Research · IEEE Spectrum AI
Claude Mythos Preview Requires New Ways to Keep Code Secure
Compiled by KHAO Editorial — aggregated from 1 outlet. See llms.txt for citation guidance.
◌ Single Source
Multiple layers of verification and human oversight are a start.
Key facts
- In early April, Anthropic’s Frontier Red Team, which evaluates the potential safety and security risks posed by the company’s AI models, announced that the company’s Claude Mythos Preview model
- Those findings prompted Anthropic to also establish Project Glasswing to help thwart AI-assisted cyberattacks
- For Nayan Goel, a principal application-security engineer at the financial services company Upgrade, speed and semantics set AI models apart
- But Goel emphasizes that the issues AI models flag must still be checked and confirmed by humans
Summary
Rina Diane Caballar is a contributing editor covering tech and its intersections with science, society, and the environment. Malicious actors are now exploiting generative AI to carry out cyberattacks: scamming victims using AI-generated deepfakes, deploying malware developed with the help of AI coding tools, using chatbots to pull off phishing campaigns, and hacking widely used open-source code repositories with AI agents. In early April, Anthropic’s Frontier Red Team, which evaluates the potential safety and security risks posed by the company’s AI models, announced that the company’s Claude Mythos Preview model has identified thousands of high- and critical-severity vulnerabilities. Those findings prompted Anthropic to also establish Project Glasswing to help thwart AI-assisted cyberattacks. While generative AI’s coding, reasoning, and autonomous capabilities have become powerful enough to spot potential code security weaknesses, these same skills also enable it to exploit those flaws.