Claude Mythos Preview Requires New Ways to Keep Code Secure

Mon, Apr 27 · 3:18 PM UTC 2 min read

Compiled by KHAO Editorial — aggregated from 1 outlet. See llms.txt for citation guidance.

◌ Single Source

Abstract art of binary numbers, bar graphs, and shapes.

Multiple layers of verification and human oversight are a start.

Key facts

In early April, Anthropic’s Frontier Red Team, which evaluates the potential safety and security risks posed by the company’s AI models, announced that the company’s Claude Mythos Preview model
Those findings prompted Anthropic to also establish Project Glasswing to help thwart AI-assisted cyberattacks
For Nayan Goel, a principal application-security engineer at the financial services company Upgrade, speed and semantics set AI models apart
But Goel emphasizes that the issues AI models flag must still be checked and confirmed by humans

Summary

Rina Diane Caballar is a contributing editor covering tech and its intersections with science, society, and the environment. Malicious actors are now exploiting generative AI to carry out cyberattacks: scamming victims using AI-generated deepfakes, deploying malware developed with the help of AI coding tools, using chatbots to pull off phishing campaigns, and hacking widely used open-source code repositories with AI agents. In early April, Anthropic’s Frontier Red Team, which evaluates the potential safety and security risks posed by the company’s AI models, announced that the company’s Claude Mythos Preview model has identified thousands of high- and critical-severity vulnerabilities. Those findings prompted Anthropic to also establish Project Glasswing to help thwart AI-assisted cyberattacks. While generative AI’s coding, reasoning, and autonomous capabilities have become powerful enough to spot potential code security weaknesses, these same skills also enable it to exploit those flaws.

Read full article at IEEE Spectrum AI →