Anthropic · The Register
Devansh ultimately concluded that while the bugs it found are real
Compiled by KHAO Editorial — aggregated from 1 outlet. See llms.txt for citation guidance.
◌ Single Source
For example, the Anthropic-claimed 181 Firefox exploits ran with the browser sandbox turned off and the FreeBSD exploit transcript "shows substantial human guidance, not autonomy.
Key facts
- Another researcher, Davi Ottenheimer, pointed out that the security section (Section 3, pages 47-53) of Anthropic's 244-page documentation " contains no count of zero-days at all
- So far we've found no category or complexity of vulnerability that humans can find that this model can't," Mozilla CTO Bobby Holley said, after revealing that Mythos found 271 vulnerabilities
- Another engineer, Devansh, scoured the Mythos-related CVE advisories and Anthropic's exploit code, 44-prompt transcript, and 244-page system card, along with Glasswing partner agreements, red-team
- Snehal Antani, co-founder and CEO of offensive AI hacking company Horizon3.ai, told The Register, "attackers didn't need Mythos to accelerate vulnerability research, 4.6 and open source models
Summary
Anthropic's Mythos model is purportedly so good at finding vulnerabilities that the Claude-maker is afraid to make it available to the general public for fear that criminals will take advantage. Anthropic made Mythos available in preview to a select but ever-growing number of organizations under the title of Project Glasswing so they could find and fix vulnerabilities in their environment before criminals got hold of the purported zero-day machine and caused mayhem. "We're investigating a report claiming unauthorized access to Claude Mythos Preview through one of our third-party vendor environments," the spokesperson told them. The AI biz declined to name the third-party vendor, but said that it's a company Anthropic works with on model development. Bloomberg, which originally reported the unauthorized access, said that "a handful" of people gained access to Mythos by making "an educated guess about the model's online location" based on Anthropic's previous models, and that these details were revealed in the recent Mercor data breach.