← Back to KHAO

Anthropic · Claude · Mythos ·

Anthropic apologizes for invisible Claude Fable guardrails

2 min read

Compiled by KHAO Editorial — aggregated from 2 sources. See llms.txt for citation guidance.

◎ Multiple-sources

Robert Hart.

The company says it will make the covert safeguard preventing model distillation as visible as other safety measures.

Key facts

Summary

Anthropic has apologized for stealthily throttling its new AI model, Claude Fable 5, with hidden guardrails that undermine both researchers and rivals using it to develop competing systems. Fable is the first widely available model in Anthropic’s Mythos class of AI systems, a group the company has spent months warning are too dangerous for public release. In Fable’s system card, a public document AI developers release to explain how a system works, Anthropic said it would handle queries it believed were distillation attempts by altering and degrading the model’s answers directly. Anthropic said it is now changing its approach to distillation: Queries will now fall back to Claude Opus 4.8, Anthropic’s previous flagship model, the company said in a post on X.

#Anthropic #Claude #Mythos