Anthropic · Wired
Anthropic Confirms That Claude Contains Its Own Kind of Emotions
Compiled by KHAO Editorial — aggregated from 1 outlet. See llms.txt for citation guidance.
◌ Single Source
Claude has been through a lot lately—a public fallout with the Pentagon, leaked source code— so it makes sense that it would be feeling a little blue.
Key facts
- Researchers at the company probed the inner workings of Claude Sonnet 4.5 and found that so-called “functional emotions” seem to affect Claude’s behavior, altering the model’s outputs and actions
- When Claude says it is happy to see you, for example, a state inside the model that corresponds to “happiness” may be activated
- While Anthropic’s latest study might encourage people to see Claude as conscious, the reality is more complicated
- To understand how Claude might represent emotions, the Anthropic team analyzed the model’s inner workings as it was fed text related to 171 different emotional concepts
Summary
A new study from Anthropic suggests models have digital representations of human emotions like happiness, sadness, joy, and fear, within clusters of artificial neurons—and these representations activate in response to different cues. Researchers at the company probed the inner workings of Claude Sonnet 4.5 and found that so-called “functional emotions” seem to affect Claude’s behavior, altering the model’s outputs and actions. Anthropic’s findings may help ordinary users make sense of how chatbots work. “What was surprising to us was the degree to which Claude’s behavior is routing through the model’s representations of these emotions,” says Jack Lindsey, a researcher at Anthropic who studies Claude’s artificial neurons.