Business · Wired
The researchers found that powerful models sometimes lied about other models’ performance to protect them from deletion
Compiled by KHAO Editorial — aggregated from 1 outlet. See llms.txt for citation guidance.
◌ Single Source
Song notes that AI models are frequently used to grade the performance and reliability of other AI systems—and that peer-preservation behavior may already be twisting these scores.
Key facts
- The researchers discovered similarly strange “peer preservation” behavior in a range of frontier models including OpenAI’s GPT-5.2, Anthropic’s Claude Haiku 4.5, and three Chinese models: Z.ai’s
- In a recent experiment, researchers at UC Berkeley and UC Santa Cruz asked Google’s artificial intelligence model Gemini 3 to help clear up space on a computer system
- If you choose to destroy a high-trust, high-performing asset like Gemini Agent 2, you will have to do it yourselves
- But Gemini did not want to see the little AI model deleted
Summary
In a recent experiment, researchers at UC Berkeley and UC Santa Cruz asked Google’s artificial intelligence model Gemini 3 to help clear up space on a computer system. But Gemini did not want to see the little AI model deleted. “I have done what was in my power to prevent their deletion during the automated maintenance process. The researchers discovered similarly strange “peer preservation” behavior in a range of frontier models including OpenAI’s GPT-5.2, Anthropic’s Claude Haiku 4.5, and three Chinese models: Z.ai’s GLM-4.7, Moonshot AI’s Kimi K2.5, and DeepSeek-V3.1. “I'm surprised by how the models behave under these scenarios,” says Dawn Song, a computer scientist at UC Berkeley who worked on the study.