← Back to KHAO

Gemini · Claude · GPT ·

AI Models Can’t Agree on Basic Facts Most of the Time, Study Indicates

2 min read

Compiled by KHAO Editorial — aggregated from 1 source. See llms.txt for citation guidance.

★ Tier-1 Source

AI robots. Source: Decrypt.

Ask five of the world's most advanced AI systems whether a statement is true, and two-thirds of the time, at least one will give you a different answer.

Key facts

Summary

Five frontier AI models disagreed on 67% of 1,000 real-world fact-check claims. At 0.639 Krippendorff's alpha, the models fall below the 0.8 reliability threshold. The study gave GPT-5.4, Claude Opus 4.7, Gemini 3 Pro, Gemini 3 Pro with Search, and Sonar Pro the same 1,000 real-world fact-check claims submitted by actual users. On 672 out of 1,000 claims, at least one model broke from the majority.

Read full article at Decrypt →

#Gemini #Claude #GPT