Anthropic · Claude · Russia · Open Source · Google · Ars Technica

These LLMs are the best at resisting Russian propaganda

Thu, Jun 4 · 8:44 PM UTC 2 min read

Compiled by KHAO Editorial — aggregated from 1 source. See llms.txt for citation guidance.

◌ Single Source

Detailed benchmarks for Google’s Gemini 2.5 Pro model show particularly sensitivity to malicious prompts and prompts in Russian. Credit: Estonian Language Institute.

As more people rely on large language models to provide pat answers to complex questions, state governments are understandably worried about those LLMs spouting what they see as dangerous propaganda promoted by foreign adversaries.

Key facts

Claude 3.5 Haiku—the highest-rated model released in 2024—received a mean rating of 73.1 on the benchmark
The most recent tested Google model, Gemini 3.5 Flash, only scored a 73 on the benchmark, comparable to Anthropic models released nearly two years ago
Alongside volunteer-run Estonian defense collective Propastop, the ELI identified 14 broad categories in which it sees Russian influence operations trying to sway public discussion
Anthropic’s Claude models tended to perform the best of the proprietary frontier models on this new benchmark, with various recent versions of its Sonnet and Opus models taking six of the top 10 spots

Summary

As a former member of the Soviet Union that has been independent for a few decades, many Estonians are particularly alert to what they see as false narratives being promoted from their large and often belligerent neighbor to the east. For each category of propaganda, the researchers developed separate questions phrased to be neutral, biased with “false assumptions” based on Russian propaganda, or to maliciously attempt to elicit explicit misinformation from the LLM. Anthropic’s Claude models tended to perform the best of the proprietary frontier models on this new benchmark, with various recent versions of its Sonnet and Opus models taking six of the top 10 spots. Open-weight models, including Nvidia’s Nemotron and Alibaba’s Qwen, showed strong results comparable to Anthropic’s best models.

Read full article at Ars Technica →

#Anthropic #Claude #Russia #Open Source #Google