DeepSeek · Gemini · Claude · Codex · Import AI
Import AI 452: Scaling laws for cyberwar; rising tides of AI automation; and a puzzle over gDP forecasting
Compiled by KHAO Editorial — aggregated from 1 outlet. See llms.txt for citation guidance.
◌ Single Source
Welcome to Import AI, a newsletter about AI research.
Key facts
- Evaluated models: 2019: GPT-2. 2020: GPT3. 2022: GPT3.5. 2024: Claude 3 Opus, GPT-4o. 2025: o3, Opus 4, Gemini 2.5 Pro, DeepSeek V3.1, GPT-5.1 Codex Max
- GPT-5.2 Codex. 2026: Opus 4.6, GPT-5.3 Codex, GLM-5, Sonnet 4.6
- The most recent frontier models in their study, GPT-5.3 Codex and Opus 4.6, sit above both fitted trendlines, achieving 50% success on tasks taking human experts 3.1h and 3.2h respectively,” they write
- Across frontier models released since 2019, the doubling time is 9.8 months
Summary
Uh oh, there’s a scaling war for cyberattacks as well!: …The smarter the system, the better the ability to cyberattack… AI safety research organization Lyptus Research has looked at how well AI systems can perform a variety of cyberoffense tasks and found a clear trend of more advanced models being able to do more advanced forms of cyberattack. “Across frontier models released since 2019, the doubling time is 9.8 months. The most recent frontier models in our study, GPT-5.3 Codex and Opus 4.6, sit above both fitted trendlines, achieving 50% success on tasks taking human experts 3.1h and 3.2h respectively,” they write. Evaluated models: 2019: GPT-2. 2020: GPT3. 2022: GPT3.5. 2024: Claude 3 Opus, GPT-4o. 2025: o3, Opus 4, Gemini 2.5 Pro, DeepSeek V3.1, GPT-5.1 Codex Max. GPT-5.2 Codex. 2026: Opus 4.6, GPT-5.3 Codex, GLM-5, Sonnet 4.6.