ARC AGI 3 just dropped, what it means for AGI

Business & Strategy2 months ago

ARC AGI 3 just dropped, what it means for AGI

Descriptions:

ARC-AGI 3 has officially launched as the first interactive version of the Abstraction and Reasoning Corpus benchmark, and Matthew Berman walks through what makes this release a meaningful step forward in measuring artificial general intelligence. Unlike coding, math, or science benchmarks where the best human experts compete against top AI systems, ARC-AGI tasks are trivially solvable by average humans but remain stubbornly difficult for frontier models — the defining characteristic that makes the benchmark compelling.

Berman traces the progression from ARC-AGI 1 (now nearly saturated, with top models approaching 93–94%) to ARC-AGI 2, where even the best current systems fall well short: GPT 5.4 Pro Extra High leads at 72% with a cost per task of $39, followed by Gemini 3.1 Pro at 69% and Claude Opus 4.6 medium at 68%, while humans still achieve 100%. ARC Prize maintains a $2 million prize for full saturation.

The third iteration is a major format departure. Instead of pattern-completion puzzles, ARC-AGI 3 drops both humans and AI agents into an undescribed video game environment with zero instructions and a limited turn budget. Berman demonstrates live gameplay, showing how the challenge requires genuine exploration and generalization rather than pattern memorization. The benchmark is designed to resist the memorization strategies that allowed AI systems to climb earlier leaderboards, making it the most robust test of open-ended reasoning published to date.

📺 Source: Matthew Berman · Published March 27, 2026
🏷️ Format: News Analysis

1 Item

Channels

No Image Available

Matthew Berman

Tags

ARC AGI ARC AGI 2 Claude Opus 4.6 Gemini 3.1 Pro GPT 5.4 Grok 4.2 Matthew Berman

Prev

Claude + Firecrawl Just Changed How We Browse the Internet Forever! (Tutorial)

Next

Gemini 3.1 Flash Live Just Changed Voice Agents Forever

Gemini 3.1 Flash Live Just Changed Voice Agents Forever

18 Related Posts

Related Posts

12:23

Business & Strategy

Claude’s 13 Free AI Courses in 12 Minutes

22 hours ago

44:03

Business & Strategy

Cerebras Goes Public in Year’s Biggest IPO | Bloomberg Tech 5/14/2026

22 hours ago

19:11

Business & Strategy

Your Agent Can Now Train Models — Merve Noyan, Hugging Face

2 days ago

41:46

Business & Strategy

I’m terrified of this…

2 days ago

07:44

Business & Strategy

Anthropic Just Dethroned OpenAI. Here’s What Happens Next.

2 days ago

25:38

Business & Strategy

The Best Way to Talk to Your Agents

3 days ago