DeepMind’s New AI Found A Strange New Way To Think

DeepMind’s New AI Found A Strange New Way To Think

More

Descriptions:

Dr. Károly Zsolnai-Fehér of Two Minute Papers examines DeepMind’s AlphaProof Nexus, a new AI system that successfully solved 9 out of approximately 350 open mathematical problems originally posed by legendary Hungarian mathematician Paul Erdős — problems that had resisted human proof for up to 56 years — at a cost of roughly a few hundred dollars per solved problem.

The system’s core innovation is an ELO-ranked tournament loop built around Lean, the formal proof verification language. Rather than relying on a single model pass, AlphaProof Nexus iterates: a primary AI proposes proofs, a cheaper judge model compares pairs of failed attempts and assigns ELO scores (borrowing from chess rating systems), and each new iteration restarts from the highest-scoring incorrect solution rather than from scratch. A formal Lean validator provides ground truth, ensuring that claimed proofs are actually correct and preventing the hallucination problem that makes standard LLMs unreliable for rigorous mathematics. The architecture illustrates a principle the video emphasizes throughout: intelligence increasingly lives in the harness and loop around models, not solely in the models themselves.

The video covers limitations with unusual candor: the 350-problem subset was chosen partly for ease of formalization (introducing selection bias), smaller models solved exactly zero problems (reinforcing that frontier-scale compute still matters despite benchmark compression), and open questions remain around the optimal tradeoff between model size and tournament rounds at fixed compute budgets. The progression from GPT-3’s arithmetic failures to solving decades-old open problems in four years frames the result as a trajectory data point rather than a ceiling.


📺 Source: Two Minute Papers · Published June 05, 2026
🏷️ Format: News Analysis

1 Item

Channels

1 Item

Companies