AI is getting REALLY good at math. But how good, exactly?

AI is getting REALLY good at math. But how good, exactly?

More

Descriptions:

David Shapiro conducts a structured investigation into the current state of AI mathematical reasoning, examining both what large language models have demonstrably achieved and where hard limits remain. The video opens with a clear motivation: mathematics is upstream of nearly every scientific and engineering discipline, so commoditizing advanced math capability would have compounding effects across energy, biology, medicine, and materials science.

On the achievement side, Shapiro covers OpenAI and Google DeepMind both earning gold-level scores on the 2025 International Mathematical Olympiad, as well as AI solutions to select Erdős problems validated by mathematician Terence Tao. He attributes much of this progress not to traditional scaling but to inference-time compute techniques, particularly Monte Carlo tree search, which companies like Ilya Sutskever’s new venture are doubling down on. On the limits side, no Millennium Prize problems have been solved, and Shapiro characterizes current systems as performing at a ‘strong undergraduate’ level rather than frontier research.

The practical implications receive substantial attention: the combination of improved math and coding capability could make formal verification — historically requiring roughly 20 person-years per 9,000 lines of code, as with the seL4 microkernel — a routine engineering standard rather than a mission-critical luxury. Viewers interested in AI’s trajectory in scientific research and software reliability will find the video a useful, evidence-grounded overview.


📺 Source: David Shapiro · Published January 20, 2026
🏷️ Format: Deep Dive

1 Item

Channels

2 Items

Companies