The “Final Boss” of Deep Learning

Foundation Models5 months ago

The “Final Boss” of Deep Learning

Descriptions:

Machine Learning Street Talk hosts a long-form technical conversation with DeepMind researchers probing the fundamental mismatch between how modern neural networks are built and what we need them to reliably do. The opening provocation is concrete: large language models cannot perform basic addition. They learn statistical patterns that approximate correct answers, but changing a single digit in an arithmetic problem exposes the failure—the model either produces nonsense or repeats an incorrect pattern with confidence. The researchers argue that tool use (calling an external calculator) patches the symptom without addressing the underlying architectural gap, and that efficiency arguments favor models that can reason internally rather than offloading to tools repeatedly.

The deeper thread is the mathematical limits of geometric deep learning. Geometric approaches assume operations are invertible—that transformations can always be undone—but many real algorithms are not. Dijkstra’s shortest-path algorithm maps many different weighted graphs to identical outputs, making it fundamentally non-invertible and unrepresentable using symmetry-based frameworks. This realization led the researchers from group theory through monoids (removing the invertibility requirement) and eventually to category theory (also removing the requirement that all functions must compose), as a more general mathematical language for what neural networks should learn.

The episode traces the emerging field of categorical deep learning: what it means architecturally to relax the constraints of groups, how monoid-based models handle asymmetric computations, and why category theory may offer a path toward systems genuinely aligned with algorithmic reasoning rather than pattern approximation. It is among the more technically rigorous public discussions of where deep learning theory is heading beyond the transformer paradigm.

📺 Source: Machine Learning Street Talk · Published December 22, 2025
🏷️ Format: Deep Dive

1 Item

Channels

No Image Available

Machine Learning Street Talk

Tags

ChatGPT DeepMind Transformers

Prev

The Most Important AI Stories This Week

The Most Important AI Stories This Week

Next

Somebody Will Get REALLY RICH Doing This (Gemini 3 + n8n)

Somebody Will Get REALLY RICH Doing This (Gemini 3 + n8n)

18 Related Posts

Related Posts

31:55

Foundation Models

The biggest AI breakthrough in medicine & drug discovery

23 hours ago

01:20:07

Foundation Models

Mind the Gap (In your Agent Observability) — Amy Boyd & Nitya Narasimhan, Microsoft

23 hours ago

25:53

Foundation Models

The Trillion Dollar Agentic Workflow Opportunity Is Here

23 hours ago

18:37

Foundation Models

CI/CD Is Dead, Agents Need Continuous Compute and Computers — Hugo Santos and Madison Faulkner

2 days ago

20:09

Foundation Models

Pinecone Just Demoted Vector Search. Here’s the Knowledge Layer.

2 days ago

14:27

Foundation Models

Claude Makes Dashboards Too Easy. That’s the Problem.

2 days ago