The “Final Boss” of Deep Learning

The “Final Boss” of Deep Learning

More

Descriptions:

Machine Learning Street Talk hosts a long-form technical conversation with DeepMind researchers probing the fundamental mismatch between how modern neural networks are built and what we need them to reliably do. The opening provocation is concrete: large language models cannot perform basic addition. They learn statistical patterns that approximate correct answers, but changing a single digit in an arithmetic problem exposes the failure—the model either produces nonsense or repeats an incorrect pattern with confidence. The researchers argue that tool use (calling an external calculator) patches the symptom without addressing the underlying architectural gap, and that efficiency arguments favor models that can reason internally rather than offloading to tools repeatedly.

The deeper thread is the mathematical limits of geometric deep learning. Geometric approaches assume operations are invertible—that transformations can always be undone—but many real algorithms are not. Dijkstra’s shortest-path algorithm maps many different weighted graphs to identical outputs, making it fundamentally non-invertible and unrepresentable using symmetry-based frameworks. This realization led the researchers from group theory through monoids (removing the invertibility requirement) and eventually to category theory (also removing the requirement that all functions must compose), as a more general mathematical language for what neural networks should learn.

The episode traces the emerging field of categorical deep learning: what it means architecturally to relax the constraints of groups, how monoid-based models handle asymmetric computations, and why category theory may offer a path toward systems genuinely aligned with algorithmic reasoning rather than pattern approximation. It is among the more technically rigorous public discussions of where deep learning theory is heading beyond the transformer paradigm.


📺 Source: Machine Learning Street Talk · Published December 22, 2025
🏷️ Format: Deep Dive

1 Item

Channels