The Mathematical Foundations of Intelligence [Professor Yi Ma]

The Mathematical Foundations of Intelligence [Professor Yi Ma]

More

Descriptions:

Machine Learning Street Talk hosts Professor Yi Ma — inaugural director of the School of Computing and Data Science at Hong Kong University, former full professor at UC Berkeley, and IEEE/ACM Fellow — for a deep discussion of his mathematical theory of intelligence. The conversation centers on his recently published book, which argues that intelligence can be formally grounded in two principles: parsimony, the drive to compress data into maximally compact representations, and self-consistency, a criterion that well-formed representations must satisfy across transformations.

A central technical contribution discussed is the CRATE architecture (Coding Rate Reduction Transformer), a family of “white-box” transformers where every architectural component — attention heads, MLP layers, residual connections — can be derived from first principles rather than empirical heuristics. Ma explains why this interpretability matters: it allows researchers to understand what representations each layer is learning, rather than treating the network as an inscrutable black box. He also unpacks why the non-convex optimization landscapes arising from natural data structures turn out to be surprisingly well-behaved, with few problematic local minima — a phenomenon he calls “the blessing of dimensionality.”

The interview closes on open problems: the distinction between compression and abstraction, the gap between memorization and genuine understanding, and the goal of universal induction — an inductive counterpart to Turing’s universal computation. For researchers and technically minded practitioners interested in the theoretical underpinnings of modern deep learning, this is a substantive and rare first-person account from one of the field’s leading mathematical voices.


📺 Source: Machine Learning Street Talk · Published December 13, 2025
🏷️ Format: Interview

1 Item

Channels