Why Scale Will Not Solve AGI | Vishal Misra – The a16z Show

Why Scale Will Not Solve AGI | Vishal Misra – The a16z Show

More

Descriptions:

Columbia University professor Vishal Misra joins the a16z podcast to present his mathematical framework for understanding how large language models actually function — and why he believes scaling alone cannot produce AGI. Misra, who claims to have built one of the earliest RAG implementations at ESPN in 2020 while working on a cricket database query system, has since developed formal mathematical models of LLM behavior, beginning with representing these models as enormous probability matrices mapping every possible prompt to a distribution over vocabulary tokens.

The centerpiece of his research is a concept he calls the “Bayesian wind tunnel” — an experimental framework inspired by aerospace testing. By giving transformer models tasks where memorization is computationally impossible yet the correct Bayesian posterior can be calculated analytically, his team (alongside colleagues at Columbia and DeepMind) demonstrated that transformers compute precise Bayesian posteriors to within 10⁻³ bits accuracy. Their taxonomy of results is striking: transformers handle all tested Bayesian tasks, Mamba architectures handle most, LSTMs handle only a subset, and MLPs fail completely — revealing an architectural rather than data-driven effect.

Misra frames this Bayesian interpretation as fundamental to understanding both the power and limits of current LLMs, and poses a sharp test for genuine AGI: train a model only on pre-1916 physics and see if it derives the theory of relativity. His argument is that next-token prediction over existing human knowledge cannot produce the kind of novel conceptual leap that defines true general intelligence.


📺 Source: a16z · Published March 17, 2026
🏷️ Format: Interview