Descriptions:
Ali Behrouz — Cornell PhD student and Google researcher — joins Nathan Labenz on the Cognitive Revolution to unpack Nested Learning, his biologically-inspired machine learning architecture that Jeff Dean has called a potential paradigm shift. The core idea: different components of a model update at different temporal frequencies, mirroring how human memory operates across working memory and long-term storage. This allows the model to adapt rapidly to new contexts while preserving foundational knowledge — a key step toward genuine continual learning that current transformer architectures cannot achieve.
The conversation also covers Behrouz’s newer paper, “Language Models Need Sleep,” which introduces an offline consolidation phase in which models distill recently acquired knowledge from high-frequency update layers into slower-evolving layers, and generate synthetic training data from recent experiences — closely paralleling how human memory consolidates during sleep. Behrouz further argues that all deep learning components can be understood as forms of associative memory, leading him to call conventional architectures an “illusion” and to develop expressive optimizers that learn their own update rules and outperform both Adam and Muon.
Empirical results show Nested Learning models matching transformers on standard benchmarks while outperforming them on difficult tasks including effective recall over 10 million tokens and simultaneous translation of multiple previously unseen languages. The episode closes with a candid discussion of continual learning’s privacy and alignment risks — and why Behrouz is cautiously optimistic that models that evolve through ongoing user interaction could ultimately produce a more diverse and stable AI ecosystem.
📺 Source: Cognitive Revolution “How AI Changes Everything” · Published June 03, 2026
🏷️ Format: Interview







