Text Diffusion — Brendon Dillon, Google DeepMind

Foundation Models2 months ago

Text Diffusion — Brendon Dillon, Google DeepMind

Descriptions:

Brendan Dillon, a research scientist at Google DeepMind, delivered a technically rigorous presentation at AI Engineer on text diffusion—an alternative text generation paradigm that differs fundamentally from the autoregressive token-by-token approach used by GPT, Gemini, and most other large language models. Instead of generating one token at a time with causal (past-only) attention, diffusion models initialize an entire output sequence as random noise and iteratively refine it over multiple forward passes, enabling bidirectional attention that lets the model see and correct future tokens during generation.

DeepMind’s Gemini Diffusion, released as a research preview to approximately 100,000 users roughly one year ago, achieved quality comparable to Gemini 2.0 Flash Lite at substantially better latency by exploiting full hardware parallelism across the output block rather than serial token generation. Dillon demonstrated one of the architecture’s most striking properties—self-correcting generation—with a concrete example: the model made an arithmetic error early in its output canvas, completed the full reasoning trace, recognized the mistake by attending to future tokens, and returned to fix it. GPT-4o and Gemini 2.5 Flash, both larger models, failed the same problem without correction.

Additional advantages covered include dynamic computation (more denoising steps produce monotonically higher quality across six internal coding benchmarks), elimination of certain reasoning artifacts intrinsic to causal attention, and the potential for the model to allocate more passes to harder segments of a response. Dillon closed by signaling that new developments from DeepMind in text diffusion are forthcoming.

📺 Source: AI Engineer · Published June 04, 2026
🏷️ Format: Deep Dive

1 Item

Channels

No Image Available

AI Engineer

1 Item

Companies

No Image Available

DeepMind

Tags

DeepMind Google IO GPT-4o GPU HBM TPU

Prev

AI Financing Is an Arms Race, Says GoldenTree’s Tananbaum

Next

Mellum2: JetBrains’ New Coding Model – vLLM + MCP Tool Use Locally

18 Related Posts

Related Posts

21:09

Foundation Models

Persona Engineering: A Field Guide to AI Synthetic Personas — Ishan Anand, InsightSciences.ai

24 hours ago

21:39

Foundation Models

Serving 2 Million Models Without Melting: Scaling the Hugging Face Hub — Arek Borucki, Hugging Face

2 days ago

06:40

Foundation Models

AMD Releases First Ever AI model: Instella-MoE-16B-A3B-Think

2 days ago

24:01

Foundation Models

US AI Dominance Is Over: Here’s Why

3 days ago

17:31

Foundation Models

The Messy Reality of Scale: Synthetic Data and Pre-Training — Marah Abdin & Robert McHardy, poolside

4 days ago

23:13

Foundation Models

Evaling Video Slop — Maor Bril, Character.ai

5 days ago