How We Built Zeta2: Training an Edit Prediction Model in Production — Ben Kunkle, Zed

Foundation Models2 weeks ago

How We Built Zeta2: Training an Edit Prediction Model in Production — Ben Kunkle, Zed

Descriptions:

Ben Kunkle, edit predictions lead at Zed, delivers a detailed technical walkthrough of how the team trained Zeta2 — Zed’s small, specialized model that predicts the next code edit a developer is about to make on every keystroke. The talk covers the full production training pipeline: collecting opt-in editor snapshots from real users, using knowledge distillation from a frontier model to generate training labels, and running a “repair step” where a second frontier model corrects low-quality predictions flagged by heuristic filters.

Kunkle goes deep on one of the harder engineering problems: generating high-quality training data from “settled” edits — waiting until a user finishes editing a region to capture the ground-truth outcome. This approach is noisy by nature (users change their minds, agents rewrite code entirely), so Zed validates candidate examples by sampling 50 outputs from its own student checkpoint and checking proximity to the settled state via Levenshtein distance. This replaced an approach that required 1 million frontier model API calls per 100k training examples — prohibitively expensive in practice.

The pipeline stores everything as JSONL so each stage simply appends fields, keeping experimentation flexible and cacheable across runs. Offline evaluation uses delta CaRF, an n-gram Levenshtein metric, against a held-out test set. Kunkle also explains the philosophy behind targeting “interesting” training examples — those in the middle difficulty band where the model almost gets it right — as the highest-signal data for improving edit prediction quality.

📺 Source: AI Engineer · Published May 30, 2026
🏷️ Format: Deep Dive

1 Item

Channels

No Image Available

AI Engineer

Prev

Claude Opus 4.8 Agentic AI Trading Agent First Test

Next

Your AI Agent Is Leaking Your API Keys (Fix It With Free Agent-Vault)

18 Related Posts

Related Posts

19:37

Foundation Models

Only 1 in 1,600 People Use Codex. Here’s How to Catch Up.

3 days ago

20:56

Foundation Models

Stop Making Models Bigger, Make Them Behave — Kobie Crawdord, Snorkel

5 days ago

34:00

Foundation Models

Claude Fable 5 – Full 319 page Breakdown

5 days ago

11:13

Foundation Models

RAG is dead, right?? — Kuba Rogut, Turbopuffer

6 days ago

15:50

Foundation Models

Road to 5 Million Tokens: Breaking Barriers in Long Context Training — Max Ryabinin, Together AI

7 days ago

24:51

Foundation Models

Why Eval++ Is the Next Great Compute Primitive — Sunil Pai & Matt Carrie, Cloudflare

7 days ago