RAG is dead, right?? — Kuba Rogut, Turbopuffer

Foundation Models2 months ago

RAG is dead, right?? — Kuba Rogut, Turbopuffer

Descriptions:

Kuba Rogut, a deployed engineer at Turbopuffer, pushes back on the “RAG is dead” wave that swept AI social media in late 2025 and early 2026, presenting Google search volume data showing that interest in RAG actually hit a new inflection point midway through 2025. His core argument: the dismissal conflates simple vector search with the full definition of retrieval-augmented generation, which properly encompasses full-text search (BM25), regex, grep, and structured filters — and that hybrid retrieval combining these methods is quietly becoming the default architecture for serious agentic applications.

The talk uses Cursor as its primary case study of production-grade retrieval. Cursor uses Merkle trees to calculate similarity between codebases opened by teammates, copying over existing embeddings and re-chunking only changed files rather than starting from scratch each session. Their internal benchmark shows semantic search drives a 12.5–13.5% improvement in answer accuracy across models, rising to nearly 24% for their Composer model specifically. An online A/B test found 2.6% better code retention in large codebases and a 2.2% reduction in dissatisfied user requests — gains Rogut notes are understated because semantic search is not triggered on every query. Turbopuffer powers Cursor’s semantic search layer.

Rogut also addresses Claude Code’s well-documented decision to use grep instead of vector search, framing the two approaches as “cached compute” versus “per-session discovery” — embeddings pay off when indexing cost amortizes across many queries, but not always. The session closes by defining what agentic search actually means in practice: giving agents a set of retrieval tools — vector, full-text, and file system — to iteratively find and reason over context until a task is complete.

📺 Source: AI Engineer · Published June 09, 2026
🏷️ Format: Deep Dive

1 Item

Channels

No Image Available

AI Engineer

Tags

Boris Cherny Claude Code Cursor Google Jeff Dean xAI

Prev

Developers Hope for Big Leaps From Apple’s AI

Next

Dan Dreyfus: The Next AI Bottleneck is Copper

18 Related Posts

Related Posts

21:09

Foundation Models

Persona Engineering: A Field Guide to AI Synthetic Personas — Ishan Anand, InsightSciences.ai

1 day ago

21:39

Foundation Models

Serving 2 Million Models Without Melting: Scaling the Hugging Face Hub — Arek Borucki, Hugging Face

2 days ago

06:40

Foundation Models

AMD Releases First Ever AI model: Instella-MoE-16B-A3B-Think

2 days ago

24:01

Foundation Models

US AI Dominance Is Over: Here’s Why

3 days ago

17:31

Foundation Models

The Messy Reality of Scale: Synthetic Data and Pre-Training — Marah Abdin & Robert McHardy, poolside

4 days ago

17:57

Foundation Models

Loop Engineering from First Principles — Kyle Mistele, HumanLayer

5 days ago