Descriptions:
Kuba Rogut of Turbopuffer presents original benchmark results comparing three code retrieval strategies for Claude Code: the default agentic search (grep-based file exploration), windowed grep, and semantic search powered by Turbopuffer’s serverless vector database with Voyage Code embeddings.
The headline finding: Claude Code’s default approach achieves 65% file precision — roughly one in three files it reads is irrelevant to the task. Adding windowed grep improves this to around 80%, and combining windowed grep with semantic search pushes file precision to 87%. The benchmark spans 50 tasks drawn from a Claude Code evaluation suite and measures file-level precision, file-level recall, and line-level recall across all three conditions. Rogut introduces TurboGrep, an open-source CLI that chunks a codebase using a tree-splitter library, embeds it with the Voyage Code model, and indexes it into Turbopuffer for fast semantic retrieval at query time.
The talk contextualizes the infrastructure investment using Cursor’s published research — which found a 24% relative improvement in answer accuracy and a 2.6% increase in code retention in large codebases from semantic search — and explains the “cached compute” argument: upfront embedding cost amortizes across every agent session running against the same codebase, delivering compounding token savings at scale. For teams building on Claude Code, evaluating vector database options, or designing RAG pipelines for code-heavy workloads, this is a rigorous, data-backed starting point with reproducible tooling.
📺 Source: AI Engineer · Published June 03, 2026
🏷️ Format: Benchmark Test







