AI Dev 26 x SF | Jerry Liu: My Agent Can’t Read a PDF?

Foundation Models2 months ago

AI Dev 26 x SF | Jerry Liu: My Agent Can’t Read a PDF?

Descriptions:

Jerry Liu, co-founder and CEO of LlamaIndex, delivers a conference talk at AI Dev 26 in San Francisco explaining why document parsing remains one of the most underestimated bottlenecks in production agentic AI systems. With over one billion pages processed and 300,000 users on the LlamaParse platform, Liu argues that most agent failures trace back not to reasoning capability but to low-quality document context — garbage in, garbage out at enterprise scale.

Liu breaks down why 20 years of OCR progress still leaves major gaps for AI workflows: complex tables, multi-column layouts, embedded charts, and fine-grained financial data confuse even frontier vision-language models like Claude Opus 4.7. He notes that naively screenshotting pages and feeding them into a VLM works for interactive assistants where users absorb token costs, but becomes economically unworkable when processing millions of documents. LlamaParse’s approach combines specialized layout detection with bounding box grounding to achieve accuracy levels that general-purpose VLMs cannot match at comparable cost.

Beyond raw extraction accuracy, Liu emphasizes citation infrastructure as a critical design requirement for enterprise agent workflows: financial analysts, legal teams, and insurance processors need to trace an agent’s conclusion back to a specific region in the source document. This grounding capability — knowing not just what the text says but exactly where on the page it appears — is something that doesn’t come out of the box with standard VLM API calls and requires dedicated layout modeling to implement reliably.

📺 Source: DeepLearningAI · Published May 22, 2026
🏷️ Format: Deep Dive

1 Item

Channels

No Image Available

DeepLearningAI

Tags

Claude Code Claude Opus 4.6 Claude Opus 4.7 Gemini 3 Pro GPT-55 LlamaIndex Simon Willison

Prev

This is absolutely CRAZY

Next

printf is Actually a Secret Virtual Machine – And a Giant Security Hole!

18 Related Posts

Related Posts

21:09

Foundation Models

Persona Engineering: A Field Guide to AI Synthetic Personas — Ishan Anand, InsightSciences.ai

24 hours ago

21:39

Foundation Models

Serving 2 Million Models Without Melting: Scaling the Hugging Face Hub — Arek Borucki, Hugging Face

2 days ago

06:40

Foundation Models

AMD Releases First Ever AI model: Instella-MoE-16B-A3B-Think

2 days ago

24:01

Foundation Models

US AI Dominance Is Over: Here’s Why

3 days ago

17:31

Foundation Models

The Messy Reality of Scale: Synthetic Data and Pre-Training — Marah Abdin & Robert McHardy, poolside

4 days ago

17:57

Foundation Models

Loop Engineering from First Principles — Kyle Mistele, HumanLayer

5 days ago