Why do AI models hallucinate?

Why do AI models hallucinate?

More

Descriptions:

Jordan, a researcher at Anthropic, explains why AI models like Claude sometimes hallucinate — confidently stating false information — and what Anthropic is doing to reduce it. The video starts with a live demonstration: asking Claude about research papers by Jared Kaplan produces plausible-sounding but entirely fabricated titles, illustrating how hallucinations can be indistinguishable from accurate responses.

The root cause explanation centers on how large language models learn: by pattern-matching across enormous amounts of text to predict likely next tokens. When queried about obscure topics, niche researchers, or recent events with sparse training data, the model fills gaps by generating plausible-sounding completions rather than admitting uncertainty. Anthropic addresses this through training Claude to recognize and express uncertainty, running thousands of adversarial test questions specifically designed to surface false-confidence errors, and tracking metrics like citation fabrication rate and appropriate hedging frequency across model versions.

The video closes with practical user guidance: ask the model to cite sources and then verify them, start fresh chats to fact-check prior outputs, tell the model upfront that ‘I don’t know’ is an acceptable answer, and apply extra skepticism to specific numbers, dates, and names. Jordan notes that while Claude hallucinates significantly less than it did a year prior, the problem remains unsolved across the entire industry — and the rarity of errors today may paradoxically make them harder for users to catch.


📺 Source: Claude · Published April 15, 2026
🏷️ Format: Deep Dive

1 Item

Channels

1 Item

Companies