ChatGPT and Claude Got Smarter. Not More Honest.

Tutorials2 months ago

ChatGPT and Claude Got Smarter. Not More Honest.

Descriptions:

Dylan Davis, who runs an AI consultancy and uses Claude, ChatGPT, and Gemini daily across client engagements, makes the case that newer, smarter models have quietly become worse at admitting uncertainty — a phenomenon he calls the ‘honesty gap.’ He cites an OpenAI research paper on the topic and pairs it with the concept of automation bias: as AI sounds more confident, users check its outputs less, compounding errors over time.

The bulk of the video is a practical walkthrough of three prompt rules designed to close that gap. The first forces blank-field outputs with required explanations when the model is uncertain, grounding responses strictly to source documents. The second recalibrates the model’s incentive structure by explicitly stating that a wrong answer carries three times the cost of a blank answer. The third adds a self-audit step requiring the model to flag its own assumptions and inferences separately from extracted facts.

Davis demonstrates each rule against a real contract-extraction scenario, showing side-by-side outputs with and without the prompts. The before-after comparison reveals that without the rules, AI fills in ambiguous fields like payment terms (where two conflicting clauses exist) by silently picking one; with the rules, it surfaces the conflict and leaves the decision to the user. The prompts themselves are short — a few lines each — making them immediately portable to any Claude, ChatGPT, or Gemini workflow.

📺 Source: Dylan Davis · Published March 28, 2026
🏷️ Format: Tutorial Demo

1 Item

Companies

No Image Available

OpenAI

Tags

ChatGPT Claude Dylan Davis Gemini OpenAI

Prev

Cohere Transcribe: Local ASR Model – Audio In, Text Out in 14 Languages

Cohere Transcribe: Local ASR Model – Audio In, Text Out in 14 Languages

Next

GLM 5.1 vs MiniMax M2.7 — The Brutal Coding Test via OpenClaw

GLM 5.1 vs MiniMax M2.7 — The Brutal Coding Test via OpenClaw

18 Related Posts

Related Posts

14:22

Tutorials

Codex Mobile Released and It’s Insane

2 hours ago

14:38

Tutorials

Using HiDream-O1 Natively in ComfyUI

2 hours ago

10:54

Tutorials

Talkie: I Ran a 1930 AI Model Locally and Talked to People from the Past

1 day ago

03:02

Tutorials

Installing Claude Code

1 day ago

08:17

Tutorials

OpenAI Codex Now Works from Anywhere (Dispatch Killer?)

1 day ago

08:41

Tutorials

Luce DFlash Meets OpenClaw – Local AI Agents at 2x Speed with Qwen3.6-27B

2 days ago