What I Tell Every CTO Before They Touch Claude Code or the Anthropic API

Foundation Models5 months ago

What I Tell Every CTO Before They Touch Claude Code or the Anthropic API

Descriptions:

Nate B Jones addresses CTOs and engineering leads who are building production systems on Claude Code and the Anthropic API, arguing that the foundational failure in most AI projects is not model quality or architecture choice — it’s the inability to define what ‘correct’ means before picking tools. The video’s central thesis: correctness is upstream of everything, and teams that skip this step build on a shifting target, changing definitions mid-stream and blaming hallucinations on the model rather than on the reward signals they themselves designed.

The technical argument draws on Goodhart’s Law applied to reinforcement learning: proxy metrics for correctness become targets the model learns to satisfy, often diverging from actual intent. Gemini 3’s single-turn optimization at the expense of multi-turn coherence is presented as a concrete fingerprint of this dynamic — the model reflects the training signal it received, not a failure of intelligence. Jones references OpenAI’s published guidance on evaluations and an Anthropic paper on reward hacking to ground the argument in named, citable sources.

The practical framework covers how to structure correctness definitions as multi-dimensional bundles (truthfulness, completeness, tone, policy compliance, cost, auditability) before making architecture decisions about RAG, agent count, or context engineering. For agentic systems combining structured and unstructured data, Jones addresses the human-in-the-loop calibration problem: when to trust the system over the human operator, and how to have organizational change management conversations when those decisions need to shift. The video closes with a note on hallucination causation — framing overconfident output as a systemic response to ambiguous correctness incentives rather than a model defect.

📺 Source: AI News & Strategy Daily | Nate B Jones · Published December 16, 2025
🏷️ Format: Deep Dive

1 Item

Channels

No Image Available

AI News & Strategy Daily | Nate B Jones

Tags

Amazon Claude Gemini 3 Microsoft Microsoft Copilot OpenAI

Prev

Agent Experts: Finally, Agents That ACTUALLY Learn

Agent Experts: Finally, Agents That ACTUALLY Learn

Next

Complex Motion with SCAIL: 360° Spins, Long Videos & Camera Control in ComfyUI 🎥

Complex Motion with SCAIL: 360° Spins, Long Videos & Camera Control in ComfyUI 🎥

18 Related Posts

Related Posts

31:55

Foundation Models

The biggest AI breakthrough in medicine & drug discovery

1 day ago

01:20:07

Foundation Models

Mind the Gap (In your Agent Observability) — Amy Boyd & Nitya Narasimhan, Microsoft

1 day ago

25:53

Foundation Models

The Trillion Dollar Agentic Workflow Opportunity Is Here

1 day ago

20:09

Foundation Models

Pinecone Just Demoted Vector Search. Here’s the Knowledge Layer.

2 days ago

14:27

Foundation Models

Claude Makes Dashboards Too Easy. That’s the Problem.

2 days ago

18:37

Foundation Models

CI/CD Is Dead, Agents Need Continuous Compute and Computers — Hugo Santos and Madison Faulkner

2 days ago