The Unreasonable Effectiveness of Prompt Learning – Aparna Dhinakaran, Arize

Foundation Models5 months ago

The Unreasonable Effectiveness of Prompt Learning – Aparna Dhinakaran, Arize

Descriptions:

Aparna Dhinakaran from Arize AI presents a framework called “prompt learning” — a lightweight alternative to reinforcement learning for improving AI coding agents. Drawing on Andrej Karpathy’s concept of system prompt learning, she argues that iteratively refining system prompts with LLM-as-judge feedback is more practical for most teams than full RL training, which demands large datasets and dedicated data science resources.

The talk demonstrates the technique applied to two coding agents: Cline (running Claude Sonnet 4.5) and Claude Code. Starting from vanilla SWEBench baselines — 30% GitHub issue resolution for Cline and 40% for Claude Code — the team ran a multi-step pipeline: execute the agent on coding tasks, evaluate outputs with an LLM judge that generates natural-language explanations of failures, then route those explanations through a meta-prompt to produce updated rules appended to the agent’s CLAUDE.md or Cline rules file. No fine-tuning required.

Dhinakaran contrasts this with RL’s scalar reward signal (an exam grade with no feedback), arguing that rich English-language explanations make prompt learning far more sample-efficient. The talk also references the viral leak of Claude’s full system prompt to underscore how seriously frontier labs treat prompt engineering as a competitive differentiator — and why teams building on top of these models should too.

📺 Source: AI Engineer · Published December 23, 2025
🏷️ Format: Deep Dive

1 Item

Channels

No Image Available

AI Engineer

Tags

Andrej Karpathy Claude Claude Code Cursor SWE-bench

Prev

The “Final Boss” of Deep Learning

The “Final Boss” of Deep Learning

Next

The 10 Biggest AI Stories of 2025

The 10 Biggest AI Stories of 2025

18 Related Posts

Related Posts

31:55

Foundation Models

The biggest AI breakthrough in medicine & drug discovery

1 day ago

01:20:07

Foundation Models

Mind the Gap (In your Agent Observability) — Amy Boyd & Nitya Narasimhan, Microsoft

1 day ago

25:53

Foundation Models

The Trillion Dollar Agentic Workflow Opportunity Is Here

1 day ago

20:09

Foundation Models

Pinecone Just Demoted Vector Search. Here’s the Knowledge Layer.

2 days ago

14:27

Foundation Models

Claude Makes Dashboards Too Easy. That’s the Problem.

2 days ago

18:37

Foundation Models

CI/CD Is Dead, Agents Need Continuous Compute and Computers — Hugo Santos and Madison Faulkner

2 days ago