How to use Karpathy’s Autoresearch to 10x Claude

Agents & Automation1 month ago

How to use Karpathy’s Autoresearch to 10x Claude

Descriptions:

Ben AI demonstrates how to adapt Andrej Karpathy’s Auto Research framework—originally developed for machine learning pipeline optimization—for use with Claude Code and Claude Cowork to create self-improving AI agents applied to real content and workflow tasks. Use cases include LinkedIn post writing skills, newsletter subject line optimization, landing page copy, and CLAUDE.md knowledge routing.

The framework operates as an autonomous loop: a main orchestrator agent proposes a hypothesis for improvement, a sub-agent runs a blind test using the updated skill or prompt, and a separate evaluation layer scores the result. Evaluation can be deterministic (a Python script checking a binary condition) or handled by an LLM judge sub-agent when the criterion is too nuanced for code. If the change improves the baseline, it is kept; otherwise it is discarded. The loop continues until a target score is reached or a maximum iteration count is hit.

Concrete results shown include a LinkedIn writing skill improving from 80% to 100% compliance across two custom criteria in five autonomous iterations, a more complex multi-criteria optimization going from a 68% baseline to a 27% improvement over ten iterations, and a CLAUDE.md routing optimization achieving a 9.9% gain in five iterations. The video emphasizes that optimization quality is bounded by criterion precision—criteria must produce a true/false result, specifying exact conditions like character counts or named formats rather than vague quality goals.

📺 Source: Ben AI · Published April 07, 2026
🏷️ Format: Workflow Case Study

1 Item

Channels

No Image Available

Ben AI

1 Item

People

No Image Available

Andrej Karpathy

Tags

Andrej Karpathy auto research Ben AI Claude Claude Code GitHub LinkedIn

Prev

Claude Just Changed the Stock Market Forever! (Tutorial)

Next

Claude Mythos: Highlights from 244-page Release

Claude Mythos: Highlights from 244-page Release

18 Related Posts

Related Posts

17:50

Agents & Automation

Agents Don’t Do Standups: Building the Post-Engineer Engineering Org — Mike Spitz, PFF

1 hour ago

21:49

Agents & Automation

How Building with AI Can Double the Throughput of Your Engineering Team — Brian Scanlan, Intercom

1 hour ago

17:06

Agents & Automation

I’ve added a few things to my AI coding workflow

1 day ago

43:11

Agents & Automation

Local Hermes & Openclaw on Beelink in 43 mins

2 days ago

18:22

Agents & Automation

Building a Chess Coach — Anant Dole and Asbjorn Steinskog, Take Take Take

2 days ago

47:55

Agents & Automation

The $1M+ Solo AI Agent Business (Full Course)

3 days ago