AutoResearch Clearly Explained (and how to use it)

AutoResearch Clearly Explained (and how to use it)

More

Descriptions:

David Ondrej delivers a detailed technical walkthrough of AutoResearch, an open-source project by Andrej Karpathy — OpenAI co-founder and the researcher behind Tesla Autopilot — that enables AI agents to autonomously run experiments and iteratively self-improve. The core concept: provide an agent with a single modifiable file, a fixed time budget, and an immutable evaluation function, then let it run hundreds of experiments overnight — committing improvements and discarding failures via git reset in an automated loop.

The video explains AutoResearch’s three-file architecture: `program.md` (human-defined goals and constraints), `train.py` (the sole file the agent may modify), and `prepare.py` (the locked evaluation function that prevents the agent from gaming its own score). The fixed time budget ensures experiments are directly comparable — the agent cannot “cheat” by simply training longer. Karpathy originally applied this to optimizing a GPT-2 training script; the video argues all major frontier labs will eventually run some version of this loop.

Beyond model training, Ondrej surveys practical applications: marketing teams running 36,000 A/B experiments per year instead of 30, codebase performance optimization, fine-tuning open-source models for on-device inference, and automated prompt engineering for agent system prompts. The video closes with the three necessary conditions for a successful AutoResearch loop — a single quantifiable metric, a fully automated evaluation with no human in the loop, and exactly one changeable file — along with a beginner-friendly guide to building a first loop from scratch.


📺 Source: David Ondrej · Published March 27, 2026
🏷️ Format: Deep Dive

1 Item

Channels

1 Item

People