Stop Fixing Your Claude Skills. Autoresearch Does It For You

Tutorials2 months ago

Stop Fixing Your Claude Skills. Autoresearch Does It For You

Descriptions:

Nick Saraev demonstrates how to apply Andrej Karpathy’s recently released Auto Research GitHub repository to create self-improving Claude Code skills. The core problem: Claude Code skills — reusable prompt modules — produce inconsistent output, with Saraev estimating a roughly 30% failure rate on his own. Rather than manually debugging prompts, the auto-research approach runs a skill repeatedly against a standardized evaluation set and lets an agent iteratively refine the prompt until measured performance improves.

The method maps directly from Karpathy’s nanoGPT auto-research structure: the `train.py` file maps to a skill markdown, and `program.md` becomes the agent’s instruction prompt. The critical ingredient is an objective metric — a set of binary yes/no evaluation questions that remove subjective judgment. Saraev walks through a concrete example improving a diagram-generator skill against four criteria: text legibility, adherence to a pastel color palette, linear left-to-right layout, and absence of numbered ordering.

The same methodology applied to website optimization reduced page load time from 1,100 milliseconds to 67 milliseconds across 67 test iterations, illustrating that the technique extends well beyond prompt tuning to any repeatable process with a measurable output. The full workflow runs inside an IDE called Anti-Gravity. Saraev also notes that accumulated research logs from these runs become a durable asset — transferable to future, more capable models like GPT-6 or Claude Opus 5 to continue where previous iterations left off.

📺 Source: Nick Saraev · Published March 13, 2026
🏷️ Format: Tutorial Demo

1 Item

Channels

No Image Available

Nick Saraev

2 Items

People

No Image Available

Andrej Karpathy

No Image Available

Nick Saraev

Tags

Andrej Karpathy Antigravity auto research Claude Code Claude Opus 4.6 Nano Banana Pro Nick Saraev OpenAI Tesla

Prev

Stripe’s Coding Agents Ship 1,300 PRs EVERY Week – Here’s How They Do It

Stripe’s Coding Agents Ship 1,300 PRs EVERY Week – Here’s How They Do It

Next

The Social Network for Agents Just Got Acquired

The Social Network for Agents Just Got Acquired

18 Related Posts

Related Posts

10:54

Tutorials

Talkie: I Ran a 1930 AI Model Locally and Talked to People from the Past

23 hours ago

03:02

Tutorials

Installing Claude Code

23 hours ago

08:17

Tutorials

OpenAI Codex Now Works from Anywhere (Dispatch Killer?)

23 hours ago

08:41

Tutorials

Luce DFlash Meets OpenClaw – Local AI Agents at 2x Speed with Qwen3.6-27B

2 days ago

24:07

Tutorials

Hermes Agent powered by local models on the DGX Spark is basically magic

2 days ago

03:21

Tutorials

Goal Mode Changes Everything for AI Coding

2 days ago