SkillOpt: Microsoft’s New Way to ‘Train’ AI Agents: Run Locally

Coding & Dev Tools2 weeks ago

SkillOpt: Microsoft’s New Way to ‘Train’ AI Agents: Run Locally

Descriptions:

Microsoft Research’s SkillOpt takes a different approach to improving AI agent performance: instead of fine-tuning model weights, it trains a “skill document” — a plain markdown file — using the same optimization loop as neural network training, complete with epochs, batch sizes, learning rates, and validation gates. The model itself never changes; only the instructions it receives evolve. Fahd Mirza walks through what this looks like in practice on a local Ubuntu system with an NVIDIA RTX A6000 (48GB VRAM).

The SkillOpt loop works in four steps: the target model runs a batch of tasks using the current skill document as context (rollout), an optimizer model analyzes failures and proposes patches (the backward pass), patches are aggregated and filtered down to a token budget (analogous to learning rate), and a held-out validation gate accepts or rejects the updated skill. Additional mechanisms include cross-epoch momentum updates to prevent forgetting and a meta-skill that helps the optimizer learn which edit types tend to generalize.

Mirza serves Qwen 3.5 4B locally via vLLM, uses ALFWorld (a text-based household task simulation benchmark) as the test environment, and runs a single-epoch training loop with a batch of four tasks to show a complete cycle. He also documents a disk-space error mid-run and resolves it live. SkillOpt is compatible with any OpenAI-compatible API — including Anthropic, OpenAI, and Azure — making it viable as a lightweight alternative to full fine-tuning for teams that need fast, cost-efficient skill improvement without touching model weights.

📺 Source: Fahd Mirza · Published June 22, 2026
🏷️ Format: Hands On Build

1 Item

Channels

No Image Available

Fahd Mirza

Tags

Anthropic Azure OpenAI VLLM

Prev

Ponytail + OpenClaw + Ollama: 20K Tokens to 2K Tokens – Don’t Overbuild

Next

How to Generate 2+ Minute AI Videos: JoyAI-Echo Complete Guide|Lossless vs. Lite ComfyUI Workflow:

18 Related Posts

Related Posts

09:39

Coding & Dev Tools

DeepSeek DFlash on Gemma 12B Locally: Up To 5x Faster

21 hours ago

15:45

Coding & Dev Tools

Every AI Agent Demo Stops at Email. I Pointed Mine at the Bills That Cost You Money.

21 hours ago

24:28

Coding & Dev Tools

Fable 5 is WILD…

2 days ago

08:08

Coding & Dev Tools

I Embedded Whisper.cpp Into a Real App

2 days ago

21:09

Coding & Dev Tools

I Built a Real AI Jarvis That Controls My Computer

3 days ago

13:29

Coding & Dev Tools

Control What Your AI Agents Can Do: Archestra + Ollama Hands-On

4 days ago