Ornith 1.0 9B: Self-Improving Model for Agentic Coding – Run Locally

Coding & Dev Tools1 week ago

Ornith 1.0 9B: Self-Improving Model for Agentic Coding – Run Locally

Descriptions:

Fahd Mirza walks through a complete installation and evaluation of Ornith 1.0 9B, a newly released open-source model family built specifically for agentic coding tasks. Running on an Nvidia H100 with 80 GB VRAM, Mirza serves the model in full precision using vLLM and puts it through two distinct real-world tests, offering a practical look at what the 9-billion-parameter variant can actually do.

The first test involves a live World Cup 2026 tracker application with a silent tiebreaker logic bug — Ornith autonomously reads the codebase via the Hermes agent framework, makes 22 tool calls over about three minutes, and correctly fixes the goal-difference sorting error without human guidance. The second test is a one-shot code generation challenge: building an interactive spit-grill simulation with animated rotating chickens that transition from raw to golden to burnt. The video also covers Ornith’s benchmark profile, which shows the 9B model outperforming Gemma 4 31B on most coding evals including SWE-bench and Terminal-bench, while falling slightly behind on Claude Eval.

Mirza explains the model’s underlying training approach — Group Policy Optimization (GPO) — in which the model writes its own step-by-step plan before attempting a solution, runs multiple attempts, collects a reward signal, and refines both the plan and the output together. The full Ornith lineup (9B, 35B, and 397B) is MIT-licensed and available in GGUF format, making local deployment accessible on commodity hardware.

📺 Source: Fahd Mirza · Published June 25, 2026
🏷️ Format: Hands On Build

1 Item

Channels

No Image Available

Fahd Mirza

Tags

Gemma 4 31B Hermes MCP SWE-bench VLLM

Prev

Post-AGI Equilibria: Are There Any Good Ones?

Next

Why Toilets and MSG Are Winning the AI Boom | Bloomberg Tech: Asia 6/26/2026

18 Related Posts

Related Posts

09:39

Coding & Dev Tools

DeepSeek DFlash on Gemma 12B Locally: Up To 5x Faster

23 hours ago

15:45

Coding & Dev Tools

Every AI Agent Demo Stops at Email. I Pointed Mine at the Bills That Cost You Money.

23 hours ago

24:28

Coding & Dev Tools

Fable 5 is WILD…

2 days ago

08:08

Coding & Dev Tools

I Embedded Whisper.cpp Into a Real App

2 days ago

21:09

Coding & Dev Tools

I Built a Real AI Jarvis That Controls My Computer

3 days ago

13:29

Coding & Dev Tools

Control What Your AI Agents Can Do: Archestra + Ollama Hands-On

4 days ago