Descriptions:
Fahd Mirza walks through a complete installation and evaluation of Ornith 1.0 9B, a newly released open-source model family built specifically for agentic coding tasks. Running on an Nvidia H100 with 80 GB VRAM, Mirza serves the model in full precision using vLLM and puts it through two distinct real-world tests, offering a practical look at what the 9-billion-parameter variant can actually do.
The first test involves a live World Cup 2026 tracker application with a silent tiebreaker logic bug โ Ornith autonomously reads the codebase via the Hermes agent framework, makes 22 tool calls over about three minutes, and correctly fixes the goal-difference sorting error without human guidance. The second test is a one-shot code generation challenge: building an interactive spit-grill simulation with animated rotating chickens that transition from raw to golden to burnt. The video also covers Ornith’s benchmark profile, which shows the 9B model outperforming Gemma 4 31B on most coding evals including SWE-bench and Terminal-bench, while falling slightly behind on Claude Eval.
Mirza explains the model’s underlying training approach โ Group Policy Optimization (GPO) โ in which the model writes its own step-by-step plan before attempting a solution, runs multiple attempts, collects a reward signal, and refines both the plan and the output together. The full Ornith lineup (9B, 35B, and 397B) is MIT-licensed and available in GGUF format, making local deployment accessible on commodity hardware.
๐บ Source: Fahd Mirza ยท Published June 25, 2026
๐ท๏ธ Format: Hands On Build







