OmniCoder-9B Running Locally: I Tried to Break It With Real Engineering Tasks

OmniCoder-9B Running Locally: I Tried to Break It With Real Engineering Tasks

More

Descriptions:

Fahd Mirza puts OmniCoder-9B — a coding-focused model from Tesslr, fine-tuned on the Qwen3.5 9B hybrid architecture — through a hands-on local evaluation on an Nvidia RTX 6000 with 48 GB VRAM. The model claims benchmark scores of 83.8% on GPQA Diamond and was trained on 425,000 curated agentic trajectories sourced from frontier model outputs, giving it habits like reading before writing, responding to compiler diagnostics, and making minimal diffs rather than rewriting entire files.

Mirza serves the model using vLLM, configures recommended hyperparameters (temperature 0.6, top-K 20, top-P 0.95) through Open WebUI, and then prompts it with a challenging task: generate a self-contained HTML file simulating a Kerbal Space Program-style rocket booster simulator. The resulting output includes physics simulation, canvas rendering, and game-state management — and actually runs in the browser. The model also supports a 262K-token context window and chain-of-thought reasoning via think tags, and is fully open-weight under Apache 2.0.

The video is most valuable for its behavioral analysis angle: rather than just quoting leaderboard numbers, Mirza highlights the agentic coding habits baked into OmniCoder-9B through its training data, and explains why those behaviors matter more than raw benchmark scores for real automated coding pipelines. Developers evaluating small open-weight coding models for local agent setups will find this a useful practical reference.


📺 Source: Fahd Mirza · Published March 14, 2026
🏷️ Format: Hands On Build

1 Item

Channels