Descriptions:
Fahd Mirza takes Laguna M.1 — Poolside’s new 225-billion-parameter mixture-of-experts coding model with 23 billion parameters active per token — through two structured tests designed to probe real-world agentic coding performance. The model’s weights are publicly available on Hugging Face under a permissive license, though its scale requires a multi-GPU cluster; Mirza accesses it via API using the Hermes agent framework.
In the first test, Mirza points Laguna M.1 at a broken full-stack World Cup 2026 tracker application with a backend-to-frontend port communication failure. The model reads hundreds of files, identifies the root cause, and produces working fix instructions — the app loads correctly after applying its output. The second test asks the model to generate a procedurally animated tree simulation from scratch using only HTML canvas and physics-based growth, with no external libraries.
On benchmarks, Laguna M.1 outperforms Mistral’s Devstral 2 (a dense 123B model) across SWE-bench Verified, SWE-bench Multilingual, BenchPro, and TerminalBench, and edges out GLM 4.7 on two of those four. It falls short of DeepSeek V4 Flash and Qwen 3.5. Mirza notes the model’s reasoning trace shows some redundant file re-reading, suggesting chain-of-thought depth is an area for improvement in future versions. Overall, the video offers a candid early assessment of where a new open-weight coding model sits in a competitive field.
📺 Source: Fahd Mirza · Published June 18, 2026
🏷️ Format: Review







