13:29 Coding & Dev Tools4 days ago Control What Your AI Agents Can Do: Archestra + Ollama Hands-On Fahd Mirza walks through Orchestra (stylized as Archestra), an open-source platform built for running AI agents safely in production,... 0 comments 1.3K views
08:09 Business & Strategy6 days ago Someone Fine-Tuned a Model on 10 Examples in 3 Minutes: Qwable-5 27B Coder Fahd Mirza uses a deliberately provocative model release — Qwable-5 27B Coder — as the centerpiece of a broader argument about credib... 0 comments 2.6K views
14:42 Tutorials2 weeks ago Qwen3.6 27B (Pi-Reasoning GGUF) – Fine-Tuned for Local Heavy AI Agent Fahd Mirza tests Pi-Reasoning, a community fine-tune of Qwen 3.6 27B built specifically for agentic coding — tasks like reading files... 0 comments 3.8K views
09:40 Benchmarks3 weeks ago DFlash Just Got Faster: 4x Speed with 160 tok/s Locally Fahd Mirza benchmarks DFlash with SGLang's new SpecV2 overlapping scheduler on an NVIDIA H100 80GB GPU, demonstrating a 4.3x throughp... 0 comments 2K views
13:26 Research & Benchmarks4 weeks ago Gemma4 12B vs Qwen3.6 27B — The Veteran vs The Newcomer Fahd Mirza runs a structured head-to-head comparison of Gemma 4 12B and Qwen 3.6 27B on the same Nvidia H100 80GB VRAM system, testin... 0 comments 4.3K views
32:57 Tutorials1 month ago Unsloth Studio is insane… fine-tune any AI model locally Unsloth Studio is a free, open-source desktop application that brings LLM fine-tuning to consumer hardware — and this video by David... 0 comments 8.3K views
10:06 Coding & Dev Tools1 month ago DFlash Leaves Qwen Territory – Gemma 4 31B Now Runs 5x Faster with Speculative Decoding Fahd Mirza demonstrates the first end-to-end deployment of Llama Box DFlash with Google's Gemma 4 31B model, following the merge of P... 0 comments 3.4K views
10:48 Tutorials1 month ago LM Studio Just Got MTP — Qwen3.6-27B Runs 63% Faster with One Toggle Fahd Mirza demonstrates how to enable Multi-Token Prediction (MTP) speculative decoding in LM Studio's new beta release (version 0.4.... 0 comments 5.6K views
09:45 Tutorials2 months ago Llama.cpp Just Got MTP – Qwen3.6 27B Runs 2x Faster Locally with Two Flags Multi-token prediction (MTP) support has officially merged into the mainline llama.cpp repository—not a fork or custom branch, but th... 0 comments 3K views
11:14 Coding & Dev Tools2 months ago Qwen3.7 Has Arrived – And It’s Already Beating GPT-5.2 & Grok-4.20 Alibaba's Qwen team quietly dropped two new preview models—Qwen3.7 Max and Qwen3.7 Plus—onto Qwen Chat, and Fahd Mirza wasted no time... 0 comments 6.4K views