10:06 Coding & Dev Tools3 weeks ago DFlash Leaves Qwen Territory – Gemma 4 31B Now Runs 5x Faster with Speculative Decoding Fahd Mirza demonstrates the first end-to-end deployment of Llama Box DFlash with Google's Gemma 4 31B model, following the merge of P... 0 comments 3.4K views
08:43 Tutorials1 month ago DFlash Drafter for Gemma 4 26B – Official Speculative Decoding is Here: Run Locally ZLab, the UC San Diego research team that invented DFlash speculative decoding, has released the first official drafter model paired... 0 comments 525 views
08:41 Tutorials1 month ago Gemma 4 31B at 196 tok/s with RedHat DFlash Speculator Locally This hands-on tutorial from the Fahd Mirza channel demonstrates running Google's Gemma 4 31B model locally at 196 tokens per second u... 0 comments 2.2K views
08:57 Benchmarks1 month ago Google Releases Gemma 4 MTP Drafters – Run Locally and DFlash Comparison Fahd Mirza demonstrates Google's newly released MTP (multi-token prediction) draft models for the Gemma 4 family, running live tests... 0 comments 5.2K views
15:31 Coding & Dev Tools1 month ago PFlash + Qwen3.6-27B-DFlash: 10x Faster Prefill on a Single GPU: Run Locally Fahd Mirza builds and benchmarks PFlash, a prefill acceleration tool that dramatically reduces the blank-screen wait time when feedin... 0 comments 3.8K views