DFlash - Frontier Models

There are 6 items in this page

10:50

Research & Benchmarks4 weeks ago

Fahd Mirza puts Poolside's newly released Laguna XS 2.1 through a live evaluation using the Hermit agentic framework. The model is a...

10:06

Coding & Dev Tools2 months ago

Fahd Mirza demonstrates the first end-to-end deployment of Llama Box DFlash with Google's Gemma 4 31B model, following the merge of P...

08:43

Tutorials3 months ago

ZLab, the UC San Diego research team that invented DFlash speculative decoding, has released the first official drafter model paired...

08:41

Tutorials3 months ago

This hands-on tutorial from the Fahd Mirza channel demonstrates running Google's Gemma 4 31B model locally at 196 tokens per second u...

08:57

Benchmarks3 months ago

Fahd Mirza demonstrates Google's newly released MTP (multi-token prediction) draft models for the Gemma 4 family, running live tests...

15:31

Coding & Dev Tools3 months ago

Fahd Mirza builds and benchmarks PFlash, a prefill acceleration tool that dramatically reduces the blank-screen wait time when feedin...