VLLM - Frontier Models

There are 63 items in this page

13:26

Tutorials4 weeks ago

At the AI Engineer summit, Audrey Hsu, developer advocate at RunPod, delivers a live demo showing how to deploy a production-ready LL...

10:33

Coding & Dev Tools4 weeks ago

JetBrains has released Mellum 2, a 12-billion-parameter mixture-of-experts coding model that runs at the compute cost of a 2.5-billio...

09:08

Tutorials1 month ago

Fahd Mirza walks through the installation and configuration of Hermes agent desktop, the newly released GUI for the Hermes agent fram...

11:06

Tutorials1 month ago

Fahd Mirza examines Nvidia's official FP4 quantization of Qwen3 35B A22B — a release validated by Nvidia's own model optimizer tool a...

10:07

Tutorials1 month ago

Fahd Mirza walks through the local deployment and testing of Dolphin X1 Trinity Nano, the first model trained entirely within a custo...

09:39

Coding & Dev Tools1 month ago

Step 3.7 Flash is a 198 billion parameter sparse mixture-of-experts model from Step One, activating only 11 billion parameters per to...

08:53

Coding & Dev Tools1 month ago

LFM2.5-8B-A1B is Liquid AI's latest open-weight model — an 8.3 billion parameter mixture-of-experts architecture that activates only...

10:06

Coding & Dev Tools1 month ago

Fahd Mirza demonstrates the first end-to-end deployment of Llama Box DFlash with Google's Gemma 4 31B model, following the merge of P...

11:11

Tutorials1 month ago

Fahd Mirza walks through a complete local installation and live evaluation of MiniCPM 5 in its 1 billion parameter variant, released...