VLLM - Frontier Models

There are 63 items in this page

13:29

Coding & Dev Tools4 days ago

Control What Your AI Agents Can Do: Archestra + Ollama Hands-On

Fahd Mirza walks through Orchestra (stylized as Archestra), an open-source platform built for running AI agents safely in production,...

14:48

Business & Strategy6 days ago

Turbocharge Your Agent’s Retrieval with TurboQuant – Shashi Jagtap, Superagentic AI

Shashi Jagtap, founder of SuperAgentic AI, presents at the AI Engineer conference on TurboQuant — a vector embedding compression algo...

08:51

Tutorials1 week ago

OpenJarvis + Ollama: Local AI Agent That Tracks Every Watt

Fahd Mirza walks through the installation and hands-on testing of Open Jarvis, a newly released local-first personal AI framework dev...

09:41

Coding & Dev Tools1 week ago

Qwen-AgentWorld: One AI Model That Simulates 7 Different Environments

Fahd Mirza walks through a complete local installation and live demonstration of Qwen-AgentWorld, a novel "world model" from the Qwen...

10:04

Coding & Dev Tools1 week ago

Ornith 1.0 9B: Self-Improving Model for Agentic Coding – Run Locally

Fahd Mirza walks through a complete installation and evaluation of Ornith 1.0 9B, a newly released open-source model family built spe...

09:00

Coding & Dev Tools2 weeks ago

SkillOpt: Microsoft’s New Way to ‘Train’ AI Agents: Run Locally

Microsoft Research's SkillOpt takes a different approach to improving AI agent performance: instead of fine-tuning model weights, it...

18:46

Foundation Models3 weeks ago

You Might Not Need 50 Diffusion Steps — Ziv Ilan, Nvidia

At the AI Engineer conference, Nvidia's Ziv Ilan — a researcher in Nvidia's AI labs team based in Paris — presents a practical framew...

12:48

Tutorials3 weeks ago

DiffusionGemma: 1100 Tokens/sec: Google’s Fastest Open Model Yet Locally

Fahd Mirza installs and stress-tests Google DeepMind's DiffusionGemma — a 26-billion-parameter mixture-of-experts model that abandons...

20:19

Coding & Dev Tools4 weeks ago

GPU Cloud Deployment Without Leaving Your IDE — Audry Hsu, RunPod

Audrey Hsu, developer advocate at RunPod, demonstrates the company's new IDE-integrated GPU deployment tooling at AI Engineer, showin...

13:26

Tutorials4 weeks ago

Under 5 minutes to a deployed LLM endpoint — Audry Hsu, RunPod

At the AI Engineer summit, Audrey Hsu, developer advocate at RunPod, delivers a live demo showing how to deploy a production-ready LL...