29:59 Interviews1 month ago ⚡️ Google’s Open AI Strategy — Omar Sanseviero, Google DeepMind In this Latent Space podcast interview, Omar Sanseviero from Google DeepMind walks through the technical decisions and strategic thin... 0 comments 313 views
10:10 Tutorials1 month ago Intern-S2-Preview FP8: 35B Scientific Multimodal Model Running Locally InternLM's latest release, Intern-S2-Preview, is a 35-billion-parameter scientific multimodal model that takes a different approach t... 0 comments 86 views
09:16 Research & Benchmarks2 months ago Command A+ : Cohere’s “Best Model Ever” Is Kind of Disappointing Fahd Mirza puts Cohere's newly released Command A Plus through its paces in this hands-on review, offering a skeptical take on a mode... 0 comments 516 views
33:45 Coding & Dev Tools2 months ago AI Dev 26 x SF | Eda Zhou & Mahdi Ghodsi: Building Personal AI Agents with Open Source Models At the AI Dev 26 conference in San Francisco, AMD engineers Eda Zhou and Mahdi Ghodsi lead a hands-on workshop teaching attendees how... 0 comments 570 views
15:56 Research & Benchmarks2 months ago MiniCPM-V 4.6: The Agent Vision Model Sam Witteveen examines MiniCPM-V 4.6, a 1.3 billion parameter vision-language model released by OpenBMB—a joint initiative between AI... 0 comments 2.3K views
08:06 Research & Benchmarks2 months ago MTP vs DFlash — Speculative Decoding Explained Simply This video by Fahd Mirza offers a clear, structured comparison of two speculative decoding techniques — Multi-Token Prediction (MTP)... 0 comments 1K views
43:11 Agents & Automation2 months ago Local Hermes & Openclaw on Beelink in 43 mins Keith AI delivers a detailed, framework-driven evaluation of running Hermes and OpenClaw locally on a Beelink S10 Max mini PC — the d... 0 comments 1.6K views
08:28 Coding & Dev Tools2 months ago Qwen3-8B at 74 tok/s with RedHat DFlash Speculator on vLLM Locally Fahd Mirza walks through running Red Hat's DFlash speculative decoding implementation on Qwen3-8B using vLLM, achieving 74 tokens per... 0 comments 1.6K views
11:00 Tutorials2 months ago NVIDIA Nemotron Elastic: 3-in-1 Elastic LLM Like Russian Dolls in One File NVIDIA's Nemotron Elastic model family packs three reasoning models — 30B, 23B, and 12B parameters — into a single checkpoint file us... 0 comments 1.4K views
13:37 Research & Benchmarks2 months ago Zaya1 8B – Intelligence Efficiency by Zyphra – Run Locally Zyphra, a San Francisco AI lab known for earlier releases like Zonos and ZR1, has returned with Zaya 1 (Zia) 8B — an open-source mixt... 0 comments 2.6K views