16:32 Foundation Models4 weeks ago LLM Observability, Evaluation, Experimentation Platform — Dat Ngo, Arize Dat Ngo, AI architect at Arize AI, presents a structured framework for making LLM systems observable, evaluable, and experimentally i... 0 comments 439 views
19:04 Foundation Models4 weeks ago Evals Are Broken, Use Them Anyway — Ara Khan, Cline Ara Khan, an engineer on the Cline team, delivers a pointed critique of how the AI industry uses evaluation benchmarks — and why most... 0 comments 1.3K views
06:49 Foundation Models4 weeks ago Nanowhale-100m: Fascinating Implemention of DeepSeek-V4 Architecture Fahd Mirza walks through Nanowhale-100M, a 110 million parameter language model built entirely from scratch—no borrowed weights—that... 0 comments 1.1K views
25:20 Foundation Models4 weeks ago Beyond Transcription: Building Voice AI That Understands Conversations — Hervé Bredin, pyannoteAI Hervé Bredin, chief science officer and co-founder of pyannoteAI, presents a conference talk exploring what becomes possible when voi... 0 comments 552 views
22:38 Foundation Models4 weeks ago Building Agent Interfaces: Lessons from Chrome DevTools (MCP) for Agents — Michael Hablich, Google Michael Hablich, Product Manager for Chrome DevTools at Google, shares four engineering lessons from building Chrome DevTools for Age... 0 comments 604 views
44:53 Foundation Models4 weeks ago It’s starting… Matthew Berman breaks down Anthropic's newly published paper on recursive self-improvement, which traces the evolution of AI developm... 0 comments 21.3K views
15:59 Foundation Models4 weeks ago Nemotron 3 Ultra NVIDIA’s 550B Open Model NVIDIA has released Nemotron 3 Ultra, a 550 billion parameter mixture-of-experts model built specifically for agentic workloads, and... 0 comments 1.7K views
28:03 Foundation Models4 weeks ago Text Diffusion — Brendon Dillon, Google DeepMind Brendan Dillon, a research scientist at Google DeepMind, delivered a technically rigorous presentation at AI Engineer on text diffusi... 0 comments 435 views
16:30 Foundation Models4 weeks ago SWE-rebench: Lessons from Evaluating Coding Agents — Ibragim Badertdinov, Nebius Ibragim Badertdinov, an AI researcher at Nebius with an unconventional background—a trained dentist turned NeurIPS and ICML author—pr... 0 comments 845 views
23:25 Foundation Models4 weeks ago The Art & Science of Benchmarking Agents — Vincent Chen, Snorkel AI Vincent Chen, research fellow and co-founder at Snorkel AI, took the stage at AI Engineer to share meta-level lessons on what separat... 0 comments 377 views
12:49 Foundation Models1 month ago BDD, ADR, PRD, WTF: Capturing Decisions for Humans and AI Alike — Michal Cichra, Safe Intelligence In this AI Engineer conference talk, Michal Cichra — formerly of Microsoft and Red Hat, now building Spec 27, a new agent testing pro... 0 comments 3.8K views
07:19 Foundation Models1 month ago Claude Opus 4.8: Lying Machine No More? Two Minute Papers host Dr. Karoly Zsolnai-Fehér goes beyond the benchmark headlines to work through Anthropic's 244-page system card... 0 comments 27.8K views
16:58 Foundation Models1 month ago Beyond Components: Designing Generative UI for MCP Apps — Ruben Casas, Postman Ruben Casas, staff engineer at Postman, traces the evolution of AI-generated user interfaces from the earliest ChatGPT copy-paste exp... 0 comments 1K views
19:05 Foundation Models1 month ago How Lovable self-improves every hour — Benjamin Verbeek, Lovable Benjamin van Beek, a member of technical staff at Lovable, delivers a conference talk at AI Engineer on how the vibe-coding platform... 0 comments 1K views
20:40 Foundation Models1 month ago Task Fidelity Scaling Laws — Kobie Crawdord, Snorkel Kobie Crawford, developer advocate at Snorkel AI, presents original research from the company's frontier AI data lab quantifying how... 0 comments 312 views
12:40 Foundation Models1 month ago What Lies Beneath the API — Benjamin Cowen, Modal Benjamin Cowen, a forward-deployed machine learning engineer at Modal, delivers a conference talk examining one of the most consequen... 0 comments 346 views