Foundation Models - Frontier Models

There are 355 items in this page

16:32

Foundation Models4 weeks ago

LLM Observability, Evaluation, Experimentation Platform — Dat Ngo, Arize

Dat Ngo, AI architect at Arize AI, presents a structured framework for making LLM systems observable, evaluable, and experimentally i...

19:04

Foundation Models4 weeks ago

Evals Are Broken, Use Them Anyway — Ara Khan, Cline

Ara Khan, an engineer on the Cline team, delivers a pointed critique of how the AI industry uses evaluation benchmarks — and why most...

06:49

Foundation Models4 weeks ago

Nanowhale-100m: Fascinating Implemention of DeepSeek-V4 Architecture

Fahd Mirza walks through Nanowhale-100M, a 110 million parameter language model built entirely from scratch—no borrowed weights—that...

25:20

Foundation Models4 weeks ago

Beyond Transcription: Building Voice AI That Understands Conversations — Hervé Bredin, pyannoteAI

Hervé Bredin, chief science officer and co-founder of pyannoteAI, presents a conference talk exploring what becomes possible when voi...

22:38

Foundation Models4 weeks ago

Building Agent Interfaces: Lessons from Chrome DevTools (MCP) for Agents — Michael Hablich, Google

Michael Hablich, Product Manager for Chrome DevTools at Google, shares four engineering lessons from building Chrome DevTools for Age...

44:53

Foundation Models4 weeks ago

It’s starting…

Matthew Berman breaks down Anthropic's newly published paper on recursive self-improvement, which traces the evolution of AI developm...

15:59

Foundation Models4 weeks ago

Nemotron 3 Ultra NVIDIA’s 550B Open Model

NVIDIA has released Nemotron 3 Ultra, a 550 billion parameter mixture-of-experts model built specifically for agentic workloads, and...

28:03

Foundation Models4 weeks ago

Text Diffusion — Brendon Dillon, Google DeepMind

Brendan Dillon, a research scientist at Google DeepMind, delivered a technically rigorous presentation at AI Engineer on text diffusi...

16:30

Foundation Models4 weeks ago

SWE-rebench: Lessons from Evaluating Coding Agents — Ibragim Badertdinov, Nebius

Ibragim Badertdinov, an AI researcher at Nebius with an unconventional background—a trained dentist turned NeurIPS and ICML author—pr...

23:25

Foundation Models4 weeks ago

The Art & Science of Benchmarking Agents — Vincent Chen, Snorkel AI

Vincent Chen, research fellow and co-founder at Snorkel AI, took the stage at AI Engineer to share meta-level lessons on what separat...

12:49

Foundation Models1 month ago

BDD, ADR, PRD, WTF: Capturing Decisions for Humans and AI Alike — Michal Cichra, Safe Intelligence

In this AI Engineer conference talk, Michal Cichra — formerly of Microsoft and Red Hat, now building Spec 27, a new agent testing pro...

07:19

Foundation Models1 month ago

Claude Opus 4.8: Lying Machine No More?

Two Minute Papers host Dr. Karoly Zsolnai-Fehér goes beyond the benchmark headlines to work through Anthropic's 244-page system card...

16:58

Foundation Models1 month ago

Beyond Components: Designing Generative UI for MCP Apps — Ruben Casas, Postman

Ruben Casas, staff engineer at Postman, traces the evolution of AI-generated user interfaces from the earliest ChatGPT copy-paste exp...

19:05

Foundation Models1 month ago

How Lovable self-improves every hour — Benjamin Verbeek, Lovable

Benjamin van Beek, a member of technical staff at Lovable, delivers a conference talk at AI Engineer on how the vibe-coding platform...

20:40

Foundation Models1 month ago

Task Fidelity Scaling Laws — Kobie Crawdord, Snorkel

Kobie Crawford, developer advocate at Snorkel AI, presents original research from the company's frontier AI data lab quantifying how...