27:10 Interviews3 months ago The End of SWE-Bench Verified — Mia Glaese & Olivia Watkins, OpenAI Frontier Evals Mia Glaese, VP of Research at OpenAI overseeing the Codex, human data, and alignment teams, and Olivia Watkins from OpenAI's Frontier... 0 comments 3.8K views
21:43 Research & Benchmarks3 months ago Google wins again. Gemini 3.1 Pro review Google's Gemini 3.1 Pro is reviewed across an extensive range of capability tests, accessible via the Gemini app by selecting the Pro... 0 comments 122.9K views
33:45 Tutorials3 months ago New #1 open source AI model just dropped GLM-5 from Zhipu AI, available for free at z.ai, is reviewed here as a top-performing open-source language model with a chat interfac... 0 comments 97.4K views
08:50 Business & Strategy3 months ago Sam Altman Finally Admits It: “We Screwed Up” At a recent OpenAI town hall, CEO Sam Altman made an unusually candid admission: the company \"just screwed up\" GPT-5.2's writing qu... 0 comments 42.8K views
17:45 Interviews5 months ago [State of Code Evals] After SWE-bench, Code Clash & SOTA Coding Benchmarks recap — John Yang John Yang, creator of SWE-bench, sits down with the Latent Space podcast at NeurIPS 2025 to survey the state of coding evaluations he... 0 comments 1.5K views
10:23 Business & Strategy5 months ago Anthropic's New Benchmark Changes Everything—Most People Will Miss Why Nate B Jones of AI News & Strategy Daily breaks down the latest results from METR (Model Evaluation and Threat Research), the nonprof... 0 comments 42.1K views
38:50 Business & Strategy5 months ago New open Nano Banana, AI plays any video game, new top open source models, long videos: AI NEWS This AI news roundup from AI Search covers the most significant open-source releases from the final week of December 2025. The video... 0 comments 135K views
10:56 Foundation Models5 months ago The Unreasonable Effectiveness of Prompt Learning – Aparna Dhinakaran, Arize Aparna Dhinakaran from Arize AI presents a framework called \"prompt learning\" — a lightweight alternative to reinforcement learning... 0 comments 15.4K views
47:55 Business & Strategy5 months ago Insane 3D models, realtime AI video, new #1 open model, realtime AI worlds, Gemini 3 Flash: AI NEWS This weekly AI news roundup from AI Search covers an unusually dense cluster of major open-source releases from the third week of Dec... 0 comments 94.9K views
01:09:17 Interviews5 months ago Codex 5.2 Launch Revealed: How OpenAI Got Non-Engineers Shipping Real Code This interview brings together two members of OpenAI's Codex engineering team — Tibo, an engineering lead, and Ed, a design engineer... 0 comments 9.2K views