Intelligence with Everyone: RL @ MiniMax, with Olive Song, from AIE NYC & Inference by Turing Post

Interviews4 months ago

Intelligence with Everyone: RL @ MiniMax, with Olive Song, from AIE NYC & Inference by Turing Post

Descriptions:

Olive Song, a senior researcher specializing in reinforcement learning and model evaluation at Chinese AI company MiniMax, gives an unusually candid look at how the team builds and trains their frontier open-weight models in this crossover episode from Nathan Labenz’s Cognitive Revolution podcast. MiniMax’s latest model, M2.5, currently tops the Open Router usage leaderboard. The episode combines Olive’s presentation at the AI Engineer conference in New York with an extended interview from Cassia’s Turing Post podcast, Inference.

Technically, the episode covers several specific advances. MiniMax’s interleaved thinking technique allows a model to take an action, receive feedback from its environment, and pause to reason before proceeding — improving performance on long-horizon agentic tasks significantly over standard chain-of-thought. Their perturbation pipeline systematically varies the training environment to force robust generalization rather than pattern memorization. One particularly concrete finding: running reinforcement learning at full FP32 floating-point precision — rather than reduced precision — produces measurably better results by keeping training behavior closer to the theoretical algorithmic ideal, a detail Olive frames as closing the gap between implementation and theory.

The episode also explores how MiniMax’s unusual structure — developing both foundation models and consumer-facing applications in-house — creates tight feedback loops between researchers and developers, enabling faster identification and correction of model weaknesses. Olive discusses the ongoing battle against reward hacking, the tedious debugging process when training runs produce unexpected behavior, and how the team uses internal AI agents to manage the daily flood of research publications. She acknowledges MiniMax’s models do not yet match the top American labs but argues the RL techniques and organizational approach are worth studying regardless.

📺 Source: Cognitive Revolution “How AI Changes Everything” · Published February 22, 2026
🏷️ Format: Interview

1 Item

Companies

No Image Available

Minimax

Tags

Claude MiniMax Open Router

Prev

OpenAI’s Codex Lead: Why Coding as We Know It is Over

OpenAI’s Codex Lead: Why Coding as We Know It is Over

Next

Claude Code Just KILLED All Marketing Agencies

Claude Code Just KILLED All Marketing Agencies

18 Related Posts

Related Posts

01:11:54

Interviews

The $100,000 token budget EVERY engineer will need | Sierra Co-Founder

1 hour ago

02:00:20

Interviews

Claude Fable 5 Is BACK (And It’s Different)

2 days ago

01:18:07

Interviews

Coinbase Cuts AI Spend by 50% | Kalshi’s $40B Valuation & Impending IPO | The Year for SaaS Roll-Ups

2 days ago

44:07

Interviews

Tesla Deliveries Jump 25% | Bloomberg Tech 7/02/2026

2 days ago

05:14

Interviews

Nuclear Reactor Powers Nvidia AI Chip in US First

2 days ago

07:36

Interviews

Microsoft Shifts Strategy on Enterprise AI

2 days ago