NVIDIA’s New AI Broke My Brain

Foundation Models3 weeks ago

NVIDIA’s New AI Broke My Brain

Descriptions:

Two Minute Papers host Dr. Károly Zsolnai-Fehér breaks down Sonic, a new robot controller from NVIDIA’s humanoid robotics lab led by Jim Fan and Professor Zhu. The system enables a robot to accept commands in virtually any modality — live video of a human demonstrating a movement, spoken instructions, music, or plain text — and translate them into fluid, stable physical motion without requiring human-annotated action labels during training.

What makes Sonic technically notable is its scale efficiency. The final model contains just 42 million parameters — small enough to run on a smartphone — yet it was trained on 100 million frames of raw human motion using a pipeline that learns motion transitions without manual labeling. A key architectural detail is the root trajectory spring model, which dampens abrupt user commands through an exponential decay term so the robot settles smoothly at target positions without oscillating or injuring itself. The system encodes multimodal inputs through a motion generator and human encoder into universal tokens, which a decoder then maps to motor commands — a design that allows seamless switching between input types mid-task.

Training required 128 GPUs over three days, but the resulting models are being released openly and are lightweight enough for consumer hardware. The video is particularly useful for anyone tracking the convergence of large-scale motion data, small deployable models, and multimodal control as a viable path toward general-purpose robot behavior.

📺 Source: Two Minute Papers · Published April 25, 2026
🏷️ Format: Deep Dive

1 Item

Channels

No Image Available

Two Minute Papers

1 Item

Companies

No Image Available

Nvidia

Tags

Nvidia

Prev

DeepSeek V4 just shocked the AI industry…

Next

Testing Tencent HY3 Preview Hard on Near Impossible Tasks for Free

18 Related Posts

Related Posts

31:55

Foundation Models

The biggest AI breakthrough in medicine & drug discovery

23 hours ago

01:20:07

Foundation Models

Mind the Gap (In your Agent Observability) — Amy Boyd & Nitya Narasimhan, Microsoft

23 hours ago

25:53

Foundation Models

The Trillion Dollar Agentic Workflow Opportunity Is Here

23 hours ago

18:37

Foundation Models

CI/CD Is Dead, Agents Need Continuous Compute and Computers — Hugo Santos and Madison Faulkner

2 days ago

20:09

Foundation Models

Pinecone Just Demoted Vector Search. Here’s the Knowledge Layer.

2 days ago

14:27

Foundation Models

Claude Makes Dashboards Too Easy. That’s the Problem.

2 days ago