NVIDIA Launches Nemotron 3 Super: 120B LatentMoE Explained & Tested

Research & Benchmarks2 months ago

NVIDIA Launches Nemotron 3 Super: 120B LatentMoE Explained & Tested

Descriptions:

Fahd Mirza covers the launch of NVIDIA’s Nemotron 3 Super, a 120-billion-parameter language model built on a novel architecture called LatentMoE (Latent Mixture of Experts). Unlike standard MoE designs, Nemotron 3 Super compresses input data into a lower-dimensional latent space before routing to expert subnetworks — keeping active parameters at 12 billion during inference, which significantly reduces compute cost without a proportional drop in capability.

The video explains several technical features in accessible terms: NVFP4 quantization using 4-bit precision to cut memory requirements and boost speed; a 1-million-token context window suited to full codebase processing; multi-token prediction for faster generation; and a configurable chain-of-thought reasoning mode that generates a hidden internal trace before responding, recommended specifically for complex coding and math tasks. Running the model locally requires approximately eight H100 80GB GPUs.

Mirza tests Nemotron 3 Super on two prompts via NVIDIA’s hosted interface: a self-contained HTML simulation of an AI-managed plant growth system with live sensor dashboards and a “first bud detected” event sequence, and a multilingual role-play involving characters speaking French, German, Spanish, and Arabic. Both outputs are evaluated positively — the HTML demo produces functional animations and the language test shows correct grammar and cultural register across supported languages. Mirza flags NVIDIA’s hosted inference interface as a persistent usability weak point the company should address.

📺 Source: Fahd Mirza · Published March 11, 2026
🏷️ Format: Review

1 Item

Channels

No Image Available

Fahd Mirza

Tags

Nemotron 3 Super NVFP4 Nvidia Qwen 3.5

Prev

The Top 100 Consumer AI Apps | The a16z Show

The Top 100 Consumer AI Apps | The a16z Show

Next

Claude Code changed this year (did you notice?)

Claude Code changed this year (did you notice?)

18 Related Posts

Related Posts

42:12

Research & Benchmarks

What AI Agent Should YOU be Using?

23 hours ago

10:46

Research & Benchmarks

Ring-2.6-1T: The 1 Trillion Parameter Open Source Model That NO ONE Can Run

23 hours ago

05:42

Research & Benchmarks

NVIDIA New AI Is An Efficiency Monster

2 days ago

09:34

Research & Benchmarks

I Tried GPT Image 2.0 for 14 Days So You Don’t Have To

3 days ago

30:30

Research & Benchmarks

Which AI Image Generator Should You Actually Use?

5 days ago

24:34

Research & Benchmarks

Codex vs Cowork for Regular People (Every Feature Compared)

7 days ago