Mistral: Voxtral TTS, Forge, Leanstral, & Mistral 4 — w/ Pavan Kumar Reddy & Guillaume Lample

Interviews2 months ago

Mistral: Voxtral TTS, Forge, Leanstral, & Mistral 4 — w/ Pavan Kumar Reddy & Guillaume Lample

Descriptions:

Latent Space hosts Guillaume Lample, Mistral’s Chief Scientist, and Pavan Kumar Reddy, Head of Audio Research, for a first-party announcement of Voxtral TTS — Mistral’s first text-to-speech model. The 3-billion-parameter model supports nine languages, is built on top of the Ministral base, and introduces a novel autoregressive flow matching architecture paired with an in-house neural audio codec that converts audio into latent semantic and acoustic tokens at a fraction of the cost of competing TTS services while matching their quality at the base model level.

The conversation traces Mistral’s full audio model lineage: the original Voxtral ASR model released in summer 2024, a multilingual transcription update in January 2025 adding context biasing and real-time streaming, and now the generation side with Voxtral TTS. Pavan explains the architectural distinction between understanding models (audio encoder feeding continuous embeddings into a transformer decoder) and generation models (requiring the neural codec on the output side), and why the autoregressive flow matching approach was selected after iterating through several internal architectures.

The second half covers Forge, Mistral’s enterprise training platform announced at GTC — the same internal tooling Mistral’s science team uses for continued pretraining, SFT, and RLHF, now offered to customers to fine-tune models on proprietary data. Guillaume argues that enterprise customers using closed-source models are leaving enormous value on the table by not leveraging domain-specific datasets they have accumulated for years, and that fine-tuning on that data can produce models that dramatically outperform general-purpose alternatives for specialized tasks, including building models where a target language represents 50% of the training mix rather than under 1%.

📺 Source: Latent Space · Published March 30, 2026
🏷️ Format: Interview

Tags

Gemini Google Llama Meta Mistral AI Whisper

Prev

5 Mind Blowing Claude Cowork Uses Cases

5 Mind Blowing Claude Cowork Uses Cases

Next

Rivian Electric Bike Spinoff Signs Deal with DoorDash

Rivian Electric Bike Spinoff Signs Deal with DoorDash

18 Related Posts

Related Posts

08:44

Interviews

AI Chipmaker Cerebras Raises $5.55 Billion in Year’s Biggest IPO

1 day ago

01:06:38

Interviews

Inside Abridge: The AI Listening to 100 Million Doctor Visits — Abridge’s Janie Lee & Chai Asawa

1 day ago

16:39

Interviews

How Emergent is making app building more accessible with Claude

2 days ago

01:16:02

Interviews

TypeScript, C# and Turbo Pascal with Anders Hejlsberg

2 days ago

23:34

Interviews

The Founders Who Left Tesla to Rebuild America | a16z

2 days ago

46:56

Interviews

“There Is No Task Agents Cannot Do” – Magnus Müller

2 days ago