Mistral 3: Europe’s Answer to DeepSeek or Too Little, Too Late?

Mistral 3: Europe’s Answer to DeepSeek or Too Little, Too Late?

More

Descriptions:

Sam Witteveen reviews Mistral’s first major model release in five months: a four-model suite headlined by Mistral Large 3, a 675-billion-parameter mixture-of-experts model with 41 billion active parameters. Notably, this active-parameter ratio is significantly higher than recent MoE releases from OpenAI and Qwen, which typically keep active parameters well under 5% of total. Alongside the flagship model, Mistral has shipped three dense “Mini Mistral” models at 3B, 8B, and 14B parameter sizes—each released with base weights, instruction-tuned variants, and reasoning versions, making this one of the more complete open-weight drops of the period.

On benchmarks, Mistral Large 3 sits roughly on par with DeepSeek 3.1 and Kimi K2 in Mistral’s own comparisons, though Witteveen flags those comparisons as selective—the model ranks 28th overall on the LLM Arena leaderboard, a far cry from frontier closed-model performance. Filtered to Apache 2.0-licensed models, however, it sits near the top, edging out several Qwen 3 variants. The Mini models show competitive results against same-size Gemma and Qwen offerings, with strong instruction-following scores at the 14B tier. Mistral also signals a reasoning version of Large 3 is forthcoming.

Witteveen frames the release within a broader ecosystem gap: few organizations are still releasing multi-size model families with base weights, leaving Mistral as one of the only alternatives to Qwen for practitioners who need fine-tuneable open models across a range of sizes. For teams building on open infrastructure, this release meaningfully expands the available options.


📺 Source: Sam Witteveen · Published December 03, 2025
🏷️ Format: Review

1 Item

Channels