Tiny Aya – Cohere’s Mini Multilingual Models

Tiny Aya – Cohere’s Mini Multilingual Models

More

Descriptions:

Sam Witteveen reviews Cohere’s newly released Aya Tiny model family, a suite of small multilingual models designed to bring capable language AI to low-resource languages that larger general-purpose models have historically underserved. The video opens with a clear technical explanation of why multilingual coverage fails at small scales: insufficient training data for languages with sparse Wikipedia presence, and tokenizers that decompose non-Latin scripts character by character rather than into meaningful subword units โ€” dramatically increasing token count and making coherent language modeling much harder.

The Aya Tiny lineup centers on a ~3.3 billion parameter base model pretrained on 70-plus languages. From that foundation, Cohere released four post-trained variants: Tiny Global (broad multilingual instruction tuning), Tiny Earth (West Asia, Africa, and Europe โ€” Arabic, Turkish, Hebrew, and 41 additional languages), Tiny Fire (South Asian languages including Hindi, Bengali, Tamil, and Nepali, with code-switching support), and Tiny Water (Asia-Pacific coverage including Tagalog, Vietnamese, Thai, Chinese, Khmer, and Burmese). Witteveen explains that each regional variant was trained with a specialized SFT recipe and then merged, a technique he highlights as interesting even outside the multilingual context.

Cohere also released multilingual training datasets and benchmarks alongside the models, enabling developers to fine-tune for specific target languages. All models are available on Hugging Face. Witteveen positions the Aya Tiny family as a practical choice for production applications where compute constraints, latency requirements, or on-device deployment rule out larger alternatives from the Gemma 3 or Qwen families.


๐Ÿ“บ Source: Sam Witteveen ยท Published February 23, 2026
๐Ÿท๏ธ Format: Review

1 Item

Channels