Cohere Transcribe: Local ASR Model – Audio In, Text Out in 14 Languages

Tutorials2 months ago

Cohere Transcribe: Local ASR Model – Audio In, Text Out in 14 Languages

Descriptions:

Cohere has quietly released a new automatic speech recognition model called Cohere Transcribe, and in this video Fahd Mirza walks through a complete local installation and live test on Ubuntu using an Nvidia RTX 6000 GPU with 48GB VRAM.

The model is a 2-billion parameter ASR system built on a conformer architecture, which combines transformer and convolutional layers to process audio by first converting waveforms into MEL spectrograms. Released under an Apache 2.0 license, it supports 14 languages including Arabic, Japanese, Korean, Polish, Spanish, and Vietnamese, and is claimed to be up to three times faster than comparable dedicated ASR models at its parameter scale. Inference requires only around 5GB of VRAM. The model is gated on Hugging Face, so the video also covers the authentication and token setup process.

Mirza runs the model against audio samples in all 14 supported languages and measures real-time VRAM usage during transcription. He also clearly outlines the model’s known limitations: no automatic language detection, no timestamp generation, no speaker diarization, and a tendency to hallucinate text when fed silence. The transparent treatment of both strengths and shortcomings — including praise for Cohere’s unusually candid model card — makes this a practical reference for developers evaluating local or self-hosted speech-to-text options.

📺 Source: Fahd Mirza · Published March 27, 2026
🏷️ Format: Tutorial Demo

1 Item

Channels

No Image Available

Fahd Mirza

Tags

Cohere Fahd Mirza

Prev

Claude + Firecrawl Just Changed How We Browse the Internet Forever! (Tutorial)

Next

Gemini 3.1 Flash Live Just Changed Voice Agents Forever

Gemini 3.1 Flash Live Just Changed Voice Agents Forever

18 Related Posts

Related Posts

10:54

Tutorials

Talkie: I Ran a 1930 AI Model Locally and Talked to People from the Past

23 hours ago

03:02

Tutorials

Installing Claude Code

23 hours ago

08:17

Tutorials

OpenAI Codex Now Works from Anywhere (Dispatch Killer?)

23 hours ago

08:41

Tutorials

Luce DFlash Meets OpenClaw – Local AI Agents at 2x Speed with Qwen3.6-27B

2 days ago

24:07

Tutorials

Hermes Agent powered by local models on the DGX Spark is basically magic

2 days ago

03:21

Tutorials

Goal Mode Changes Everything for AI Coding

2 days ago