Descriptions:
Fahd Mirza puts Cohere’s newly released Command A Plus through its paces in this hands-on review, offering a skeptical take on a model Cohere is billing as its best ever. Command A Plus is a mixture-of-experts architecture with 218 billion total parameters but only 25 billion active at inference time. It supports a 128K input context window, text and image inputs, reasoning, tool use, and 48 languages — a significant jump from the 23 languages supported by its predecessor, Command A.
Mirza runs three live tests on Cohere’s hosted playground and a Hugging Face space. A coding task asking for a self-contained HTML notification center UI returns what he describes as gibberish, leading him to conclude that coding is effectively out of scope for this model. A multilingual translation test shows strong results for languages well-represented in training data but notable hallucinations and failures for Southeast Asian languages — a gap he argues is unacceptable given what general-purpose models like Qwen and DeepSeek can do. A handwritten equation image test fares better, with the model solving both equations correctly and identifying errors, though with hedged phrasing that Mirza finds undermines confidence.
The benchmark story is more positive: Command A Plus shows substantial gains over Command A Reasoning on agentic coding, instruction following, math, and data analysis. But Mirza’s central argument is that in mid-2026, a model of this size should be delivering more confident, precise real-world results — and that Cohere’s continued focus on large enterprise models rather than smaller, more accessible ones is a strategic miscalculation.
📺 Source: Fahd Mirza · Published May 20, 2026
🏷️ Format: Review







