ZAYA1-VL-8B: Efficient Open Visual Intelligence – Run Locally

Benchmarks6 days ago

ZAYA1-VL-8B: Efficient Open Visual Intelligence – Run Locally

Descriptions:

Fahd Mirza puts ZAYA1-VL-8B — the new vision-language model from Zeffa — through its paces on an NVIDIA RTX 6000 with 48GB of VRAM, showing installation, VRAM consumption (just over 26GB), and a series of progressively harder tests. The model is a Mixture-of-Experts architecture with 8 billion total parameters but only 700 million active during inference, and it was trained on roughly 140 billion vision-language tokens — a dramatically leaner dataset than competing models that use trillions.

The benchmark results are notable: Zeffa claims ZAYA1-VL beats Molmo, DeepSeek VL2, and Qwen3 VL at equivalent active parameter counts. Mirza’s live tests cover dense OCR on a vintage newspaper (strong performance across five headlines), handwritten letter extraction (initial failure on prompt following, then clean success after prompt clarification), and multilingual text recognition across an AI-generated airport sign in English, Japanese, Korean, and Russian. A broader multilingual test with more obscure languages — including several Southeast Asian scripts — reveals clear gaps, suggesting multilinguality is not the model’s strongest suit.

The video also walks through the architecture’s two key design ideas: treating image tokens differently from text during causal processing, and its efficiency-first training philosophy. Released under an Apache 2.0 license, ZAYA1-VL-8B is fully commercially usable and runnable locally, making it a practical option for developers who need vision capabilities without cloud API costs.

📺 Source: Fahd Mirza · Published May 09, 2026
🏷️ Format: Benchmark Test

1 Item

Channels

No Image Available

Fahd Mirza

1 Item

People

No Image Available

Fahd Mirza

Tags

Fahd Mirza

Prev

The expanding toolkit

Next

Hermes Agent: Zero to Personal AI Assistant (1 Hour Course)

18 Related Posts

Related Posts

11:12

Benchmarks

Qwen3.6 27B Gets 20% Faster with MTP and llama.cpp Locally

5 days ago

04:40

Benchmarks

One API Key for Every AI Model (Pay With Crypto)

1 week ago

08:57

Benchmarks

Google Releases Gemma 4 MTP Drafters – Run Locally and DFlash Comparison

1 week ago

08:44

Benchmarks

Are AI Coding Skills Just Hype? I Tested Them

2 weeks ago

11:03

Benchmarks

I Didn’t Expect This: Opus 4.7 vs GPT 5.5

2 weeks ago

12:24

Benchmarks

Mistral Medium 3.5 128B: Built for Long Stretches on Coding: Full Testing

2 weeks ago