NeuTTS Nano Multilingual: Great Idea, Disappointing Execution

NeuTTS Nano Multilingual: Great Idea, Disappointing Execution

More

Descriptions:

Fahd Mirza tests NeuTTS Nano’s newly released multilingual voice cloning models — French, Spanish, and German variants — alongside the original English Nano, all sharing the same 120-million-parameter open-source architecture under an MIT license. The models are marketed for low-power, on-device deployment including Raspberry Pi, with a headline claim of 3-second voice cloning and real-time CPU inference.

Installation uses a Gradio interface running on localhost port 7860, pulling a model file under 1GB. The first major discrepancy surfaces immediately: despite no explicit GPU directive, the model consumed over 17GB of VRAM during inference — making true CPU-only deployment impractically slow. Mirza notes this contradicts the product’s positioning as edge-friendly hardware.

Quality testing across all four languages disappoints consistently. English voice cloning fails to convincingly replicate vocal characteristics, Spanish output is described as falling flat, and the German model generates errors and fails to produce usable audio across multiple attempts. Mirza reveals he had previously recorded a standalone NeuTTS Nano video he chose not to publish due to poor results — this video represents a second attempt that reached the same conclusion. For developers targeting European markets with multilingual voice applications, he recommends looking at alternatives like KiTTEN TTS, which he covered separately. The video is a useful calibration point for anyone evaluating lightweight open-source TTS options against the current state of the market.


📺 Source: Fahd Mirza · Published February 26, 2026
🏷️ Format: Review

1 Item

Channels