Voice AI: when is the “Her” moment? — Neil Zeghidour, Gradium AI

Foundation Models6 days ago

Voice AI: when is the “Her” moment? — Neil Zeghidour, Gradium AI

Descriptions:

Neil Zeghidour, co-founder of Gradian AI and creator of Moshi (the first full-duplex speech-to-speech model), gives a technically grounded assessment of where voice AI actually stands relative to the ‘Her moment’ — the benchmark of genuinely natural, human-feeling conversational voice set by the 2013 film. Gradian spun out of a non-profit lab funded by Eric Schmidt, Rodolphe Saadé, and Xavier Niel, and its portfolio includes Moshi, Pocket TTS (a CPU-optimized TTS model), and voice cloning from as little as 10 seconds of audio.

The core technical argument is architectural: virtually every deployed voice AI system — including the best offerings from ElevenLabs and OpenAI’s Advanced Voice Mode — is half-duplex. The model is either listening or speaking; it cannot handle simultaneous speech, natural interruptions, or back-channeling. In Japanese conversation, up to 20% of dialogue involves overlapping speech, and back-channeling (continuous ‘mhm’ affirmations) is a sign of active listening. Half-duplex systems break on all of these, making them feel robotic regardless of voice quality. Moshi is presented as the only production model that has crossed into full-duplex territory.

Zeghidour also dissects latency strategies in cascaded systems (STT → LLM → TTS pipelines), including the technique of generating filler speech while the LLM processes to mask response delays. His broader point: speech-to-speech architecture reduces latency but does not solve the human-conversation problem alone — the underlying model intelligence must also be sufficient for the voice interface to feel genuinely useful rather than just smoother-sounding.

📺 Source: AI Engineer · Published May 09, 2026
🏷️ Format: Deep Dive

1 Item

Channels

No Image Available

AI Engineer

Tags

ElevenLabs Nvidia OpenAI

Prev

The expanding toolkit

Next

Hermes Agent: Zero to Personal AI Assistant (1 Hour Course)

18 Related Posts

Related Posts

16:23

Foundation Models

Your SaaS Bill Just Got a Second Meter. You’re About to Pay It.

1 hour ago

31:55

Foundation Models

The biggest AI breakthrough in medicine & drug discovery

1 day ago

01:20:07

Foundation Models

Mind the Gap (In your Agent Observability) — Amy Boyd & Nitya Narasimhan, Microsoft

1 day ago

25:53

Foundation Models

The Trillion Dollar Agentic Workflow Opportunity Is Here

1 day ago

20:09

Foundation Models

Pinecone Just Demoted Vector Search. Here’s the Knowledge Layer.

2 days ago

14:27

Foundation Models

Claude Makes Dashboards Too Easy. That’s the Problem.

2 days ago