Descriptions:
Andres Marafioti, head of multimodal research at Hugging Face, introduces Reachy Mini — a $300 open-source desktop robot built specifically for hackers, researchers, and students who want to experiment with human-robot interaction without the $50,000-plus price tag of commercial humanoids. Presented at the AI Engineer conference, the talk makes the case that voice AI has reached sufficient maturity to power genuinely expressive robots, and that the window is open to build the interaction paradigms before a handful of companies lock them in.
The technical core of Reachy Mini is its voice pipeline: speech is transcribed every 150 milliseconds using Parakeet (chosen for speed), partial results are streamed to an LLM for real-time reaction, and responses are synthesized via Coqui TTS — all orchestrated through Hugging Face’s open-source speech-to-speech framework. The robot also does tool calling for physical movements and emotions, runs camera-based face tracking, and handles echo cancellation locally. Hugging Face serves inference for the fleet and has shipped 7,500 units, making the conversation app its most-used feature by a wide margin.
Marafioti situates Reachy Mini against the broader open-source voice AI landscape — citing GPT Realtime API, Mistral’s Voix Rally, and the 80-million-parameter Cocoro model — to argue that all the pieces now exist for community-built robot interaction. The robot ships unassembled deliberately, echoing the early PC kit ethos, and every model and agent in the stack is open source.
📺 Source: AI Engineer · Published May 29, 2026
🏷️ Format: Keynote Launch







