MLX Genmedia — Prince Canuma, Arcee

Tutorials4 days ago

MLX Genmedia — Prince Canuma, Arcee

Descriptions:

Prince Canuma, a core contributor to Apple’s MLX framework and engineer at Arcee, delivers a conference demo showing how to deploy and manage AI agents — including voice and vision agents — entirely on Apple Silicon devices without any cloud dependency. The talk draws on his personal motivation: building accessible technology for his father, who lost his sight in 2020 and lives in a region with unreliable internet access.

Canuma walks through the MLX ecosystem, which now counts over 1.5 million downloads and more than 4,000 ported models. He demonstrates MLX VLM (the vision-language model runtime that also powers LM Studio) running Google’s Gemma 4 26B locally on a MacBook with 96GB of unified memory, real-time object detection using a Roboflow model via the new MLX Swift bindings, and live background segmentation — all confirmed offline. He notes that even M1 MacBooks can run very large models by leveraging device storage.

The session is aimed at developers who want to reduce cloud subscription costs and build privacy-preserving, low-latency AI applications. Key takeaways include day-zero MLX support for frontier open-source releases like Gemma 4, the viability of omnimodal (vision + audio) pipelines on iPhone and iPad, and a growing library of community projects that extend MLX into voice agents, accessibility tools, and local coding assistants.

📺 Source: AI Engineer · Published May 11, 2026
🏷️ Format: Tutorial Demo

1 Item

Channels

No Image Available

AI Engineer

Tags

Apple Gemma 4 Liquid AI LM Studio MLX TurboQuant

Prev

Two Roads to Durable Agents: Replay vs. Snapshot — Eric Allam, Trigger.dev

Next

TurboQuant + DFlash: Supercharge Local LLM Speed

18 Related Posts

Related Posts

10:54

Tutorials

Talkie: I Ran a 1930 AI Model Locally and Talked to People from the Past

23 hours ago

03:02

Tutorials

Installing Claude Code

23 hours ago

08:17

Tutorials

OpenAI Codex Now Works from Anywhere (Dispatch Killer?)

23 hours ago

08:41

Tutorials

Luce DFlash Meets OpenClaw – Local AI Agents at 2x Speed with Qwen3.6-27B

2 days ago

24:07

Tutorials

Hermes Agent powered by local models on the DGX Spark is basically magic

2 days ago

03:21

Tutorials

Goal Mode Changes Everything for AI Coding

2 days ago