This AI Model Runs On Your Phone (With No Internet)!

This AI Model Runs On Your Phone (With No Internet)!

More

Descriptions:

Matt Wolfe covers the newly released Qwen 3.5 model family—launched March 2, 2026, in 800M, 2B, 4B, and 9B parameter variants—and demonstrates running it entirely offline on an iPhone using the Locally AI app, with no data sent to OpenAI, Anthropic, Google, or any cloud service.

The Locally AI app, available on the App Store with a 4.8-star rating from 579 reviews, supports multiple open-weight models including Gemma 2, Llama 3.2, and Qwen 3.5. Wolfe walks through the download and setup process on an iPhone 17 Pro, noting device requirements: the 4B model needs an iPhone 15 Pro or newer, the 2B runs on iPhone 15, and the 800M works on iPhone 14 or newer. Download took roughly five minutes on home Wi-Fi. The app supports custom system prompts, adjustable temperature, Siri shortcut integration, and a thinking mode that enables on-device chain-of-thought reasoning.

In practice, Qwen 3.5 handles brainstorming tasks fluidly but shows some gaps on common-sense logical problems. Wolfe notes the 4B model benchmarks favorably against GPT-4o Nano across most standard evaluations. As the chat history grows longer, the model slows noticeably—a practical constraint for extended sessions. The video is a useful starting point for anyone interested in private, offline AI on mobile hardware.


📺 Source: Matt Wolfe · Published March 04, 2026
🏷️ Format: Tutorial Demo

1 Item

Channels