This AI Model Runs On Your Phone (With No Internet)!

Tutorials2 months ago

This AI Model Runs On Your Phone (With No Internet)!

Descriptions:

Matt Wolfe covers the newly released Qwen 3.5 model family—launched March 2, 2026, in 800M, 2B, 4B, and 9B parameter variants—and demonstrates running it entirely offline on an iPhone using the Locally AI app, with no data sent to OpenAI, Anthropic, Google, or any cloud service.

The Locally AI app, available on the App Store with a 4.8-star rating from 579 reviews, supports multiple open-weight models including Gemma 2, Llama 3.2, and Qwen 3.5. Wolfe walks through the download and setup process on an iPhone 17 Pro, noting device requirements: the 4B model needs an iPhone 15 Pro or newer, the 2B runs on iPhone 15, and the 800M works on iPhone 14 or newer. Download took roughly five minutes on home Wi-Fi. The app supports custom system prompts, adjustable temperature, Siri shortcut integration, and a thinking mode that enables on-device chain-of-thought reasoning.

In practice, Qwen 3.5 handles brainstorming tasks fluidly but shows some gaps on common-sense logical problems. Wolfe notes the 4B model benchmarks favorably against GPT-4o Nano across most standard evaluations. As the chat history grows longer, the model slows noticeably—a practical constraint for extended sessions. The video is a useful starting point for anyone interested in private, offline AI on mobile hardware.

📺 Source: Matt Wolfe · Published March 04, 2026
🏷️ Format: Tutorial Demo

1 Item

Channels

No Image Available

Matt Wolfe

Tags

iPhone Qwen 3.5

Prev

I Built An Entire AI Marketing Team With Claude Code In 16 Minutes

I Built An Entire AI Marketing Team With Claude Code In 16 Minutes

Next

CH Robinson CEO Says AI Is Boosting Its Bottom Line

CH Robinson CEO Says AI Is Boosting Its Bottom Line

18 Related Posts

Related Posts

10:54

Tutorials

Talkie: I Ran a 1930 AI Model Locally and Talked to People from the Past

23 hours ago

03:02

Tutorials

Installing Claude Code

23 hours ago

08:17

Tutorials

OpenAI Codex Now Works from Anywhere (Dispatch Killer?)

23 hours ago

08:41

Tutorials

Luce DFlash Meets OpenClaw – Local AI Agents at 2x Speed with Qwen3.6-27B

2 days ago

24:07

Tutorials

Hermes Agent powered by local models on the DGX Spark is basically magic

2 days ago

03:21

Tutorials

Goal Mode Changes Everything for AI Coding

2 days ago