Run Qwen3.6-35B-A3B Locally: Open-Source and Free

Coding & Dev Tools4 weeks ago

Run Qwen3.6-35B-A3B Locally: Open-Source and Free

Descriptions:

Fahd Mirza demonstrates how to install and run Qwen 3.6 35B-A3B locally — Alibaba’s latest mixture-of-experts model released as fully open weights. The model has 35 billion total parameters but activates only 3 billion at any given time through expert routing, delivering the knowledge base of a large dense model at a fraction of the compute cost. The tutorial runs on an Ubuntu system with a single NVIDIA H100 (80GB VRAM) using vLLM as the inference engine.

The video covers the full technical setup: downloading via Hugging Face CLI (26 shards), configuring vLLM with a 32K context cap to fit within the H100’s headroom after model weights consume roughly 70GB, setting GPU memory utilization to 0.90, enabling tool-calling flags for agentic frameworks, and correctly handling the model’s thinking blocks with the reasoning parser. The architecture is explained throughout — 40 layers alternating gated delta attention and standard gated attention feeding into a 256-expert MoE block, with a native 262K token context window extensible to 1 million. A notable new feature called thinking preservation retains the model’s reasoning chain across multi-turn conversations, which Mirza flags as significant for long agentic sessions.

Live tests include a tabbed CSS interface, a single-shot playable Flappy Bird-style HTML game (rendered with no external dependencies), and a multilingual announcement task. Benchmark comparisons against Qwen 3.5 and Claude Sonnet 4.5 are discussed, with the model showing competitive performance on agentic coding and multimodal understanding. A follow-up video covering OpenClaw integration for autonomous coding is referenced.

📺 Source: Fahd Mirza · Published April 16, 2026
🏷️ Format: Hands On Build

1 Item

Channels

No Image Available

Fahd Mirza

1 Item

Companies

No Image Available

Alibaba

Tags

Alibaba Claude Opus Claude Sonnet 4.5 Claude Sonnet 4.6 Fahd Mirza Qwen 3.5 VLLM

Prev

Anthropic Draws Investor Offers at Over $800 Billion Value | Bloomberg Tech 4/15/2026

Anthropic Draws Investor Offers at Over $800 Billion Value | Bloomberg Tech 4/15/2026

Next

Qwen3.6-35B-A3B + OpenClaw – Agentic Coding Locally for Free

Qwen3.6-35B-A3B + OpenClaw – Agentic Coding Locally for Free

18 Related Posts

Related Posts

10:06

Coding & Dev Tools

Toto 2.0: Datadog’s Observability AI Model – Full Install + Live Dashboard

1 hour ago

18:19

Coding & Dev Tools

My Hands-Free AI Streaming Setup (CodeRabbit + Claude Code)

1 hour ago

23:22

Coding & Dev Tools

Claude Just Replaced My Financial Advisor (Tutorial)

1 hour ago

06:45

Coding & Dev Tools

How to Make Your AI Agent Crash Proof in 1 Install (Free)

1 hour ago

15:13

Coding & Dev Tools

Make the PERFECT Videos with Claude Code (Full Workflow)

1 day ago

01:04:27

Coding & Dev Tools

Make your own event-sourced agent harness using stream processors — Jonas Templestein, Iterate

1 day ago