Descriptions:
Fahd Mirza walks through a complete, reproducible guide to fine-tuning Qwen 3.5 0.8B locally using the Unsloth library and LoRA (Low Rank Adaptation). The tutorial targets practitioners who want to specialize a small open-weight model for a specific domain — in this case, Turkish kebab expertise — using a custom 200-example Q&A dataset assembled with ChatGPT and Claude.
The walkthrough covers the full pipeline: creating a conda virtual environment on Ubuntu with an Nvidia RTX 6000 GPU (48GB VRAM), installing PyTorch and Unsloth, and attaching LoRA adapters to the query, key, value, and output projection layers of the attention mechanism, plus the feed-forward gate and up/down projections. Key hyperparameters are explained in plain terms — LoRA rank of 16 as a memory-accuracy sweet spot, gradient accumulation steps to achieve an effective batch size of 8 without multiplying VRAM demand, and three training epochs over the 200-record dataset with real-time loss monitoring every five steps.
Mirza also clarifies the conceptual distinction between supervised fine-tuning (SFT) and direct preference optimization (DPO), and explains why LoRA — which trains only a tiny fraction of the model’s 800 million parameters — makes fine-tuning accessible on consumer hardware. The video concludes with inference testing on the resulting domain-specialized model. The approach applies directly to other Qwen 3.5 series models.
📺 Source: Fahd Mirza · Published March 04, 2026
🏷️ Format: Tutorial Demo







