Fine-Tune Gemma-4 on Your Own Dataset Locally: Step-by-Step Tutorial

Fine-Tune Gemma-4 on Your Own Dataset Locally: Step-by-Step Tutorial

More

Descriptions:

This step-by-step tutorial from Fahd Mirza covers fine-tuning Google’s Gemma 4 E2B model locally on a custom dataset using Unsloth and LoRA (Low-Rank Adaptation). The example dataset contains roughly 100 detailed question-and-answer pairs about the Gandhara civilization — formatted in ShareGPT style with human/GPT message pairs in JSONL — chosen to illustrate how fine-tuning gives a base model precise, deep knowledge in a niche domain it would otherwise answer only superficially.

The E2B model has 5.1 billion total parameters but an effective 2.3 billion active during inference due to per-layer embeddings, and the full fine-tuning run requires under 8GB of VRAM on an NVIDIA H100. Mirza walks through the complete pipeline: setting up a conda environment, installing Unsloth and PyTorch, applying LoRA adapters (which freeze the base model and add small trainable layers to attention and MLP modules), formatting data with Gemma 4’s chat template, and configuring the trainer. Every hyperparameter is explained in plain terms — batch size 2 for memory safety, gradient accumulation steps of 4 for an effective batch size of 8, three full epochs over the dataset — without requiring an ML background to follow.

The resulting fine-tuned output is small (only the LoRA delta between the base and adapted model is saved), and Mirza notes it can be merged into the base model with a single command. The video is a practical end-to-end reference for developers who want to specialize an efficient open-weights model on domain-specific data without expensive cloud infrastructure.


📺 Source: Fahd Mirza · Published April 03, 2026
🏷️ Format: Tutorial Demo

1 Item

Channels