Descriptions:
Fahd Mirza demonstrates how to run Gemma 4 E2B Uncensored — a modified version of Google’s Gemma 4 2B model — locally using Ollama on an NVIDIA RTX 6000 with 48GB of VRAM. The model’s refusal behavior has been removed using a technique called obliteration, released under Apache 2 license for AI safety research and red-teaming applications.
The video provides a detailed technical explanation of how obliteration works: researchers feed the model approximately 400 harmful and 400 harmless prompts, capture internal activations at each layer, and compute the difference vector — called the refusal direction — that separates compliant from non-compliant responses. This vector is then subtracted from the model weights using norm-preserving projection, keeping the magnitude of each weight vector constant while shifting its direction away from refusal behavior. KL divergence scores around 340–346 for the E2B variant confirm that the overall output distribution remains nearly intact; only the refusal mechanism is targeted. The tool used, called Heretic, is publicly available.
Mirza frames the content explicitly for security engineers, red teamers, and AI safety researchers who require uncensored outputs for legitimate testing. The installation walkthrough covers Ollama setup, a version upgrade fix for a model loading compatibility error, and GGUF format options for both GPU and CPU inference. All model size variants are available for download.
📺 Source: Fahd Mirza · Published April 05, 2026
🏷️ Format: Tutorial Demo







