Gemma 4 Uncensored: Run with Ollama Locally for AI Safety Research

Tutorials1 month ago

Gemma 4 Uncensored: Run with Ollama Locally for AI Safety Research

Descriptions:

Fahd Mirza demonstrates how to run Gemma 4 E2B Uncensored — a modified version of Google’s Gemma 4 2B model — locally using Ollama on an NVIDIA RTX 6000 with 48GB of VRAM. The model’s refusal behavior has been removed using a technique called obliteration, released under Apache 2 license for AI safety research and red-teaming applications.

The video provides a detailed technical explanation of how obliteration works: researchers feed the model approximately 400 harmful and 400 harmless prompts, capture internal activations at each layer, and compute the difference vector — called the refusal direction — that separates compliant from non-compliant responses. This vector is then subtracted from the model weights using norm-preserving projection, keeping the magnitude of each weight vector constant while shifting its direction away from refusal behavior. KL divergence scores around 340–346 for the E2B variant confirm that the overall output distribution remains nearly intact; only the refusal mechanism is targeted. The tool used, called Heretic, is publicly available.

Mirza frames the content explicitly for security engineers, red teamers, and AI safety researchers who require uncensored outputs for legitimate testing. The installation walkthrough covers Ollama setup, a version upgrade fix for a model loading compatibility error, and GGUF format options for both GPU and CPU inference. All model size variants are available for download.

📺 Source: Fahd Mirza · Published April 05, 2026
🏷️ Format: Tutorial Demo

1 Item

Channels

No Image Available

Fahd Mirza

Tags

Fahd Mirza Gemma 4 E2B Google llama.cpp Ollama

Prev

Training the AIs’ Eyes: How Roboflow is Making the Real World Programmable, with CEO Joseph Nelson

Training the AIs’ Eyes: How Roboflow is Making the Real World Programmable, with CEO Joseph Nelson

Next

Run a Full AI on Your Phone — No Internet Needed

Run a Full AI on Your Phone — No Internet Needed

18 Related Posts

Related Posts

14:22

Tutorials

Codex Mobile Released and It’s Insane

13 minutes ago

10:54

Tutorials

Talkie: I Ran a 1930 AI Model Locally and Talked to People from the Past

1 day ago

03:02

Tutorials

Installing Claude Code

1 day ago

08:17

Tutorials

OpenAI Codex Now Works from Anywhere (Dispatch Killer?)

1 day ago

08:41

Tutorials

Luce DFlash Meets OpenClaw – Local AI Agents at 2x Speed with Qwen3.6-27B

2 days ago

24:07

Tutorials

Hermes Agent powered by local models on the DGX Spark is basically magic

2 days ago