Google just dropped Gemma 4… (WOAH)

Foundation Models1 month ago

Google just dropped Gemma 4… (WOAH)

Descriptions:

Matthew Berman delivers a thorough breakdown of Google’s Gemma 4 model family, covering all four released sizes: effective 2B and 4B parameter models designed for on-device deployment, a 26B mixture-of-experts model with 4 billion active parameters, and a 31B dense model that currently ranks third among all open-weights models on the Arena AI text leaderboard — behind only GLM5 and Kimmy K2.5, both of which are dramatically larger.

Berman explains the ‘effective parameter’ (E2B/E4B) nomenclature, which refers to per-layer embedding tables that maximize parameter efficiency for edge deployments without increasing layer count. He charts all models on a parameter-count vs. ELO-score axis, showing Gemma 4 31B performing comparably to Qwen 3.5 (a 397B/17B-active MoE) at a fraction of the size. He notes that Kimmy K2.5 — a near-trillion-parameter model — cannot even run on an NVIDIA GB300 with 750 GB of unified memory, making Gemma 4’s efficiency particularly significant.

On capabilities, Berman highlights native function calling, structured JSON output, multimodal support across video and images, and explicit OpenClaw compatibility confirmed in Hugging Face’s launch blog. He frames Gemma 4 as the right model for a hybrid deployment strategy: frontier hosted models like Claude Opus 4.6 or GPT-5 for hard reasoning tasks, with Gemma 4 handling the majority of agentic workload locally or via OpenRouter.

📺 Source: Matthew Berman · Published April 03, 2026
🏷️ Format: Deep Dive

1 Item

Channels

No Image Available

Matthew Berman

1 Item

Companies

No Image Available

Google

Tags

Claude Opus 4.6 Gemma 4 Gemma 4 31B Gemma 4 E2B Gemma 4 E4B GLM5 Google Kimi K2.5 LM Studio Matthew Berman Ollama OpenAI Qualcomm Qwen 3.5 Unsloth

Prev

Seedance 2.0 is Finally HERE & Just Won the AI Video Race

Seedance 2.0 is Finally HERE & Just Won the AI Video Race

Next

Gemma-4 26B A4B + vLLM: Best MoE Model of 2026: Running Locally

Gemma-4 26B A4B + vLLM: Best MoE Model of 2026: Running Locally

18 Related Posts

Related Posts

31:55

Foundation Models

The biggest AI breakthrough in medicine & drug discovery

1 day ago

01:20:07

Foundation Models

Mind the Gap (In your Agent Observability) — Amy Boyd & Nitya Narasimhan, Microsoft

1 day ago

25:53

Foundation Models

The Trillion Dollar Agentic Workflow Opportunity Is Here

1 day ago

20:09

Foundation Models

Pinecone Just Demoted Vector Search. Here’s the Knowledge Layer.

2 days ago

14:27

Foundation Models

Claude Makes Dashboards Too Easy. That’s the Problem.

2 days ago

18:37

Foundation Models

CI/CD Is Dead, Agents Need Continuous Compute and Computers — Hugo Santos and Madison Faulkner

2 days ago