Open Models at Google DeepMind — Cassidy Hardin, Google DeepMind

Business & Strategy3 weeks ago

Open Models at Google DeepMind — Cassidy Hardin, Google DeepMind

Descriptions:

Google DeepMind researcher Cassidy Hardin presents a detailed technical breakdown of Gemma 4, the latest generation of Google’s open-source model family launched in late April 2026. The lineup spans four sizes: two on-device ‘effective’ models (2B and 4B) optimized for phones, iPads, and laptops, alongside a 26B mixture-of-experts model and a flagship 31B dense model. All Gemma 4 models ship under an Apache 2.0 license, a deliberate move to broaden commercial accessibility.

The 31B dense model ranks third on the global LM Arena leaderboard—outperforming models more than 20 times its size—with a 256k context window and native support for reasoning, function calling, and structured JSON outputs. The 26B MoE activates only 8 of its 128 experts per forward pass, requiring just 3.8 billion active parameters during inference. Architectural changes include a 5:1 interleaved local-to-global attention layer ratio, sliding window attention (1,024 tokens for larger models), and grouped query attention to reduce memory pressure.

Hardin goes deep on Per Layer Embeddings (PLE), the key innovation enabling the effective models’ on-device efficiency. PLE adds a dedicated 256-dimension embedding table per layer, stored in flash memory rather than VRAM—dramatically reducing the memory footprint that constrains mobile inference. The 2B effective model carries 35 layers and the 4B carries 42, with token representations refined at each stage. This architecture allows the effective models to significantly outperform prior Gemma generations at the same scale.

📺 Source: AI Engineer · Published April 27, 2026
🏷️ Format: Keynote Launch

1 Item

Channels

No Image Available

AI Engineer

1 Item

Companies

No Image Available

DeepMind

Tags

DeepMind Gemma 4 Gemma 4 31B Gemma 4 E2B Gemma 4 E4B Google AI Studio Ollama

Prev

Where the Economy Thrives After AI

Next

Nemotron OCR v2: Fast Multilingual OCR Model: Run Locally on CPU

18 Related Posts

Related Posts

41:05

Business & Strategy

Anthropic on USA vs China

1 hour ago

24:56

Business & Strategy

everyone JUST got HACKED…

1 hour ago

33:09

Business & Strategy

AI News: Impressive New Model From Unexpected Company

1 hour ago

18:27

Business & Strategy

Combine Skills and MCP to Close the Context Gap — Pedro Rodrigues, Supabase

1 hour ago

06:46

Business & Strategy

The trial of the century is even dumber than expected…

1 hour ago

12:23

Business & Strategy

Claude’s 13 Free AI Courses in 12 Minutes

1 day ago