TLMs: Tiny LLMs and Agents on Edge Devices with LiteRT-LM — Cormac Brick, Google

Foundation Models2 weeks ago

TLMs: Tiny LLMs and Agents on Edge Devices with LiteRT-LM — Cormac Brick, Google

Descriptions:

Cormac Brick, tech lead for Google AI Edge with a decade of experience spanning Intel NPU architecture and Google’s Pixel AI features, delivers a detailed technical presentation on running LLMs and agentic workloads on mobile and edge devices. The talk centers on two product areas: LiteRT-LM, Google’s LLM inference runtime for Android, iOS, and other edge platforms, and a new agent skills framework built on top of the latest Gemma 4 models that enables on-device agentic behavior without a cloud round-trip.

Brick explains the distinction between small language models (SLMs, roughly hundreds of millions to a few billion parameters) and tiny language models (TLMs, sub-100M), walking through the performance profiles Google observes across device classes. He covers Gemma 4’s launch the prior week alongside an Android and iOS reference app, performance numbers on mobile hardware, and how agent skills are structured — each skill is a self-contained unit of JavaScript, a spec file, and optional API credentials, with an orchestrator model routing user intent to the appropriate skill. Google’s internal team built approximately 80 skills using this pattern, with Gemini CLI and Claude Code as the primary authoring tools.

The second half focuses on fine-tuning and deploying tiny models to edge devices, ending with a real shipped application built using TLM technology. The talk is directly relevant to anyone building offline-capable, latency-sensitive, or privacy-preserving AI features on Android, iOS, or embedded platforms using Google’s open toolchain.

📺 Source: AI Engineer · Published May 03, 2026
🏷️ Format: Deep Dive

1 Item

Channels

No Image Available

AI Engineer

Tags

Apple DeepMind Gemma 4 Gemma 4 E2B Gemma 4 E4B Google Qualcomm

Prev

The Week AI Grew Up

Next

GPT-5.5 VERIFIED Opus 4.7: A Pi Coding Agent That REVIEWS Like YOU

18 Related Posts

Related Posts

31:55

Foundation Models

The biggest AI breakthrough in medicine & drug discovery

23 hours ago

01:20:07

Foundation Models

Mind the Gap (In your Agent Observability) — Amy Boyd & Nitya Narasimhan, Microsoft

23 hours ago

25:53

Foundation Models

The Trillion Dollar Agentic Workflow Opportunity Is Here

23 hours ago

20:09

Foundation Models

Pinecone Just Demoted Vector Search. Here’s the Knowledge Layer.

2 days ago

14:27

Foundation Models

Claude Makes Dashboards Too Easy. That’s the Problem.

2 days ago

18:37

Foundation Models

CI/CD Is Dead, Agents Need Continuous Compute and Computers — Hugo Santos and Madison Faulkner

2 days ago