AI on Android: Ask me Anything — Florina Muntenescu & Oli Gaymond, Google DeepMind

AI on Android: Ask me Anything — Florina Muntenescu & Oli Gaymond, Google DeepMind

More

Descriptions:

At the AI Engineer conference, Florina Muntenescu (developer relations engineer, Google) and Oli Gaymond (product manager for Android AI, Google DeepMind) led an interactive Q&A session covering the complete landscape of options available to developers building intelligent features on Android. Their presentation walked through the tradeoffs between on-device inference, hybrid approaches, and full cloud inference — and explained when each model makes sense for real-world applications.

The centerpiece of the technical content was Gemini Nano, Google’s most efficient on-device model, which shares its architecture with the recently released Gemma 4 but is optimized specifically for Android hardware. Developers access it through the ML Kit GenAI APIs, which include task-specific endpoints for summarization, proofreading, and rewriting, as well as a flexible Prompt API supporting text and image input. The AI Core system service acts as a shared model host — eliminating the need for individual apps to bundle a 3–4 GB model — while also handling hardware-specific optimization and isolating each app’s requests for privacy. For developers requiring custom models, LiteRT LM offers a lower-level alternative with more control.

The Q&A portion surfaced practical engineering concerns including battery drain under continuous inference, RAM consumption, latency management under high concurrency across many apps sharing the same model, and strategies for deferring batch workloads to background processing during device charging. Gaymond and Muntenescu addressed each with candor, acknowledging current hardware limitations while pointing to AI Core as the platform-level abstraction that handles most optimization concerns automatically.


📺 Source: AI Engineer · Published May 22, 2026
🏷️ Format: Deep Dive

1 Item

Channels

1 Item

Companies