Descriptions:
In a wide-ranging interview filmed at the Google Cloud campus, Matthew Berman sits down with Google Cloud CEO Thomas Kurian to discuss how Google has managed to supply compute to its own models, to Anthropic and OpenAI, and to enterprise customers simultaneously — while competitors describe themselves as perpetually compute-constrained. Kurian traces this back to infrastructure decisions made over a decade ago: locking in real estate for data centers, shifting from construction to manufacturing, diversifying energy sources, and building its own silicon through eleven successive generations of TPUs.
The conversation covers the 8th generation TPU lineup in detail. Kurian confirms that for the first time Google has split the chip family into two distinct products — one optimized primarily for training, one for inference — with the inference chip, 8i, seeing demand far beyond initial projections. He also discusses Mythos, widely rumored to be the first 10 trillion parameter model, and addresses how Google balances monetization across selling raw TPU access, hosting inference for other AI labs, and running its own Gemini models. A notable subplot is the expansion of TPUs into capital markets, with Citadel cited as a firm shifting algorithmic trading workloads from numerical computation to inference.
Kurian also weighs in on Google’s relationship with NVIDIA, the goodput metric Google uses to measure effective throughput on TPU clusters, and what he considers the next major bottleneck for the industry. For anyone tracking AI infrastructure competition — who owns the silicon, who controls the supply chain, and how that shapes the frontier model race — this is one of the more substantive executive interviews available.
📺 Source: Matthew Berman · Published April 24, 2026
🏷️ Format: Interview







