Descriptions:
CES 2026 may be remembered less for its gadgets than for the moment the AI industry’s industrial posture became impossible to ignore. Nate B. Jones breaks down what Nvidia’s announcements — particularly the Vera Rubin rack-scale platform — actually signal about where AI infrastructure investment is heading and why inference, not training, is now the dominant cost center for every major AI lab.
The Rubin platform is a six-chip rack-scale system that Nvidia claims cuts inference token generation costs by a factor of 10 while supporting 10-million-token context windows. Crucially, the platform ships with a dedicated inference context memory storage tier — essentially externalizing the KV cache from the GPU itself — which Jones reads as an explicit acknowledgment that inference scaling is now as much a memory and data-movement problem as a compute problem. Sam Altman’s October 2025 figure of 800 million weekly active ChatGPT users illustrates the permanent serving load that now dwarfs any individual training run.
Jones connects the hardware story to OpenAI’s supply chain positioning: a $38 billion AWS capacity lock, a multi-billion dollar Coreweave deal, and the Stargate project’s new partnership with Samsung and SK Hynix targeting 900,000 DRAM wafers per month. Reuters data showing DRAM prices up over 300% in Q4 2025 underscores how tight the supply chain has become — and why the companies that locked in capacity agreements early are structurally advantaged heading into the scale-out phase of 2026.
📺 Source: AI News & Strategy Daily | Nate B Jones · Published January 08, 2026
🏷️ Format: News Analysis







