Nvidia Backs DeepInfra in $107 Million Raise

Interviews2 weeks ago

Nvidia Backs DeepInfra in $107 Million Raise

Descriptions:

DeepInfra, a purpose-built AI inference cloud, has raised $107 million in a funding round backed by NVIDIA, Samsung, and Super Micro. In this Bloomberg Technology interview, CEO Nicolas outlines the company’s strategy for scaling inference infrastructure and driving down cost per token across a growing open-source model ecosystem.

The company currently processes 5 trillion tokens per week across eight data centers, with plans to expand across the US and into Europe and Asia later in 2026. Nicolas explains how efficiency gains come from full-stack optimization — from data center selection and cluster architecture to software, with KV caching highlighted as especially critical as agentic workloads generate large volumes of repeated context-heavy requests.

The conversation also covers supply chain pressures that have intensified since early 2026, including shortages of GPUs, high-bandwidth memory, and storage — areas where strategic investors like Samsung and Super Micro provide supply access advantages. Nicolas also addresses the competitive landscape, noting that while Cerebras is pursuing an IPO and positioning itself as an NVIDIA alternative, DeepInfra is doubling down on NVIDIA hardware while focusing on inference efficiency. The interview offers a clear window into how purpose-built inference clouds are differentiating on infrastructure depth rather than model ownership.

📺 Source: Bloomberg Technology · Published May 04, 2026
🏷️ Format: Interview

1 Item

Channels

No Image Available

Bloomberg Technology

1 Item

Companies

No Image Available

Nvidia

Tags

Cerebras Europe Nvidia Samsung

Prev

TLMs: Tiny LLMs and Agents on Edge Devices with LiteRT-LM — Cormac Brick, Google

Next

Why Agents Make Every Job a Startup

18 Related Posts

Related Posts

08:44

Interviews

AI Chipmaker Cerebras Raises $5.55 Billion in Year’s Biggest IPO

23 hours ago

01:06:38

Interviews

Inside Abridge: The AI Listening to 100 Million Doctor Visits — Abridge’s Janie Lee & Chai Asawa

23 hours ago

16:39

Interviews

How Emergent is making app building more accessible with Claude

2 days ago

01:16:02

Interviews

TypeScript, C# and Turbo Pascal with Anders Hejlsberg

2 days ago

23:34

Interviews

The Founders Who Left Tesla to Rebuild America | a16z

2 days ago

46:56

Interviews

“There Is No Task Agents Cannot Do” – Magnus Müller

2 days ago