Lessons from Trillion Token Deployments at Fortune 500s — Alessandro Cappelli, Adaptive ML

Foundation Models2 months ago

Lessons from Trillion Token Deployments at Fortune 500s — Alessandro Cappelli, Adaptive ML

Descriptions:

Alessandro Cappelli, co-founder and Chief Customer Officer of Adaptive ML, delivers a conference talk at AI Engineer drawing on production deployments across Fortune 500 clients including AT&T, Manulife, and CCS. The central thesis: reinforcement learning — not prompt engineering or supervised fine-tuning — is the mechanism that separates AI pilots from systems that actually reach and stay in production. Cappelli cites that 95% of generative AI pilots fail to make it to production, attributing this to what he calls the “myth of the last mile” — the false assumption that the hard part is building an impressive demo rather than surviving the continuous improvement cycle that follows.

Adaptive ML’s RL ops platform enables enterprises to retrain and refine specialized language models using real business feedback, KPIs, and LLM-as-judge reward signals in a closed loop. Cappelli explains that RL-trained smaller models can match frontier model performance at a fraction of the inference cost — a concrete example being AT&T, which spends millions annually on call transcript summarization and could dramatically reduce that figure by replacing a large hosted model with a compact RL-optimized one.

The talk extends into agent training, explaining how Adaptive ML plugs RL-trained models directly into existing agentic workflows — as done with Manulife’s established agent infrastructure — using business outcomes as reward signals. When labeled data is scarce, Cappelli describes using the RL environment itself as a synthetic data pipeline via rejection sampling, bootstrapping the first training run. A companion workshop by Adaptive ML colleagues Letizia and Joao, available on their engineering site, demonstrates the full pipeline hands-on.

📺 Source: AI Engineer · Published May 12, 2026
🏷️ Format: Deep Dive

1 Item

Channels

No Image Available

AI Engineer

Tags

ChatGPT Cursor Gemma 4 OpenAI Qwen

Prev

The Golden Age Thesis | Marc Andreessen on MTS

Next

Luce DFlash Meets OpenClaw – Local AI Agents at 2x Speed with Qwen3.6-27B

18 Related Posts

Related Posts

25:21

Foundation Models

Deepseek drops another HUGE breakthrough

22 hours ago

09:01

Foundation Models

NVIDIA’s Two-Tower Model Generates Text 2.4x Faster Without Losing Quality

2 days ago

07:27

Foundation Models

This New AI Model Changes Everything

3 days ago

14:10

Foundation Models

Your Agent Failed in Prod. Good Luck Reproducing It. – Tisha Chawla & Susheem Koul, Microsoft

5 days ago

30:38

Foundation Models

The Future Is Domain-Specific Agents – Justin Schroeder, StandardAgents

5 days ago

07:14

Foundation Models

Deterministic Infra for Non-Deterministic AI Agents – Nishant Gupta, Meta Superintelligence Labs

5 days ago