How People Actually Use AI Agents

How People Actually Use AI Agents

More

Descriptions:

An in-depth breakdown of Anthropic’s research paper ‘Measuring AI Agent Autonomy in Practice,’ which draws on real-world usage data from Claude Code and the public API to study how people actually deploy AI agents—as opposed to the idealized conditions captured by the widely cited METR benchmark. The episode explains the key methodological distinction: METR measures what a model can accomplish without human interaction, while Anthropic’s study tracks tool usage, session duration, and supervision behaviors as they occur in live workflows.

Specific findings are notable. Median Claude Code session length grew from roughly 25 minutes to over 45 minutes across model releases, then dipped slightly back toward 40 minutes in early 2026—a period when the Claude Code user base doubled. Anthropic attributes the dip partly to this influx of new users and a shift back to more constrained work tasks after the holiday period. On supervision, new users enable full auto-approval about 20% of the time versus roughly 40% for experienced users; yet experienced users also interrupt Claude mid-task at nearly double the rate of newcomers (9% vs. 5%), suggesting a shift from pre-action gatekeeping to in-flight course-correction as trust accumulates.

The episode positions these findings within a broader argument: agent autonomy is not purely a function of model capability but of the full sociotechnical context—user trust, task type, and interaction design. Claude Code is described as arguably the first AI agent with genuine product-market fit, making its usage data a uniquely valuable window into how agentic AI is actually being adopted.


📺 Source: The AI Daily Brief: Artificial Intelligence News · Published February 19, 2026
🏷️ Format: Deep Dive

1 Item

Companies