Descriptions:
Nate B Jones of AI News & Strategy Daily breaks down the latest results from METR (Model Evaluation and Threat Research), the nonprofit benchmark organization known for its Personal Task Runtime (PTR) graph measuring how long AI agents can perform useful work autonomously. Unlike capped benchmarks such as SWE-bench, PTR has no upper ceiling, making it uniquely suited to tracking long-horizon agentic progress.
The centerpiece of the analysis is Anthropic’s Claude Opus 4.5, which METR clocked at roughly 4 hours and 45 minutes of human-equivalent work at a 50% success rate—and 27 to 28 minutes at the stricter 80% threshold. Jones argues this data points to a super-exponential growth curve, with AI agentic capability doubling approximately every four to four-and-a-half months, a pace he distinguishes sharply from ordinary exponential growth.
The practical implications Jones draws are wide-ranging: if the doubling rate holds, agents capable of a full week of autonomous work could be a reality by late 2026. He frames skill in delegating to AI agents as a compounding career advantage, arguing that power-law distributions of productivity will increasingly separate those who master agentic workflows early from those who wait. The video is a useful orientation for anyone trying to contextualize where frontier model capability stands on long-duration task performance.
📺 Source: AI News & Strategy Daily | Nate B Jones · Published December 29, 2025
🏷️ Format: Opinion Editorial

![Your Brain Doesn’t Command Your Body. It Predicts It. [Max Bennett]](https://frontiermodels.cc/wp-content/uploads/2026/03/your-brain-doesnt-command-your-b-150x150.jpg)





