What the Freakiness of 2025 in AI Tells Us About 2026

What the Freakiness of 2025 in AI Tells Us About 2026

More

Descriptions:

AI Explained delivers a ten-takeaway retrospective on 2025’s most consequential AI developments, paired with five forward-looking observations for 2026. The video opens with an assessment of reasoning models — the year’s defining paradigm shift — noting that while thinking longer measurably boosts accuracy on hard problems, it may reduce output diversity and does not appear to produce reasoning paths that weren’t already latent in the underlying base model. The host also references a Demis Hassabis interview clarifying that scaling hasn’t hit a wall so much as entered a regime of diminishing — but still substantial — returns.

A significant portion of the runtime is devoted to a careful methodological critique of the METR benchmark, which measures how long AI systems can sustain autonomous task completion and has been widely cited in governmental AI analyses and long-range forecasts. The presenter surfaces several underreported limitations: the 1-4 hour task range is derived from only 14 samples; the resulting 95% confidence interval spans 1 hour 49 minutes to 20 hours 25 minutes; Claude paradoxically performs better on 16-hour tasks than on 2-4 hour ones; and raising the success threshold from 50% to 80% significantly degrades all reported metrics. The host also warns that benchmark gaming incentives increase as benchmarks gain public prominence.

Other milestones covered include Google DeepMind’s Genie 3 (dynamic playable worlds from text or image prompts at 720p), VO 3.1 and Sora 2 for video generation, and the mainstreaming of AI-generated social content that fools millions of viewers — with the host arguing AI slop has now crossed from novelty to cultural background noise.


📺 Source: AI Explained · Published December 23, 2025
🏷️ Format: Opinion Editorial

1 Item

Channels

4 Items

Companies