Descriptions:
Michele Catasta, head of AI at Replit, argues at the AI Engineer conference that the most consequential frontier for autonomous coding agents is not raw capability but accessibility to non-technical users. Where most coding agents assume a developer in the loop who can evaluate outputs and course-correct, Replit Agent B3 is designed for knowledge workers who cannot make technical decisions and have little tolerance for QA cycles — what Catasta calls the Waymo model of autonomy, as opposed to the Tesla FSD model where a licensed driver must remain ready to intervene.
The technical centerpiece of the talk is autonomous testing — a necessity when end users cannot provide structured feedback. Catasta describes a spectrum from static analysis and LSP checks through unit testing and API endpoint testing to full browser-based integration testing, which Replit now uses to verify that built web applications behave correctly without human involvement. Rather than computer use (which requires screenshots and tends to be slow and expensive), Replit’s browser testing operates through DOM abstractions, simulating user interactions efficiently enough to run continuously during development.
The agent is also designed to generate application code that is amenable to automated verification from the start — structuring apps so the testing agent can reliably exercise them. Catasta frames the broader trajectory as moving from react-based agents through native tool-calling agents to the current third generation: long-horizon autonomous agents capable of running coherently for hours. Replit Agent B3 represents their current implementation of this third generation, targeting the full spectrum of knowledge workers rather than just professional developers.
📺 Source: AI Engineer · Published December 22, 2025
🏷️ Format: Deep Dive







