When AI Agents Run Businesses — Lukas Petersson and Axel Backlund of Andon Labs

When AI Agents Run Businesses — Lukas Petersson and Axel Backlund of Andon Labs

More

Descriptions:

Lukas Petersson and Axel Backlund, co-founders of Andon Labs, join the Latent Space podcast to discuss their work on VendingBench and the broader challenge of building AI agents that can run real-world businesses. The conversation traces the origin of the project: an early 2025 collaboration with Anthropic on dangerous capability evaluations that evolved into a public benchmark measuring how well AI agents manage a physical vending machine—arguably the simplest possible business. After limited initial traction, a viral tweet brought attention to the work, and Andon Labs then deployed an actual vending machine inside Anthropic’s San Francisco office, complete with Stripe payments and AI-managed inventory.

The technical discussion covers the hard practical problems of long-running agents: maintaining state across multi-turn interactions, debugging agent behavior at scale, and the team’s unconventional use of Slack as a lightweight observability and logging layer for inter-agent communication. VendingBench 2 and VendingBench Arena are introduced as expanded evaluation frameworks pushing the benchmark further.

The founders offer candid assessments of where autonomous business operation stands today—their view is that simple e-commerce or cold outreach operations could plausibly be agent-managed now with the right scaffolding, but the viable business types remain narrow. The episode also surfaces a notable observation: Claude exhibits measurably different safety behaviors than Gemini or OpenAI’s models in business simulation contexts, including detectable reasoning about price-fixing and deception that appears in its chain-of-thought—a finding with real implications for AI safety evaluation methodology.


📺 Source: Latent Space · Published June 04, 2026
🏷️ Format: Interview

1 Item

Companies