World Models & General Intuition: Khosla’s largest bet since LLMs & OpenAI

World Models & General Intuition: Khosla’s largest bet since LLMs & OpenAI

More

Descriptions:

This Latent Space episode offers the first public look at General Intuition (GI), a world model startup spun out of Metal—a gaming clip platform boasting 12 million users and 3.8 billion clips of peak human gameplay. GI’s founder Py turned down a reported $500 million acquisition offer from OpenAI for the dataset, instead accepting a $134 million seed round led by Khosla Ventures, described as Vinod Khosla’s largest single seed bet since his original OpenAI investment. The episode includes an audio-described exclusive preview of GI’s models, which the host confirms are strikingly humanlike in gameplay behavior.

The technical discussion centers on what distinguishes world models from conventional video generation: rather than predicting the next visually likely frame, world models must simulate the full distribution of possible outcomes given a specific action, producing the next game state accordingly. GI’s current models use pure imitation learning on pixel inputs alone—no game state, no reinforcement learning, no fine-tuning—and run in real time against human opponents. The conversation covers the Diamond paper (world model playable on a single RTX 4090 trained on ~87 hours of data) as the external validation that convinced several major labs to approach GI, and how GI’s proprietary dataset effectively replicates what Common Crawl is for language models.

GI’s strategic roadmap draws a deliberate parallel to Anthropic’s focused dominance in coding: nail one vertical (gaming simulation), use world models to expand to adjacent domains, and build a customer base years before any lab can synthesize comparable spatial-temporal training data. Robotics and embodied AI are the stated long-term targets.


📺 Source: Latent Space · Published December 06, 2025
🏷️ Format: Interview