Descriptions:
Sam Witteveen introduces MirrorThinker 1.5, a pair of open-weight models (235B and 30B) built on Qwen Mixture of Experts architecture and released under an MIT license, designed specifically for long-horizon agentic research tasks. The central claim — that MirrorThinker’s 30B model with only 3 billion active parameters can match trillion-parameter models like Kimi K2 on tool-heavy benchmarks — frames the video’s core argument about the shift away from massive monolithic models toward smaller, tool-capable ones.
The video includes a full notebook walkthrough running MirrorThinker 235B as a vLLM server on an A100 80GB GPU, exposing it via the OpenAI-compatible API. Witteveen builds a custom agent loop from scratch — no LangChain or other frameworks — using DuckDuckGo search, BeautifulSoup web fetching, a Python code executor, and date/calculator utilities. He explains the model’s context management strategy for handling long multi-step runs and its design target of 400 tool calls per session.
Benchmark comparisons show MirrorThinker competing with DeepSeek V3.2, Kimi K2 Thinking, GLM 4.7, and MiniMax models, with state-of-the-art results on browser-use evaluations. Both the 256K-token context window and the MIT license make this a strong candidate for teams building research agents or high-output generation pipelines who need cost-effective local or cloud deployment.
📺 Source: Sam Witteveen · Published January 07, 2026
🏷️ Format: Hands On Build







