Descriptions:
Sam Witteveen delivers a hands-on review of Gemini 3 Flash following roughly a week of early access provided by Google DeepMind ahead of the public launch. The core finding: Gemini 3 Flash is substantially stronger than its predecessor Gemini 2.5 Flash and broadly competitive with Gemini 2.5 Pro, while outperforming Gemini 3 Pro on at least one key benchmark — SWE-bench Verified. On Humanity’s Last Exam, Flash scores 33.7% versus Pro’s 37.5%; on GPQA Diamond and MMMU Pro the gap is similarly narrow. Witteveen attributes the Flash model’s edge in some evals to better post-training tuning rather than superior base intelligence, expecting Gemini 3 Pro to close the gap at general availability.
Pricing lands at $0.50 per million input tokens and $3.00 per million output tokens across all context lengths — up from $0.30 input on the previous Flash — but token efficiency partially offsets the increase: the model consistently uses fewer output tokens than both Gemini 2.5 Flash and 2.5 Pro to complete equivalent tasks. Google’s own Anti-Gravity IDE and Gemini CLI are both planning to adopt Gemini 3 Flash as their primary model.
The video walks through practical code examples using the Gemini API with Pydantic-based structured outputs: extracting action items and decisions from meeting transcripts, analyzing food images to generate full recipes with calorie estimates, and handling PDF and audio inputs. Witteveen recommends the model as the new default daily-driver for production app developers who need strong multimodal extraction capability at a price point below the Pro tier.
📺 Source: Sam Witteveen · Published December 17, 2025
🏷️ Format: Review







