AI Dev 26 x SF | Andrew Filev: Multi Model Pipelines—How to Get Better AI Results for Less

Foundation Models2 months ago

AI Dev 26 x SF | Andrew Filev: Multi Model Pipelines—How to Get Better AI Results for Less

Descriptions:

Andrew Filev, CEO of Zenoder, presents findings from his company’s in-house applied research lab at AI Dev SF 2026, sharing the results of experiments run across roughly 50 engineers before being rolled out to customers. The central thesis: treating AI coding as a system-engineering problem—rather than simply picking the most powerful model—yields dramatically better results at far lower cost.

The talk introduces a plan-implement-review pipeline in which the planner uses the best available model (Opus 4.6 or GPT 5.5 at the time of the experiments) while the implementation step is handed off to cheaper, faster models. Counterintuitively, Gemini Flash outperformed Opus on SWBench Pro’s hardest problems when given a high-quality plan, and resolved additional issues that Opus missed—attributed to model diversity rather than raw capability. Filev quantifies the financial stakes: teams using Opus for most tasks spend roughly $2,000 per engineer per month in API costs, making cost-aware pipeline design a business necessity, not an optimization afterthought.

Filev also covers spec-driven development (SDD), the role of the human engineer as a system architect rather than a direct code author, and why the planning stage justifies the highest model spend. The session is grounded in real benchmark data and internal production metrics, making it a practical reference for engineering leaders evaluating how to scale AI coding responsibly.

📺 Source: DeepLearningAI · Published May 22, 2026
🏷️ Format: Deep Dive

1 Item

Channels

No Image Available

DeepLearningAI

Tags

Anthropic Claude Code Claude Opus 4.6 Gemini Flash SWE-bench

Prev

This is absolutely CRAZY

Next

printf is Actually a Secret Virtual Machine – And a Giant Security Hole!

18 Related Posts

Related Posts

21:09

Foundation Models

Persona Engineering: A Field Guide to AI Synthetic Personas — Ishan Anand, InsightSciences.ai

1 day ago

21:39

Foundation Models

Serving 2 Million Models Without Melting: Scaling the Hugging Face Hub — Arek Borucki, Hugging Face

2 days ago

06:40

Foundation Models

AMD Releases First Ever AI model: Instella-MoE-16B-A3B-Think

2 days ago

24:01

Foundation Models

US AI Dominance Is Over: Here’s Why

3 days ago

17:31

Foundation Models

The Messy Reality of Scale: Synthetic Data and Pre-Training — Marah Abdin & Robert McHardy, poolside

4 days ago

23:13

Foundation Models

Evaling Video Slop — Maor Bril, Character.ai

5 days ago