Descriptions:
Andrew Filev, CEO of Zenoder, presents findings from his company’s in-house applied research lab at AI Dev SF 2026, sharing the results of experiments run across roughly 50 engineers before being rolled out to customers. The central thesis: treating AI coding as a system-engineering problem—rather than simply picking the most powerful model—yields dramatically better results at far lower cost.
The talk introduces a plan-implement-review pipeline in which the planner uses the best available model (Opus 4.6 or GPT 5.5 at the time of the experiments) while the implementation step is handed off to cheaper, faster models. Counterintuitively, Gemini Flash outperformed Opus on SWBench Pro’s hardest problems when given a high-quality plan, and resolved additional issues that Opus missed—attributed to model diversity rather than raw capability. Filev quantifies the financial stakes: teams using Opus for most tasks spend roughly $2,000 per engineer per month in API costs, making cost-aware pipeline design a business necessity, not an optimization afterthought.
Filev also covers spec-driven development (SDD), the role of the human engineer as a system architect rather than a direct code author, and why the planning stage justifies the highest model spend. The session is grounded in real benchmark data and internal production metrics, making it a practical reference for engineering leaders evaluating how to scale AI coding responsibly.
📺 Source: DeepLearningAI · Published May 22, 2026
🏷️ Format: Deep Dive







