AI Dev 25 x NYC | Samraj Moorjani: Accelerate High quality Agent Development with MLflow

Foundation Models5 months ago

AI Dev 25 x NYC | Samraj Moorjani: Accelerate High quality Agent Development with MLflow

Descriptions:

Samraj Moorjani, a Databricks engineer with two years focused on MLflow and agent quality, opened his AI Dev 25 NYC talk with a pointed question: if you wouldn’t ship untested software to production, why are teams shipping untested AI agents? The session is a ground-level guide to using MLflow—Databricks’ open-source GenAI platform—to apply software engineering rigor to the problem of agent reliability.

Moorjani identifies the specific ways agent QA differs from traditional software testing: non-deterministic outputs, unpredictable user behavior, domain expertise that lives outside the engineering team, and the three-way tradeoff between cost, latency, and quality. His solution architecture has two interlocking pillars: MLflow’s tracing capability for step-by-step observability of agent execution (the prerequisite for everything else), and a two-tier evaluation system—offline regression suites built from real production traces and human-labeled examples, plus online production monitors powered by LLM judges that scale expert feedback.

The talk walks through the managed Databricks version of MLflow, where evaluation datasets are backed by Unity Catalog for enterprise governance—fine-grained access controls and lineage tracking included by default. A particularly practical segment covers model-swap decisions: rather than guessing whether switching to a cheaper or newer model will degrade quality, MLflow’s eval diffing makes the tradeoff visible across a versioned dataset. Teams working on production agents in regulated or high-stakes domains will find the quality-assurance framework directly applicable.

📺 Source: DeepLearningAI · Published December 05, 2025
🏷️ Format: Deep Dive

1 Item

Channels

No Image Available

DeepLearningAI

Tags

Claude Agent SDK Claude Code Databricks JEPA

Prev

n8n Tutorial for Beginners 2026: How to Build AI Agents

n8n Tutorial for Beginners 2026: How to Build AI Agents

Next

World Models & General Intuition: Khosla’s largest bet since LLMs & OpenAI

World Models & General Intuition: Khosla’s largest bet since LLMs & OpenAI

18 Related Posts

Related Posts

16:23

Foundation Models

Your SaaS Bill Just Got a Second Meter. You’re About to Pay It.

1 hour ago

31:55

Foundation Models

The biggest AI breakthrough in medicine & drug discovery

1 day ago

01:20:07

Foundation Models

Mind the Gap (In your Agent Observability) — Amy Boyd & Nitya Narasimhan, Microsoft

1 day ago

25:53

Foundation Models

The Trillion Dollar Agentic Workflow Opportunity Is Here

1 day ago

18:37

Foundation Models

CI/CD Is Dead, Agents Need Continuous Compute and Computers — Hugo Santos and Madison Faulkner

2 days ago

20:09

Foundation Models

Pinecone Just Demoted Vector Search. Here’s the Knowledge Layer.

2 days ago