Mind the Gap (In your Agent Observability) — Amy Boyd & Nitya Narasimhan, Microsoft

Mind the Gap (In your Agent Observability) — Amy Boyd & Nitya Narasimhan, Microsoft

More

Descriptions:

Amy Boyd and Nitya Narasimhan from Microsoft’s Azure AI Foundry developer relations team deliver a hands-on workshop at AI Engineer London titled “Mind the Gap,” using the London Underground safety announcement as a sustained analogy for the distance between what an agent is designed to do and what it actually does in production. Their central argument: observability must be built in from day one, not retrofitted, and trace-linked evaluations are the key mechanism for shortening the gap between detecting a problem and diagnosing its root cause.

Boyd opens by demonstrating Microsoft Azure AI Foundry’s no-code evaluation tooling — creating a project, attaching a model with web search, running the agent, selecting evaluation metrics including task adherence and safety, and reviewing trace-linked results in the Foundry UI — all without writing code. The session reveals a concrete example where task adherence scored unexpectedly low, illustrating how early evals surface quality issues before an agent reaches production.

Narasimhan then shifts to the SDK layer, covering how to implement tracing programmatically, write custom prompt-based and code-based evaluators, and interpret evaluation results tied directly to specific trace steps. The presenters explain why this matters when, for example, a model swap causes tool call efficiency to drop — evals flag the regression and the trace shows exactly where in the execution the behavior changed. All workshop assets are available in a maintained GitHub repository, with ongoing support through a Microsoft Foundry Discord channel.


📺 Source: AI Engineer · Published May 14, 2026
🏷️ Format: Deep Dive

1 Item

Channels

1 Item

Companies