AI Dev 25 x NYC | Scott Yak: Building MCP Servers That Make Agents More Effective

AI Dev 25 x NYC | Scott Yak: Building MCP Servers That Make Agents More Effective

More

Descriptions:

Scott Yak, an engineer at Datadog, used his AI Dev 25 NYC slot to make a counterintuitive argument: evals can become a source of joy if you consolidate your agent tooling into a well-designed MCP server. The talk opens with a diagnosis of how agent teams typically operate—each team building its own tools, each tool failing differently, each team maintaining its own isolated tool-call failure evals—resulting in duplicated effort and change-detector tests that break every time a tool interface shifts.

Yak’s solution is a centralized MCP server owned by a small dedicated team and shared across all agent teams in the organization. Because evals now measure outcome-level behavior rather than individual tool calls, they remain stable across tool surface changes and can be reused across teams. Datadog instruments their server with LLM observability so the full cycle—code change, eval run, results in dashboard—completes in under two minutes, with shareable trace URLs that can be dropped directly into Slack for collaborative debugging. The talk includes a live walkthrough of Cursor and Claude Code querying Datadog’s MCP server for HTTP errors, with detailed explanation of how the agent manages its context window using tool descriptions returned by the server.

For agent platform teams, the key takeaway is organizational: a centralized MCP server means tool-call failures are handled by specialists, agent builders focus on higher-level reasoning, and any improvement to the shared server immediately improves every downstream agent that uses it.


📺 Source: DeepLearningAI · Published December 05, 2025
🏷️ Format: Deep Dive

1 Item

Channels