Making agentic workflows trustworthy and verifiable with a custom DSL

Foundation Models2 months ago

Making agentic workflows trustworthy and verifiable with a custom DSL

Descriptions:

James Brady from Elicit presents a technically detailed account of how his team built a custom domain-specific language — internally referred to as HPL — to make their agentic research workflows legible, reproducible, and auditable. The talk opens with a deceptively simple question: if two AI systems produce identical outputs, are they equally trustworthy? Brady argues they are not, and that the mechanism by which an answer is produced matters as much as the answer itself — a principle that shapes every design decision in HPL.

The language was built around three core requirements. First, legibility: the agent’s process must be readable and spot-checkable by both human users and other agents running critique passes. Second, fidelity of iteration: users should be able to add layers, change direction, and extend the work without the model drifting from the original intent. Third, faithful execution: a validated process should run exactly as specified, not approximately. The implementation uses a Python service that parses HPL into an abstract syntax tree, performs type checking, and walks the tree for interpretation — with any syntax errors cheaply returned to a “curator” agent for correction before any expensive inference runs.

A critical performance enabler is the content-addressed store: because HPL is a pure language, any expression that has been previously evaluated can be retrieved by hash rather than recomputed, making full program reinterpretation on each iteration practical at speed. Brady includes a live demo of Elicit’s research agent and discusses the team’s explicit positioning on the speed-versus-rigor tradeoff — firmly on the high-rigor end, suited for systematic literature review rather than fast conversational lookups.

📺 Source: Claude · Published May 22, 2026
🏷️ Format: Deep Dive

1 Item

Channels

No Image Available

Claude

Tags

Anthropic Anthropic Agent SDK Claude Opus Codex Elicit Python

Prev

This is absolutely CRAZY

Next

printf is Actually a Secret Virtual Machine – And a Giant Security Hole!

18 Related Posts

Related Posts

21:09

Foundation Models

Persona Engineering: A Field Guide to AI Synthetic Personas — Ishan Anand, InsightSciences.ai

1 day ago

21:39

Foundation Models

Serving 2 Million Models Without Melting: Scaling the Hugging Face Hub — Arek Borucki, Hugging Face

2 days ago

06:40

Foundation Models

AMD Releases First Ever AI model: Instella-MoE-16B-A3B-Think

2 days ago

24:01

Foundation Models

US AI Dominance Is Over: Here’s Why

3 days ago

17:31

Foundation Models

The Messy Reality of Scale: Synthetic Data and Pre-Training — Marah Abdin & Robert McHardy, poolside

4 days ago

20:24

Foundation Models

From Agent Traces to Agent Simulations — Rustem Feyzkhanov, Snorkel AI

5 days ago