TypeScript Expect DSL
Python equivalent: Python Expect DSL
The Expect DSL provides a fluent, chainable API to declare assertions against agent traces. Assertions are batched and sent to the Attest engine for evaluation.
Installation
Section titled “Installation”pnpm add @attest-ai/core# or with vitest integrationpnpm add @attest-ai/vitestImport
Section titled “Import”import { attestExpect } from '@attest-ai/core';
// Or via vitest integration (re-exports the same function)import { attestExpect } from '@attest-ai/vitest';attestExpect()
Section titled “attestExpect()”Creates an ExpectChain for declaring assertions against an agent result or trace.
function attestExpect(result: AgentResult | Trace): ExpectChainParameters:
| Parameter | Type | Description |
|---|---|---|
result | AgentResult | Trace | The agent result or raw trace to assert against. A raw Trace is auto-wrapped in AgentResult. |
Returns: ExpectChain — a fluent builder. Chain assertion methods, then pass to evaluate().
const chain = attestExpect(result) .outputContains("hello") .latencyUnder(5000) .tokensUnder(1000);
const evaluated = await evaluate(chain);expect(evaluated.passed).toBe(true);ExpectChain Properties
Section titled “ExpectChain Properties”assertions
Section titled “assertions”get assertions(): Assertion[]Returns a copy of all assertions added to the chain.
get trace(): TraceReturns the underlying trace from the wrapped AgentResult.
Layer 1: Schema Assertions
Section titled “Layer 1: Schema Assertions”Schema assertions validate structural properties of output and tool interactions using JSON Schema.
outputMatchesSchema()
Section titled “outputMatchesSchema()”Validates the entire structured output against a JSON Schema.
outputMatchesSchema(schema: Record<string, unknown>): thisattestExpect(result).outputMatchesSchema({ type: "object", required: ["name", "age"], properties: { name: { type: "string" }, age: { type: "number", minimum: 0 }, },});outputFieldMatchesSchema()
Section titled “outputFieldMatchesSchema()”Validates a single field of the output against a JSON Schema.
outputFieldMatchesSchema(field: string, schema: Record<string, unknown>): thisattestExpect(result).outputFieldMatchesSchema("recommendations", { type: "array", items: { type: "string" }, minItems: 1,});toolArgsMatchSchema()
Section titled “toolArgsMatchSchema()”Validates the arguments passed to a specific tool call.
toolArgsMatchSchema(toolName: string, schema: Record<string, unknown>): thisattestExpect(result).toolArgsMatchSchema("search", { type: "object", required: ["query"], properties: { query: { type: "string", minLength: 1 }, },});toolResultMatchesSchema()
Section titled “toolResultMatchesSchema()”Validates the result returned by a specific tool call.
toolResultMatchesSchema(toolName: string, schema: Record<string, unknown>): thisattestExpect(result).toolResultMatchesSchema("search", { type: "object", required: ["results"], properties: { results: { type: "array" }, },});Layer 2: Constraint Assertions
Section titled “Layer 2: Constraint Assertions”Constraint assertions enforce numeric bounds on cost, latency, tokens, and step counts. All support a soft option: soft failures are warnings, not test failures.
costUnder()
Section titled “costUnder()”Asserts total cost is at or below a threshold (USD).
costUnder(maxCost: number, opts?: { soft?: boolean }): thisattestExpect(result).costUnder(0.05);attestExpect(result).costUnder(0.10, { soft: true }); // warning onlylatencyUnder()
Section titled “latencyUnder()”Asserts total latency is at or below a threshold (milliseconds).
latencyUnder(maxMs: number, opts?: { soft?: boolean }): thisattestExpect(result).latencyUnder(3000);tokensUnder()
Section titled “tokensUnder()”Asserts total token usage is at or below a threshold.
tokensUnder(maxTokens: number, opts?: { soft?: boolean }): thisattestExpect(result).tokensUnder(2000);tokensBetween()
Section titled “tokensBetween()”Asserts total token usage falls within a range (inclusive).
tokensBetween(minTokens: number, maxTokens: number, opts?: { soft?: boolean }): thisattestExpect(result).tokensBetween(100, 2000);stepCount()
Section titled “stepCount()”Asserts the total number of steps matches an operator/value pair.
stepCount(operator: string, value: number, opts?: { soft?: boolean }): thisSupported operators: "eq", "lte", "gte", "lt", "gt", "between".
attestExpect(result).stepCount("lte", 5);attestExpect(result).stepCount("eq", 3);toolCallCount()
Section titled “toolCallCount()”Asserts the number of tool call steps matches an operator/value pair.
toolCallCount(operator: string, value: number, opts?: { soft?: boolean }): thisattestExpect(result).toolCallCount("gte", 1);attestExpect(result).toolCallCount("lte", 3);constraint()
Section titled “constraint()”Generic constraint for arbitrary trace fields.
constraint( field: string, operator: string, opts?: { value?: number; min?: number; max?: number; soft?: boolean },): thisattestExpect(result).constraint("metadata.cost_usd", "lte", { value: 0.01 });attestExpect(result).constraint("metadata.total_tokens", "between", { min: 50, max: 500,});Layer 3: Trace Assertions
Section titled “Layer 3: Trace Assertions”Trace assertions validate the sequence and composition of tool calls within a trace.
toolsCalledInOrder()
Section titled “toolsCalledInOrder()”Asserts that the named tools appear in the trace in the given order (other tools may appear between them).
toolsCalledInOrder(tools: string[], opts?: { soft?: boolean }): thisattestExpect(result).toolsCalledInOrder(["search", "summarize"]);toolsCalledExactly()
Section titled “toolsCalledExactly()”Asserts the exact ordered sequence of tool calls — no extra tools allowed.
toolsCalledExactly(tools: string[], opts?: { soft?: boolean }): thisattestExpect(result).toolsCalledExactly(["search", "rank", "format"]);noToolLoops()
Section titled “noToolLoops()”Asserts a tool is not called more than maxRepetitions consecutive times.
noToolLoops(tool: string, maxRepetitions?: number, opts?: { soft?: boolean }): thisattestExpect(result).noToolLoops("retry_fetch", 2);noDuplicateTools()
Section titled “noDuplicateTools()”Asserts no tool is called more than once across the entire trace.
noDuplicateTools(opts?: { soft?: boolean }): thisattestExpect(result).noDuplicateTools();requiredTools()
Section titled “requiredTools()”Asserts that all specified tools appear in the trace (in any order).
requiredTools(tools: string[], opts?: { soft?: boolean }): thisattestExpect(result).requiredTools(["validate_input", "generate_response"]);forbiddenTools()
Section titled “forbiddenTools()”Asserts that none of the specified tools appear in the trace.
forbiddenTools(tools: string[], opts?: { soft?: boolean }): thisattestExpect(result).forbiddenTools(["delete_user", "drop_table"]);Layer 4: Content Assertions
Section titled “Layer 4: Content Assertions”Content assertions inspect the textual content of outputs.
outputContains()
Section titled “outputContains()”Asserts the output message contains a substring.
outputContains(value: string, opts?: { caseSensitive?: boolean; soft?: boolean }): thisattestExpect(result).outputContains("recommendation");attestExpect(result).outputContains("JSON", { caseSensitive: true });outputNotContains()
Section titled “outputNotContains()”Asserts the output message does not contain a substring.
outputNotContains(value: string, opts?: { caseSensitive?: boolean; soft?: boolean }): thisattestExpect(result).outputNotContains("error");outputMatchesRegex()
Section titled “outputMatchesRegex()”Asserts the output message matches a regular expression pattern.
outputMatchesRegex(pattern: string, opts?: { soft?: boolean }): thisattestExpect(result).outputMatchesRegex("\\d{3}-\\d{4}");outputHasAllKeywords()
Section titled “outputHasAllKeywords()”Asserts the output contains all specified keywords.
outputHasAllKeywords( keywords: string[], opts?: { caseSensitive?: boolean; soft?: boolean },): thisattestExpect(result).outputHasAllKeywords(["price", "availability", "shipping"]);outputHasAnyKeyword()
Section titled “outputHasAnyKeyword()”Asserts the output contains at least one of the specified keywords.
outputHasAnyKeyword( keywords: string[], opts?: { caseSensitive?: boolean; soft?: boolean },): thisattestExpect(result).outputHasAnyKeyword(["yes", "confirmed", "approved"]);outputForbids()
Section titled “outputForbids()”Asserts the output does not contain any of the specified terms. Always a hard failure.
outputForbids(terms: string[]): thisattestExpect(result).outputForbids(["password", "ssn", "credit card"]);contentContains()
Section titled “contentContains()”Generic content check against an arbitrary trace target path.
contentContains( target: string, value: string, opts?: { caseSensitive?: boolean; soft?: boolean },): thisattestExpect(result).contentContains("steps[0].result.completion", "success");Layer 5: Embedding Similarity
Section titled “Layer 5: Embedding Similarity”Embedding assertions compare semantic similarity between the output and a reference string. Requires an embedding provider.
outputSimilarTo()
Section titled “outputSimilarTo()”Asserts the output is semantically similar to a reference text.
outputSimilarTo( reference: string, opts?: { threshold?: number; model?: string; soft?: boolean },): this| Option | Type | Default | Description |
|---|---|---|---|
threshold | number | 0.8 | Minimum cosine similarity score (0.0 to 1.0) |
model | string | Provider default | Embedding model override |
soft | boolean | false | Treat failure as warning |
attestExpect(result).outputSimilarTo( "The product is available and ships within 2 business days", { threshold: 0.85 },);Layer 6: LLM Judge
Section titled “Layer 6: LLM Judge”LLM judge assertions use a language model to evaluate output quality against criteria. Most expensive assertion layer.
passesJudge()
Section titled “passesJudge()”Asserts the output passes an LLM judge evaluation.
passesJudge( criteria: string, opts?: { rubric?: string; threshold?: number; model?: string; soft?: boolean },): this| Option | Type | Default | Description |
|---|---|---|---|
rubric | string | "default" | Evaluation rubric name |
threshold | number | 0.8 | Minimum judge score (0.0 to 1.0) |
model | string | Provider default | Judge model override |
soft | boolean | false | Treat failure as warning |
attestExpect(result).passesJudge( "Response is helpful, accurate, and addresses all parts of the user question", { threshold: 0.9 },);
attestExpect(result).passesJudge("Response does not contain hallucinated facts", { rubric: "factuality", model: "gpt-4.1",});Layer 7: Trace Tree (Multi-Agent)
Section titled “Layer 7: Trace Tree (Multi-Agent)”Trace tree assertions validate multi-agent delegation patterns, cross-agent data flow, and aggregate metrics across the full agent tree.
agentCalled()
Section titled “agentCalled()”Asserts a specific agent was invoked somewhere in the trace tree.
agentCalled(agentId: string, opts?: { soft?: boolean }): thisattestExpect(result).agentCalled("researcher");delegationDepth()
Section titled “delegationDepth()”Asserts the maximum delegation depth does not exceed a limit.
delegationDepth(maxDepth: number, opts?: { soft?: boolean }): thisattestExpect(result).delegationDepth(3);agentOutputContains()
Section titled “agentOutputContains()”Asserts a specific agent’s output contains a substring.
agentOutputContains( agentId: string, value: string, opts?: { caseSensitive?: boolean; soft?: boolean },): thisattestExpect(result).agentOutputContains("summarizer", "conclusion");crossAgentDataFlow()
Section titled “crossAgentDataFlow()”Asserts data flows from one agent to another through a specific field.
crossAgentDataFlow( fromAgent: string, toAgent: string, field: string, opts?: { soft?: boolean },): thisattestExpect(result).crossAgentDataFlow("researcher", "writer", "findings");followsTransitions()
Section titled “followsTransitions()”Asserts the agent delegation graph contains specific parent-child transitions.
followsTransitions(transitions: [string, string][], opts?: { soft?: boolean }): thisattestExpect(result).followsTransitions([ ["orchestrator", "researcher"], ["orchestrator", "writer"],]);aggregateCostUnder()
Section titled “aggregateCostUnder()”Asserts the total cost across all agents in the tree is at or below a threshold.
aggregateCostUnder(maxCost: number, opts?: { soft?: boolean }): thisattestExpect(result).aggregateCostUnder(0.50);aggregateTokensUnder()
Section titled “aggregateTokensUnder()”Asserts the total token usage across all agents in the tree is at or below a threshold.
aggregateTokensUnder(maxTokens: number, opts?: { soft?: boolean }): thisattestExpect(result).aggregateTokensUnder(10000);agentOrderedBefore()
Section titled “agentOrderedBefore()”Asserts one agent completed before another started.
agentOrderedBefore(agentA: string, agentB: string, opts?: { soft?: boolean }): thisattestExpect(result).agentOrderedBefore("planner", "executor");agentsOverlap()
Section titled “agentsOverlap()”Asserts two agents had overlapping execution windows (ran concurrently).
agentsOverlap(agentA: string, agentB: string, opts?: { soft?: boolean }): thisattestExpect(result).agentsOverlap("researcher_a", "researcher_b");agentWallTimeUnder()
Section titled “agentWallTimeUnder()”Asserts a specific agent’s wall-clock execution time is under a threshold.
agentWallTimeUnder(agentId: string, maxMs: number, opts?: { soft?: boolean }): thisattestExpect(result).agentWallTimeUnder("summarizer", 5000);orderedAgents()
Section titled “orderedAgents()”Asserts agents execute in ordered groups. Agents within the same group may run in any order, but groups execute sequentially.
orderedAgents(groups: string[][], opts?: { soft?: boolean }): thisattestExpect(result).orderedAgents([ ["planner"], // Phase 1 ["researcher_a", "researcher_b"], // Phase 2 (parallel) ["writer"], // Phase 3]);Full Example
Section titled “Full Example”import { attestExpect, agent, TraceBuilder } from '@attest-ai/core';import { evaluate } from '@attest-ai/vitest';import { describe, it, expect } from 'vitest';
const myAgent = agent("assistant", (builder, args) => { builder.addLlmCall("gpt-4.1", { args: { prompt: args.question }, result: { completion: "Paris is the capital of France." }, }); builder.addToolCall("fact_check", { args: { claim: "Paris is the capital of France" }, result: { verified: true }, }); return { message: "Paris is the capital of France." };});
describe("assistant agent", () => { it("produces correct, efficient output", async () => { const result = myAgent({ question: "What is the capital of France?" });
const chain = attestExpect(result) // Layer 1: Schema .outputMatchesSchema({ type: "object", required: ["message"], properties: { message: { type: "string" } }, }) // Layer 2: Constraints .tokensUnder(500) .costUnder(0.01) // Layer 3: Trace .requiredTools(["fact_check"]) .forbiddenTools(["delete_data"]) // Layer 4: Content .outputContains("Paris") .outputNotContains("error");
const evaluated = await evaluate(chain); expect(evaluated.passed).toBe(true); });});Assertion Layer Summary
Section titled “Assertion Layer Summary”| Layer | Type | Cost | Methods |
|---|---|---|---|
| 1 | Schema | Free | outputMatchesSchema, outputFieldMatchesSchema, toolArgsMatchSchema, toolResultMatchesSchema |
| 2 | Constraint | Free | costUnder, latencyUnder, tokensUnder, tokensBetween, stepCount, toolCallCount, constraint |
| 3 | Trace | Free | toolsCalledInOrder, toolsCalledExactly, noToolLoops, noDuplicateTools, requiredTools, forbiddenTools |
| 4 | Content | Free | outputContains, outputNotContains, outputMatchesRegex, outputHasAllKeywords, outputHasAnyKeyword, outputForbids, contentContains |
| 5 | Embedding | Paid | outputSimilarTo |
| 6 | LLM Judge | Paid | passesJudge |
| 7 | Trace Tree | Free | agentCalled, delegationDepth, agentOutputContains, crossAgentDataFlow, followsTransitions, aggregateCostUnder, aggregateTokensUnder, agentOrderedBefore, agentsOverlap, agentWallTimeUnder, orderedAgents |