Skip to content

TypeScript Expect DSL

Python equivalent: Python Expect DSL

The Expect DSL provides a fluent, chainable API to declare assertions against agent traces. Assertions are batched and sent to the Attest engine for evaluation.

Terminal window
pnpm add @attest-ai/core
# or with vitest integration
pnpm add @attest-ai/vitest
import { attestExpect } from '@attest-ai/core';
// Or via vitest integration (re-exports the same function)
import { attestExpect } from '@attest-ai/vitest';

Creates an ExpectChain for declaring assertions against an agent result or trace.

function attestExpect(result: AgentResult | Trace): ExpectChain

Parameters:

ParameterTypeDescription
resultAgentResult | TraceThe agent result or raw trace to assert against. A raw Trace is auto-wrapped in AgentResult.

Returns: ExpectChain — a fluent builder. Chain assertion methods, then pass to evaluate().

const chain = attestExpect(result)
.outputContains("hello")
.latencyUnder(5000)
.tokensUnder(1000);
const evaluated = await evaluate(chain);
expect(evaluated.passed).toBe(true);
get assertions(): Assertion[]

Returns a copy of all assertions added to the chain.

get trace(): Trace

Returns the underlying trace from the wrapped AgentResult.


Schema assertions validate structural properties of output and tool interactions using JSON Schema.

Validates the entire structured output against a JSON Schema.

outputMatchesSchema(schema: Record<string, unknown>): this
attestExpect(result).outputMatchesSchema({
type: "object",
required: ["name", "age"],
properties: {
name: { type: "string" },
age: { type: "number", minimum: 0 },
},
});

Validates a single field of the output against a JSON Schema.

outputFieldMatchesSchema(field: string, schema: Record<string, unknown>): this
attestExpect(result).outputFieldMatchesSchema("recommendations", {
type: "array",
items: { type: "string" },
minItems: 1,
});

Validates the arguments passed to a specific tool call.

toolArgsMatchSchema(toolName: string, schema: Record<string, unknown>): this
attestExpect(result).toolArgsMatchSchema("search", {
type: "object",
required: ["query"],
properties: {
query: { type: "string", minLength: 1 },
},
});

Validates the result returned by a specific tool call.

toolResultMatchesSchema(toolName: string, schema: Record<string, unknown>): this
attestExpect(result).toolResultMatchesSchema("search", {
type: "object",
required: ["results"],
properties: {
results: { type: "array" },
},
});

Constraint assertions enforce numeric bounds on cost, latency, tokens, and step counts. All support a soft option: soft failures are warnings, not test failures.

Asserts total cost is at or below a threshold (USD).

costUnder(maxCost: number, opts?: { soft?: boolean }): this
attestExpect(result).costUnder(0.05);
attestExpect(result).costUnder(0.10, { soft: true }); // warning only

Asserts total latency is at or below a threshold (milliseconds).

latencyUnder(maxMs: number, opts?: { soft?: boolean }): this
attestExpect(result).latencyUnder(3000);

Asserts total token usage is at or below a threshold.

tokensUnder(maxTokens: number, opts?: { soft?: boolean }): this
attestExpect(result).tokensUnder(2000);

Asserts total token usage falls within a range (inclusive).

tokensBetween(minTokens: number, maxTokens: number, opts?: { soft?: boolean }): this
attestExpect(result).tokensBetween(100, 2000);

Asserts the total number of steps matches an operator/value pair.

stepCount(operator: string, value: number, opts?: { soft?: boolean }): this

Supported operators: "eq", "lte", "gte", "lt", "gt", "between".

attestExpect(result).stepCount("lte", 5);
attestExpect(result).stepCount("eq", 3);

Asserts the number of tool call steps matches an operator/value pair.

toolCallCount(operator: string, value: number, opts?: { soft?: boolean }): this
attestExpect(result).toolCallCount("gte", 1);
attestExpect(result).toolCallCount("lte", 3);

Generic constraint for arbitrary trace fields.

constraint(
field: string,
operator: string,
opts?: { value?: number; min?: number; max?: number; soft?: boolean },
): this
attestExpect(result).constraint("metadata.cost_usd", "lte", { value: 0.01 });
attestExpect(result).constraint("metadata.total_tokens", "between", {
min: 50,
max: 500,
});

Trace assertions validate the sequence and composition of tool calls within a trace.

Asserts that the named tools appear in the trace in the given order (other tools may appear between them).

toolsCalledInOrder(tools: string[], opts?: { soft?: boolean }): this
attestExpect(result).toolsCalledInOrder(["search", "summarize"]);

Asserts the exact ordered sequence of tool calls — no extra tools allowed.

toolsCalledExactly(tools: string[], opts?: { soft?: boolean }): this
attestExpect(result).toolsCalledExactly(["search", "rank", "format"]);

Asserts a tool is not called more than maxRepetitions consecutive times.

noToolLoops(tool: string, maxRepetitions?: number, opts?: { soft?: boolean }): this
attestExpect(result).noToolLoops("retry_fetch", 2);

Asserts no tool is called more than once across the entire trace.

noDuplicateTools(opts?: { soft?: boolean }): this
attestExpect(result).noDuplicateTools();

Asserts that all specified tools appear in the trace (in any order).

requiredTools(tools: string[], opts?: { soft?: boolean }): this
attestExpect(result).requiredTools(["validate_input", "generate_response"]);

Asserts that none of the specified tools appear in the trace.

forbiddenTools(tools: string[], opts?: { soft?: boolean }): this
attestExpect(result).forbiddenTools(["delete_user", "drop_table"]);

Content assertions inspect the textual content of outputs.

Asserts the output message contains a substring.

outputContains(value: string, opts?: { caseSensitive?: boolean; soft?: boolean }): this
attestExpect(result).outputContains("recommendation");
attestExpect(result).outputContains("JSON", { caseSensitive: true });

Asserts the output message does not contain a substring.

outputNotContains(value: string, opts?: { caseSensitive?: boolean; soft?: boolean }): this
attestExpect(result).outputNotContains("error");

Asserts the output message matches a regular expression pattern.

outputMatchesRegex(pattern: string, opts?: { soft?: boolean }): this
attestExpect(result).outputMatchesRegex("\\d{3}-\\d{4}");

Asserts the output contains all specified keywords.

outputHasAllKeywords(
keywords: string[],
opts?: { caseSensitive?: boolean; soft?: boolean },
): this
attestExpect(result).outputHasAllKeywords(["price", "availability", "shipping"]);

Asserts the output contains at least one of the specified keywords.

outputHasAnyKeyword(
keywords: string[],
opts?: { caseSensitive?: boolean; soft?: boolean },
): this
attestExpect(result).outputHasAnyKeyword(["yes", "confirmed", "approved"]);

Asserts the output does not contain any of the specified terms. Always a hard failure.

outputForbids(terms: string[]): this
attestExpect(result).outputForbids(["password", "ssn", "credit card"]);

Generic content check against an arbitrary trace target path.

contentContains(
target: string,
value: string,
opts?: { caseSensitive?: boolean; soft?: boolean },
): this
attestExpect(result).contentContains("steps[0].result.completion", "success");

Embedding assertions compare semantic similarity between the output and a reference string. Requires an embedding provider.

Asserts the output is semantically similar to a reference text.

outputSimilarTo(
reference: string,
opts?: { threshold?: number; model?: string; soft?: boolean },
): this
OptionTypeDefaultDescription
thresholdnumber0.8Minimum cosine similarity score (0.0 to 1.0)
modelstringProvider defaultEmbedding model override
softbooleanfalseTreat failure as warning
attestExpect(result).outputSimilarTo(
"The product is available and ships within 2 business days",
{ threshold: 0.85 },
);

LLM judge assertions use a language model to evaluate output quality against criteria. Most expensive assertion layer.

Asserts the output passes an LLM judge evaluation.

passesJudge(
criteria: string,
opts?: { rubric?: string; threshold?: number; model?: string; soft?: boolean },
): this
OptionTypeDefaultDescription
rubricstring"default"Evaluation rubric name
thresholdnumber0.8Minimum judge score (0.0 to 1.0)
modelstringProvider defaultJudge model override
softbooleanfalseTreat failure as warning
attestExpect(result).passesJudge(
"Response is helpful, accurate, and addresses all parts of the user question",
{ threshold: 0.9 },
);
attestExpect(result).passesJudge("Response does not contain hallucinated facts", {
rubric: "factuality",
model: "gpt-4.1",
});

Trace tree assertions validate multi-agent delegation patterns, cross-agent data flow, and aggregate metrics across the full agent tree.

Asserts a specific agent was invoked somewhere in the trace tree.

agentCalled(agentId: string, opts?: { soft?: boolean }): this
attestExpect(result).agentCalled("researcher");

Asserts the maximum delegation depth does not exceed a limit.

delegationDepth(maxDepth: number, opts?: { soft?: boolean }): this
attestExpect(result).delegationDepth(3);

Asserts a specific agent’s output contains a substring.

agentOutputContains(
agentId: string,
value: string,
opts?: { caseSensitive?: boolean; soft?: boolean },
): this
attestExpect(result).agentOutputContains("summarizer", "conclusion");

Asserts data flows from one agent to another through a specific field.

crossAgentDataFlow(
fromAgent: string,
toAgent: string,
field: string,
opts?: { soft?: boolean },
): this
attestExpect(result).crossAgentDataFlow("researcher", "writer", "findings");

Asserts the agent delegation graph contains specific parent-child transitions.

followsTransitions(transitions: [string, string][], opts?: { soft?: boolean }): this
attestExpect(result).followsTransitions([
["orchestrator", "researcher"],
["orchestrator", "writer"],
]);

Asserts the total cost across all agents in the tree is at or below a threshold.

aggregateCostUnder(maxCost: number, opts?: { soft?: boolean }): this
attestExpect(result).aggregateCostUnder(0.50);

Asserts the total token usage across all agents in the tree is at or below a threshold.

aggregateTokensUnder(maxTokens: number, opts?: { soft?: boolean }): this
attestExpect(result).aggregateTokensUnder(10000);

Asserts one agent completed before another started.

agentOrderedBefore(agentA: string, agentB: string, opts?: { soft?: boolean }): this
attestExpect(result).agentOrderedBefore("planner", "executor");

Asserts two agents had overlapping execution windows (ran concurrently).

agentsOverlap(agentA: string, agentB: string, opts?: { soft?: boolean }): this
attestExpect(result).agentsOverlap("researcher_a", "researcher_b");

Asserts a specific agent’s wall-clock execution time is under a threshold.

agentWallTimeUnder(agentId: string, maxMs: number, opts?: { soft?: boolean }): this
attestExpect(result).agentWallTimeUnder("summarizer", 5000);

Asserts agents execute in ordered groups. Agents within the same group may run in any order, but groups execute sequentially.

orderedAgents(groups: string[][], opts?: { soft?: boolean }): this
attestExpect(result).orderedAgents([
["planner"], // Phase 1
["researcher_a", "researcher_b"], // Phase 2 (parallel)
["writer"], // Phase 3
]);

import { attestExpect, agent, TraceBuilder } from '@attest-ai/core';
import { evaluate } from '@attest-ai/vitest';
import { describe, it, expect } from 'vitest';
const myAgent = agent("assistant", (builder, args) => {
builder.addLlmCall("gpt-4.1", {
args: { prompt: args.question },
result: { completion: "Paris is the capital of France." },
});
builder.addToolCall("fact_check", {
args: { claim: "Paris is the capital of France" },
result: { verified: true },
});
return { message: "Paris is the capital of France." };
});
describe("assistant agent", () => {
it("produces correct, efficient output", async () => {
const result = myAgent({ question: "What is the capital of France?" });
const chain = attestExpect(result)
// Layer 1: Schema
.outputMatchesSchema({
type: "object",
required: ["message"],
properties: { message: { type: "string" } },
})
// Layer 2: Constraints
.tokensUnder(500)
.costUnder(0.01)
// Layer 3: Trace
.requiredTools(["fact_check"])
.forbiddenTools(["delete_data"])
// Layer 4: Content
.outputContains("Paris")
.outputNotContains("error");
const evaluated = await evaluate(chain);
expect(evaluated.passed).toBe(true);
});
});

LayerTypeCostMethods
1SchemaFreeoutputMatchesSchema, outputFieldMatchesSchema, toolArgsMatchSchema, toolResultMatchesSchema
2ConstraintFreecostUnder, latencyUnder, tokensUnder, tokensBetween, stepCount, toolCallCount, constraint
3TraceFreetoolsCalledInOrder, toolsCalledExactly, noToolLoops, noDuplicateTools, requiredTools, forbiddenTools
4ContentFreeoutputContains, outputNotContains, outputMatchesRegex, outputHasAllKeywords, outputHasAnyKeyword, outputForbids, contentContains
5EmbeddingPaidoutputSimilarTo
6LLM JudgePaidpassesJudge
7Trace TreeFreeagentCalled, delegationDepth, agentOutputContains, crossAgentDataFlow, followsTransitions, aggregateCostUnder, aggregateTokensUnder, agentOrderedBefore, agentsOverlap, agentWallTimeUnder, orderedAgents