Skip to content

Overview

Attest is a testing framework for AI agents. It provides an 8-layer graduated assertion pipeline that starts with free deterministic checks and escalates to expensive LLM-based evaluation only when needed.

Testing AI agents is different from testing traditional software:

  • Non-deterministic outputs — The same input produces different outputs across runs
  • Multi-step behavior — Agents make sequences of decisions (tool calls, API calls, delegations)
  • Cost per test — LLM-as-judge evaluation costs money on every run
  • Framework fragmentation — Teams use OpenAI, LangChain, CrewAI, Google ADK — each with different tracing formats

Most existing tools solve this by throwing an LLM judge at everything. That’s expensive, slow, and flaky. Attest takes a different approach: check what you can deterministically first, then escalate.

Use one of 11 adapters (or the @agent decorator) to capture execution traces:

from attest import agent
@agent("my-agent")
async def my_agent(query: str) -> str:
# Your agent logic — tool calls, LLM calls, delegations
...
result = await my_agent("What's 2+2?")
expect(result).output_to_contain("4") # Layer 4: Free, instant
expect(result).to_have_steps(["llm_call"]) # Layer 3: Free, instant
expect(result).output_to_satisfy("mathematically correct") # Layer 6: ~$0.001

3. Run with pytest (Python) or vitest (TypeScript)

Section titled “3. Run with pytest (Python) or vitest (TypeScript)”
Terminal window
# Python
pytest test_agent.py -v
# TypeScript
npx vitest run
  • 8 assertion layers — Schema, constraint, trace, content, embedding, LLM judge, trace tree, plugin
  • 11 framework adapters — OpenAI, Anthropic, Gemini, Ollama, LangChain, LlamaIndex, CrewAI, Google ADK, OpenTelemetry, Manual
  • Multi-agent trace trees — Test delegation chains, cross-agent assertions, temporal ordering
  • Simulation mode — Run tests without API calls using mock tools and personas
  • Continuous evaluation — Sample production traces and run assertions with alerting
  • Drift detection — Monitor agent behavior changes over time
  • Cost tracking — Per-assertion cost metrics with tier budgets
  • Plugin system — Extend with custom evaluation logic via entry points

Attest uses a subprocess architecture: a Go engine handles assertion evaluation while SDKs (Python, TypeScript) handle instrumentation and the developer API.

┌─────────────┐ JSON-RPC 2.0 ┌──────────────┐
│ Python SDK │ ◄──── NDJSON/stdio ───► │ Go Engine │
│ or TS SDK │ │ (subprocess) │
└─────────────┘ └──────────────┘

The engine is a single static binary — no runtime dependencies, no containers. SDKs auto-download the correct version on first run.

Current release: v0.4.2 (SDK), v0.4.0 (engine). Alpha stage — API surface is stabilizing but breaking changes are still possible before v1.0.

PackageRegistryInstall
attest-aiPyPIpip install attest-ai
@attest-ai/corenpmnpm install @attest-ai/core
@attest-ai/vitestnpmnpm install @attest-ai/vitest