Skip to content

Changelog

All notable changes to Attest are documented here. Versions follow Semantic Versioning.


Production Hardening — test coverage, TS CLI, documentation

  • Go engine test coverage — Integration tests for evaluate_batch, submit_plugin_result, and shutdown JSON-RPC methods. Coverage for concurrent request handling, schema compiler cache, and trace validation.
  • TypeScript CLInpx attest init scaffolds a vitest-based test project with @attest-ai/core and @attest-ai/vitest pre-configured.
  • TypeScript examples — Four ported examples (quickstart, openai-adapter, schema-assertions, content-assertions) in the attest-examples repo.
  • Documentation site — Updated changelog, configuration reference, migration guide (v0.5 to v0.6), and SDK reference pages covering all v0.5–v0.7 features.
  • Engine: integration tests for all JSON-RPC methods, concurrent request handling, error paths
  • Python SDK: ExpectChain.plugin(), aggregate_latency_under(), all_tools_called() assertion tests
  • TypeScript SDK: plugin system, continuous eval runner, branded types, discriminated union specs
Terminal window
# Python
uv add attest-ai@latest
# TypeScript
pnpm add @attest-ai/core@latest @attest-ai/vitest@latest

TypeScript Parity — full feature alignment with Python SDK

  • Discriminated union specsStep type uses a kind discriminant (llm_call | tool_call | retrieval | agent_call) for exhaustive switch handling without type guards.
  • Branded typesTraceId, AssertionId, AgentId newtypes prevent string mixing at compile time.
  • Plugin system (TS)PluginRegistry and AttestPlugin interface for registering custom assertion plugins in TypeScript. Matches the Python attest.plugins entry point API.
  • Continuous eval (TS)ContinuousEvalRunner, Sampler, and AlertDispatcher ported from Python. Supports sampling strategies and alerting via webhooks.
  • LangChain.js adapter@attest-ai/core/adapters/langchain captures traces from LangChain.js agents with automatic callback instrumentation.
  • TraceAdapter type safety — Adapter traceFromResponse() return type is Trace (not unknown), enabling end-to-end type inference.
  • CJS dual output — tsup-based build produces both ESM and CommonJS bundles. package.json exports map resolves the correct format automatically.
  • Python TraceTree analyticsTraceTree.summary() returns aggregate metrics (total cost, tokens, latency, agent count, max depth) across the full delegation tree.
  • Adapter API renamecapture() is replaced by traceFromResponse() on all TypeScript adapters. The old name is removed (no deprecation shim).
  • CJS consumers — If you previously used a bundler workaround for ESM-only @attest-ai/core, remove it. The package now ships dual ESM/CJS.
Terminal window
pnpm add @attest-ai/core@latest @attest-ai/vitest@latest

Performance — engine-only optimizations

  • Schema compiler cache — JSON Schema compilation results are cached per-schema hash. Eliminates recompilation on repeated evaluate_batch calls with the same schema.
  • Trace validation optimization — Trace validation short-circuits on first error in non-verbose mode, reducing per-evaluation overhead.
  • SQL query optimizations — History store queries use covering indexes for list_results and drift_query. Batch inserts use prepared statements.
  • Prepared statement pooling — Frequently-used SQL statements are prepared once and reused across evaluations.
  • Result paginationlist_results supports limit/offset for large history stores.
  • segmentio/encoding — JSON codec switched from encoding/json to segmentio/encoding/json for ~2x faster marshal/unmarshal on evaluation payloads.
  • Engine-only release. No SDK changes. Update the engine binary via ATTEST_ENGINE_PATH or let auto-download fetch the new version.

Robustness — timeouts, bounded resources, new assertions

  • Budget trackingBudgetTracker enforces per-evaluation cost limits. Set ATTEST_BUDGET_MAX_COST to cap total spend across judge and embedding assertions.
  • Concurrent request support — Engine handles multiple evaluate_batch requests concurrently with per-request isolation.
  • Configurable judge cache — Judge response cache size is configurable via ATTEST_JUDGE_CACHE_MAX_MB (default: 100 MB). Cache eviction uses LRU.
  • History retention policyATTEST_HISTORY_MAX_ROWS and ATTEST_HISTORY_MAX_AGE_DAYS control automatic cleanup of old evaluation results.
  • Engine read timeoutATTEST_ENGINE_TIMEOUT (default: 30s) prevents SDK hangs when the engine process stalls.
  • Bounded continuous eval queueATTEST_CONTINUOUS_QUEUE_SIZE (default: 1000) caps the evaluation queue. Overflow uses backpressure instead of unbounded growth.
  • ExpectChain.plugin() — Chain custom plugin assertions alongside built-in ones: expect(result).output_contains("ok").plugin("my_plugin", config).
  • Simulation mode (TS)ATTEST_SIMULATION=1 works in the TypeScript SDK, returning deterministic mock results without an engine process.
  • Engine read timeout — SDK no longer hangs indefinitely if the engine process crashes or stalls mid-response.
  • History store cleanup — Unbounded row growth in SQLite history is now capped by retention policy.
VariablePurposeDefault
ATTEST_BUDGET_MAX_COSTMaximum USD spend per evaluationunset (unlimited)
ATTEST_JUDGE_CACHE_MAX_MBJudge response LRU cache size100
ATTEST_HISTORY_MAX_ROWSMaximum rows in history store10000
ATTEST_HISTORY_MAX_AGE_DAYSAuto-delete results older than N days90
ATTEST_ENGINE_TIMEOUTEngine response timeout (seconds)30
ATTEST_CONTINUOUS_QUEUE_SIZEContinuous eval queue capacity1000
Terminal window
# Python
uv add attest-ai@latest
# TypeScript
pnpm add @attest-ai/core@latest

Correctness & Safety — adapter fixes, engine hardening

  • submit_plugin_result — Implement the previously-stubbed submit_plugin_result JSON-RPC method. Plugin evaluations now round-trip correctly through the engine.
  • Trace ID validation — Engine rejects traces with missing or malformed trace_id fields instead of silently accepting them.
  • Step type validation — Unknown step types (kind field) return a typed error instead of being silently dropped.
  • Assertion ID uniqueness — Engine enforces unique assertion IDs within a batch. Duplicate IDs return an error.
  • Error response codes — All engine errors use spec-compliant JSON-RPC error codes (-32600 to -32603).
  • Shutdown drainingshutdown waits for in-flight evaluations to complete (5s timeout) before exiting.
  • OpenAI — Tool call arguments are parsed from JSON string to object. Previously, function.arguments was passed as a raw string, causing schema assertions on tool args to fail.
  • Ollama — Empty tool call arrays are normalized to undefined instead of []. Prevents phantom “0 tool calls” in trace summaries.
  • Gemini — Token count extraction reads usage_metadata.total_token_count (was missing, reported as 0).
  • LangChain — Callback handler implements full BaseCallbackHandler protocol including ignore_chat_model and ignore_retriever.
  • Anthropic — System prompt is captured as a separate step when present.
  • delegate()parent_trace_id is now set correctly on child traces, fixing broken TraceTree traversal for multi-agent scenarios.
  • Adapter integration tests — Each adapter has a dedicated test suite verifying trace capture, token counting, and tool call extraction.
Terminal window
# Python
uv add attest-ai@latest
# TypeScript
pnpm add @attest-ai/core@latest

SDK patch release — adapter fixes and async compatibility

  • LangChain adapter — Add missing callback protocol attributes (ignore_agent, ignore_retry, raise_error) required by LangChain’s BaseCallbackHandler interface. Handle LangGraph AIMessage and ToolMessage output formats so traces capture tool-call responses correctly.
  • expect() DSL — Accept Trace directly in addition to AgentResult. Auto-wraps into AgentResult for manual adapter workflows that build traces via TraceBuilder without going through a provider adapter.
  • Plugin fixture — Run the engine event loop in a background daemon thread with run_coroutine_threadsafe() bridge. Fixes Future attached to a different loop errors when pytest-asyncio tests (e.g., google-adk) call into the engine from a separate event loop.
Terminal window
uv add attest-ai@latest
  • No engine changes. The Go engine binary remains at v0.4.0. ENGINE_VERSION is unchanged; auto-download continues to fetch v0.4.0 binaries.

SDK patch release — engine auto-download

  • Engine auto-download — Both Python and TypeScript SDKs now automatically download the attest-engine binary from GitHub Releases on first use. No manual binary setup required after uv add attest-ai or pnpm add @attest-ai/core.

  • SHA256 verification — Downloaded binaries are verified against checksums-sha256.txt from the release. Checksum mismatch aborts the download with a clear error.

  • Version-pinned cache — Binaries are cached at ~/.attest/bin/ with a .engine-version marker. SDK version mismatch triggers automatic re-download.

  • Discovery chain — Engine binary resolution follows a predictable order:

    ATTEST_ENGINE_PATH env var
    → PATH lookup
    → ~/.attest/bin/ (shared cache, version-checked)
    → ../../bin/ (monorepo dev layout)
    → ./bin/ (local)
    → auto-download from GitHub Releases
    → actionable error message
  • Opt-out — Set ATTEST_ENGINE_NO_DOWNLOAD=1 to disable network access. The error message explains alternative installation methods.

  • pytest pluginpytest.skip() replaced with pytest.fail() when the engine binary is missing. With auto-download in place, silent skipping is no longer appropriate; real errors are now surfaced.
  • TypeScript VERSION — Corrected from 0.3.0 to 0.4.1.
Terminal window
uv add attest-ai
Terminal window
pnpm add @attest-ai/core
VariablePurposeDefault
ATTEST_ENGINE_PATHAbsolute path to engine binary — skips all discoveryunset
ATTEST_ENGINE_NO_DOWNLOAD1 / true / yes disables auto-downloadunset (enabled)

Production & Polish

  • Result history with SQLite storage
  • Drift detection (σ-based statistical thresholds)
  • Continuous eval runner with sampling and alerting
  • Plugin system (attest.plugins entry point group)
  • CrewAI adapter (11 adapters total)
  • CLI init and validate commands
  • MkDocs documentation site

Simulation & Multi-Agent

  • Layers 7-8: simulation runtime, multi-agent testing
  • TypeScript SDK (first npm publish: @attest-ai/core, @attest-ai/vitest)
  • Framework adapters: LangChain, Google ADK, LlamaIndex

Semantic & Judge Layers

  • Layers 5-6: ONNX local embeddings, LLM-as-judge
  • Soft failure support
  • OTel adapter
  • setup-attest GitHub Action

Foundation

  • Layers 1-4: schema validation, cost/performance, trace structure, content validation
  • Python SDK with pytest plugin
  • 4 provider adapters: OpenAI, Anthropic, Gemini, Ollama
  • PyPI + GitHub release