Drift Detection
Drift detection uses the continuous evaluation pipeline to monitor how agent behavior changes over time. When assertions that previously passed start failing, the system dispatches alerts via webhooks and Slack.
What is Drift?
Section titled “What is Drift?”Agent drift occurs when an AI agent’s behavior changes without code changes. Common causes:
- Model updates — Provider deploys a new model version
- Data drift — Input distribution shifts over time
- Tool changes — External APIs change behavior or rate limits
- Prompt sensitivity — Subtle changes in system prompts cascade
- Context degradation — RAG retrieval quality degrades as data grows stale
Configuration
Section titled “Configuration”Global Config
Section titled “Global Config”from attest import config
config( sample_rate=0.05, alert_webhook="https://hooks.example.com/attest-drift", alert_slack_url="https://hooks.slack.com/services/T.../B.../xxx",)Environment Variables
Section titled “Environment Variables”| Variable | Description |
|---|---|
ATTEST_SAMPLE_RATE | Fraction of traces to evaluate for drift |
ATTEST_ALERT_WEBHOOK | Webhook URL for drift alerts |
ATTEST_ALERT_SLACK_URL | Slack webhook URL for drift alerts |
Setting Up Drift Assertions
Section titled “Setting Up Drift Assertions”Define assertions that capture your agent’s expected behavior baseline:
from attest import Assertion
drift_assertions = [ # Output quality — should always produce helpful responses Assertion( assertion_id="drift_no_errors", type="content", spec={ "target": "output.message", "check": "forbidden", "values": ["I cannot", "I'm unable", "error occurred", "stacktrace"], }, ),
# Latency — agent should respond within SLO Assertion( assertion_id="drift_latency", type="constraint", spec={"field": "metadata.latency_ms", "operator": "lte", "value": 5000, "soft": True}, ),
# Cost — per-request cost should stay within budget Assertion( assertion_id="drift_cost", type="constraint", spec={"field": "metadata.cost_usd", "operator": "lte", "value": 0.05}, ),
# Tool usage — agent should call the expected tools Assertion( assertion_id="drift_tools", type="trace", spec={"check": "required_tools", "tools": ["search", "format_response"], "soft": True}, ),
# Token budget — prevent token explosion from loops Assertion( assertion_id="drift_tokens", type="constraint", spec={"field": "metadata.total_tokens", "operator": "lte", "value": 4000}, ),
# No tool loops Assertion( assertion_id="drift_no_loops", type="trace", spec={"check": "loop_detection", "tool": "search", "max_repetitions": 3}, ),]Alert Payload
Section titled “Alert Payload”When an assertion fails, the AlertDispatcher sends a payload to configured endpoints:
{ "drift_type": "constraint_violation", "score": 0.0, "trace_id": "trc_abc123def456", "assertion_id": "drift_latency", "explanation": "metadata.latency_ms (8200) > 5000"}Slack Format
Section titled “Slack Format”[attest] drift alert — type=constraint_violation score=0.0 trace_id=trc_abc123def456Drift Detection Patterns
Section titled “Drift Detection Patterns”Pattern 1: Output Quality Regression
Section titled “Pattern 1: Output Quality Regression”Detect when model updates cause the agent to produce lower-quality outputs:
# Content assertions catch obvious quality dropsAssertion( assertion_id="quality_keywords", type="content", spec={ "target": "output.message", "check": "keyword_any", "values": ["helpful", "here's", "found"], "soft": True, },)Pattern 2: Cost Explosion
Section titled “Pattern 2: Cost Explosion”Detect when changes cause the agent to use more tokens:
Assertion( assertion_id="cost_budget", type="constraint", spec={"field": "metadata.cost_usd", "operator": "lte", "value": 0.02},)
Assertion( assertion_id="token_budget", type="constraint", spec={ "field": "metadata.total_tokens", "operator": "between", "min": 100, "max": 3000, },)Pattern 3: Tool Behavior Changes
Section titled “Pattern 3: Tool Behavior Changes”Detect when the agent stops using expected tools or starts looping:
# Required toolsAssertion( assertion_id="tool_presence", type="trace", spec={"check": "required_tools", "tools": ["search", "summarize"]},)
# No infinite loopsAssertion( assertion_id="no_loops", type="trace", spec={"check": "loop_detection", "tool": "search", "max_repetitions": 2},)
# Forbidden tools (deprecated or dangerous)Assertion( assertion_id="no_deprecated", type="trace", spec={"check": "forbidden_tools", "tools": ["old_search_v1", "unsafe_eval"]},)Pattern 4: Multi-Agent Delegation Drift
Section titled “Pattern 4: Multi-Agent Delegation Drift”For multi-agent systems, detect when delegation patterns change:
# Expected delegation structureAssertion( assertion_id="delegation_structure", type="trace_tree", spec={ "check": "follows_transitions", "transitions": [["orchestrator", "researcher"], ["orchestrator", "writer"]], },)
# Depth limit — prevent runaway recursionAssertion( assertion_id="delegation_depth", type="trace_tree", spec={"check": "delegation_depth", "max_depth": 3},)Production Deployment
Section titled “Production Deployment”FastAPI Example
Section titled “FastAPI Example”from attest import Assertion, ContinuousEvalRunnerfrom attest.client import AttestClientfrom attest.engine_manager import EngineManager
async def setup_drift_monitor(): engine = EngineManager() await engine.start() client = AttestClient(engine)
runner = ContinuousEvalRunner( client=client, assertions=drift_assertions, # from examples above sample_rate=0.05, alert_webhook="https://hooks.example.com/attest", alert_slack_url="https://hooks.slack.com/services/T.../B.../xxx", ) await runner.start() return runner, engine
# In request handlerasync def handle(query: str, runner: ContinuousEvalRunner): result = await my_agent(query) await runner.submit(result.trace) return resultSample Rate Guidelines
Section titled “Sample Rate Guidelines”| Environment | Rate | Rationale |
|---|---|---|
| Development | 1.0 | Evaluate every trace |
| Staging | 0.5 | Half of traces, catch regressions |
| Production (layers 1-4 only) | 0.1-0.5 | Free assertions, higher coverage |
| Production (with layers 5-6) | 0.01-0.05 | Paid assertions, minimize cost |
Relationship to Testing
Section titled “Relationship to Testing”Drift detection complements test-time assertions:
| Concern | Test-time (pytest) | Production (drift) |
|---|---|---|
| When | CI/CD pipeline | Live traffic |
| Coverage | Synthetic test cases | Real user inputs |
| Cost | Per test run | Per sampled request |
| Assertions | All layers | Layers 1-4 preferred |
| Failure mode | Test failure | Alert notification |
Use tests for comprehensive assertion coverage with synthetic inputs. Use drift detection for lightweight monitoring with real production traffic.