Skip to content

Drift Detection

Drift detection uses the continuous evaluation pipeline to monitor how agent behavior changes over time. When assertions that previously passed start failing, the system dispatches alerts via webhooks and Slack.

Agent drift occurs when an AI agent’s behavior changes without code changes. Common causes:

  • Model updates — Provider deploys a new model version
  • Data drift — Input distribution shifts over time
  • Tool changes — External APIs change behavior or rate limits
  • Prompt sensitivity — Subtle changes in system prompts cascade
  • Context degradation — RAG retrieval quality degrades as data grows stale
from attest import config
config(
sample_rate=0.05,
alert_webhook="https://hooks.example.com/attest-drift",
alert_slack_url="https://hooks.slack.com/services/T.../B.../xxx",
)
VariableDescription
ATTEST_SAMPLE_RATEFraction of traces to evaluate for drift
ATTEST_ALERT_WEBHOOKWebhook URL for drift alerts
ATTEST_ALERT_SLACK_URLSlack webhook URL for drift alerts

Define assertions that capture your agent’s expected behavior baseline:

from attest import Assertion
drift_assertions = [
# Output quality — should always produce helpful responses
Assertion(
assertion_id="drift_no_errors",
type="content",
spec={
"target": "output.message",
"check": "forbidden",
"values": ["I cannot", "I'm unable", "error occurred", "stacktrace"],
},
),
# Latency — agent should respond within SLO
Assertion(
assertion_id="drift_latency",
type="constraint",
spec={"field": "metadata.latency_ms", "operator": "lte", "value": 5000, "soft": True},
),
# Cost — per-request cost should stay within budget
Assertion(
assertion_id="drift_cost",
type="constraint",
spec={"field": "metadata.cost_usd", "operator": "lte", "value": 0.05},
),
# Tool usage — agent should call the expected tools
Assertion(
assertion_id="drift_tools",
type="trace",
spec={"check": "required_tools", "tools": ["search", "format_response"], "soft": True},
),
# Token budget — prevent token explosion from loops
Assertion(
assertion_id="drift_tokens",
type="constraint",
spec={"field": "metadata.total_tokens", "operator": "lte", "value": 4000},
),
# No tool loops
Assertion(
assertion_id="drift_no_loops",
type="trace",
spec={"check": "loop_detection", "tool": "search", "max_repetitions": 3},
),
]

When an assertion fails, the AlertDispatcher sends a payload to configured endpoints:

{
"drift_type": "constraint_violation",
"score": 0.0,
"trace_id": "trc_abc123def456",
"assertion_id": "drift_latency",
"explanation": "metadata.latency_ms (8200) > 5000"
}
[attest] drift alert — type=constraint_violation score=0.0 trace_id=trc_abc123def456

Detect when model updates cause the agent to produce lower-quality outputs:

# Content assertions catch obvious quality drops
Assertion(
assertion_id="quality_keywords",
type="content",
spec={
"target": "output.message",
"check": "keyword_any",
"values": ["helpful", "here's", "found"],
"soft": True,
},
)

Detect when changes cause the agent to use more tokens:

Assertion(
assertion_id="cost_budget",
type="constraint",
spec={"field": "metadata.cost_usd", "operator": "lte", "value": 0.02},
)
Assertion(
assertion_id="token_budget",
type="constraint",
spec={
"field": "metadata.total_tokens",
"operator": "between",
"min": 100,
"max": 3000,
},
)

Detect when the agent stops using expected tools or starts looping:

# Required tools
Assertion(
assertion_id="tool_presence",
type="trace",
spec={"check": "required_tools", "tools": ["search", "summarize"]},
)
# No infinite loops
Assertion(
assertion_id="no_loops",
type="trace",
spec={"check": "loop_detection", "tool": "search", "max_repetitions": 2},
)
# Forbidden tools (deprecated or dangerous)
Assertion(
assertion_id="no_deprecated",
type="trace",
spec={"check": "forbidden_tools", "tools": ["old_search_v1", "unsafe_eval"]},
)

For multi-agent systems, detect when delegation patterns change:

# Expected delegation structure
Assertion(
assertion_id="delegation_structure",
type="trace_tree",
spec={
"check": "follows_transitions",
"transitions": [["orchestrator", "researcher"], ["orchestrator", "writer"]],
},
)
# Depth limit — prevent runaway recursion
Assertion(
assertion_id="delegation_depth",
type="trace_tree",
spec={"check": "delegation_depth", "max_depth": 3},
)
from attest import Assertion, ContinuousEvalRunner
from attest.client import AttestClient
from attest.engine_manager import EngineManager
async def setup_drift_monitor():
engine = EngineManager()
await engine.start()
client = AttestClient(engine)
runner = ContinuousEvalRunner(
client=client,
assertions=drift_assertions, # from examples above
sample_rate=0.05,
alert_webhook="https://hooks.example.com/attest",
alert_slack_url="https://hooks.slack.com/services/T.../B.../xxx",
)
await runner.start()
return runner, engine
# In request handler
async def handle(query: str, runner: ContinuousEvalRunner):
result = await my_agent(query)
await runner.submit(result.trace)
return result
EnvironmentRateRationale
Development1.0Evaluate every trace
Staging0.5Half of traces, catch regressions
Production (layers 1-4 only)0.1-0.5Free assertions, higher coverage
Production (with layers 5-6)0.01-0.05Paid assertions, minimize cost

Drift detection complements test-time assertions:

ConcernTest-time (pytest)Production (drift)
WhenCI/CD pipelineLive traffic
CoverageSynthetic test casesReal user inputs
CostPer test runPer sampled request
AssertionsAll layersLayers 1-4 preferred
Failure modeTest failureAlert notification

Use tests for comprehensive assertion coverage with synthetic inputs. Use drift detection for lightweight monitoring with real production traffic.