Migrating from PromptFoo
Upgrade from PromptFoo’s YAML-based testing to Attest’s fluent Python/TypeScript API.
Why Migrate?
Section titled “Why Migrate?”PromptFoo is great for prompt evaluation. Attest provides:
- Agent testing — Not just prompts, entire agents
- Code-first — Python or TypeScript instead of YAML
- 8-layer assertions — From schema to simulation
- Framework adapters — LangChain, CrewAI, LlamaIndex
- Trace inspection — See model calls, tools, costs
- Multi-agent — Simulation runtime for scenarios
Concept Mapping
Section titled “Concept Mapping”| PromptFoo | Attest | Notes |
|---|---|---|
prompts/ | Agent functions | Code instead of files |
test.yaml config | Python test functions | More expressive |
providers | Adapters | Built-in providers |
asserts | Assertion methods | Fluent DSL |
eval | Pytest + expect() | Native test framework |
| Metric | Judge prompt | LLM evaluation |
Step-by-Step Migration
Section titled “Step-by-Step Migration”1. Replace YAML Config with Python Tests
Section titled “1. Replace YAML Config with Python Tests”PromptFoo (YAML)
Section titled “PromptFoo (YAML)”providers: - id: openai:gpt-4o-mini config: temperature: 0.7
prompts: - id: simple_qa raw: "Answer this question: {{question}}"
tests: - vars: question: "What is 2+2?" assert: - type: contains value: "4" - type: regex value: "^\\d+$" - type: cost threshold: 0.01Attest (Python)
Section titled “Attest (Python)”from attest import expectfrom openai import OpenAI
client = OpenAI(api_key="sk-...", model="gpt-4o-mini")
def test_simple_qa(): """Test simple Q&A.""" result = client.chat.completions.create( messages=[{"role": "user", "content": "Answer: What is 2+2?"}] )
(expect(result) .output_contains("4") .output_matches(r"^\d+$") .cost_under(0.01))2. Convert PromptFoo Providers to Adapters
Section titled “2. Convert PromptFoo Providers to Adapters”PromptFoo
Section titled “PromptFoo”providers: - id: openai:gpt-4o-mini - id: anthropic:claude-3-sonnet - id: localai:mistralAttest
Section titled “Attest”from attest.adapters import openai, anthropic, ollama
# OpenAIopenai_result = openai.create_completion( model="gpt-4o-mini", messages=[...])
# Anthropicanthropic_result = anthropic.create_message( model="claude-3-sonnet", messages=[...])
# Local (Ollama)local_result = ollama.generate( model="mistral", prompt="...")3. Convert PromptFoo Asserts to Expect
Section titled “3. Convert PromptFoo Asserts to Expect”PromptFoo Asserts
Section titled “PromptFoo Asserts”assert: - type: contains value: "success" - type: regex value: '^\d{4}-\d{2}-\d{2}$' - type: length value: 100 threshold: 0.1 - type: cost threshold: 0.05 - type: json-path value: "$.status"Attest Expects
Section titled “Attest Expects”(expect(result) .output_contains("success") .output_matches(r'^\d{4}-\d{2}-\d{2}$') .word_count_between(90, 110) .cost_under(0.05) .matches_schema({"type": "object", "properties": {"status": {}}}))4. Convert Test Variables to Test Functions
Section titled “4. Convert Test Variables to Test Functions”PromptFoo
Section titled “PromptFoo”tests: - description: "Math question" vars: question: "What is 2+2?" topic: "math" assert: - type: contains value: "4" - description: "History question" vars: question: "When was WWII?" topic: "history" assert: - type: contains value: "1939"Attest
Section titled “Attest”import pytest
@pytest.mark.parametrize("question,expected", [ ("What is 2+2?", "4"), ("When was WWII?", "1939")])def test_qa(question, expected): """Test Q&A across topics.""" result = agent.run(question) expect(result).output_contains(expected)5. Replace Metric Scripts with Judge Prompts
Section titled “5. Replace Metric Scripts with Judge Prompts”PromptFoo
Section titled “PromptFoo”tests: - vars: query: "Is this helpful?" assert: - type: script value: "result.output.length > 100" - type: script value: | const score = result.output.includes('yes') ? 1 : 0; return { pass: score > 0.5 };Attest
Section titled “Attest”result = agent.run("question")
(expect(result) .word_count_between(100, 1000) .passes_judge( prompt="Is this response helpful?", model="gpt-4o", scoring="binary" ))Complete Migration Example
Section titled “Complete Migration Example”Before: PromptFoo Setup
Section titled “Before: PromptFoo Setup”providers: - id: openai:gpt-4o-mini
prompts: - id: qa_agent raw: | You are a helpful assistant. Answer this question: {{query}}
tests: - description: "Math question" vars: query: "What is 2+2?" assert: - type: contains value: "4" - type: regex value: '^4$' - type: cost threshold: 0.01
- description: "History question" vars: query: "What year did WWII end?" assert: - type: contains value: "1945" - type: cost threshold: 0.02
- description: "Response quality" vars: query: "Tell me about Python" assert: - type: length threshold: 0.5 value: 200 - type: javascript value: | const words = result.output.split(' ').length; return { pass: words > 50, score: Math.min(words / 100, 1) };Run with:
promptfoo evalAfter: Attest Setup
Section titled “After: Attest Setup”import pytestfrom attest import expect
class TestAgent: @pytest.fixture def agent(self): from my_app import create_agent return create_agent()
def test_math_question(self, agent): """Math question should return correct answer.""" result = agent.run("What is 2+2?")
(expect(result) .output_contains("4") .output_matches("^4$") .cost_under(0.01))
def test_history_question(self, agent): """History question should return correct year.""" result = agent.run("What year did WWII end?")
(expect(result) .output_contains("1945") .cost_under(0.02))
def test_response_quality(self, agent): """Response should be comprehensive.""" result = agent.run("Tell me about Python")
(expect(result) .word_count_between(150, 500) .passes_judge("Is this well-written and informative?"))Run with:
pytest test_agent.py -vPromptFoo to Attest Feature Map
Section titled “PromptFoo to Attest Feature Map”| PromptFoo Feature | Attest Equivalent |
|---|---|
| Web UI evaluation | Python/TS code + pytest |
| YAML config | Python test functions |
| Multiple prompts | Multiple test functions |
| Provider selection | Adapter selection |
| Contains assert | .output_contains() |
| Regex assert | .output_matches() |
| Cost tracking | .cost_under() |
| Custom JS eval | .passes_judge() |
| CSV test data | pytest parametrize |
| Batch evaluation | pytest + parameterize |
Key Differences
Section titled “Key Differences”1. Code Over Config
Section titled “1. Code Over Config”PromptFoo uses YAML, Attest uses Python/TypeScript. More expressive and testable:
# Attest: can use variables, loops, logicfor query in queries: result = agent.run(query) expect(result).output_contains("...")2. Agents Over Prompts
Section titled “2. Agents Over Prompts”Test entire agents, not just prompts:
from attest.adapters import langchain
# Full LangChain agent with toolsagent = langchain.create_agent(llm, tools)result = agent.invoke({"input": "..."})
expect(result).trace_contains_tool("google_search")3. Fluent API Over Assertions List
Section titled “3. Fluent API Over Assertions List”Chain assertions naturally:
# Attest: fluent, readable(expect(result) .output_contains("answer") .cost_under(0.05) .passes_judge("Is correct?"))Migration Checklist
Section titled “Migration Checklist”- Convert
promptfooconfig.yamlto test functions - Replace provider declarations with adapters
- Map all test cases to
@pytest.mark.parametrize - Convert
assertstatements to.expect()chains - Replace JavaScript metrics with
.passes_judge() - Run
pytestand verify all tests pass - Add to CI/CD pipeline
- Remove PromptFoo config files
Can I run PromptFoo and Attest in parallel?
Yes, during migration. But consolidate to Attest once complete.
How do I evaluate multiple prompts like PromptFoo?
Use test parametrization:
@pytest.mark.parametrize("prompt,expected", [ ("Prompt A", "Expected A"), ("Prompt B", "Expected B")])def test_prompts(prompt, expected): result = agent.run(prompt) expect(result).output_contains(expected)How do I replace PromptFoo’s batch mode?
pytest automatically runs all test functions:
pytest test_agent.py # Runs all testsWhat about PromptFoo’s web UI?
Attest is code-first. Use IDE + test output for feedback. No UI needed.
Related
Section titled “Related”- Expect DSL Reference — All assertion methods
- Quickstart — Quick start
- Adapters Reference — Provider integrations