Test Runner
The Lamdis test runner (lamdis-runs) is an open-source engine for testing AI assistants and agents. It executes test suites against your chatbots, copilots, RAG systems, or workflow agents.
Overview
The test runner supports:
- Multi-turn conversations — Exercise complex, multi-step dialogues with your assistant
- LLM-based judging — Use semantic evaluation to check assistant responses
- HTTP request steps — Create or verify data via API calls during tests
- Variable interpolation — Pass data between steps dynamically
- Multiple execution channels — HTTP chat, OpenAI Chat, or AWS Bedrock
Architecture
┌─────────────────────────────────────────────────────────┐
│ Test Suite │
│ ┌─────────────────────────────────────────────────┐ │
│ │ Tests (messages, steps, assertions) │ │
│ └─────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────┐ │
│ │ Test Runner Engine │ │
│ │ - Step execution │ │
│ │ - Variable bag management │ │
│ │ - Transcript tracking │ │
│ └─────────────────────────────────────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ Target Assistant │ │ LLM Judge │ │
│ │ (Your AI) │ │ (OpenAI/Bedrock) │ │
│ └──────────────────┘ └──────────────────┘ │
└─────────────────────────────────────────────────────────┘Storage Providers
The test runner supports three storage modes:
| Mode | DB_PROVIDER | Required Env Var | Best For |
|---|---|---|---|
| Local | local | (none) | CLI runs, CI pipelines |
| MongoDB | mongo | MONGO_URL | Persistent storage |
| PostgreSQL | postgres | DATABASE_URL | Enterprise setups |
Auto-Detection
When DB_PROVIDER is not set, the runner automatically detects the storage mode:
DATABASE_URLstarting withpostgres→ PostgreSQLMONGO_URLset → MongoDB- Otherwise → Local (in-memory/JSON)
Execution Channels
The test runner can communicate with your assistant via:
HTTP Chat (http_chat)
Standard HTTP endpoint. The runner POSTs to your assistant’s /chat endpoint:
{
"message": "User's message",
"transcript": [...previous messages...],
"persona": "Optional persona text"
}Expected response:
{
"reply": "Assistant's response"
}OpenAI Chat (openai_chat)
Direct OpenAI Chat API integration. Requires OPENAI_API_KEY.
Bedrock Chat (bedrock_chat)
AWS Bedrock integration. Requires AWS credentials and region.
Judge Providers
The LLM judge evaluates assistant responses against your rubrics:
| Provider | Environment Variable | Model Variable |
|---|---|---|
| OpenAI (default) | OPENAI_API_KEY | OPENAI_MODEL |
| AWS Bedrock | JUDGE_PROVIDER=bedrock | BEDROCK_JUDGE_MODEL_ID |
Using Different Models
You can use a faster model for chat simulation and a stronger model for judging:
export JUDGE_PROVIDER=bedrock
export BEDROCK_CHAT_MODEL_ID=anthropic.claude-3-haiku-20240307-v1:0
export BEDROCK_JUDGE_MODEL_ID=anthropic.claude-3-sonnet-20240229-v1:0File Organization
When using JSON-based test definitions:
configs/
├── auth/ # Authentication configurations
├── requests/ # Reusable HTTP request definitions
├── personas/ # End-user persona definitions
├── assistants/ # Assistant endpoint configurations
├── tests/ # Individual test files
└── suites/ # Test suite groupingsRunning Tests
Via CLI (Local Development)
# Run a single test file
npm run run-file -- tests/my-tests.json
# Run a suite
npm run run-file -- suites/my-suite.jsonVia API (CI/CD)
curl -X POST "$LAMDIS_RUNS_URL/internal/runs/start" \
-H "x-api-token: $LAMDIS_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"mode": "json",
"suites": ["legal-tests"],
"webhookUrl": "https://ci.example.com/webhook"
}'Environment Variables
Required
| Variable | Description |
|---|---|
LAMDIS_API_TOKEN | Token to protect /internal endpoints |
OPENAI_API_KEY | OpenAI API key (when using OpenAI judge) |
Optional
| Variable | Description | Default |
|---|---|---|
PORT | HTTP port | 3101 |
DB_PROVIDER | Storage mode | auto-detect |
JUDGE_PROVIDER | Judge provider | openai |
OPENAI_MODEL | OpenAI model for judging | gpt-4o-mini |
BEDROCK_MODEL_ID | Bedrock model | anthropic.claude-3-haiku-* |
See the full configuration reference in Environment Variables.
Next Steps
- Test Steps — Learn about different step types
- Variable Interpolation — Dynamic data between steps
- Requests & Auth — API integration in tests
- CI/CD Integration — Automate test runs