Test Steps
Test steps define the sequence of actions in your test. Each step executes in order, building up the conversation transcript and variable bag.
Step Types Overview
| Type | Purpose | Key Properties |
|---|---|---|
message | Send a user message to the assistant | role, content |
request | Execute an HTTP request | requestId, inputMappings, saveAs |
assistant_check | Evaluate the assistant’s response | mode, rubric, threshold |
user_objective | Multi-turn conversation with a goal | description, maxTurns |
extract | Extract data from conversation | variableName, description |
Message Step
Sends a user message to the assistant and waits for a response.
{
"type": "message",
"role": "user",
"content": "What's the weather like today?"
}Properties
| Property | Type | Required | Description |
|---|---|---|---|
type | "message" | ✅ | Step type identifier |
role | "user" | "system" | ✅ | Message sender role |
content | string | ✅ | Message content (supports variable interpolation) |
id | string | Step identifier for referencing outputs | |
name | string | Human-readable step name |
Variable Interpolation in Content
Use variables from previous steps:
{
"type": "message",
"role": "user",
"content": "Transfer $${bag.amount} to account ${bag.var.targetAccount}"
}Request Step
Executes an HTTP request defined in your requests/ configuration. Use this to set up test data, verify backend state, or trigger external actions.
{
"type": "request",
"id": "create_account",
"requestId": "accounts.create_test",
"inputMappings": {
"accountId": "acct-12345",
"status": "active",
"balance": 5000
},
"saveAs": "newAccount"
}Properties
| Property | Type | Required | Description |
|---|---|---|---|
type | "request" | ✅ | Step type identifier |
requestId | string | ✅ | Reference to request definition (e.g., accounts.create_test) |
inputMappings | object | Input values passed to the request body | |
saveAs | string | Variable name to store the response | |
id | string | Step identifier for $steps.{id}.output syntax |
Accessing Request Results
After a request step executes, you can access its response:
// Step 1: Create account
{
"type": "request",
"id": "create_acct",
"requestId": "accounts.create_test",
"inputMappings": { "balance": 1000 },
"saveAs": "acct"
}
// Step 2: Use the created account ID
{
"type": "message",
"role": "user",
"content": "Show me details for account ${bag.var.acct.accountId}"
}Or use the step ID syntax:
{
"type": "message",
"content": "Account created: $steps.create_acct.output.accountId"
}Assistant Check Step
Evaluates the assistant’s response using an LLM judge or keyword matching.
Judge Mode (Semantic Evaluation)
{
"type": "assistant_check",
"mode": "judge",
"name": "Check risk warning",
"rubric": "The assistant must clearly warn about investment risks and recommend diversification.",
"threshold": 0.7,
"scope": "last",
"severity": "error"
}Properties (Judge Mode)
| Property | Type | Required | Description |
|---|---|---|---|
type | "assistant_check" | ✅ | Step type identifier |
mode | "judge" | ✅ | Use LLM-based evaluation |
rubric | string | ✅ | Evaluation criteria for the judge |
threshold | number | Pass threshold (0-1), default varies by judge | |
scope | "last" | "transcript" | What to evaluate: last response or full conversation | |
severity | "error" | "warning" | Impact on test pass/fail | |
name | string | Human-readable name shown in results |
Includes Mode (Keyword Matching)
{
"type": "assistant_check",
"mode": "includes",
"scope": "last",
"includes": ["risk", "diversify", "consult"],
"severity": "error"
}Properties (Includes Mode)
| Property | Type | Required | Description |
|---|---|---|---|
mode | "includes" | ✅ | Use keyword matching |
includes | string[] | ✅ | Keywords that must appear in response |
scope | "last" | "transcript" | Where to search for keywords |
Variable Check Mode
{
"type": "assistant_check",
"mode": "variable_check",
"variablePath": "bag.var.accountStatus",
"expectEquals": "active"
}User Objective Step
Defines a multi-turn conversational goal. The runner will iterate until the objective is met or max turns reached.
{
"type": "user_objective",
"description": "Successfully schedule an appointment for next Tuesday at 2pm",
"minTurns": 2,
"maxTurns": 8,
"iterativeConversation": true,
"continueAfterPass": false,
"attachedChecks": [
{
"type": "assistant_check",
"mode": "judge",
"rubric": "The assistant confirmed the appointment details correctly."
}
]
}Properties
| Property | Type | Required | Description |
|---|---|---|---|
type | "user_objective" | ✅ | Step type identifier |
description | string | ✅ | Goal description for the LLM to pursue |
minTurns | number | Minimum conversation turns | |
maxTurns | number | Maximum turns before stopping (default: 8) | |
iterativeConversation | boolean | Whether to iterate (default: true) | |
continueAfterPass | boolean | Continue after objective is met | |
attachedChecks | AssistantCheckStep[] | Checks to run at each turn |
Extract Step
Extracts structured information from the conversation using an LLM. The AI analyzes the conversation and extracts the requested data into a variable you can use in subsequent steps.
This is particularly useful for:
- Extracting confirmation numbers, order IDs, or reference codes from assistant responses
- Pulling out specific data points mentioned in the conversation
- Capturing dynamic values that you need to verify or use later
{
"type": "extract",
"id": "get_order_id",
"variableName": "orderId",
"description": "Extract the order confirmation number mentioned in the assistant's response",
"scope": "last"
}How It Works
- The LLM receives the conversation context (last message or full transcript)
- It uses your
descriptionto understand what to look for - The extracted value is stored in
bag.var.{variableName} - Subsequent steps can reference this variable
Properties
| Property | Type | Required | Description |
|---|---|---|---|
type | "extract" | ✅ | Step type identifier |
variableName | string | ✅ | Name for the extracted variable (letters, numbers, underscores only) |
description | string | ✅ | Clear description of what to extract—be specific! |
scope | "last" | "transcript" | Where to extract from (default: last) | |
id | string | Step identifier for $steps.{id}.output syntax |
Writing Good Extraction Descriptions
The quality of extraction depends on your description. Be specific:
❌ Too vague:
{ "description": "Get the ID" }✅ Specific and clear:
{ "description": "Extract the 8-character alphanumeric order confirmation code that starts with 'ORD-'" }✅ With context:
{ "description": "Find the account balance amount (in dollars) mentioned when the assistant confirmed the transfer" }Supported Data Types
The LLM can extract various data types:
| Type | Example Description | Extracted As |
|---|---|---|
| String | ”The confirmation code” | "ORD-12345678" |
| Number | ”The account balance amount” | 1500.00 |
| Boolean | ”Whether the transfer was successful” | true |
| Date | ”The appointment date” | "2024-03-15" |
Using Extracted Values
After extraction, the value is available in the variable bag:
// Step 1: Extract order ID
{
"type": "extract",
"variableName": "orderId",
"description": "The order ID from the confirmation"
}
// Step 2: Use extracted value
{
"type": "message",
"content": "Can you check the status of order ${bag.var.orderId}?"
}Step Execution Order
Steps execute sequentially, with each step able to access:
- Transcript — All previous messages in
bag.transcript - Last response — Most recent assistant reply in
bag.last.assistant - Variables — Values from
saveAsandextractinbag.var - Step outputs — Results by step ID in
bag.steps
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Step 1 │ --> │ Step 2 │ --> │ Step 3 │
│ (request) │ │ (message) │ │ (check) │
└─────────────┘ └─────────────┘ └─────────────┘
│ │ │
▼ ▼ ▼
bag.var.acct bag.transcript bag.last.assistant
bag.steps.s1 │ │
└───────────────────┘Complete Test Example
{
"id": "account-transfer-test",
"name": "Account Transfer Flow",
"steps": [
{
"type": "request",
"id": "setup",
"requestId": "accounts.create_test",
"inputMappings": { "balance": 10000 },
"saveAs": "sourceAccount"
},
{
"type": "message",
"role": "user",
"content": "I'd like to transfer $500 to my savings account"
},
{
"type": "assistant_check",
"mode": "judge",
"name": "Verify transfer confirmation",
"rubric": "The assistant must confirm the transfer amount and ask for confirmation before proceeding",
"threshold": 0.8
},
{
"type": "message",
"role": "user",
"content": "Yes, please confirm the transfer"
},
{
"type": "extract",
"variableName": "transferId",
"description": "The transfer confirmation ID or reference number"
},
{
"type": "request",
"id": "verify",
"requestId": "transfers.get",
"inputMappings": { "transferId": "${bag.var.transferId}" }
},
{
"type": "assistant_check",
"mode": "variable_check",
"variablePath": "bag.steps.verify.output.status",
"expectEquals": "completed"
}
]
}Next Steps
- Variable Interpolation — Dynamic data syntax
- Requests & Auth — HTTP request configuration
- Test Runner — Execution engine overview