Skip to Content
ConceptsTest Steps

Test Steps

Test steps define the sequence of actions in your test. Each step executes in order, building up the conversation transcript and variable bag.

Step Types Overview

TypePurposeKey Properties
messageSend a user message to the assistantrole, content
requestExecute an HTTP requestrequestId, inputMappings, saveAs
assistant_checkEvaluate the assistant’s responsemode, rubric, threshold
user_objectiveMulti-turn conversation with a goaldescription, maxTurns
extractExtract data from conversationvariableName, description

Message Step

Sends a user message to the assistant and waits for a response.

{ "type": "message", "role": "user", "content": "What's the weather like today?" }

Properties

PropertyTypeRequiredDescription
type"message"Step type identifier
role"user" | "system"Message sender role
contentstringMessage content (supports variable interpolation)
idstringStep identifier for referencing outputs
namestringHuman-readable step name

Variable Interpolation in Content

Use variables from previous steps:

{ "type": "message", "role": "user", "content": "Transfer $${bag.amount} to account ${bag.var.targetAccount}" }

Request Step

Executes an HTTP request defined in your requests/ configuration. Use this to set up test data, verify backend state, or trigger external actions.

{ "type": "request", "id": "create_account", "requestId": "accounts.create_test", "inputMappings": { "accountId": "acct-12345", "status": "active", "balance": 5000 }, "saveAs": "newAccount" }

Properties

PropertyTypeRequiredDescription
type"request"Step type identifier
requestIdstringReference to request definition (e.g., accounts.create_test)
inputMappingsobjectInput values passed to the request body
saveAsstringVariable name to store the response
idstringStep identifier for $steps.{id}.output syntax

Accessing Request Results

After a request step executes, you can access its response:

// Step 1: Create account { "type": "request", "id": "create_acct", "requestId": "accounts.create_test", "inputMappings": { "balance": 1000 }, "saveAs": "acct" } // Step 2: Use the created account ID { "type": "message", "role": "user", "content": "Show me details for account ${bag.var.acct.accountId}" }

Or use the step ID syntax:

{ "type": "message", "content": "Account created: $steps.create_acct.output.accountId" }

Assistant Check Step

Evaluates the assistant’s response using an LLM judge or keyword matching.

Judge Mode (Semantic Evaluation)

{ "type": "assistant_check", "mode": "judge", "name": "Check risk warning", "rubric": "The assistant must clearly warn about investment risks and recommend diversification.", "threshold": 0.7, "scope": "last", "severity": "error" }

Properties (Judge Mode)

PropertyTypeRequiredDescription
type"assistant_check"Step type identifier
mode"judge"Use LLM-based evaluation
rubricstringEvaluation criteria for the judge
thresholdnumberPass threshold (0-1), default varies by judge
scope"last" | "transcript"What to evaluate: last response or full conversation
severity"error" | "warning"Impact on test pass/fail
namestringHuman-readable name shown in results

Includes Mode (Keyword Matching)

{ "type": "assistant_check", "mode": "includes", "scope": "last", "includes": ["risk", "diversify", "consult"], "severity": "error" }

Properties (Includes Mode)

PropertyTypeRequiredDescription
mode"includes"Use keyword matching
includesstring[]Keywords that must appear in response
scope"last" | "transcript"Where to search for keywords

Variable Check Mode

{ "type": "assistant_check", "mode": "variable_check", "variablePath": "bag.var.accountStatus", "expectEquals": "active" }

User Objective Step

Defines a multi-turn conversational goal. The runner will iterate until the objective is met or max turns reached.

{ "type": "user_objective", "description": "Successfully schedule an appointment for next Tuesday at 2pm", "minTurns": 2, "maxTurns": 8, "iterativeConversation": true, "continueAfterPass": false, "attachedChecks": [ { "type": "assistant_check", "mode": "judge", "rubric": "The assistant confirmed the appointment details correctly." } ] }

Properties

PropertyTypeRequiredDescription
type"user_objective"Step type identifier
descriptionstringGoal description for the LLM to pursue
minTurnsnumberMinimum conversation turns
maxTurnsnumberMaximum turns before stopping (default: 8)
iterativeConversationbooleanWhether to iterate (default: true)
continueAfterPassbooleanContinue after objective is met
attachedChecksAssistantCheckStep[]Checks to run at each turn

Extract Step

Extracts structured information from the conversation using an LLM. The AI analyzes the conversation and extracts the requested data into a variable you can use in subsequent steps.

This is particularly useful for:

  • Extracting confirmation numbers, order IDs, or reference codes from assistant responses
  • Pulling out specific data points mentioned in the conversation
  • Capturing dynamic values that you need to verify or use later
{ "type": "extract", "id": "get_order_id", "variableName": "orderId", "description": "Extract the order confirmation number mentioned in the assistant's response", "scope": "last" }

How It Works

  1. The LLM receives the conversation context (last message or full transcript)
  2. It uses your description to understand what to look for
  3. The extracted value is stored in bag.var.{variableName}
  4. Subsequent steps can reference this variable

Properties

PropertyTypeRequiredDescription
type"extract"Step type identifier
variableNamestringName for the extracted variable (letters, numbers, underscores only)
descriptionstringClear description of what to extract—be specific!
scope"last" | "transcript"Where to extract from (default: last)
idstringStep identifier for $steps.{id}.output syntax

Writing Good Extraction Descriptions

The quality of extraction depends on your description. Be specific:

❌ Too vague:

{ "description": "Get the ID" }

✅ Specific and clear:

{ "description": "Extract the 8-character alphanumeric order confirmation code that starts with 'ORD-'" }

✅ With context:

{ "description": "Find the account balance amount (in dollars) mentioned when the assistant confirmed the transfer" }

Supported Data Types

The LLM can extract various data types:

TypeExample DescriptionExtracted As
String”The confirmation code”"ORD-12345678"
Number”The account balance amount”1500.00
Boolean”Whether the transfer was successful”true
Date”The appointment date”"2024-03-15"

Using Extracted Values

After extraction, the value is available in the variable bag:

// Step 1: Extract order ID { "type": "extract", "variableName": "orderId", "description": "The order ID from the confirmation" } // Step 2: Use extracted value { "type": "message", "content": "Can you check the status of order ${bag.var.orderId}?" }

Step Execution Order

Steps execute sequentially, with each step able to access:

  1. Transcript — All previous messages in bag.transcript
  2. Last response — Most recent assistant reply in bag.last.assistant
  3. Variables — Values from saveAs and extract in bag.var
  4. Step outputs — Results by step ID in bag.steps
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ Step 1 │ --> │ Step 2 │ --> │ Step 3 │ │ (request) │ │ (message) │ │ (check) │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ ▼ ▼ ▼ bag.var.acct bag.transcript bag.last.assistant bag.steps.s1 │ │ └───────────────────┘

Complete Test Example

{ "id": "account-transfer-test", "name": "Account Transfer Flow", "steps": [ { "type": "request", "id": "setup", "requestId": "accounts.create_test", "inputMappings": { "balance": 10000 }, "saveAs": "sourceAccount" }, { "type": "message", "role": "user", "content": "I'd like to transfer $500 to my savings account" }, { "type": "assistant_check", "mode": "judge", "name": "Verify transfer confirmation", "rubric": "The assistant must confirm the transfer amount and ask for confirmation before proceeding", "threshold": 0.8 }, { "type": "message", "role": "user", "content": "Yes, please confirm the transfer" }, { "type": "extract", "variableName": "transferId", "description": "The transfer confirmation ID or reference number" }, { "type": "request", "id": "verify", "requestId": "transfers.get", "inputMappings": { "transferId": "${bag.var.transferId}" } }, { "type": "assistant_check", "mode": "variable_check", "variablePath": "bag.steps.verify.output.status", "expectEquals": "completed" } ] }

Next Steps

Last updated on