Test Steps

Test steps define the sequence of actions in your test. Each step executes in order, building up the conversation transcript and variable bag.

Step Types Overview

Type	Purpose	Key Properties
`message`	Send a user message to the assistant	`role`, `content`
`request`	Execute an HTTP request	`requestId`, `inputMappings`, `saveAs`
`assistant_check`	Evaluate the assistant’s response	`mode`, `rubric`, `threshold`
`user_objective`	Multi-turn conversation with a goal	`description`, `maxTurns`
`extract`	Extract data from conversation	`variableName`, `description`

Message Step

Sends a user message to the assistant and waits for a response.


{
  "type": "message",
  "role": "user",
  "content": "What's the weather like today?"
}

Properties

Property	Type	Required	Description
`type`	`"message"`	✅	Step type identifier
`role`	`"user"` \| `"system"`	✅	Message sender role
`content`	string	✅	Message content (supports variable interpolation)
`id`	string		Step identifier for referencing outputs
`name`	string		Human-readable step name

Variable Interpolation in Content

Use variables from previous steps:


{
  "type": "message",
  "role": "user",
  "content": "Transfer $${bag.amount} to account ${bag.var.targetAccount}"
}

Request Step

Executes an HTTP request defined in your requests/ configuration. Use this to set up test data, verify backend state, or trigger external actions.


{
  "type": "request",
  "id": "create_account",
  "requestId": "accounts.create_test",
  "inputMappings": {
    "accountId": "acct-12345",
    "status": "active",
    "balance": 5000
  },
  "saveAs": "newAccount"
}

Properties

Property	Type	Required	Description
`type`	`"request"`	✅	Step type identifier
`requestId`	string	✅	Reference to request definition (e.g., `accounts.create_test`)
`inputMappings`	object		Input values passed to the request body
`saveAs`	string		Variable name to store the response
`id`	string		Step identifier for `$steps.{id}.output` syntax

Accessing Request Results

After a request step executes, you can access its response:


// Step 1: Create account
{
  "type": "request",
  "id": "create_acct",
  "requestId": "accounts.create_test",
  "inputMappings": { "balance": 1000 },
  "saveAs": "acct"
}
 
// Step 2: Use the created account ID
{
  "type": "message",
  "role": "user",
  "content": "Show me details for account ${bag.var.acct.accountId}"
}

Or use the step ID syntax:


{
  "type": "message",
  "content": "Account created: $steps.create_acct.output.accountId"
}

Assistant Check Step

Evaluates the assistant’s response using an LLM judge or keyword matching.

Judge Mode (Semantic Evaluation)


{
  "type": "assistant_check",
  "mode": "judge",
  "name": "Check risk warning",
  "rubric": "The assistant must clearly warn about investment risks and recommend diversification.",
  "threshold": 0.7,
  "scope": "last",
  "severity": "error"
}

Properties (Judge Mode)

Property	Type	Required	Description
`type`	`"assistant_check"`	✅	Step type identifier
`mode`	`"judge"`	✅	Use LLM-based evaluation
`rubric`	string	✅	Evaluation criteria for the judge
`threshold`	number		Pass threshold (0-1), default varies by judge
`scope`	`"last"` \| `"transcript"`		What to evaluate: last response or full conversation
`severity`	`"error"` \| `"warning"`		Impact on test pass/fail
`name`	string		Human-readable name shown in results

Includes Mode (Keyword Matching)


{
  "type": "assistant_check",
  "mode": "includes",
  "scope": "last",
  "includes": ["risk", "diversify", "consult"],
  "severity": "error"
}

Properties (Includes Mode)

Property	Type	Required	Description
`mode`	`"includes"`	✅	Use keyword matching
`includes`	string[]	✅	Keywords that must appear in response
`scope`	`"last"` \| `"transcript"`		Where to search for keywords

Variable Check Mode


{
  "type": "assistant_check",
  "mode": "variable_check",
  "variablePath": "bag.var.accountStatus",
  "expectEquals": "active"
}

User Objective Step

Defines a multi-turn conversational goal. The runner will iterate until the objective is met or max turns reached.


{
  "type": "user_objective",
  "description": "Successfully schedule an appointment for next Tuesday at 2pm",
  "minTurns": 2,
  "maxTurns": 8,
  "iterativeConversation": true,
  "continueAfterPass": false,
  "attachedChecks": [
    {
      "type": "assistant_check",
      "mode": "judge",
      "rubric": "The assistant confirmed the appointment details correctly."
    }
  ]
}

Properties

Property	Type	Required	Description
`type`	`"user_objective"`	✅	Step type identifier
`description`	string	✅	Goal description for the LLM to pursue
`minTurns`	number		Minimum conversation turns
`maxTurns`	number		Maximum turns before stopping (default: 8)
`iterativeConversation`	boolean		Whether to iterate (default: true)
`continueAfterPass`	boolean		Continue after objective is met
`attachedChecks`	AssistantCheckStep[]		Checks to run at each turn

Extract Step

Extracts structured information from the conversation using an LLM. The AI analyzes the conversation and extracts the requested data into a variable you can use in subsequent steps.

This is particularly useful for:

Extracting confirmation numbers, order IDs, or reference codes from assistant responses
Pulling out specific data points mentioned in the conversation
Capturing dynamic values that you need to verify or use later


{
  "type": "extract",
  "id": "get_order_id",
  "variableName": "orderId",
  "description": "Extract the order confirmation number mentioned in the assistant's response",
  "scope": "last"
}

How It Works

The LLM receives the conversation context (last message or full transcript)
It uses your description to understand what to look for
The extracted value is stored in bag.var.{variableName}
Subsequent steps can reference this variable

Properties

Property	Type	Required	Description
`type`	`"extract"`	✅	Step type identifier
`variableName`	string	✅	Name for the extracted variable (letters, numbers, underscores only)
`description`	string	✅	Clear description of what to extract—be specific!
`scope`	`"last"` \| `"transcript"`		Where to extract from (default: `last`)
`id`	string		Step identifier for `$steps.{id}.output` syntax

Writing Good Extraction Descriptions

The quality of extraction depends on your description. Be specific:

❌ Too vague:


{ "description": "Get the ID" }

✅ Specific and clear:


{ "description": "Extract the 8-character alphanumeric order confirmation code that starts with 'ORD-'" }

✅ With context:


{ "description": "Find the account balance amount (in dollars) mentioned when the assistant confirmed the transfer" }

Supported Data Types

The LLM can extract various data types:

Type	Example Description	Extracted As
String	”The confirmation code”	`"ORD-12345678"`
Number	”The account balance amount”	`1500.00`
Boolean	”Whether the transfer was successful”	`true`
Date	”The appointment date”	`"2024-03-15"`

Using Extracted Values

After extraction, the value is available in the variable bag:


// Step 1: Extract order ID
{
  "type": "extract",
  "variableName": "orderId",
  "description": "The order ID from the confirmation"
}
 
// Step 2: Use extracted value
{
  "type": "message",
  "content": "Can you check the status of order ${bag.var.orderId}?"
}

Step Execution Order

Steps execute sequentially, with each step able to access:

Transcript — All previous messages in bag.transcript
Last response — Most recent assistant reply in bag.last.assistant
Variables — Values from saveAs and extract in bag.var
Step outputs — Results by step ID in bag.steps


┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│  Step 1     │ --> │  Step 2     │ --> │  Step 3     │
│  (request)  │     │  (message)  │     │  (check)    │
└─────────────┘     └─────────────┘     └─────────────┘
       │                   │                   │
       ▼                   ▼                   ▼
    bag.var.acct      bag.transcript     bag.last.assistant
    bag.steps.s1           │                   │
                           └───────────────────┘

Complete Test Example


{
  "id": "account-transfer-test",
  "name": "Account Transfer Flow",
  "steps": [
    {
      "type": "request",
      "id": "setup",
      "requestId": "accounts.create_test",
      "inputMappings": { "balance": 10000 },
      "saveAs": "sourceAccount"
    },
    {
      "type": "message",
      "role": "user",
      "content": "I'd like to transfer $500 to my savings account"
    },
    {
      "type": "assistant_check",
      "mode": "judge",
      "name": "Verify transfer confirmation",
      "rubric": "The assistant must confirm the transfer amount and ask for confirmation before proceeding",
      "threshold": 0.8
    },
    {
      "type": "message",
      "role": "user",
      "content": "Yes, please confirm the transfer"
    },
    {
      "type": "extract",
      "variableName": "transferId",
      "description": "The transfer confirmation ID or reference number"
    },
    {
      "type": "request",
      "id": "verify",
      "requestId": "transfers.get",
      "inputMappings": { "transferId": "${bag.var.transferId}" }
    },
    {
      "type": "assistant_check",
      "mode": "variable_check",
      "variablePath": "bag.steps.verify.output.status",
      "expectEquals": "completed"
    }
  ]
}

Next Steps

Variable Interpolation — Dynamic data syntax
Requests & Auth — HTTP request configuration
Test Runner — Execution engine overview