Skip to Content
ConceptsObservability

Observability

Track activity and troubleshoot issues using built-in Analytics and Test Results. Export data or integrate with your own tooling as needed.

Analytics

  • Metrics: Test runs, pass/fail rates, judge scores, and latency (avg, p50, p95, max).
  • Summary: Suite performance over time with trend charts.

Test Results

  • Inspect step inputs/outputs and error details per run.
  • Use run detail pages to debug and review transcripts.

Tips

  • Keep rubrics clear; the judge reasoning helps debug unexpected failures.
  • Export transcripts and results for audits or post-incident reviews.
Last updated on