Labeling Queues

Send your data to subject matter experts
Build a ground truth dataset

Send your data to subject matter experts

Use labeling queues when you want a subject matter expert or third party to label spans without exposing the full traces view. Reviewers get a focused interface showing only what they need to annotate. Once labeled, those examples become the ground truth dataset you use to validate evaluators and run experiments.

By Alyx
By UI

Ask Alyx to create a labeling queue, send data to it, and optionally annotate data. For example:

“Send spans where latency is over 5 seconds to my Slow Response labeling queue”
“Send spans where hallucination eval scored 0 to the Hallucination Review queue”

Tracing view with eval filter applied and Alyx sidebar suggesting sending low-scoring hallucination spans to the Hallucination Review labeling queue

Build a ground truth dataset

A ground truth dataset is a curated set of labeled examples that captures the range of behaviors your system should and should not produce. It gives you a stable benchmark for validating automated evaluators and a reusable dataset to run experiments against as your prompts and models evolve.

By Alyx
By UI

Ask Alyx to create a dataset from spans of interest, append spans to an existing dataset, or suggest examples that cover edge cases for your rubric.Example prompts:

“Create a dataset from the spans I filtered in this trace view and include inputs and outputs”
“Append these high-error spans to my regression benchmark dataset”
“Suggest 20 diverse examples for a golden dataset based on my last week’s traces”

Tracing view with span filter applied and Alyx sidebar offering to create a golden dataset from factual spans, with preview table and Accept and Create Dataset action

View eval results and costs Defining the Dataset That Powers Your Experiments

⌘I

How to Use Arize

Quickstart

Instrument

Observe

Evaluate

Develop

Prompts

Machine Learning

Settings

Security

Send your data to subject matter experts

Build a ground truth dataset

How to Use Arize

Quickstart

Instrument

Observe

Evaluate

Develop

Prompts

Machine Learning

Settings

Security

​Send your data to subject matter experts

​Build a ground truth dataset

Send your data to subject matter experts

Build a ground truth dataset