Prerequisites
Before configuring classification metrics, make sure:- Your dataset has a column with categorical ground truth labels (e.g.,
expected_category,true_label). - Your experiments produce an output column with predicted labels (e.g.,
output,predicted_label). - Column values are clean categorical strings. Metrics are computed by exact match against the selected positive class.
Configure metrics settings
Select Ground Truth Column
Choose the column from your dataset that contains the true labels. The dropdown shows all columns available in the dataset version.
Select Predicted Column
Choose the column from your experiment outputs that contains the predicted labels. The dropdown shows columns available across the selected experiments.
Select Positive Class
Pick the value that represents the positive class for binary metric computation. The dropdown is populated with distinct values from the ground truth column you selected.
Changing the ground truth column resets the positive class selection.
Metrics computed
Arize computes the following binary classification metrics using the selected positive class. Rows where either the ground truth or predicted value is null are excluded.| Metric | Formula | Description |
|---|---|---|
| Accuracy | (TP + TN) / Total | Fraction of predictions that match the ground truth |
| Precision | TP / (TP + FP) | Of all positive predictions, how many are correct |
| Recall | TP / (TP + FN) | Of all actual positives, how many were predicted |
| F1 | 2 · TP / (2 · TP + FP + FN) | Harmonic mean of Precision and Recall |
- TP (True Positive) — predicted and ground truth both match the positive class
- TN (True Negative) — neither predicted nor ground truth match the positive class
- FP (False Positive) — predicted matches the positive class, ground truth does not
- FN (False Negative) — ground truth matches the positive class, predicted does not