Import paths
Every class lives under theairflow.providers.arize_ax namespace:
Common params. Every operator accepts
arize_ax_conn_id (default arize_ax_default). Most accept space_id, which the provider resolves in this order: the operator argument, then the connection’s extra.default_space, then the Airflow Variable arize_ax_space_id. Create* operators accept if_exists="skip" for idempotent re-runs. Delete* and Get* operators accept ignore_if_missing=True. See design patterns.Hook
ArizeAxHook is the only hook class. Every operator instantiates it in execute() to get a cached ArizeClient. You only use the hook directly when no operator covers your pattern.
list_datasets, create_experiment, export_spans_to_df, trigger_task_run, log_spans, list_evaluators, and so on). It also handles pagination, retries transient Flight errors on span exports, and re-raises SDK 404s and backend errors as AirflowException with clear messages.
Datasets
Module:airflow.providers.arize_ax.operators.datasets
| Operator | Purpose | Key params |
|---|---|---|
ArizeAxListDatasetsOperator | List datasets in a space. | space_id, limit, cursor |
ArizeAxCreateDatasetOperator | Create a dataset, optionally with starter examples. | space_id, name, examples or examples_path, if_exists |
ArizeAxGetDatasetOperator | Fetch a dataset by ID. | dataset_id |
ArizeAxDeleteDatasetOperator | Delete a dataset. | dataset_id, ignore_if_missing |
ArizeAxListDatasetExamplesOperator | List examples in a dataset (or a specific version). | dataset_id, dataset_version_id, limit, all |
ArizeAxAppendDatasetExamplesOperator | Append examples from memory, CSV, or JSON. | dataset_id, examples or examples_path |
ArizeAxExportDatasetExamplesToFileOperator | Export all examples to JSON (paginated via SDK Flight). | dataset_id, path, dataset_version_id |
ArizeAxEvalDatasetHealthOperator | Score a dataset on freshness, diversity, and coverage vs production. | dataset_id, project_id, start_time, end_time |
ArizeAxSmartDatasetRefreshOperator | Evolve a dataset with diverse, high-value production examples. | dataset_id, project_id, start_time, end_time, diversity_strategy |
Experiments
Module:airflow.providers.arize_ax.operators.experiments
| Operator | Purpose | Key params |
|---|---|---|
ArizeAxListExperimentsOperator | List experiments for a dataset. | dataset_id, limit, cursor |
ArizeAxCreateExperimentOperator | Create experiment with pre-computed runs. | name, dataset_id, experiment_runs, task_fields, evaluator_columns, if_exists |
ArizeAxGetExperimentOperator | Fetch an experiment by ID. | experiment_id |
ArizeAxDeleteExperimentOperator | Delete an experiment. | experiment_id, ignore_if_missing |
ArizeAxRunExperimentOperator | Run a task + evaluators end-to-end against a dataset. | name, dataset_id, task, evaluators, concurrency, dry_run |
ArizeAxListExperimentRunsOperator | List runs for an experiment. | experiment_id, limit, cursor |
ArizeAxExportExperimentRunsToFileOperator | Export all runs to a JSON file. | experiment_id, path |
ArizeAxGetExperimentScoreOperator | Aggregate eval scores; gates on min_score. | experiment_id, metric_names, aggregation, min_score |
ArizeAxCompareExperimentsOperator | Compare candidate vs baseline; gates on regression. | candidate_experiment_id, baseline_experiment_id, pass_threshold, fail_on_regression |
ArizeAxDetectEvalDriftOperator | Detect metric drift between experiments; gates on drift. | experiment_id, baseline_id, drift_threshold, fail_on_drift |
ArizeAxEvaluatorCalibrationOperator | Measure LLM-judge correlation vs human labels; gates on calibration. | experiment_id, human_label_column, calibration_threshold, fail_on_poor_calibration |
ArizeAxBehavioralRegressionOperator | Compare behavioral distributions; gates on regression. | candidate_id, baseline_id, metrics, fail_on_regression |
ArizeAxEvalBudgetAllocatorOperator | Distribute evaluation budget across projects. | projects, total_budget, allocation_strategy |
Projects
Module:airflow.providers.arize_ax.operators.projects
| Operator | Purpose | Key params |
|---|---|---|
ArizeAxListProjectsOperator | List projects in a space. | space_id, limit, cursor |
ArizeAxCreateProjectOperator | Create a project. | space_id, name, if_exists |
ArizeAxGetProjectOperator | Fetch a project by ID. | project_id |
ArizeAxDeleteProjectOperator | Delete a project. | project_id, ignore_if_missing |
Spans
Module:airflow.providers.arize_ax.operators.spans
| Operator | Purpose | Key params |
|---|---|---|
ArizeAxListSpansOperator | List spans from a project (alpha). | project_id, space_id, filter, start_time, end_time |
ArizeAxSpansLogOperator | Bulk-log spans from DataFrame, CSV, or JSON. | space_id, project_name, dataframe or dataframe_path, evals_dataframe |
ArizeAxSpansUpdateEvaluationsOperator | Update span evaluations from a DataFrame. | space_id, project_name, dataframe or dataframe_path |
ArizeAxSpansUpdateAnnotationsOperator | Update span annotations from a DataFrame. | space_id, project_name, dataframe or dataframe_path |
ArizeAxSpansUpdateMetadataOperator | Update span metadata from a DataFrame. | space_id, project_name, dataframe or dataframe_path |
ArizeAxSpansExportToDataframeOperator | Export filtered spans to a DataFrame. | space_id, project_name, start_time, end_time, where, columns |
ArizeAxSpansExportToParquetOperator | Export filtered spans to a Parquet file. | space_id, project_name, start_time, end_time, path, where |
ArizeAxExportSpansToFineTuningOperator | Export spans as OpenAI fine-tuning JSONL. | space_id, project_name, start_time, end_time, path |
ArizeAxGetSpanMetricsOperator | Aggregate latency, cost, and token metrics. | space_id, project_name, start_time, end_time |
ArizeAxCurateSpansToDatasetOperator | Curate filtered spans into a dataset. | space_id, project_name, dataset_id, where |
ArizeAxExportAnnotatedSpansOperator | Export annotated spans to Parquet or JSON. | space_id, project_name, start_time, end_time, path, format |
ArizeAxAdaptiveSamplingOperator | Pick priority spans for evaluation via uncertainty/novelty/anomaly. | space_id, project_name, sample_size, strategy |
ML
Module:airflow.providers.arize_ax.operators.ml
| Operator | Purpose | Key params |
|---|---|---|
ArizeAxMLLogBatchOperator | Batch-log ML predictions from DataFrame or CSV. | space_id, model_name, model_type, environment, dataframe, schema |
ArizeAxMLLogStreamOperator | Stream a single ML prediction. | space_id, model_name, model_type, environment, log_params |
ArizeAxMLExportToDataframeOperator | Export ML data to a DataFrame. | space_id, model_name, environment, start_time, end_time, where |
ArizeAxMLExportToParquetOperator | Export ML data to a Parquet file. | space_id, model_name, environment, start_time, end_time, path |
Evaluators
Module:airflow.providers.arize_ax.operators.evaluators (alpha)
| Operator | Purpose | Key params |
|---|---|---|
ArizeAxListEvaluatorsOperator | List LLM-template evaluators in Eval Hub. | space_id, limit, cursor, name_search |
ArizeAxCreateEvaluatorOperator | Create an LLM-template evaluator. | space_id, name, evaluator_type, commit_message, template_config, if_exists |
ArizeAxGetEvaluatorOperator | Fetch an evaluator by ID. | evaluator_id |
ArizeAxUpdateEvaluatorOperator | Update evaluator metadata (name, description). | evaluator_id, name, description |
ArizeAxDeleteEvaluatorOperator | Delete an evaluator. | evaluator_id, ignore_if_missing |
ArizeAxAddEvaluatorVersionOperator | Add a new template version. | evaluator_id, commit_message, template_config |
ArizeAxListEvaluatorVersionsOperator | List versions of an evaluator. | evaluator_id, limit, cursor |
ArizeAxGetEvaluatorVersionOperator | Fetch a specific version. | evaluator_id, version_id |
Tasks
Module:airflow.providers.arize_ax.operators.tasks (alpha)
Evaluation tasks attach evaluators to a project or dataset and execute on demand or on a schedule.
| Operator | Purpose | Key params |
|---|---|---|
ArizeAxListTasksOperator | List evaluation tasks (filter by space/project/dataset/type). | space_id, project_id, dataset_id, eval_task_type |
ArizeAxGetTaskOperator | Fetch a task by ID. | task_id_param |
ArizeAxCreateTaskOperator | Create a task that attaches evaluators to a project or dataset. | name, eval_task_type, evaluators, project_id or dataset_id, sampling_rate, is_continuous |
ArizeAxListTaskRunsOperator | List runs for a task. | task_id_param, limit, cursor |
ArizeAxGetTaskRunOperator | Fetch a task run by ID. | run_id |
ArizeAxTriggerTaskRunOperator | Trigger an on-demand run; override_evaluations=True re-scores existing spans. | task_id_param, data_start_time, data_end_time, max_spans, override_evaluations |
ArizeAxCancelTaskRunOperator | Cancel a pending or running task run. | run_id |
Prompts
Module:airflow.providers.arize_ax.operators.prompts
| Operator | Purpose | Key params |
|---|---|---|
ArizeAxListPromptsOperator | List prompts in Prompt Hub. | space_id, limit, cursor |
ArizeAxGetPromptOperator | Fetch a prompt by ID, version ID, or label. | prompt_id, version_id, version_label |
ArizeAxCreatePromptOperator | Create a prompt with an initial version. | space_id, name, messages or messages_task_id, provider, model, if_exists |
ArizeAxDeletePromptOperator | Delete a prompt. | prompt_id, ignore_if_missing |
ArizeAxPromotePromptOperator | Apply a label to a prompt version (e.g. production). | prompt_id, version_id, label |
ArizeAxComparePromptsOperator | Compare two prompt versions by experiment scores. | prompt_id_1, prompt_id_2, experiment_id |
Spaces
Module:airflow.providers.arize_ax.operators.spaces
| Operator | Purpose | Key params |
|---|---|---|
ArizeAxListSpacesOperator | List spaces accessible to the authenticated user. | limit, cursor |
ArizeAxGetSpaceOperator | Fetch a space by ID. | space_id |
ArizeAxCreateSpaceOperator | Create a space in an organization. | name, organization_id, description |
ArizeAxUpdateSpaceOperator | Update space name or description. | space_id, name, description |
ArizeAxDeleteSpaceOperator | Delete a space (alpha). | space_id, ignore_if_missing |
Annotation configs & queues
Module:airflow.providers.arize_ax.operators.annotations (alpha)
For configuring HITL annotation flows. Pair with ArizeAxAnnotationQueueSensor to gate downstream tasks until a minimum number of annotation configs exist in the space.
| Operator | Purpose | Key params |
|---|---|---|
ArizeAxListAnnotationConfigsOperator | List annotation configurations. | space_id, limit, cursor |
ArizeAxCreateAnnotationConfigOperator | Create a config (freeform, category, or numeric). | space_id, name, config_type, minimum_score, maximum_score |
ArizeAxDeleteAnnotationConfigOperator | Delete a config. | annotation_config, space_id |
ArizeAxListAnnotationQueuesOperator | List queues. | space_id, limit, cursor |
ArizeAxGetAnnotationQueueOperator | Fetch a queue by ID or name. | queue_id or queue_name, space_id |
ArizeAxCreateAnnotationQueueOperator | Create a queue with annotators and configs. | space_id, name, description, annotators, configs |
ArizeAxUpdateAnnotationQueueOperator | Update queue metadata, annotators, or configs. | queue_id, name, instructions, configs, annotators |
ArizeAxDeleteAnnotationQueueOperator | Delete a queue. | queue_id, ignore_if_missing |
ArizeAxListAnnotationQueueRecordsOperator | List pending records (max 500/page). | queue_id, limit, cursor |
ArizeAxAddAnnotationQueueRecordsOperator | Add record sources to a queue (max 2 sources). | queue_id, sources, num_records |
ArizeAxAnnotateQueueRecordOperator | Submit annotations on a record. | queue_id, record_id, annotations |
ArizeAxAssignQueueRecordOperator | Assign a record to one or more annotators. | queue_id, record_id, annotators |
API keys
Module:airflow.providers.arize_ax.operators.api_keys
| Operator | Purpose | Key params |
|---|---|---|
ArizeAxListAPIKeysOperator | List API keys (user or service). | limit, cursor |
ArizeAxCreateAPIKeyOperator | Create an API key. | key_type, description, organization_id |
ArizeAxDeleteAPIKeyOperator | Delete an API key. | api_key_id, ignore_if_missing |
ArizeAxRefreshAPIKeyOperator | Rotate (refresh) an API key. | api_key_id |
AI integrations
Module:airflow.providers.arize_ax.operators.ai_integrations (alpha)
Register external agent platforms (LangGraph, OpenAI Agents, Autogen) so Arize evaluators can run against them.
| Operator | Purpose | Key params |
|---|---|---|
ArizeAxListAIIntegrationsOperator | List AI integrations. | space_id, limit, cursor |
ArizeAxGetAIIntegrationOperator | Fetch an integration by ID. | integration_id |
ArizeAxCreateAIIntegrationOperator | Create an integration. | name, provider, space_id, api_key, base_url, model_names, if_exists |
ArizeAxUpdateAIIntegrationOperator | Update an integration. | integration_id, name, provider, api_key |
ArizeAxDeleteAIIntegrationOperator | Delete an integration. | integration_id, ignore_if_missing |
Sensors
Module:airflow.providers.arize_ax.sensors.arize_ax
All sensors accept the standard Airflow poke_interval, timeout, mode, and soft_fail.
| Sensor | Waits until… | Key params |
|---|---|---|
ArizeAxExperimentCompleteSensor | Experiment reaches a terminal state. | experiment_id, min_runs |
ArizeAxExperimentRunCountSensor | Experiment has at least N completed runs. | experiment_id, min_runs |
ArizeAxEvaluationScoreSensor | Experiment metric mean ≥ threshold. | experiment_id, metric_name, min_score |
ArizeAxDatasetReadySensor | Dataset has ≥ N examples. | dataset_id, min_examples, dataset_version_id |
ArizeAxSpanCountSensor | Project has ≥ N spans in the window. | project_id, min_count, start_time, end_time, filter |
ArizeAxSpanIngestionSensor | Span count stops growing rapidly (ingestion stable). | project_id, stable_window_seconds |
ArizeAxAnnotationQueueSensor | At least N annotation configs exist in the space. | space_id, min_count |
ArizeAxTaskRunSensor | Evaluation task run reaches a terminal state. | run_id |
Resources
Provider overview
Install, connection setup, design patterns, and example DAG walkthroughs.
Example DAGs
Runnable DAGs covering CI/CD gating, prompt lifecycle, dataset curation, and more.
Python SDK v8
The underlying
ArizeClient API the provider wraps.