Skip to main content

phoenix/js/packages/phoenix-cli at main · Arize-ai/phoenix

GitHub
Phoenix CLI is a command-line interface for your Phoenix projects. Fetch traces, list datasets, and export experiment results directly from your terminal—or pipe them into AI coding agents like Claude Code, Cursor, Codex, and Gemini CLI. You can use Phoenix CLI for:
  • Immediate Debugging: Fetch the most recent trace of a failed or unexpected run with a single command
  • Bulk Export: Export large numbers of traces or experiment results to JSON files for offline analysis
  • Dataset & Experiment Access: List datasets and retrieve full experiment data including runs, evaluations, and trace IDs
  • Terminal Workflows: Integrate trace and experiment data into your existing tools, piping output to Unix utilities like jq
  • AI Coding Assistants: Use with Claude Code, Cursor, Windsurf, or other AI-powered tools to analyze agent traces and experiments
Don’t see a use-case covered? @arizeai/phoenix-cli is open-source! Issues and PRs welcome.

Installation

npm install -g @arizeai/phoenix-cli
Or run directly with npx:
npx @arizeai/phoenix-cli

Quick Start

# Configure your Phoenix instance
export PHOENIX_HOST=http://localhost:6006
export PHOENIX_PROJECT=my-project
export PHOENIX_API_KEY=your-api-key  # if authentication is enabled

# Fetch the most recent trace
px traces --limit 1

# Fetch a specific trace by ID
px trace abc123def456

# Export traces to a directory
px traces ./my-traces --limit 50

Environment Variables

VariableDescription
PHOENIX_HOSTPhoenix API endpoint (e.g., http://localhost:6006)
PHOENIX_PROJECTProject name or ID
PHOENIX_API_KEYAPI key for authentication (if required)
PHOENIX_CLIENT_HEADERSCustom headers as JSON string
CLI flags take priority over environment variables.

Commands

px projects

List all available projects.
px projects
px projects --format raw  # JSON output for piping
OptionDescriptionDefault
--endpoint <url>Phoenix API endpointFrom env
--api-key <key>Phoenix API keyFrom env
--format <format>Output format: pretty, json, or rawpretty
--no-progressDisable progress indicators
--limit <number>Maximum projects to fetch per page100

px traces [directory]

Fetch recent traces from the configured project.
px traces --limit 10                          # Output to stdout
px traces ./my-traces --limit 10              # Save to directory
px traces --last-n-minutes 60 --limit 20      # Filter by time
px traces --since 2026-01-13T10:00:00Z        # Since timestamp
px traces --format raw --no-progress | jq     # Pipe to jq
OptionDescriptionDefault
[directory]Save traces as JSON files to directorystdout
-n, --limit <number>Number of traces to fetch (newest first)10
--last-n-minutes <number>Only fetch traces from the last N minutes
--since <timestamp>Fetch traces since ISO timestamp
--endpoint <url>Phoenix API endpointFrom env
--project <name>Project name or IDFrom env
--api-key <key>Phoenix API keyFrom env
--format <format>pretty, json, or rawpretty
--no-progressDisable progress output
--max-concurrent <number>Maximum concurrent fetches10

px trace <trace-id>

Fetch a specific trace by ID.
px trace abc123def456
px trace abc123def456 --file trace.json      # Save to file
px trace abc123def456 --format raw | jq      # Pipe to jq
OptionDescriptionDefault
--file <path>Save to file instead of stdoutstdout
--format <format>pretty, json, or rawpretty
--endpoint <url>Phoenix API endpointFrom env
--project <name>Project name or IDFrom env
--api-key <key>Phoenix API keyFrom env
--no-progressDisable progress indicators

px datasets

List all available datasets.
px datasets
px datasets --format json                    # JSON output
px datasets --format raw --no-progress | jq  # Pipe to jq
OptionDescriptionDefault
--endpoint <url>Phoenix API endpointFrom env
--api-key <key>Phoenix API keyFrom env
--format <format>pretty, json, or rawpretty
--no-progressDisable progress indicators
--limit <number>Maximum number of datasets

px dataset <dataset-identifier>

Fetch examples from a dataset.
px dataset query_response                        # Fetch all examples
px dataset query_response --split train          # Filter by split
px dataset query_response --split train --split test  # Multiple splits
px dataset query_response --version <version-id> # Specific version
px dataset query_response --file dataset.json    # Save to file
px dataset query_response --format raw | jq '.examples[].input'
OptionDescriptionDefault
--split <name>Filter by split (can be used repeatedly)
--version <id>Fetch from specific dataset versionlatest
--file <path>Save to file instead of stdoutstdout
--format <format>pretty, json, or rawpretty
--endpoint <url>Phoenix API endpointFrom env
--api-key <key>Phoenix API keyFrom env
--no-progressDisable progress indicators

px experiments --dataset <name-or-id>

List experiments for a dataset, optionally exporting full data to files.
px experiments --dataset my-dataset                 # List experiments
px experiments --dataset my-dataset --format json   # JSON output
px experiments --dataset my-dataset ./experiments   # Export to directory
OptionDescriptionDefault
--dataset <name-or-id>Dataset name or ID (required)
[directory]Export experiment JSON files to directorystdout
--endpoint <url>Phoenix API endpointFrom env
--api-key <key>Phoenix API keyFrom env
--format <format>pretty, json, or rawpretty
--no-progressDisable progress indicators
--limit <number>Maximum number of experiments

px experiment <experiment-id>

Fetch a single experiment with all run data, including inputs, outputs, evaluations, and trace IDs.
px experiment RXhwZXJpbWVudDox
px experiment RXhwZXJpbWVudDox --file exp.json   # Save to file
px experiment RXhwZXJpbWVudDox --format json     # JSON output
OptionDescriptionDefault
--file <path>Save to file instead of stdoutstdout
--format <format>pretty, json, or rawpretty
--endpoint <url>Phoenix API endpointFrom env
--api-key <key>Phoenix API keyFrom env
--no-progressDisable progress indicators

Output Formats

pretty (default) — Human-readable tree view:
┌─ Trace: abc123def456

│  Input: What is the weather in San Francisco?
│  Output: The weather is currently sunny...

│  Spans:
│  └─ ✓ agent_run (CHAIN) - 1250ms
│     ├─ ✓ llm_call (LLM) - 800ms
│     └─ ✓ tool_execution (TOOL) - 400ms
└─
json — Formatted JSON with indentation. raw — Compact JSON for piping to jq or other tools.

JSON Structure

{
  "traceId": "abc123def456",
  "spans": [
    {
      "name": "chat_completion",
      "context": {
        "trace_id": "abc123def456",
        "span_id": "span-1"
      },
      "span_kind": "LLM",
      "parent_id": null,
      "start_time": "2026-01-17T10:00:00.000Z",
      "end_time": "2026-01-17T10:00:01.250Z",
      "status_code": "OK",
      "attributes": {
        "llm.model_name": "gpt-4",
        "llm.token_count.prompt": 512,
        "llm.token_count.completion": 256,
        "input.value": "What is the weather?",
        "output.value": "The weather is sunny..."
      }
    }
  ],
  "rootSpan": { ... },
  "startTime": "2026-01-17T10:00:00.000Z",
  "endTime": "2026-01-17T10:00:01.250Z",
  "duration": 1250,
  "status": "OK"
}
Spans include OpenInference semantic attributes like llm.model_name, llm.token_count.*, input.value, output.value, tool.name, and exception.*.

Examples

Debug failed traces

px traces --limit 20 --format raw --no-progress | jq '.[] | select(.status == "ERROR")'

Find slowest traces

px traces --limit 10 --format raw --no-progress | jq 'sort_by(-.duration) | .[0:3]'

Extract LLM models used

px traces --limit 50 --format raw --no-progress | \
  jq -r '.[].spans[] | select(.span_kind == "LLM") | .attributes["llm.model_name"]' | sort -u

Count errors

px traces --limit 100 --format raw --no-progress | jq '[.[] | select(.status == "ERROR")] | length'

List datasets and experiments

# List all datasets
px datasets --format raw --no-progress | jq '.[].name'
# Output: "query_response"

# List experiments for a dataset
px experiments --dataset query_response --format raw --no-progress | \
  jq '.[] | {id, successful_run_count, failed_run_count}'
# Output: {"id":"RXhwZXJpbWVudDox","successful_run_count":249,"failed_run_count":1}

# Export all experiment data for a dataset to a directory
px experiments --dataset query_response ./experiments/

Analyze experiment results

# Get input queries and latency from an experiment
px experiment RXhwZXJpbWVudDox --format raw --no-progress | \
  jq '.[] | {query: .input.query, latency_ms, trace_id}'

# Find failed runs in an experiment
px experiment RXhwZXJpbWVudDox --format raw --no-progress | \
  jq '.[] | select(.error != null) | {query: .input.query, error}'
# Output: {"query":"looking for complex fodmap meal ideas","error":"peer closed connection..."}

# Calculate average latency across runs
px experiment RXhwZXJpbWVudDox --format raw --no-progress | \
  jq '[.[].latency_ms] | add / length'

Use with AI Coding Assistants

Phoenix CLI is designed to work seamlessly with AI coding assistants like Claude Code, Cursor, and Windsurf.

Claude Code

Ask Claude Code:
Use px to fetch the last 3 traces from my Phoenix project and analyze them for potential improvements
Claude Code will discover the CLI via px --help and fetch your traces for analysis.

Cursor / Windsurf

Run the CLI in the terminal and ask the AI to interpret:
Fetch my recent Phoenix traces using px and explain what my agent is doing

License

Apache 2.0