Phoenix CLI
Debug and analyze LLM applications using the Phoenix CLI (px).
Quick Start Installation npm install -g @arizeai/phoenix-cli
Or run directly with npx
npx @arizeai/phoenix-cli
Configuration
Set environment variables before running commands:
export PHOENIX_HOST=http://localhost:6006 export PHOENIX_PROJECT=my-project export PHOENIX_API_KEY=your-api-key # if authentication is enabled
CLI flags override environment variables when specified.
Debugging Workflows Debug a failing LLM application Fetch recent traces to see what's happening: px traces --limit 10
Find failed traces: px traces --limit 50 --format raw --no-progress | jq '.[] | select(.status == "ERROR")'
Get details on a specific trace:
px trace
Look for errors in spans:
px trace
Find performance issues Get the slowest traces: px traces --limit 20 --format raw --no-progress | jq 'sort_by(-.duration) | .[0:5]'
Analyze span durations within a trace:
px trace
Analyze LLM usage
Extract models and token counts:
px traces --limit 50 --format raw --no-progress | \ jq -r '.[].spans[] | select(.span_kind == "LLM") | {model: .attributes["llm.model_name"], prompt_tokens: .attributes["llm.token_count.prompt"], completion_tokens: .attributes["llm.token_count.completion"]}'
Review experiment results List datasets: px datasets
List experiments for a dataset: px experiments --dataset my-dataset
Analyze experiment failures:
px experiment
Calculate average latency:
px experiment
Command Reference px traces
Fetch recent traces from a project.
px traces [directory] [options]
Option Description
[directory] Save traces as JSON files to directory
-n, --limit
Fetch a specific trace by ID.
px trace
Option Description
--file
List all datasets.
px datasets [options]
px dataset
Fetch examples from a dataset.
px dataset
Option Description
--split
List experiments for a dataset.
px experiments --dataset
Option Description
--dataset
Fetch a single experiment with run data.
px experiment
px prompts
List all prompts.
px prompts [options]
px prompt
Fetch a specific prompt.
px prompt
Output Formats pretty (default): Human-readable tree view json: Formatted JSON with indentation raw: Compact JSON for piping to jq or other tools
Use --format raw --no-progress when piping output to other commands.
Trace Structure
Traces contain spans with OpenInference semantic attributes:
{ "traceId": "abc123", "spans": [{ "name": "chat_completion", "span_kind": "LLM", "status_code": "OK", "attributes": { "llm.model_name": "gpt-4", "llm.token_count.prompt": 512, "llm.token_count.completion": 256, "input.value": "What is the weather?", "output.value": "The weather is sunny..." } }], "duration": 1250, "status": "OK" }
Key span kinds: LLM, CHAIN, TOOL, RETRIEVER, EMBEDDING, AGENT.
Key attributes for LLM spans:
llm.model_name: Model used llm.provider: Provider name (e.g., "openai") llm.token_count.prompt / llm.token_count.completion: Token counts llm.input_messages.: Input messages (indexed, with role and content) llm.output_messages.: Output messages (indexed, with role and content) input.value / output.value: Raw input/output as text exception.message: Error message if failed