From arize-platform
Manages Arize AI datasets via ax CLI: list with pagination, get details, create from CSV/JSON/Parquet files, delete, export data, extract IDs with jq. For Arize ML platform ops.
How this skill is triggered — by the user, by Claude, or both
Slash command
/arize-platform:arize-datasetsThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Manage datasets in the Arize AI platform using the `ax` CLI.
Manage datasets in the Arize AI platform using the ax CLI.
The user must have:
pip install arize-ax-cli)ax config init)ax datasets list
Options:
--output <format> - Output format: table (default), json, csv, parquet--profile <name> - Use specific configuration profile--limit <n> - Limit number of results--offset <n> - Skip first n results (pagination)Examples:
# List as table (default)
ax datasets list
# List as JSON
ax datasets list --output json
# List with pagination
ax datasets list --limit 10 --offset 0
# Use production profile
ax datasets list --profile production
Extracting Dataset IDs:
To find a specific dataset ID for use in other operations:
# Get all dataset IDs and names as JSON
ax datasets list --output json | jq '.[] | {id: .id, name: .name}'
# Find a dataset ID by name
ax datasets list --output json | jq -r '.[] | select(.name == "Training Data") | .id'
# Save dataset ID to a variable
DATASET_ID=$(ax datasets list --output json | jq -r '.[] | select(.name == "Training Data") | .id')
echo "Found dataset: $DATASET_ID"
# Use the ID in subsequent commands
ax datasets get "$DATASET_ID"
ax datasets delete "$DATASET_ID"
Without jq (using grep):
# List with grep to find dataset
ax datasets list --output json | grep -A 2 "Training Data" | grep "id"
# More reliable pattern
ax datasets list --output json | grep -B 1 '"name": "Training Data"' | grep "id" | cut -d'"' -f4
Retrieve information about a specific dataset:
ax datasets get <dataset-id>
Options:
--output <format> - Output format--profile <name> - Configuration profile to useExamples:
# Get dataset details
ax datasets get ds_abc123xyz
# Get as JSON
ax datasets get ds_abc123xyz --output json
# Get from production environment
ax datasets get ds_abc123xyz --profile production
Create a dataset from a file:
ax datasets create --file <path> [options]
Supported File Formats:
.csv).json, .jsonl).parquet)Options:
--name <name> - Dataset name (required or inferred from filename)--description <text> - Dataset description--profile <name> - Configuration profile to useExamples:
# Create from CSV
ax datasets create --file data.csv --name "Training Data" --description "Production training set"
# Create from JSON
ax datasets create --file examples.json --name "Test Examples"
# Create from Parquet
ax datasets create --file dataset.parquet --name "Large Dataset"
# Use staging profile
ax datasets create --file data.csv --name "Test Data" --profile staging
Remove a dataset from Arize:
ax datasets delete <dataset-id>
Options:
--profile <name> - Configuration profile to use--yes or -y - Skip confirmation promptExamples:
# Delete with confirmation
ax datasets delete ds_abc123xyz
# Delete without confirmation
ax datasets delete ds_abc123xyz --yes
# Delete from production
ax datasets delete ds_abc123xyz --profile production
⚠️ Warning: Deletion is permanent. Always verify the dataset ID before deleting.
Export dataset examples to various formats:
ax datasets get <dataset-id> --output <format>
Export Formats:
json - JSON formatcsv - Comma-separated valuesparquet - Apache Parquet formatExamples:
# Export to JSON
ax datasets get ds_abc123xyz --output json > dataset.json
# Export to CSV
ax datasets get ds_abc123xyz --output csv > dataset.csv
# Export to Parquet
ax datasets get ds_abc123xyz --output parquet > dataset.parquet
When working across different environments (dev, staging, production):
# List datasets in production
ax datasets list --profile production
# Create dataset in staging
ax datasets create --file test_data.csv --profile staging
# Get dataset from dev environment
ax datasets get ds_dev_123 --profile dev
For accounts with many datasets, use pagination:
# First page (10 items)
ax datasets list --limit 10 --offset 0
# Second page
ax datasets list --limit 10 --offset 10
# Third page
ax datasets list --limit 10 --offset 20
# 1. List all datasets and find the one you want
ax datasets list --output json | jq '.[] | {id: .id, name: .name}'
# 2. Extract the specific dataset ID by name
DATASET_ID=$(ax datasets list --output json | jq -r '.[] | select(.name == "Production Data") | .id')
# 3. Get detailed information about that dataset
ax datasets get "$DATASET_ID"
# 4. Export the dataset if needed
ax datasets get "$DATASET_ID" --output csv > dataset_export.csv
# 1. Create dataset
ax datasets create --file data.csv --name "My Dataset"
# 2. Find the new dataset ID
DATASET_ID=$(ax datasets list --output json | jq -r '.[] | select(.name == "My Dataset") | .id')
echo "Created dataset: $DATASET_ID"
# 3. Verify details
ax datasets get "$DATASET_ID"
# 1. Export existing dataset
ax datasets get ds_abc123 --output csv > dataset.csv
# 2. Modify the CSV file (manual editing)
# 3. Create new version
ax datasets create --file dataset.csv --name "Updated Dataset v2"
# 1. Export from production
ax datasets get ds_prod_123 --profile production --output json > prod_data.json
# 2. Import to staging
ax datasets create --file prod_data.json --name "Production Copy" --profile staging
# 1. List all datasets
ax datasets list --output json > all_datasets.json
# 2. Review and identify datasets to delete (manual review)
# 3. Delete old datasets
ax datasets delete ds_old_001 --yes
ax datasets delete ds_old_002 --yes
Human-readable table with columns for ID, Name, Created, and Status.
Structured JSON with full dataset metadata:
{
"id": "ds_abc123xyz",
"name": "Training Data",
"description": "Production training set",
"created_at": "2024-01-15T10:30:00Z",
"num_examples": 1000,
"size_bytes": 52428800
}
Comma-separated values, useful for importing into spreadsheets or pandas.
Efficient columnar format, ideal for large datasets and data processing.
ax datasets listax config showax config show --expandax config initSupported formats are CSV, JSON (including JSONL), and Parquet. Check:
For very large datasets:
For datasets with many examples:
--limit to restrict output sizeax datasets get ds_abc123 --output json > dataset.json
--limit and --offsetDATASET_ID=$(ax datasets list --output json | jq -r '.[] | select(.name == "My Dataset") | .id')
ax datasets list --output json | jq '.[] | .id'ax datasets list --output json | jq '.[] | {id, name}'ax datasets get "$DATASET_ID" to confirm before deletingprod, staging, devUse this skill when users want to:
Don't use this skill for:
/arize-graphql-analytics instead)/setup-arize-cli instead)npx claudepluginhub arize-ai/arize-claude-code-plugin --plugin arize-platformCreates, manages, and queries Arize datasets and examples via the ax CLI. Use for dataset CRUD, appending examples, exporting data, and file-based dataset creation.
Manages Arize AI projects via ax CLI: lists with pagination/options, resolves names to IDs, gets details, creates, deletes. Includes jq parsing examples for scripting.
Manages Arize ML platform resources like models, monitors, prompts, evaluators, dashboards, spaces via arize_toolkit CLI. Lists, creates, updates, deletes resources, configures profiles, handles admin tasks from terminal.