From tw
Operates a Traceway observability instance via CLI: login, query exceptions/logs/endpoints/metrics, and debug production issues to root cause. Activated by /traceway commands.
How this skill is triggered — by the user, by Claude, or both
Slash command
/tw:tracewayThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Drive a Traceway instance from the terminal with the `traceway` CLI. The first word of the argument decides the flow:
Drive a Traceway instance from the terminal with the traceway CLI. The first word of the argument decides the flow:
| Invocation | Flow |
|---|---|
/traceway login | Login: install the CLI if missing, authenticate, select a project |
/traceway debug <issue ref or bug description> | Debug: resolve the issue and investigate to root cause |
/traceway <anything else> | Query: answer the observability question with CLI reads |
/traceway (no argument) | Ask what they want: log in, debug an issue, or run a query |
The CLI is under active development. If a flag documented here does not appear in
traceway <command> --help, trust the binary.
list / show / query subcommand may run freely; they never mutate server state.exceptions archive / unarchive are the only mutating data commands; only run them when the user asks by name, with --yes in non-interactive contexts. "Look at this error" means read it, not archive it.jq, and --fields a,b,c to trim responses. Keep --page-size at 10 to 20 for triage.--since 1h for "now" questions, --since 24h otherwise. --since accepts s, m, h, lowercase Nd (no 1w, no 7d2h). Absolute windows via --from / --to (RFC3339).{"error":"<stable_id>","message":"...","hint":"...","exit_code":N} on stderr; branch on the error field.traceway login yourself; switch to the Login flow and let the user enter credentials.Users paste dashboard URLs (https://<instance>/<route>) as references in any flow. Resolve by route family:
| URL path | Identifies | How to fetch it |
|---|---|---|
/issues/<hash> and /issues/<hash>/events | Exception group (hash = 16 hex chars) | traceway exceptions show <hash> |
/issues/<hash>/<occurrenceId> (UUID) | One occurrence within the group | traceway exceptions occurrence <occurrenceId> --recorded-at <t> where t is the URL's ?t= param. Direct and fast; also returns the occurrence's sessionId and session recording. No URL? get recordedAt from traceway exceptions show <hash> occurrences |
/endpoints/<endpoint> | Endpoint group; the segment is the URL-encoded endpoint name (GET%20%2Fapi%2Fusers%2F%3Aid is GET /api/users/:id) | Decode it, then traceway endpoints list --search "<decoded name>" (the group has no id; endpoints show is for one request — next row) |
/endpoints/<endpoint>/<endpointId> | One request (transaction) of that endpoint | traceway endpoints show <endpointId> --recorded-at <t> (t = the URL's ?t= param). Returns the request, its span waterfall, and any linked exception/messages |
/tasks/<task> | Background task group | No CLI for the group; for one run use the next row |
/tasks/<task>/<taskId> | Single task run | traceway tasks show <taskId> --recorded-at <t> (t = the URL's ?t= param) |
/sessions/<sessionId> | Session (the exceptions that fired during it; replay stays dashboard-only) | traceway sessions show <sessionId> --started-at <t>. The URL has no ?t=; use the session's start, the URL's from=, or a linked occurrence's recordedAt (it falls inside the window). Occurrences reference sessions via their sessionId |
/ai-traces/<traceName> | AI trace group | No CLI for the group; for one trace use the next row |
/ai-traces/<traceName>/<traceId> | Single AI trace | traceway ai-traces show <traceId> --recorded-at <t> (t = the URL's ?t= param); returns token/cost stats + the conversation |
/logs | Logs page (its filters are not stored in the URL) | traceway logs query with flags taken from the user's description |
/issues, /endpoints, /metrics, / | List and dashboard pages | The matching list / query command |
Time window: most dashboard URLs carry ?preset=<p> or ?from=<iso>&to=<iso> (sticky across pages); honor them instead of the default window.
preset values 5m 30m 60m 3h 6h 12h 24h 3d 7d map directly to --since; the CLI has no month unit, so map 1M to --since 30d and 3M to --since 90d.from/to are ISO timestamps; pass via --from/--to, appending Z (or the correct offset) when missing, since the CLI requires RFC3339.--since per the ground rules.preset/from/to set the window for list/group views. Detail URLs additionally carry ?t=<iso> — the single record's timestamp, URL-encoded. That t value is exactly what the by-id commands need as --recorded-at (or --started-at for sessions). See "Fast by-id lookups" next.
The by-id detail commands — exceptions occurrence, endpoints show, tasks show, ai-traces show, sessions show, traces show — require the record's timestamp (--recorded-at, or --started-at for sessions). Telemetry tables are partitioned by day: with the timestamp the lookup is bounded to a small window and ClickHouse prunes to a few partitions; without it the server scans every partition (slow cold load). The flag is mandatory for exactly this reason — never omit it. It can be approximate (within ±24h), and you can recover or estimate it when it isn't handed to you; see "When you don't have the timestamp" below.
Where the timestamp comes from, in order of preference:
?t=<iso> param is the record's recordedAt; URL-decode it and pass it verbatim. (Sessions have no t; use the session start, from=, or a linked occurrence's recordedAt.)exceptions show occurrence carries recordedAt. Capture the id and its recordedAt together, then drill in.Query order when you hold an id: resolve its recordedAt first (URL, group, or notification), then call the by-id command with it.
The flag is required, so you must supply something — but it can be approximate. The lookup window is ±24h around what you pass (±48h for traces show), and if the record isn't in that window the server falls back to an unbounded scan. So a timestamp within a day of the truth stays fast; a wrong guess still returns the right record, just slower. Resolve it in this order:
/issues/<hash>/<occurrenceId> pasted without ?t=), run traceway exceptions show <hash> and read that occurrence's recordedAt — the hash endpoint needs no timestamp. A group's firstSeen/lastSeen from exceptions list bound when its occurrences happened (lastSeen ≈ the most recent one).firstSeen/lastSeen, or the URL's preset/from window all put you inside ±24h — good enough for a fast lookup.Traceway issue notifications (email / Slack / webhook) embed everything for a direct, fast lookup. The body contains:
Hash: <16-hex> — the exception group → traceway exceptions show <hash>.Exception ID: <uuid> — the specific occurrence.Occurred at: 2006-01-02 15:04:05 UTC — the occurrence timestamp. Convert to RFC3339: replace the space with T and UTC with Z (→ 2006-01-02T15:04:05Z).View details: /issues/<hash> — the deep link.So from a notification, go straight to the occurrence (fast), then pivot reusing the same timestamp:
traceway exceptions occurrence <Exception ID> --recorded-at <Occurred at → RFC3339> --output json
# the result carries distributedTraceId and sessionId → traces show / sessions show below
traceway version
If it prints a version, skip to authentication.
Prebuilt binaries are on the tracewayapp/traceway releases page under cli/vX.Y.Z tags (the latest release may be a Backend release, so filter for CLI tags):
OS=$(uname -s | tr '[:upper:]' '[:lower:]')
ARCH=$(uname -m); [ "$ARCH" = "aarch64" ] && ARCH=arm64
URL=$(curl -s "https://api.github.com/repos/tracewayapp/traceway/releases?per_page=20" \
| grep -o "https://[^\"]*traceway_[^\"]*_${OS}_${ARCH}\.tar\.gz" | head -1)
TMP=$(mktemp -d)
curl -sL "$URL" | tar -xz -C "$TMP"
install -m 755 "$TMP/traceway" ~/.local/bin/traceway && rm -rf "$TMP"
Make sure ~/.local/bin is on PATH (or install to /usr/local/bin). Fallback, build from source (requires Go):
git clone https://github.com/tracewayapp/traceway && cd traceway/cli
go build -o bin/traceway ./cmd/traceway && install -m 755 bin/traceway ~/.local/bin/traceway
Verify with traceway version.
Login prompts for the password interactively, so ask the user to run it themselves (in Claude Code, suggest typing ! traceway login --url https://<instance> so the output lands in the session):
traceway login --url https://<traceway-instance>
Non-interactive alternative when the password is in a secret store (never echo a password into the command line or shell history):
printf '%s' "$TRACEWAY_PASSWORD" | traceway login --url https://<instance> --username [email protected] --password-stdin
Multiple instances or accounts coexist via profiles: traceway login --url ... --profile work, then traceway profiles list / traceway profiles use work.
traceway projects list
traceway projects use <project-id>
traceway exceptions list --since 24h
The selected project is used implicitly by all subsequent commands.
/traceway debug issue X or /traceway debug <free-form bug description>.
X can be several things; resolve it to an exception hash (16 hex chars):
| Reference looks like | How to resolve |
|---|---|
| Dashboard URL | See "Resolving Dashboard URLs" above; for /issues/... URLs the path segment right after /issues/ is the hash, and ?preset/?from/?to give the time window |
| Bare 16-char hex string | Already the hash |
| Anything else (title, error message, type, file name) | Search: traceway exceptions list --since 7d --search "<text>"; widen to --since 30d (and --include-archived) if empty |
| No issue reference, just a bug description | Skip to triage below |
When a search returns multiple groups, show a shortlist (hash, count, lastSeen, first stack line) and ask the user which one before drilling in.
traceway exceptions list --since 7d --search "checkout" --output json \
| jq '.data[]? | {hash: .exceptionHash, count, lastSeen, top: (.stackTrace | split("\n")[0])}'
traceway exceptions show <hash>
This is the high-value call: full stack trace, occurrence list with recordedAt, attributes (user IDs, app versions, request context), and optional distributedTraceId / sessionId per occurrence. firstSeen correlates with deploys: a group that first appeared right after a release points at that release's diff. A bogus hash exits 5 with not_found; fall back to search.
From the description extract symptom, affected endpoint/feature, and time window, then read several signals before forming a hypothesis:
traceway exceptions list --since 24h --order-by lastSeen # what is erroring (firstSeen for regressions, count for volume)
traceway logs query --since 24h --min-severity 17 # errors and worse
traceway logs query --since 24h --search "payment declined" # search log bodies
traceway logs query --since 24h --service checkout-api --min-severity 13
traceway endpoints list --since 24h --search "checkout" # latency p50/p95/p99 and error counts, --order-by impact|count|p95|lastSeen
Severity is an OTel number, not a name: 1 TRACE, 5 DEBUG, 9 INFO, 13 WARN, 17 ERROR, 21 FATAL. The flag is --min-severity 17, never --severity error.
Correlate by trace: when an occurrence or log line carries a trace ID, pull the whole request timeline; this is usually the fastest route to a root cause:
traceway exceptions show $HASH --output json | jq -r '.occurrences[0].distributedTraceId' \
| xargs -I{} traceway logs query --trace-id {} --output json
Pull the whole cross-service trace and the user's session, reusing the occurrence's recordedAt as the (mandatory) time hint so both lookups stay partition-bounded:
OCC=$(traceway exceptions show $HASH --output json | jq -c '.occurrences[0]')
TS=$(jq -r '.recordedAt' <<<"$OCC")
DT=$(jq -r '.distributedTraceId // empty' <<<"$OCC")
SID=$(jq -r '.sessionId // empty' <<<"$OCC")
[ -n "$DT" ] && traceway traces show "$DT" --recorded-at "$TS" # every endpoint/task/ai-trace/exception node across services
[ -n "$SID" ] && traceway sessions show "$SID" --started-at "$TS" # the session + the exceptions that fired in it
traces show is usually the single highest-value RCA call: it stitches one logical request together end to end across services.
Check metrics for systemic causes (spikes lining up with firstSeen suggest saturation rather than a code bug):
traceway metrics query --name system.cpu.utilization --aggregation max --since 24h
traceway metrics query --name <name> --aggregation avg|sum|count|min|max [--tag key=value] [--group-by <tag>]
The CLI also accepts p50|p95|p99, but the server has no quantile aggregation for metric points and silently computes avg for them — never present those as percentiles. Latency percentiles come from traceway endpoints list, computed from raw request durations. There is no metrics list; a bogus name returns an empty series: {} cleanly, so probing names is safe. Host metrics from the Traceway OTel Agent live under system.* names, and OTLP histogram metrics are stored as two series, <name>.avg and <name>.count.
git log --since "<firstSeen>" --until "<firstSeen + 1h>" or the deploy history.Summarize: symptom, evidence (exception hashes, log excerpts, metric anomalies), root cause, fix. Include traceway exceptions show <hash> references so the user can verify. After a fix is deployed and verified, archive only when the user asks:
traceway exceptions archive <hash> --yes
For free-form requests ("what's broken in prod?", "is /api/checkout slow?", "show errors for service X"), use the read commands directly.
| Command | Purpose |
|---|---|
traceway projects {list,use} | List or select the active project |
traceway exceptions list | Grouped exceptions; --search, --search-type text|regex, --order-by lastSeen|firstSeen|count, --include-archived |
traceway exceptions show <hash> | One group: full stack trace + occurrences |
traceway exceptions occurrence <id> --recorded-at <t> | One occurrence by id (fast): full detail + sessionId + recording |
traceway exceptions archive/unarchive <hash>... | Mutating; explicit user request + --yes only |
traceway logs query | Logs; --search (--search-type body|attribute), --service, --min-severity <n>, --trace-id |
traceway endpoints list | Per-endpoint p50/p95/p99 and counts; --search, --order-by impact|count|p95|lastSeen |
traceway endpoints show <id> --recorded-at <t> | One request by id: span waterfall + linked errors |
traceway tasks show <id> --recorded-at <t> | One background task run by id |
traceway ai-traces show <id> --recorded-at <t> | One AI trace by id + its conversation |
traceway sessions show <id> --started-at <t> | One session by id + the exceptions that fired in it |
traceway traces show <id> --recorded-at <t> | Distributed trace: every service node sharing the id |
traceway metrics query --name <metric> | Time series; --aggregation, --tag, --group-by, --interval-minutes |
traceway profiles {list,use}, login, logout, version | Profile and session management |
The by-id show/occurrence commands take their id from a dashboard URL, a notification, or an exceptions show occurrence — and require the record's timestamp (--recorded-at / --started-at); see "Fast by-id lookups" above.
Not implemented yet (do not fabricate flags; point the user at the web UI): list verbs for tasks / sessions / ai-traces / traces (only by-id show exists for those), and metrics list/discover.
# What's broken right now
traceway exceptions list --since 1h --order-by lastSeen --page-size 10 --output json \
| jq '.data[]? | {hash: .exceptionHash, count, lastSeen}'
# Did anything NEW break since a deploy at 13:00 UTC
traceway exceptions list --from 2026-06-11T13:00:00Z --to "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
--order-by firstSeen --output json \
| jq '.data[]? | select(.firstSeen >= "2026-06-11T13:00:00Z") | {hash: .exceptionHash, firstSeen, count}'
# Worst endpoint by latency
traceway endpoints list --since 1h --order-by p95 --page-size 1 --output json | jq '.data[0]'
# Errors for one service (exceptions --search is free text, not a service filter; use logs)
traceway logs query --service checkout-api --min-severity 17 --since 1h --output json \
| jq '.data[]? | {timestamp, body, traceId}'
Empty results (data: null or data: []) are not errors: widen the window, re-check the active project (traceway projects list), and if the app was never connected to Traceway, set it up first (the traceway-setup skill).
npx claudepluginhub tracewayapp/traceway --plugin twConnects a project to a Traceway instance for reporting endpoints, spans, errors, background tasks, AI traces, and metrics. Supports backends (OTLP/HTTP), frontends (Traceway SDKs), and mobile/iOS apps.
Investigates distributed application performance via PostHog APM / OpenTelemetry spans — trace ID lookup, slow span analysis, error-rate trends, latency distributions, service/attribute exploration.
Guides debugging of Kubernetes applications and alerts using VictoriaMetrics metrics, VictoriaLogs, VictoriaTraces via 4-phase protocol with subagents.