From groundwork
Adds observability to code changes during development using structured logging, RED metrics (Rate/Errors/Duration), trace spans, and symptom-based alerts, shifting left observability to avoid blind spots in production.
How this skill is triggered — by the user, by Claude, or both
Slash command
/groundwork:instrument-observabilityThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Observability is added with the code that needs it, not bolted on after an outage. **Shift Left:** the cheapest time to make a change observable is while you still hold its context — what can fail, what "normal" looks like, which boundary the latency lives behind.
Observability is added with the code that needs it, not bolted on after an outage. Shift Left: the cheapest time to make a change observable is while you still hold its context — what can fail, what "normal" looks like, which boundary the latency lives behind.
Core principle: A change you cannot observe in production is a change you cannot operate. Logs, metrics, traces, and alerts are part of "done," not a follow-up ticket.
Skip only for changes with no runtime behavior (docs, pure refactors with identical I/O, config-only edits).
Apply each layer to the change.
| Excuse | Reality |
|---|---|
| "I'll add metrics once it's in prod" | The first incident is the worst time to discover you're blind. Shift Left. |
| "The mean latency is fine" | A mean hides the p99 tail where users actually hurt. Use a histogram. |
| "There are already logs" | Unstructured logs you can't query are not observability. Structure them. |
| "I'll alert on CPU and disk" | Cause-based alerts page you for non-problems and miss real ones. Alert on symptoms. |
| "Tracing is a separate project" | One span around each boundary is minutes of work and the only thing that localizes latency. |
log.info("processing " + thing) style string interpolation instead of structured fieldsCite concrete evidence — names and locations, not intentions:
npx claudepluginhub etr/groundworkInstruments code with logging, metrics, and tracing so production behavior is visible and diagnosable. Use when shipping features that run in production or when production issues are hard to diagnose.
Instruments code with logging, metrics, and tracing so production behavior is observable and diagnosable. Use when shipping features, adding services or external integrations, or when production issues take too long to diagnose.
Guides instrumenting and operating observable software systems with OpenTelemetry, traces, spans, metrics, logs, structured events, SLOs, alerts, sampling, and telemetry pipelines for debugging production behavior.