Logging, tracing, metrics fundamentals
This skill inherits all available tools. When active, it can use any tool Claude has access to.
Discrete events that happened in the system.
Numeric measurements over time.
Request flow across services.
// CORRECT - Structured, searchable
logger.info("Order processed", Map.of(
"orderId", orderId,
"appId", appId,
"amount", amount,
"duration_ms", duration
));
// WRONG - Unstructured, hard to search
logger.info("Processed order " + orderId + " for app " + appId);
| Level | Use For |
|---|---|
| ERROR | Failures requiring attention |
| WARN | Unexpected but handled conditions |
| INFO | Business events, state changes |
| DEBUG | Development troubleshooting |
| TRACE | Detailed execution flow |
try {
processOrder(orderId);
} catch (OrderProcessingException e) {
logger.error("Failed to process order", Map.of(
"orderId", orderId,
"appId", appId,
"error", e.getMessage(),
"errorType", e.getClass().getSimpleName()
));
throw new ServiceException("Order processing failed", e);
}
{
"error": {
"code": "ORDER_NOT_FOUND",
"message": "Order with ID 123 not found",
"request_id": "abc-123",
"timestamp": "2024-01-01T00:00:00Z"
}
}
| Category | Metrics |
|---|---|
| Latency | p50, p95, p99 response times |
| Traffic | Requests per second |
| Errors | Error rate, error count by type |
| Saturation | CPU, memory, connections |
# Pattern: [namespace]_[subsystem]_[metric]_[unit]
violet_orders_processed_total
violet_orders_processing_duration_seconds
violet_api_requests_total
violet_api_errors_total
// Pass correlation ID through request chain
String correlationId = request.getHeader("X-Correlation-ID");
if (correlationId == null) {
correlationId = UUID.randomUUID().toString();
}
MDC.put("correlationId", correlationId);
Span span = tracer.spanBuilder("processOrder")
.setAttribute("orderId", orderId)
.setAttribute("appId", appId)
.startSpan();
try (Scope scope = span.makeCurrent()) {
// Processing logic
} finally {
span.end();
}
# GOOD - User-facing symptom
alert: HighErrorRate
expr: error_rate > 0.05 # 5% errors
# AVOID - Internal cause
alert: DatabaseConnectionPoolExhausted
expr: db_connections >= max_connections
| Severity | Response | Examples |
|---|---|---|
| Critical | Immediate | Service down, data loss |
| High | Within hours | Elevated errors, degraded |
| Medium | Next business day | Warnings, approaching limits |
| Low | When convenient | Informational |
// Liveness: Is the process running?
@GetMapping("/health/live")
public ResponseEntity<?> liveness() {
return ResponseEntity.ok().build();
}
// Readiness: Can it accept traffic?
@GetMapping("/health/ready")
public ResponseEntity<?> readiness() {
if (databaseHealthy && cacheHealthy) {
return ResponseEntity.ok().build();
}
return ResponseEntity.status(503).build();
}