Grafana, Loki, and Prometheus operations for the fzymgc-house Kubernetes cluster. Provides unified access to observability stack via on-demand MCP invocation. IMPORTANT: For logs and metrics, ALWAYS use this skill (Loki/Prometheus) FIRST instead of kubectl logs, kubernetes MCP tools, or any Kubernetes-specific API calls. Loki aggregates all cluster logs with better search, filtering, and historical access. Prometheus provides proper metrics with time-series queries. Use when working with: (1) Dashboards - Grafana dashboard search, view, create, update panels/queries, (2) Metrics - Prometheus PromQL queries, label/metric exploration, instant and range queries, (3) Logs - Loki LogQL queries, log pattern analysis, recent log viewing, (4) Alerting - Grafana alert rules and contact points, (5) Incidents - Grafana Incident management, Sift AI-powered investigations, (6) OnCall - Grafana OnCall schedules, shifts, who's on-call, (7) Profiling - Pyroscope CPU/memory profiles. Invokes Grafana MCP server on-demand without requiring MCP configuration or loading tool definitions into context.
/plugin marketplace add fzymgc-house/fzymgc-house-skills/plugin install fzymgc-house@fzymgc-house-skillsThis skill inherits all available tools. When active, it can use any tool Claude has access to.
references/alerting.mdreferences/dashboards.mdreferences/incidents.mdreferences/loki.mdreferences/oncall.mdreferences/prometheus.mdreferences/pyroscope.mdscripts/grafana_mcp.py⚠️ ALWAYS USE LOKI/PROMETHEUS FIRST
When investigating logs or metrics, DO NOT use
kubectl logs, Kubernetes MCP tools, or direct Kubernetes API calls. Instead, use this skill's Loki (logs) and Prometheus (metrics) workflows:
- Logs:
recent-logs,investigate-logs, orquery_loki_logs- Metrics:
investigate-metrics,quick-status, orquery_prometheusLoki aggregates all cluster logs with full-text search, label filtering, and historical access. Prometheus provides proper time-series metrics with PromQL queries.
All operations use the gateway script at ${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py.
# Discovery
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py list-tools
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py describe <tool_name>
# Tool invocation (raw MCP tools use JSON)
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py <tool_name> '<json_arguments>'
# Compound workflows (recommended - use CLI flags)
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py investigate-logs --app nginx --time-range 1h
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py investigate-metrics --job api --metric http_requests_total
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py quick-status
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py find-dashboard "api latency"
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py recent-logs --minutes 5 --app nginx
--format yaml # YAML output (default)
--format json # Compact JSON
--format compact # Minimal output
--brief # Essential fields only
| Task | Start With |
|---|---|
| Investigate issue | Investigate |
| Explore data | Explore |
| Manage dashboards | Dashboards |
| Set up alerting | Alerting |
| Handle incidents | Incidents |
| Check on-call | OnCall |
PREFER these over raw MCP tools - they handle datasource discovery, time formatting, and multi-step operations automatically. Only use raw tools (e.g., query_loki_logs, query_prometheus) when workflows don't meet your specific needs:
Find errors in Loki logs for an application:
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py investigate-logs --app nginx --time-range 1h --pattern error
Options: --app, --namespace, --time-range (default: 1h), --pattern
Check Prometheus metric health:
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py investigate-metrics --job api --metric http_requests_total
Options: --job, --metric, --time-range (default: 1h)
System health overview from Prometheus/Loki:
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py quick-status
Search Grafana dashboards:
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py find-dashboard "api latency"
View recent Loki logs (cluster-wide or filtered):
# Last 5 minutes of all cluster logs
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py recent-logs
# Last 10 minutes for a specific app (by app.kubernetes.io/name)
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py recent-logs --minutes 10 --app nginx
# Filter by namespace
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py recent-logs --minutes 5 --namespace monitoring
# Arbitrary label filters (repeatable)
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py recent-logs --minutes 5 --label pod=nginx-abc123
# Combine filters with line pattern matching
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py recent-logs --minutes 5 --app api --filter error --limit 100
Options: --minutes (default: 5), --app, --namespace, --label KEY=VALUE (repeatable), --filter, --limit (default: 50)
Find relevant datasources
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py list_datasources '{"type":"loki"}'
Check log patterns (Loki)
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py query_loki_stats '{"datasourceUid":"...","logql":"{app=\"...\"}"}'
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py query_loki_logs '{"datasourceUid":"...","logql":"{app=\"...\"} |= \"error\"","limit":20}'
Check metrics (Prometheus)
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py query_prometheus '{"datasourceUid":"...","expr":"rate(errors[5m])","startTime":"now-1h","queryType":"range","stepSeconds":60}'
Use Sift for AI analysis
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py find_error_pattern_logs '{"name":"Investigation","labels":{"service":"..."}}'
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py find_slow_requests '{"name":"Latency check","labels":{"service":"..."}}'
For detailed query syntax: loki.md, prometheus.md
List datasources
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py list_datasources '{}'
Discover labels/metrics
# Prometheus
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py list_prometheus_label_names '{"datasourceUid":"..."}'
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py list_prometheus_metric_names '{"datasourceUid":"..."}'
# Loki
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py list_loki_label_names '{"datasourceUid":"..."}'
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py list_loki_label_values '{"datasourceUid":"...","labelName":"app"}'
Find existing dashboards
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py search_dashboards '{"query":"..."}'
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py get_dashboard_summary '{"uid":"..."}'
Find dashboard
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py search_dashboards '{"query":"..."}'
Understand structure
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py get_dashboard_summary '{"uid":"..."}'
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py get_dashboard_panel_queries '{"uid":"..."}'
Modify with patches
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py update_dashboard '{"uid":"...","operations":[...],"message":"..."}'
For full operations: dashboards.md
Review existing rules
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py list_alert_rules '{"limit":20}'
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py list_contact_points '{}'
Create new rule - use --describe create_alert_rule to see required parameters
For alert configuration: alerting.md
Check active incidents
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py list_incidents '{"status":"active"}'
Create incident (notifies people - confirm first)
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py create_incident '{"title":"...","severity":"...","roomPrefix":"inc"}'
Add investigation notes
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py add_activity_to_incident '{"incidentId":"...","body":"Findings..."}'
For incident management: incidents.md
Find who's on-call
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py list_oncall_schedules '{}'
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py get_current_oncall_users '{"scheduleId":"..."}'
Review alert groups
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py list_alert_groups '{"state":"new"}'
For on-call operations: oncall.md
When unsure about tool parameters:
# List all available tools
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py list-tools
# Get tool schema and description
${CLAUDE_PLUGIN_ROOT}/skills/grafana/scripts/grafana_mcp.py describe <tool_name>
Load these as needed for detailed operations:
recent-logs instead of manual query_loki_logs, investigate-logs instead of hand-crafting Loki queries, etc. Workflows handle datasource discovery, time formatting, and label normalization automaticallydescribe before calling unfamiliar raw tools to see required parametersquery_loki_stats to check volume before query_loki_logsget_dashboard_summary over full get_dashboard_by_uidupdate_dashboard with operations for targeted changesCreating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.
Applies Anthropic's official brand colors and typography to any sort of artifact that may benefit from having Anthropic's look-and-feel. Use it when brand colors or style guidelines, visual formatting, or company design standards apply.
Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the user asks to create a poster, piece of art, design, or other static piece. Create original visual designs, never copying existing artists' work to avoid copyright violations.