Grade component health based on regression triage metrics for OpenShift releases
Inherits all available tools
Additional assets for this skill
This skill inherits all available tools. When active, it can use any tool Claude has access to.
README.mdgenerate_html_report.pyreport_template.htmlThis skill provides functionality to analyze and grade component health for OpenShift releases based on regression management metrics. It evaluates how well components are managing their test regressions by analyzing triage coverage, triage timeliness, and resolution speed.
Use this skill when you need to:
Important Note: Grading is subjective and not meant to be a critique of team performance. This is intended to help identify where help is needed and track progress as we try to improve our regression response rates.
Python 3 Installation
which python3Network Access
Required Scripts
plugins/component-health/skills/get-release-dates/get_release_dates.pyplugins/component-health/skills/list-regressions/list_regressions.pyplugins/component-health/skills/analyze-regressions/generate_html_report.py (for HTML reports)plugins/component-health/skills/analyze-regressions/report_template.html (for HTML reports)Extract the release version and optional component filter from the command arguments:
Example argument parsing:
/component-health:analyze-regressions 4.17
/component-health:analyze-regressions 4.21 --components Monitoring etcd
Run the get_release_dates.py script to determine the development window for the release:
python3 plugins/component-health/skills/get-release-dates/get_release_dates.py \
--release 4.17
Expected output (JSON on stdout):
{
"release": "4.17",
"development_start": "2024-05-17T00:00:00Z",
"feature_freeze": "2024-08-26T00:00:00Z",
"code_freeze": "2024-09-30T00:00:00Z",
"ga": "2024-10-29T00:00:00Z"
}
Processing steps:
development_start date - convert to YYYY-MM-DD formatga date - convert to YYYY-MM-DD format (may be null for in-development releases)development_start: Usually always present; if null, omit --start parameterga: Will be null for in-development releases; if null, omit --end parameterDate conversion example:
"2024-05-17T00:00:00Z" → "2024-05-17"
null → do not use this parameter
Run the list_regressions.py script with the appropriate arguments:
python3 plugins/component-health/skills/list-regressions/list_regressions.py \
--release 4.17 \
--start 2024-05-17 \
--end 2024-10-29 \
--short
Parameter rules:
--release: Always required (from Step 1)--components: Optional, only if specified by user (from Step 1)--start: Use development_start date from Step 2 (if not null)
--end: Use ga date from Step 2 (only if not null)
--short: Always include this flag
Example for GA'd release (4.17):
python3 plugins/component-health/skills/list-regressions/list_regressions.py \
--release 4.17 \
--start 2024-05-17 \
--end 2024-10-29 \
--short
Example for in-development release (4.21 with null GA):
python3 plugins/component-health/skills/list-regressions/list_regressions.py \
--release 4.21 \
--start 2025-09-02 \
--short
Example with component filter:
python3 plugins/component-health/skills/list-regressions/list_regressions.py \
--release 4.21 \
--components Monitoring etcd \
--start 2025-09-02 \
--short
The script outputs JSON to stdout with the following structure:
{
"summary": {
"total": 62,
"triaged": 59,
"triage_percentage": 95.2,
"filtered_suspected_infra_regressions": 8,
"time_to_triage_hrs_avg": 68,
"time_to_triage_hrs_max": 240,
"time_to_close_hrs_avg": 168,
"time_to_close_hrs_max": 480,
"open": {
"total": 2,
"triaged": 1,
"triage_percentage": 50.0,
"time_to_triage_hrs_avg": 48,
"time_to_triage_hrs_max": 48,
"open_hrs_avg": 120,
"open_hrs_max": 200
},
"closed": {
"total": 60,
"triaged": 58,
"triage_percentage": 96.7,
"time_to_triage_hrs_avg": 72,
"time_to_triage_hrs_max": 240,
"time_to_close_hrs_avg": 168,
"time_to_close_hrs_max": 480,
"time_triaged_closed_hrs_avg": 96,
"time_triaged_closed_hrs_max": 240
}
},
"components": {
"ComponentName": {
"summary": {
"total": 15,
"triaged": 13,
"triage_percentage": 86.7,
"filtered_suspected_infra_regressions": 0,
"time_to_triage_hrs_avg": 68,
"time_to_triage_hrs_max": 180,
"time_to_close_hrs_avg": 156,
"time_to_close_hrs_max": 360,
"open": {
"total": 1,
"triaged": 0,
"triage_percentage": 0.0,
"time_to_triage_hrs_avg": null,
"time_to_triage_hrs_max": null,
"open_hrs_avg": 72,
"open_hrs_max": 72
},
"closed": {
"total": 14,
"triaged": 13,
"triage_percentage": 92.9,
"time_to_triage_hrs_avg": 68,
"time_to_triage_hrs_max": 180,
"time_to_close_hrs_avg": 156,
"time_to_close_hrs_max": 360,
"time_triaged_closed_hrs_avg": 88,
"time_triaged_closed_hrs_max": 180
}
}
}
}
}
CRITICAL - Use Summary Counts:
summary.total, summary.open.total, summary.closed.total for countscomponents.*.summary.* for per-component counts--short flag)Key Metrics to Extract:
From summary object:
summary.total - Total regressionssummary.triaged - Total triaged regressionssummary.triage_percentage - KEY HEALTH METRIC: Percentage triagedsummary.filtered_suspected_infra_regressions - Count of filtered infrastructure regressionssummary.time_to_triage_hrs_avg - KEY HEALTH METRIC: Average hours to triagesummary.time_to_triage_hrs_max - Maximum hours to triagesummary.time_to_close_hrs_avg - KEY HEALTH METRIC: Average hours to closesummary.time_to_close_hrs_max - Maximum hours to closesummary.open.total - Open regressions countsummary.open.triaged - Open triaged countsummary.open.triage_percentage - Open triage percentagesummary.closed.total - Closed regressions countsummary.closed.triaged - Closed triaged countsummary.closed.triage_percentage - Closed triage percentageFrom components object:
components.*.summary.* for all per-component statisticsIMPORTANT - Closed Regression Triage:
summary.open.total - summary.open.triagedCalculate grades based on three key metrics:
1. Triage Coverage (summary.triage_percentage):
2. Triage Timeliness (summary.time_to_triage_hrs_avg):
168 hours: Poor ❌
3. Resolution Speed (summary.time_to_close_hrs_avg):
720 hours (4+ weeks): Poor ❌
For each component in components:
components.*.summary.* fieldsPresent a well-formatted text report with:
Display overall statistics from summary:
=== Overall Health Grade for Release 4.17 ===
Development Window: 2024-05-17 to 2024-10-29 (GA'd release)
Total Regressions: 62
Filtered Infrastructure Regressions: 8
Triaged: 59 (95.2%)
Open: 2 (50.0% triaged)
Closed: 60 (96.7% triaged)
Triage Coverage: ✅ Excellent (95.2%)
Triage Timeliness: ⚠️ Good (68 hours average, 240 hours max)
Resolution Speed: ✅ Excellent (168 hours average, 480 hours max)
Important: If the GA date is null (in-development release), note:
Development Window: 2025-09-02 onwards (In Development)
Display ranked table from components.*.summary:
=== Component Health Scorecard ===
| Component | Triage Coverage | Triage Time | Resolution Time | Open | Grade |
|-----------------|-----------------|-------------|-----------------|------|-------|
| kube-apiserver | 100.0% | 58 hrs | 144 hrs | 1 | ✅ |
| etcd | 95.0% | 84 hrs | 192 hrs | 0 | ✅ |
| Monitoring | 86.7% | 68 hrs | 156 hrs | 1 | ⚠️ |
Highlight specific components with issues:
=== Components Needing Attention ===
Monitoring:
- 1 open untriaged regression (needs triage)
- Triage coverage: 86.7% (below 90%)
Example-Component:
- 5 open untriaged regressions (needs triage)
- Slow triage response: 120 hours average
- High open count: 5 open regressions
CRITICAL: When listing untriaged regressions that need action:
components.*.summary.open.total - components.*.summary.open.triagedAfter displaying the text report, ask the user if they want an interactive HTML report:
Would you like me to generate an interactive HTML report? (yes/no)
If the user responds affirmatively:
The HTML report requires data in a specific structure. Transform the JSON data:
# Prepare component data for HTML template
component_data = []
for component_name, component_obj in components.items():
summary = component_obj['summary']
component_data.append({
'name': component_name,
'total': summary['total'],
'open': summary['open']['total'],
'closed': summary['closed']['total'],
'triaged': summary['triaged'],
'triage_percentage': summary['triage_percentage'],
'time_to_triage_hrs_avg': summary.get('time_to_triage_hrs_avg'),
'time_to_close_hrs_avg': summary.get('time_to_close_hrs_avg'),
'health_grade': calculate_health_grade(summary) # Calculate combined grade
})
Use the generate_html_report.py script (or inline Python code):
python3 plugins/component-health/skills/analyze-regressions/generate_html_report.py \
--release 4.17 \
--data regression_data.json \
--output .work/component-health-4.17/report.html
Or use inline Python with the template:
import json
from datetime import datetime
# Load template
with open('plugins/component-health/skills/analyze-regressions/report_template.html', 'r') as f:
template = f.read()
# Replace placeholders
template = template.replace('{{RELEASE}}', '4.17')
template = template.replace('{{GENERATED_DATE}}', datetime.now().isoformat())
template = template.replace('{{SUMMARY_DATA}}', json.dumps(summary))
template = template.replace('{{COMPONENT_DATA}}', json.dumps(component_data))
# Write output
output_path = '.work/component-health-4.17/report.html'
os.makedirs(os.path.dirname(output_path), exist_ok=True)
with open(output_path, 'w') as f:
f.write(template)
Open the HTML report in the user's default browser:
macOS:
open .work/component-health-4.17/report.html
Linux:
xdg-open .work/component-health-4.17/report.html
Windows:
start .work/component-health-4.17/report.html
Display the file path to the user:
HTML report generated: .work/component-health-4.17/report.html
Opening in your default browser...
Network Errors
URLError or connection timeoutInvalid Release Format
Release Dates Not Found
get_release_dates.py returns error--start and --end parameters)No Regressions Found
Component Filter No Matches
HTML Template Not Found
plugins/component-health/skills/analyze-regressions/report_template.htmlEnable verbose output by examining stderr:
python3 plugins/component-health/skills/list-regressions/list_regressions.py \
--release 4.17 \
--short 2>&1 | tee debug.log
Diagnostic messages include:
The text report should include:
Header
Overall Health Grade
Component Health Scorecard
Components Needing Attention
Footer
The HTML report should include:
/component-health:analyze-regressions 4.17
Execution flow:
/component-health:analyze-regressions 4.21 --components Monitoring etcd
Execution flow:
/component-health:analyze-regressions 4.21 --components "kube-apiserver"
Execution flow:
To calculate an overall health grade for a component, consider all three metrics:
def calculate_health_grade(summary):
"""Calculate combined health grade based on three key metrics."""
triage_coverage = summary['triage_percentage']
triage_time = summary.get('time_to_triage_hrs_avg')
resolution_time = summary.get('time_to_close_hrs_avg')
# Score each metric (0-3)
coverage_score = (
3 if triage_coverage >= 90 else
2 if triage_coverage >= 70 else
1 if triage_coverage >= 50 else
0
)
time_score = 3 # Default to excellent if no data
if triage_time is not None:
time_score = (
3 if triage_time < 24 else
2 if triage_time < 72 else
1 if triage_time < 168 else
0
)
resolution_score = 3 # Default to excellent if no data
if resolution_time is not None:
resolution_score = (
3 if resolution_time < 168 else
2 if resolution_time < 336 else
1 if resolution_time < 720 else
0
)
# Average the scores
avg_score = (coverage_score + time_score + resolution_score) / 3
# Return grade
if avg_score >= 2.5:
return "Excellent ✅"
elif avg_score >= 1.5:
return "Good ⚠️"
elif avg_score >= 0.5:
return "Needs Improvement ⚠️"
else:
return "Poor ❌"
Rank components by priority based on:
High open untriaged count (most urgent)
summary.open.total - summary.open.triagedLow triage coverage (second priority)
summary.triage_percentageSlow triage response (third priority)
summary.time_to_triage_hrs_avgHigh total regression count (fourth priority)
summary.totalCompare metrics across releases:
/component-health:analyze-regressions 4.17 --compare 4.16
Generate CSV report for spreadsheet analysis:
/component-health:analyze-regressions 4.17 --export-csv
Allow users to customize health grade thresholds:
/component-health:analyze-regressions 4.17 --triage-threshold 80
This skill can be used by:
/component-health:analyze-regressions command (primary)get-release-dates - Fetches release development window dateslist-regressions - Fetches raw regression dataprow-job:analyze-test-failure - Analyzes individual test failures.work/ directory for performance--short flag is critical to prevent output truncation with large datasetsPossible causes:
Solutions:
get_release_dates.pyContext:
Actions:
Possible causes:
Solutions:
ls -la .work/component-health-*/report.htmlThis skill provides comprehensive component health analysis by:
The key focus is on actionable insights - particularly identifying open untriaged regressions that need immediate attention, while avoiding recommendations for closed regressions which cannot be retroactively triaged.