Systematic debugging methodology using evidence-based investigation to identify root causes. Use when encountering bugs, errors, unexpected behavior, failing tests, or intermittent issues. Enforces four-phase framework (root cause investigation, pattern analysis, hypothesis testing, implementation) with the iron law NO FIXES WITHOUT ROOT CAUSE FIRST. Covers runtime errors, logic bugs, integration failures, and performance issues. Useful when debugging, troubleshooting, investigating failures, or when --debug flag is mentioned.
This skill inherits all available tools. When active, it can use any tool Claude has access to.
examples/race-condition.mdexamples/runtime-error.mdreferences/reproduction.mdEvidence-based investigation → root cause → verified fix.
<when_to_use>
NOT for: well-understood issues with obvious fixes, feature requests, architecture planning </when_to_use>
<iron_law> NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST
Never propose solutions, "quick fixes", or "try this" without first understanding root cause through systematic investigation. </iron_law>
<phases> Track with TodoWrite. Phases advance forward only.| Phase | Trigger | activeForm |
|---|---|---|
| Collect Evidence | Session start, bug encountered | "Collecting evidence" |
| Isolate Variables | Evidence gathered, reproduction confirmed | "Isolating variables" |
| Formulate Hypotheses | Problem isolated, patterns identified | "Formulating hypotheses" |
| Test Hypothesis | Hypothesis formed | "Testing hypothesis" |
| Verify Fix | Fix identified and implemented | "Verifying fix" |
Situational (insert when triggered):
Workflow:
in_progresscompleted, add next in_progressin_progress<escalation>)
</phases>
<quick_start> When encountering a bug:
in_progress via TodoWrite<phase_1_root_cause> Goal: Understand what's actually happening, not what you think is happening.
Transition: Mark "Collect Evidence" complete and add "Isolate Variables" as in_progress when you have reproduction steps and initial evidence.
Steps:
Read error messages completely
Reproduce consistently
Check recent changes
git diff — what changed since it last worked?git log --since="yesterday" — recent commitsGather evidence systematically
Trace data flow backward
Red flags indicating need more investigation:
<phase_2_pattern_analysis> Goal: Learn from what works to understand what's broken.
Transition: Mark "Isolate Variables" complete and add "Formulate Hypotheses" as in_progress when you've identified key differences between working and broken cases.
Steps:
Find working examples in same codebase
rg "pattern"Read reference implementations completely
Identify every difference
Understand dependencies
Questions to answer:
<phase_3_hypothesis_testing> Goal: Test one specific idea with minimal changes.
Transition: Mark "Formulate Hypotheses" complete and add "Test Hypothesis" as in_progress when you have specific, evidence-based hypothesis.
Steps:
Form single, specific hypothesis
Design minimal test
Execute test
Verify or pivot
in_progress, form NEW hypothesisBad hypothesis examples:
Good hypothesis examples:
<phase_4_implementation> Goal: Fix root cause permanently with verification.
Transition: Mark "Test Hypothesis" complete and add "Verify Fix" as in_progress when you've confirmed hypothesis and ready to implement permanent fix.
Steps:
Create failing test case
Implement single fix
Verify fix works
Circuit breaker: 3 failed fixes
After fixing:
completedRuntime Errors (crashes, exceptions)
Investigation focus:
Common causes:
Key techniques:
Logic Bugs (wrong result, unexpected behavior)
Investigation focus:
Common causes:
Key techniques:
Integration Failures (API, database, external service)
Investigation focus:
Common causes:
Key techniques:
Intermittent Issues (works sometimes, fails others)
Investigation focus:
Common causes:
Key techniques:
Performance Issues (slow, memory leaks, high CPU)
Investigation focus:
Common causes:
Key techniques:
Instrumentation — add diagnostic logging without changing behavior:
function processData(data: Data): Result {
console.log('[DEBUG] processData input:', JSON.stringify(data));
const transformed = transform(data);
console.log('[DEBUG] after transform:', JSON.stringify(transformed));
const validated = validate(transformed);
console.log('[DEBUG] after validate:', JSON.stringify(validated));
const result = finalize(validated);
console.log('[DEBUG] processData result:', JSON.stringify(result));
return result;
}
Binary Search Debugging — find commit that introduced bug:
git bisect start
git bisect bad # Current commit is bad
git bisect good <last-good-commit> # Known good commit
# Git will check out middle commit
# Test if bug exists, then:
git bisect bad # if bug exists
git bisect good # if bug doesn't exist
# Repeat until git identifies exact commit
Differential Analysis — compare versions side by side:
# Working version
git show <good-commit>:path/to/file.ts > file-working.ts
# Broken version
git show <bad-commit>:path/to/file.ts > file-broken.ts
# Detailed diff
diff -u file-working.ts file-broken.ts
Timeline Analysis — correlate events for intermittent issues:
12:00:01.123 - Request received
12:00:01.145 - Database query started
12:00:01.167 - Cache check started
12:00:01.169 - Cache hit returned <-- Returned before DB!
12:00:01.234 - Database query completed
12:00:01.235 - Error: stale data <-- Bug symptom
</evidence>
<red_flags> If you catch yourself thinking or saying these — STOP, return to Phase 1:
ALL of these mean: STOP. Return to Phase 1.
Add new "Collect Evidence" task and mark current task complete. </red_flags>
<anti_patterns> Common debugging mistakes to avoid.
Random Walk — trying different things hoping one works without systematic investigation
Why it fails: Wastes time, may mask real issue, doesn't build understanding
Instead: Follow phases 1-2 to understand system
Quick Fix — implementing solution that stops symptom without finding root cause
Why it fails: Bug will resurface or manifest differently
Instead: Use phase 1 to find root cause before fixing
Cargo Cult — copying code from Stack Overflow without understanding why it works
Why it fails: May not apply to your context, introduces new issues
Instead: Use phase 2 to understand working examples thoroughly
Shotgun Approach — changing multiple things simultaneously "to be sure"
Why it fails: Can't tell which change fixed it or if you introduced new bugs
Instead: Use phase 3 to test one hypothesis at a time </anti_patterns>
<integration> Connect debugging to broader development workflow.Test-Driven Debugging:
Defensive Programming After Fix — add validation at multiple layers:
function processUser(userId: string): User {
// Input validation
if (!userId || typeof userId !== 'string') {
throw new Error('Invalid userId: must be non-empty string');
}
// Fetch with error handling
const user = await fetchUser(userId);
if (!user) {
throw new Error(`User not found: ${userId}`);
}
// Output validation
if (!user.email || !user.name) {
throw new Error('Invalid user data: missing required fields');
}
return user;
}
Documentation — after fixing, document:
Example:
/**
* Processes user data from API.
*
* Bug fix (2024-01-15): Added validation for missing email field.
* Root cause: API sometimes returns partial user objects when
* user hasn't completed onboarding.
* Prevention: Always validate required fields before processing.
*/
</integration>
<escalation>
When to ask for help or escalate:
Remember: Understanding the bug is more valuable than fixing it quickly. </completion>
<rules> ALWAYS: - Create "Collect Evidence" todo at session start - Follow four-phase framework systematically - Update todos when transitioning between phases - Create failing test before implementing fix - Test single hypothesis at a time - Document root cause after fixing - Mark "Verify Fix" complete only after all tests passNEVER: