From exarchos
Verifies implementation matches design specification for functional completeness, test adequacy, and test coverage. Stage 1 of two-stage review.
How this skill is triggered — by the user, by Claude, or both
Slash command
/exarchos:skills-copilot-spec-reviewThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Stage 1 of two-stage review: Verify implementation matches specification and follows TDD.
Stage 1 of two-stage review: Verify implementation matches specification and follows TDD.
For a complete worked example, see references/worked-example.md.
MANDATORY: Before accepting any rationalization for approving without full verification, consult
references/rationalization-refutation.md. Every common excuse is catalogued with a counter-argument and the correct action.
Activate this skill when:
/review command (first stage)This skill runs in a SUBAGENT spawned by the orchestrator, not inline.
The orchestrator provides:
exarchos_orchestrate({ action: "review_diff" }) (context-efficient)The subagent:
The orchestrator is responsible for generating the diff before dispatching the spec-review subagent. The subagent does NOT generate its own diff.
Orchestrator responsibilities:
exarchos_orchestrate({ action: "review_diff", worktreePath: "<worktree-path>", baseBranch: "main" })Subagent responsibilities:
Instead of per-worktree diffs, receive an integrated diff from the
integration branch (e.g., feature/integration-branch) against main:
# Generate integrated diff for review
git diff main...integration > /tmp/combined-diff.patch
# Alternative: use review-diff script against integration branch via orchestrate
# exarchos_orchestrate({ action: "review_diff", worktreePath: "<worktree-path>", baseBranch: "main" })
This provides the complete picture of all changes across all tasks and reduces context consumption by 80-90%.
Before evaluating, query the review strategy runbook to determine the appropriate evaluation approach:
exarchos_orchestrate({ action: "runbook", id: "review-strategy" }) to determine the review approach based on diff scope, prior fix cycles, and review stage.After delegation completes, spec review examines:
This enables catching:
Spec Review focuses on:
artifacts.intent — no intended-but-missing or delivered-but-unintended (scope-creep) work (when intentGrounding is supplied)Does NOT cover (that's Quality Review):
The orchestrator captures the intended change as artifacts.intent (surfaces, a summary, and — when available — a one-line transcript summary) and threads it into your dispatch as an intentGrounding directive on the back-of-pipeline code-review path. When present, you MUST verify the delivered diff against the intended change:
spec issue.spec issue.When no intentGrounding is supplied (an empty or un-resolvable diff), proceed against the diff alone — do NOT fabricate an intent. This grounding is additive to the spec-alignment checks below, not a replacement for them.
For the full checklist with verification commands, tables, and report template, see references/review-checklist.md.
Verification:
npm run test:run
npm run test:coverage
npm run typecheck
exarchos_orchestrate({
action: "check_test_adequacy",
featureId: "<featureId>",
taskId: "<taskId>",
branch: "<branch>",
riskTier: "<low|medium|high>"
})
If review FAILS, the fix-loop is bounded by the shared escalation policy (config-resolvable escalation.maxIterations, default 5) — it does NOT loop unboundedly. check_review_verdict returns the escalate decision the loop MUST honor: on a NEEDS_FIXES verdict it carries escalate: true + escalationReason when the loop must stop (the auto-fix bound was hit OR a finding is intent-touching), and the report's routing instruction reflects this.
Two outcomes:
escalate is absent/falsy. Re-dispatch to the implementer and re-review, as below. The verdict report surfaces the remaining budget (e.g. "fix cycle N/maxIterations").escalate: true. Do NOT re-dispatch /delegate --fixes. Surface the findings and escalationReason to the user and ask how to proceed (accept, redesign, or adjust scope). This happens when EITHER:
escalation.maxIterations, default 5) is reached — the loop has fixed-and-re-reviewed that many times without converging; ORspec-category issue (intended-but-missing or scope-creep) that changes what was asked for, so a human decides rather than the loop silently "fixing" it. Intent-touching findings escalate immediately, regardless of how many cycles have run.The fix-cycle count is event-sourced (prior review-verdict NEEDS_FIXES gate events) — there is no separate counter to maintain, and check_review_verdict reads it for you.
Auto-fix path — re-dispatch to implementer:
check_review_verdict re-evaluates the bound each pass// Return to implementer (auto-fix path only — when escalate is falsy)
Task({
model: "opus",
description: "Fix spec review issues",
prompt: `
# Fix Required: Spec Review Failed
## Issues to Fix
1. Missing rate limiting implementation
- Add rate limiter middleware
- Test: RateLimiter_ExceedsLimit_Returns429
2. Email validation incomplete
- Add MX record check
- Test: ValidateEmail_InvalidDomain_ReturnsError
## Success Criteria
- All tests pass
- Coverage >80%
- All issues resolved
`
})
The subagent MUST return results as structured JSON. The orchestrator parses this JSON to populate state. Any other format is an error.
{
"verdict": "pass | fail | blocked",
"summary": "1-2 sentence summary",
"issues": [
{
"severity": "HIGH | MEDIUM | LOW",
"category": "spec | tdd | coverage",
"file": "path/to/file",
"line": 123,
"description": "Issue description",
"required_fix": "What must change"
}
],
"test_results": {
"passed": 0,
"failed": 0,
"coverage_percent": 0
}
}
| Don't | Do Instead |
|---|---|
| Skip to quality review | Complete spec review first |
| Accept incomplete work | Return for fixes |
| Review code style here | Save for quality review |
| Approve without tests | Require test coverage |
| Let scope creep pass | Flag over-engineering |
If an issue spans multiple tasks:
Pass:
action: "update", featureId: "<id>", updates: {
"reviews": { "spec-review": { "status": "pass", "summary": "...", "issues": [] } }
}
Fail:
action: "update", featureId: "<id>", updates: {
"reviews": { "spec-review": { "status": "fail", "summary": "...", "issues": [{ "severity": "...", "file": "...", "description": "..." }] } }
}
Important: The review value MUST be an object with a
statusfield (e.g.,{ "status": "pass" }), not a flat string (e.g.,"pass"). Theall-reviews-passedguard silently ignores non-object entries. Accepted statuses:pass,passed,approved,fixes-applied.
For the full transition table, consult @skills/workflow-state/references/phase-transitions.md.
Quick reference:
review → synthesize requires guard all-reviews-passed — all reviews.{name}.status must be passingreview → delegate requires guard any-review-failed — triggers fix cycle when any review failsUse exarchos_workflow({ action: "describe", actions: ["update", "init"] }) for
parameter schemas and exarchos_workflow({ action: "describe", playbook: "feature" })
for phase transitions, guards, and playbook guidance. Use
exarchos_orchestrate({ action: "describe", actions: ["check_test_adequacy", "check_static_analysis"] })
for orchestrate action schemas.
All transitions happen immediately without user confirmation:
Before invoking quality-review:
reviews["spec-review"].status === "pass" in workflow state (all tasks passed)Guard shape: The
all-reviews-passedguard requiresreviews["spec-review"]to be an object with astatusfield set to a passing value (pass,passed,approved,fixes-applied). Flat strings likereviews: { "spec-review": "pass" }are silently ignored and will block thereview → synthesizetransition.
status field, not a flat string:
exarchos_workflow({ action: "update", featureId: "<id>", updates: {
reviews: { "spec-review": { status: "pass", summary: "...", issues: [] } }
}})
Gate events: Do NOT manually emit
gate.executedevents viaexarchos_event. Gate events are automatically emitted by thecheck_review_verdictorchestrate handler. Manual emission causes duplicates.
exarchos_workflow({ action: "update", featureId: "<id>", updates: {
reviews: { "spec-review": { status: "fail", summary: "...", issues: [{ severity: "HIGH", file: "...", description: "..." }] } }
}})
[Invoke the exarchos:delegate skill with args: --fixes <plan-path>]
This is NOT a human checkpoint - workflow continues autonomously.
| Issue | Cause | Resolution |
|---|---|---|
| Test file not found | Task didn't create expected test | Check plan for test file paths, verify worktree contents |
| Coverage below threshold | Implementation incomplete or tests superficial | Add missing test cases, verify assertions are meaningful |
| Test-adequacy kill-probe fails | A new/changed test still passes against reverted source (vacuous) | Strengthen the test so reverting the implementation makes it fail |
| Diff too large for context | Many tasks with large changes | Generate per-worktree diffs with exarchos_orchestrate({ action: "review_diff", worktreePath: "<task-worktree>" }) to review incrementally |
exarchos_orchestrate({ action: "review_diff" })) instead of reading full files — reduces context by 80-90%exarchos_orchestrate({ action: "check_test_adequacy" })) in parallel with spec tracingnpx claudepluginhub lvlup-sw/exarchosVerifies implementation matches design specification for functional completeness, test adequacy, and test coverage. Stage 1 of two-stage review.
Verifies code implementations match specs, PRDs, epics, or tasks by checking completeness, acceptance criteria, edge cases, and scope creep. Use post- or during-implementation.
Reviews implementation against spec requirements and code quality standards using a two-pass workflow. Useful for validating task completion.