From autopilot
Sizes tasks (S/L/H/Fix), sets up branches, and runs session gates before any code changes. Use at the start of any coding task.
How this skill is triggered — by the user, by Claude, or both
Slash command
/autopilot:dev-flowThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
!`cat .claude/dev-flow-config.md 2>/dev/null || true`
!cat .claude/dev-flow-config.md 2>/dev/null || true
!cat .claude/model-routing-config.md 2>/dev/null || true
If no project config above, use defaults from references/model-routing.md.
!cat .claude/dispatch-config.md 2>/dev/null || true
If no project config above, autopilot's own fallback skills are primary for methodology; native Task dispatch for parallel; autopilot:reviewer for code review. See project-config-template/dispatch-config.md for the schema.
Run before any code changes. Size determines which path executes.
Total overhead target: under 5 seconds.
1. Confirm task: restate what will be done in one sentence.
2. Branch check: `git branch --show-current` -- confirm on expected branch.
3. Proceed to S Workflow. No further gates.
S-size skips: branch freshness, knowledge/digest review, draft plan overlap, risk escalation.
All gates must pass before any code changes begin. If any gate is blocked, surface to the decision-maker (user in normal mode, CEO in CEO mode).
1. Record session start SHA:
git rev-parse HEAD > .claude/session-start-sha
2. Branch check:
git branch --show-current
3. Branch freshness:
BEHIND=$(git log HEAD..main --oneline 2>/dev/null | wc -l)
AHEAD=$(git log main..HEAD --oneline 2>/dev/null | wc -l)
Evaluate using the freshness table below.
If main does not exist (new repo), skip this gate.
4. Knowledge and digest review:
Check .claude/knowledge/ for relevant prior learnings.
Check for unprocessed session digests.
5. Draft plan overlap check:
ls doc/plans/*.md 2>/dev/null (or project-configured path)
If draft plans exist, check if the current task overlaps with any draft plan
(same feature, same module, or same user story).
If overlap found:
- Normal mode: surface to user -- confirm whether to proceed or adopt the draft.
- CEO mode: CEO decides within DOA (tactical decision).
If no draft plans or no overlap: proceed.
6. Skill routing:
Check CLAUDE.md (or project config) for code-area-specific skills.
If a skill is listed for the target code area, invoke it before writing code.
**Active enforcement**: For L-size, this gate is backed by the L-1.6 TaskCreate
parent task (see L Workflow → Task tracking). Reading this bullet is NOT enough —
the TaskCreate is the forcing function that prevents skipping.
| Behind | Ahead | Status | Action |
|---|---|---|---|
| 0 | any | Up to date with main | Proceed |
| 1-5 | any | Slightly behind | Proceed with note |
| >5 | 0 | Behind, no local work | Warn user, recommend merge |
| >5 | >0 | DIVERGED | Flag to user before proceeding |
Same lightweight start as S, plus branch creation:
1. Confirm root cause: restate the bug and known fix in one sentence.
2. Branch: `git checkout -b fix/<description>`
3. Skill routing check for the target code area.
4. Proceed to Fix Workflow.
Fix skips: knowledge/digest review, plan overlap, branch freshness (short-lived branch).
Resuming work on an existing feature branch with an active project → follow the 5-step procedure in references/context-continuation.md (uncommitted-changes check, SHA refresh, branch freshness, resume point, skill routing). Context continuation never re-evaluates size — it uses the size from the original session.
These rules apply to ALL subsequent work in this session, regardless of which skills are invoked. They complement (not replace) any built-in skills — providing project-specific context.
When performing these activities, FIRST read the corresponding config file if it exists. The config provides project-specific tools, commands, known issues, and conventions. If the config file does not exist, proceed normally without it.
| Activity | Config File | What It Contains |
|---|---|---|
| Debugging (bugs, crashes, logic errors) | .claude/debug-config.md | Debug tools, Docker commands, known gotchas, layer-by-layer diagnosis |
| Writing or running tests | .claude/test-strategy-config.md | Test framework, commands, coverage thresholds, test pyramid conventions |
| Parallel task dispatch (team work) | .claude/team-config.md | Role templates, tech stack context, team size rules |
| Performance profiling | .claude/profiling-config.md | Profiling tools, metrics collection, baseline commands |
| Comparison audit (old vs new) | .claude/audit-config.md | Known by-design divergences, audit scope definitions |
| Methodology / reviewer / parallel dispatch routing | .claude/dispatch-config.md | Preference chains for debugging / testing / profiling / team / review / parallel dispatch (also auto-injected at top of this skill) |
Before committing or merging, invoke autopilot:quality-pipeline.
This is non-negotiable. The quality pipeline runs: test → scan → completeness → review.
When the user signals session end (or task completion for S-size):
docs/projects/*/README.md + INDEX.md)autopilot:learn)First ask: what kind of work is this?
| Nature | Criteria | Workflow |
|---|---|---|
| Fix | Bug fix — root cause known, solution clear. No design needed. | Fix (any module count) |
| H | Production broken — immediate fix needed. | Hotfix |
If neither → size the feature:
| Size | Criteria | Workflow |
|---|---|---|
| S | Single commit (single module, no interface change, self-contained) | Direct commit |
| L | Multiple commits (3+ modules / public API / incompatible data / Feature Flag / user requests planning) | Plan + Project |
Fix vs L: "Do I need to design the solution, or just implement a known fix?" Design → L. Known fix → Fix.
Risk Escalation (force L for features): money/points, auth/security, production protocol changes. Risk-escalated bug fixes stay Fix but add PR review before merge.
Size is evaluated once at start, but scope can grow. After every commit, self-check:
Has the scope grown beyond original S-size?
- 3+ commits already made
- 3+ files in different modules changed
- User asked for additional features beyond original goal
If yes → re-evaluate as L-size:
- Create project dir + README + INDEX (retroactive)
- Record prior commits as completed phases
- Continue with L Workflow tracking
S Session End (lite):
1. Retry check:
"Did I retry any non-trivial operation 2+ times?
If yes, invoke `learn` skill to record the finding."
2. Deferred items:
If anything was postponed, add to BACKLOG with context + trigger condition.
3. Confirm commit:
Verify the change landed on the correct branch.
S does not use TodoWrite -- too few steps to justify tracking overhead.
Bug fix with clear root cause. No plan/project needed. Feature branch for traceability.
git checkout -b fix/<description>doc/projects/ongoing-maintenance/YYYY-MM.md (or the project-configured projects path — e.g. docs/ plural; check the injected config so you don't create a stray sibling tree):
| MM-DD | commit_hash | fix(area): 根因 → 修法 (跨 N 模組) |If the fix revealed a non-obvious lesson, invoke learn skill.
Fix does NOT create: plan, project dir, or PR (unless risk-escalated).
Continuous execution: proceed between Phases without asking "continue?". Stop only for: Staging Gate | Build/test failure | Design decision needed | Context near limit.
Task tracking (MANDATORY at L-1): Create Phase Todos at start (extract p0...pN + completion from plan) AND create TWO parent tasks. Both are non-optional forcing functions — missing either one = failed L-1 gate:
TaskCreate: "L-1.6: Skill routing — invoke required skills for all affected code areas"
description: MANDATORY before any implementation phase. Input: the module/surface list
produced by L-1.5 Scope Completeness Audit. For each affected area, consult project
CLAUDE.md and/or .claude/skill-routing.md for required skills. Invoke each required
skill via the Skill tool (reading the file is NOT invoking). Mark this task completed
ONLY after:
(a) every required skill has been invoked via Skill tool, AND
(b) one-line summary of "what this skill told me for this task" is captured in
session context (either a note or a TaskCreate subtask).
If a module has no skill routing entry, mark N/A with a one-line justification.
Phase implementation tasks (P0..PN) MUST be created with blockedBy=[this task] so
they cannot start until skill routing is confirmed done.
TaskCreate: "L-5: Invoke autopilot:finish-flow"
description: MANDATORY L-size completion. Invoke autopilot:finish-flow which will
expand into 6 discrete sub-tasks (Final Goal Review, Pre-Merge Review, Merge,
Post-Merge Review, Archive, L Session End). Do not mark this completed until the
skill has run and all 6 sub-tasks reach completed.
Both parent tasks are forcing functions: they remain pending through every phase and are
surfaced by system-reminder after each tool use. They cannot be silently skipped because
marking them completed requires explicit work — L-1.6 requires Skill-tool invocations,
L-5 requires invoking autopilot:finish-flow which itself creates 6 more discrete pending
tasks.
Why L-1.6 exists (historical rationale): On 2026-04-11, reconnect-regression-fix ran
the full fix workflow against src/network/, src/lobby/, and E2E tests without invoking
twgs-network / twgs-debug / other project skills. The existing "gate 6: Skill routing"
bullet in L-size Full Gates (Phase 1 Session Start, line ~65) is passive markdown and got
mentally compressed into "I know this area" — the same failure mode that L-5 hit before
finish-flow replaced it. This active TaskCreate applies the identical passive→active
pattern that worked for L-5. Missing twgs-* skill invocations don't produce immediate
bugs, but they systematically waste the knowledge base the project has invested in.
Phase task dependency (mechanical enforcement, not just a reminder): When TaskCreating
phase tasks P0..PN, each MUST be created with blockedBy=[L-1.6]. This means phases
literally cannot be claimed/started until L-1.6 reaches completed. The system-reminder
surfaces pending L-1.6 after every tool use; the blockedBy dependency makes starting
implementation impossible without first resolving it. Two layers of defense.
If either parent task is missing at any point after L-1: STOP, create it retroactively, then continue. For L-1.6 specifically, if implementation has already started without skill routing: pause current phase, create L-1.6 now, invoke the missing skills, then resume.
Confirm before starting. Record in the project README:
## Project Goal
> **Final goal**: [one sentence]
> **Success criteria**: [quantifiable conditions]
> **Scope boundary**: [explicit include/exclude]
Quantifiable means each criterion must include (a) a measurable threshold (number, percentage, boolean state, or named command output), AND (b) how it will be verified.
| Example | |
|---|---|
| PASS | "API returns <200ms for 95th percentile (measured by load test)." |
| FAIL | "Performance is acceptable." |
Any criterion without a threshold or verification method means the plan is incomplete. Do not proceed until fixed.
CEO mode: SKIP intent confirmation -- CEO already confirmed OKR during Startup. Do not ask the user again.
A correctly-executed phase plan cannot recover from an incomplete scope. Before creating phase tasks, run a dimensions audit so the scope boundary reflects every surface this change touches, not just the one the task description mentions.
Create a discrete TaskCreate as the first item:
TaskCreate: "L-1.5: Scope completeness audit — enumerate all affected surfaces"
description: Before phase TaskCreate. Walk the dimensions checklist below.
For each "yes" row, either add a phase task for it OR document in README
scope boundary why it's explicitly out-of-scope. Do NOT mark this task
completed without dimension-by-dimension coverage recorded in README.
Dimensions checklist (non-exhaustive starter — add project-specific rows as needed):
| Dimension | Trigger |
|---|---|
| Source code + tests | Almost always |
| User-facing docs (README, guides, help text) | Any user-visible behavior change |
| API / interface reference | Any public interface change |
| Config file templates / examples | Any new or changed config format |
| CHANGELOG entry | Any release-worthy change to a versioned artifact |
| Version bump (semver) | Any externally-visible change to a versioned artifact |
| Version sync verification (grep) | Any version bump — grep the old version string across all tracked files (don't pre-filter by extension; tomorrow's repo may add .toml / Dockerfile / .yaml). If the grep returns N hits, the edit list must touch all N. Never enumerate the file list from memory |
| Migration guide / notes | Any breaking change or schema change |
| Dependent repos / external consumers | Any interface change with downstream consumers |
| Credit / attribution | Any feature absorbing external OSS, prior art, or third-party design — README's Inspired By / credits / acknowledgements section must list the source(s) |
| Dogfood target | Any tooling/infra change (does it apply to itself?) |
For each "yes" row, either:
README.md scope boundary why it's explicitly out-of-scopeFeeds into L-1.6: The module/surface list produced here is the direct input to the
L-1.6 Skill routing TaskCreate. Every "Source code + tests" module enumerated here must
have its required project skills invoked before any phase starts. Do not mark L-1.5
completed without first cross-referencing each module against .claude/skill-routing.md
(or project equivalent).
Historical rationale (why this gate exists): On 2026-04-11, the dev-flow-l5-enforcement
project shipped the new finish-flow skill but initially missed the autopilot-side
user-facing surface (README skill count, CHANGELOG entry, template example, plugin version
bump). The source-code dimension was complete; the documentation dimension was invisible.
The finish-flow forcing function could not recover this — it enforces closing discipline,
not scope completeness. This is a different failure mode that belongs at L-1, not L-5.
Why "Version sync verification (grep)" and "Credit / attribution" exist (added v2.2.1):
The v2.2.0 think-tank-dialectic release walked the dimensions checklist correctly but
still had two near-misses: (1) marketplace.json's version bump was missed because the
audit was walked from memory instead of grepping the old version string, so the edit list
forgot one of the two version files; (2) the README's Inspired By section was not
updated to credit the two source repos (agora, council-of-high-intelligence) because
the dimensions checklist had no row for attribution at all. Both failures share a root
cause: the audit was enumerated rather than grepped. The two new rows make grep the
default for version bumps, and add attribution as a first-class dimension whenever
external prior art is absorbed.
CEO mode: CEO performs the audit autonomously and records the coverage in the README scope boundary. Do not ask the user to enumerate dimensions — that's CEO tactical work.
docs/plans/YYYY-MM-DD-<feature-name>.mdGoal verification -- answer all three before starting each phase:
Pass threshold: Q1=yes, Q2=yes (essential), Q3=yes. Any "no" or "unsure" = blocked. Surface to decision-maker before proceeding.
CEO mode: CEO evaluates the three questions autonomously. Only escalate to user (Board) if the answer is "no" AND the required response is a strategic pivot (goal change, scope expansion) -- per CEO's DOA.
Drift signals:
| Signal | Response |
|---|---|
| "This phase has low ROI, skip it" | STOP -- Does it affect the final goal? |
| "We can do this later" | STOP -- Any hidden dependencies? |
| "Project is basically done" | STOP -- Has the final goal been achieved? |
| "User probably just wants..." | STOP -- Ask and confirm directly. |
Execution: Implement -> quality gate -> commit -> mark phase done.
Backlog safety (before deferring anything):
If deferral passes: add to BACKLOG with context + trigger condition, mark phase "Deferred" in project docs.
Phase advance gate -- all must be true before starting the next phase:
CEO mode: CEO verifies all prerequisites. No user confirmation needed for passing gates.
Invoke autopilot:finish-flow. That skill owns the L-size closing sequence. On invocation
it TaskCreates 6 discrete sub-tasks (Final Goal Review → Pre-Merge Review → Merge → Post-Merge
Review → Archive → L Session End), each with an explicit verification output. Every sub-task
must be individually completed — they cannot be batched or compressed.
Why delegated: Historically L-5 was an inline 6-step list that got mentally compressed into
"one action" and silently skipped. The finish-flow skill replaces passive markdown with
active TaskCreate reminders that system-reminder surfaces until addressed. See
autopilot:finish-flow for the full size → sub-tasks table.
CEO mode: All 6 sub-tasks are within CEO DOA (tactical, reversible, local git ops). CEO does not pause to ask the user between sub-tasks — execute all, then report.
Trigger: Phase/feature awaiting user review | session ending with undeployed committed changes.
Deploy per project config (default: build + restart).
Production is broken. Smallest possible fix, fastest path to stable.
Task tracking (MANDATORY at H-1): Create a parent closing task at the start:
TaskCreate: "H-9: Invoke autopilot:finish-flow"
description: MANDATORY hotfix completion. Invoke autopilot:finish-flow which will
expand into 6 discrete sub-tasks (verify fix, quality gate, merge to main, post-incident
learn, delete hotfix branch, session end).
git checkout -b hotfix/<description> mainautopilot:finish-flow — it expands the remaining closing sequence into 6 discrete
sub-tasks (verify fix → quality gate → merge to main --no-ff → post-incident learn
(MANDATORY) → delete hotfix branch → session end). Each must be individually completed.H workflow prioritizes speed. The forcing function does not add steps — it only prevents skipping the existing ones. For rollback situations, invoke
finish-flowafter the rollback is verified stable.
L-size and H-size: Session End is a sub-task inside
autopilot:finish-flow(L-5.6 / H-9.6), not a standalone section you run yourself. Do not duplicate the checklist here —finish-flowcreates the discrete tasks and this section is their reference material.S and Fix:
finish-flowis optional. You may either run the inline S-Lite below or invokeautopilot:finish-flowfor the same effect in TaskCreate form.
learn.The L Session End sub-task (L-5.6) runs the full checklist below. Create a checklist and complete each item before concluding.
1. Verify completion:
- User's last request is completed (or user explicitly said pause/stop).
- No background work pending.
- If on a feature branch: check if branch is merged to main.
If not merged, flag to user before proceeding.
2. Update project docs:
- Update project progress table and last-updated date.
- Sync project index.
- If 100% complete + merged: invoke project archival.
3. Knowledge extraction -- ask yourself:
- Stepped on a non-obvious landmine? -> record in .claude/knowledge/
- Made an architecture decision? -> record in project docs
- Discovered a process gap? -> update relevant skill
- Learned something cross-session useful? -> record in persistent memory
- None of the above? -> skip, do not force it
4. Deferred items:
Anything postponed goes to BACKLOG with:
- Context: what it is and why it was deferred
- Trigger condition: when it should be picked up
Backlog safety: if the item affects the final goal, do NOT defer.
5. Triggered BACKLOG pickup:
Check if any BACKLOG items have their trigger condition met by this session's work.
Scope "this session" using session-start-sha:
git log --oneline $(cat .claude/session-start-sha 2>/dev/null || echo "HEAD~10")..HEAD
Surface matches to decision-maker:
- Normal mode: present to user for action.
- CEO mode: CEO decides autonomously (tactical). Record in CEO Report.
6. Invoke learn skill:
Produce a session learning summary covering:
- Errors encountered and resolved (root cause + fix)
- Key decisions made (rationale)
- Surprises or counter-intuitive discoveries
7. Staging verify (if applicable):
Confirm staging reflects latest code.
Skip if: mid-implementation, only doc changes, or no staging environment.
8. Checklist summary:
Output pass/fail for each gate. Include in PR description for L-size tasks.
If the session was long or context feels degraded, measure token budget:
Budget baseline: 200K tokens = 100%.
Approximate conversion: 1 token ~ 3.5 bytes (blended estimate for mixed-language codebases).
Report three layers:
- Fixed (loaded every session): CLAUDE.md, MEMORY.md, auto-injected context
- Loaded this session: skills invoked in current conversation
- On-demand (not yet loaded): remaining skills, knowledge files
If usage > 70%: flag for attention.
If specific files are bloated: recommend compress or split strategies.
After code changes, verify documentation matches the new state — see the changed→update mapping table in references/post-feature-doc-sync.md. Skip doc sync for: bug fixes, minor value tweaks, log message changes.
!cat .claude/skill-routing.md 2>/dev/null || true
AI makes the marginal cost of completeness near-zero. When choosing between approaches:
| Wrong | Correct |
|---|---|
| Bug fix escalated to L because it crosses 3 modules | Use Fix -- module count doesn't determine bug fix workflow |
| Ask "continue?" after Phase | Proceed directly to next Phase |
| Team commit task says only "commit changes" | Must include quality gate |
| User provides plan -> skip project setup | Project dir must be created regardless |
| End session after merge | Must continue: post-merge -> archive -> session end |
| Skip branch freshness on L-size | Always check before starting L-size work |
| Force knowledge extraction when nothing happened | Skip -- do not force it |
| Defer work that affects the final goal | Never defer goal-critical items |
| Re-evaluate size on context continuation | Use size from the original session |
| Auto-execute context reduction without confirmation | List confirm operations with numbered choices |
| Skip the L-1 / H-1 parent closing TaskCreate "because I remember the steps" | The parent task IS the forcing function — memory is exactly what keeps failing; always create it |
| Skip the L-1.6 skill routing TaskCreate "because I already read CLAUDE.md" | Reading ≠ invoking. The TaskCreate exists because passive bullets get mentally compressed into "I know this area". Invoke each required skill via the Skill tool, even if you "remember" it |
Create phase tasks without blockedBy=[L-1.6] | The dependency is the mechanical enforcement; a pending L-1.6 that doesn't actually block implementation is just another reminder to ignore |
| Mark L-1.6 completed after "reading" the skill files in knowledge base | Reading skill markdown is not the same as Skill-tool invocation. The invocation loads the skill into the session context and creates the explicit decision record. Read ≠ invoke |
Inline L-5 / H-9 steps instead of invoking finish-flow | Always invoke finish-flow; inlining defeats the TaskCreate forcing mechanism |
| Mark parent L-5 / H-9 completed while finish-flow sub-tasks still pending | Parent only completes after all sub-tasks reach completed |
| Batch multiple finish-flow sub-tasks into one TaskCreate call | Each sub-task is its own TaskCreate — batching breaks the surface-per-tool-use mechanism |
| Enumerate L-size phases before running the L-1.5 Scope Completeness Audit | Scope audit determines WHICH phases should exist — it runs first |
| Skip the scope audit "because the task is obvious" | Invisible scope holes are the whole reason the audit exists; shipping an incomplete deliverable is always cheaper to prevent than to fix |
fix/ branch created, root cause confirmedblockedBy=[L-1.6]User may request skipping process steps. When overridden:
[OVERRIDE: skipped {step}] in commit messageCannot be overridden (explain why and suggest alternatives):
npx claudepluginhub cookys/autopilot --plugin autopilotProvides behavioral guidelines to reduce common LLM coding mistakes, focusing on simplicity, surgical changes, assumption surfacing, and verifiable success criteria.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.