Enforce a specification-driven development loop where all code, tests, and fixes are traced back to a single SPEC.md file, with automated drift detection, adversarial spec review, and root-cause backpropagation from bugs into the spec.
Plan-then-execute against SPEC.md. Native Claude Code loop, no sub-agents.
Drift detector. Diff SPEC.md against code. Read-only, zero writes.
Spare-budget design pass. Make one shallow module deep — smaller interface, behavior held, tests green before & after.
Sharpen a fuzzy idea into §G/§C before spec. Calibrated interrogation, one question at a time.
Gather external knowledge into §R so build grounds in facts, not hallucinations. Every finding cites a source.
Bug → spec protocol. When a bug is found or a test fails, trace the cause, decide whether a new §V invariant would catch recurrence, append to §B. This is the one non-obvious thing SDD does that plan-then-execute doesn't. Triggers on test failure, bug report, post-mortem, or explicit user ask.
Plan-then-execute implementation against SPEC.md. Native single-thread loop, no sub-agents. On test or build failure, auto-invokes the backprop skill before retrying — a failed verification always considers whether a new §V invariant would prevent recurrence. Triggers when the user asks to build, implement, execute the spec, or tackle a specific §T task (`build §T.3`, `build --next`, `implement next task`, `run the build`). Expects SPEC.md to exist; if not, defers to the spec skill.
Caveman encoding for SPEC.md and spec-adjacent writes. Loaded by /spec, /build, /check. Cuts tokens ~75% vs prose while staying precise. Triggers on any write to SPEC.md or when user says "caveman", "compress this", "be brief".
Read-only drift detector. Diffs SPEC.md against current code and reports violations grouped by severity. Writes nothing — suggests remedies via the spec or build skills but never invokes them. Triggers when the user asks to check drift, audit the spec, verify invariants, or ask whether code still matches the spec. Phrasings: "check drift", "audit the spec", "does the code still match §V", "check invariants", "spec vs code".
Optional design-improvement pass for when you have spare usage to drain. Finds the shallowest modules in the code the spec touches, researches a deeper design, and proposes refactors that shrink interfaces and hide decisions — behavior held constant, tests green before and after. Proposes §I/§V/§T edits, never silent rewrites. Triggers when the user says "deepen this", "improve the design", "this module feels shallow", "pull complexity down", "use spare budget on the codebase", or invokes /ck:deepen. Leans on the codebase-design skill's deep-module vocabulary when present.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
compressed spec-driven development for claude code
one file · one loop · zero sub-agents
Plan-then-execute forgets. SDD remembers — but most SDD frameworks bury that value under agent swarms, dashboards, and ceremony that costs more tokens than it saves.
Cavekit is the simplest full loop: grill → spec → research → review →
build, over one SPEC.md file, no sub-agents. Three commands you run
every time; four more you reach for only when the change earns it.
The spine is three properties that earn their tokens:
SPEC.md at repo root survives context resets. It is
the agent's long-term memory: lose the window, reload the spec, keep going.§B entry; classes
of bug become §V invariants the spec never forgets.And one rule that keeps it from bloating into the frameworks it replaces:
right-size. A one-line fix is just /build. The full chain is for
genuinely uncertain or high-blast-radius work — never for a typo.
the loop — run these every time:
| cmd | job |
|---|---|
/ck:spec | create / amend / backprop SPEC.md. Sole mutator. |
/ck:build | native plan → execute against spec. Names which test proves each §V. Auto-backprops on failure. |
/ck:check | read-only drift report. Lists §V / §I / §T violations. The drift detector. |
reach for these — only when the change earns the ceremony:
| cmd | job |
|---|---|
/ck:grill | interrogate a fuzzy idea into a sharp §G/§C, one question at a time, before you spec. |
/ck:research | gather external knowledge into §R so build grounds in facts, not hallucinations. Every finding cites a source. |
/ck:review | adversarial senior review of the spec before build. Refutes, hardens §V, ends in a go/no-go gate. |
/ck:deepen | spare-budget design pass — make one shallow module deep. Behavior held, tests green before & after. |
One line, via the skills CLI:
npx skills add JuliusBrussee/cavekit
Installs nine skills into ~/.claude/skills/: spec, build, check
(the loop), grill, research, review, deepen (reach-for), plus
caveman and backprop (the utilities). Claude activates each when its
trigger context matches — e.g. "write a spec for…" invokes spec, a fuzzy
idea invokes grill, a risky change before build invokes review. Claude
Code picks them up on next launch.
Or via the Claude Code marketplace (also adds the /ck:spec, /ck:build,
/ck:check, /ck:grill, /ck:research, /ck:review, /ck:deepen slash
commands):
/plugin marketplace add juliusbrussee/cavekit
/plugin install ck@cavekit
Or clone directly:
git clone https://github.com/juliusbrussee/cavekit.git ~/.claude/plugins/cavekit
See FORMAT.md. Sections: §G goal, §C constraints, §I
interfaces, §R research (optional, pipe table), §V invariants, §T tasks
(pipe table), §B bugs (pipe table). Each verb owns specific sections —
no verb rewrites a section it does not own.
FORMAT.md spec schema + caveman encoding + sectioned ownership
commands/ seven thin slash-command entry points → the skills (loop + reach-for)
skills/spec spec mutator — sole writer
skills/build plan-execute, verification contract
skills/check drift report
skills/grill sharpen a fuzzy idea → §G/§C before spec
skills/research external knowledge → §R, every finding sourced
skills/review adversarial senior review of the spec → hardens §V
skills/deepen spare-budget design pass — make one module deep
skills/caveman encoding utility
skills/backprop bug → spec protocol (six steps)
cat SPEC.md is the dashboard.The previous generation is not deprecated — it is frozen at tag
v3.1.0 and
remains a fully working plugin.
What it is:
Ultra-compressed communication mode. Cuts ~75% of tokens while keeping full technical accuracy by speaking like a caveman.
npx claudepluginhub juliusbrussee/cavekit --plugin ckSpecification-Driven Development with Process Discipline for Claude Code
GitHub Spec-Kit integration for constitution-based spec-driven development (7-phase workflow)
GitHub Spec-Kit integration for Specification-Driven Development - define WHAT and HOW before coding
Spec-driven development with task-by-task execution. Research, requirements, design, tasks, autonomous implementation, and epic triage for multi-spec feature decomposition.
Skills-first specification-driven development framework with 7 agent skills for planning, implementation, review, and shipping. Natural language activation with intelligent agent orchestration. Includes /plan, /implement, /research commands plus managing-specifications, implementing-features, and reviewing-and-shipping skills.
Spec-driven development plugin for Claude Code. Markdown specs as the source of truth, code downstream.