Search everything...

Stats

Actions

Available In

harness-claude

Name: harness-claude
Author: vasuag09

By vasuag09

Lean full-SDLC Claude Code harness — Discover, Plan, Implement, Verify, Maintain — with a vague-intent discovery entry above /spec, subagent orchestration, memory persistence, a product/UX + system design altitude, benchmark-gated cost/token optimization, guarded long-running agent runs, multi-agent parallel fan-out, a bug-fix fast lane, and a release & feedback loop (deploy + observe) that closes the pipeline.

npx claudepluginhub vasuag09/harness-claude --plugin harness-claude

Popularity

Stars

Above avg

Med: 0·Avg: 281

Installs

Med: 0·Avg: 1

What's Inside

Agents8

code-reviewer

/code-reviewer

Expert code-review specialist. MUST BE USED immediately after writing or modifying code, before merge. Reviews quality, correctness, and maintainability. Read-only — reports findings with severity, does not edit.

planner

/planner

Implementation-planning specialist. Use PROACTIVELY for non-trivial features, refactors, or anything spanning multiple files. Turns a spec into a phased, risk-assessed task plan. Read-only — produces a plan, does not edit code.

architect

/architect

System-design specialist. Use PROACTIVELY for architectural decisions, new subsystems, interface/data-model design, or significant refactors. Produces a design note / ADR with trade-offs. Read-only — designs, does not implement.

security-reviewer

/security-reviewer

Security vulnerability specialist. MUST BE USED for any change touching auth, user input, DB queries, file I/O, external calls, crypto, secrets, or payments — and before merging such code. Scans for OWASP Top-10 and secret leakage. Read-only — reports, does not edit.

tdd-guide

/tdd-guide

Test-driven development specialist. Use PROACTIVELY when implementing a feature or fixing a bug. Enforces tests-first (RED → GREEN → REFACTOR) and ≥80% coverage. May write tests and implementation.

Skills33

implement

/implement

Execute a plan via test-driven development — write tests first, then minimal code, phase by phase. Use after /plan (and /architect if needed) to write the actual feature. Delegates to the harness-claude:tdd-guide agent.

architect

/architect

Make and record an architecture/design decision for non-trivial work — new subsystems, interfaces, data models, or significant refactors. Use when /plan flags a load-bearing decision. Delegates to the harness-claude:architect agent.

benchmark

/benchmark

Measure whether a harness component (a skill, agent, or rule) actually earns its keep — run the same task k times with and without it, in isolated git worktrees, and report pass@k (works once) and pass^k (works consistently). Opt-in; not wired into the default pipeline. Use to decide keep-vs-cut on evidence, or to fill the R5 benchmark for a /extract proposal.

build-fix

/build-fix

Resolve build failures and type/compile errors fast, with minimal diffs. Use when a build or typecheck fails during implementation. Delegates to the harness-claude:build-error-resolver agent.

deploy

/deploy

Release step that closes the loop after /ship — drive a change out to a running environment using the PROJECT'S OWN deploy mechanism (detect, never prescribe a stack), then smoke-test the deployed artifact and keep a guarded rollback path. Deploy is more outward-facing than git push, so it is arm-to-fire — plan and pre-check, then HALT and require an explicit `arm deploy` before any real outward action. Opt-in; invoked explicitly. Use after /ship to push a verified change to staging/prod under the harness's boundary discipline.

Hooks1

15 hooks across 5 events

MCP Servers3

magic

memory

sequential-thinking

Stats

Version0.14.0

LanguageShell

Stars2

MaintenanceExcellent

LicenseMIT

Last CommitJun 23, 2026

AddedJun 11, 2026

Actions

View on GitHub View README Plugin Marketplace JSON

Own this plugin?

Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).

Available In

harness-claude2

Safety Signals

Critical

Matches all tools

Hooks run on every tool call, not just specific ones

Caution

Executes bash commands

Hook triggers when Bash tool is used

README

harness-claude

A lean, full-SDLC harness for Claude Code that benchmark-gates its own features — and kills the ones that don't earn their keep.

This is the first time I'm sharing this publicly. I built it privately, iterating toward v0.8.0 (an internal version count, not a prior release history), to turn Claude Code from a powerful-but-undisciplined chat loop into a real Plan → Implement → Verify → Maintain pipeline — scoped subagents, test-first gates, cross-session memory, reversible hooks.

The part I'm most proud of isn't a feature — it's the discipline behind one of them. Every cost optimization I tried had to beat a bare baseline on cost-per-successful-task, or I killed it and wrote down why. I ran 4 experiments. I shipped 2. See what I killed and why ↓

Stack focus: TypeScript/JS (React, Next, Vercel) and Python. MIT-licensed, plugin-installable, ~0-dependency hooks (plain Node).

PLAN ─────────────► IMPLEMENT ─────► VERIFY ─────────────► MAINTAIN
/spec  /research          /implement     /review  /security-review   /refactor-clean
/plan  /architect         /build-fix     /test    /verify  /ship      /onboard

claude plugin marketplace add vasuag09/harness-claude
/plugin            # find "harness-claude", enable it
/harness           # run the full pipeline on a real task

Built to evolve toward eval loops, retrieval, long-running agents, multi-agent orchestration, and computer-use (see ROADMAP.md).

What I killed (and why)

Every optimization here had to clear one gate: does it beat a bare baseline on cost-per-successful-task, at held consistency (pass^k)? Not "does it work" — "is it worth it." Two of four didn't clear it.

#	Optimization	Result	Why
1	Input-compression proxy (wire-level, compresses what's fed to the model)	🔴 Killed	Broke Claude Code's cache economics — compressing the cached prefix invalidated the cache that made it cheap, so it cost more
2	A code-graph MCP (a popular ~51k★ structural-index server)	🔴 Killed	Its fixed per-session context tax was bigger than the entire task cost on a normal-sized repo (might win on very large repos — out of scope here)
3	Generation reduction — the `/lazy` "build the minimum that actually works" reflex	🟢 Shipped, always-on	−35% generated output tokens, −23% LOC, at held accuracy (k=3, Opus/Sonnet, measured against the harness's own existing YAGNI rules)
4	Structural orientation hook (lightweight local code-index)	🟢 Shipped, always-on, with a caveat	A large-repo bet, not yet benchmarked at scale — does no harm on small repos, but isn't a proven win there either

One honest caveat on #3: the dollar savings on a small task were only ~−3% (within noise) — cost here is dominated by cached input, and the output cut only moves real dollars on output-heavy work. The win is real, just not a headline multiplier.

Documentation

Guide	Read it for
docs/USAGE.md	Install → full pipeline → reuse-an-existing-codebase → common scenarios
docs/SKILLS.md	Every command + agent, what each does, what it delegates to
docs/HOOKS.md	Exactly what runs on your machine, and how to disable any hook
docs/TROUBLESHOOTING.md	Hooks not firing, slow `tsc`, MCP, git boundary — fixes
CONTRIBUTING.md	Adding skills/hooks, portability rules, PR checklist
ROADMAP.md	Where this is going (eval loops → retrieval → multi-agent → computer-use)

New here? Start with docs/USAGE.md.

Why it's structured as a plugin

It lives in its own repo and installs as a Claude Code plugin/marketplace, so you can:

Test in isolation — enable it via /plugin, run a real task, judge the result. Toggle it off to fall back. It never overwrites your existing ~/.claude.
Promote when proven — once it gives production-grade results, make it your default.
Share later — a plugin repo is already the shareable format; push to GitHub.

What's inside

View full README on GitHub

harness-claude

Popularity

What's Inside

Confidence

README

harness-claude

What I killed (and why)

Documentation

Why it's structured as a plugin

What's inside

Similar Plugins

ecc

fullstack-dev-skills

octo

claude-buddy

context7-plugin

nature-skills

harness-claude

What I killed (and why)

Documentation

Why it's structured as a plugin

What's inside

Popularity

Health & Quality

Similar Plugins

ecc

fullstack-dev-skills

octo

claude-buddy

context7-plugin

nature-skills