From agent-harness
Turn any domain folder of skills into a bounded agentic loop: compile a goal into a verifiable task plan, execute tasks with the domain's own tools, verify every task with machine-run checks, retry with caps, escalate to a human when budgets exhaust, and refuse to close until everything is verified or explicitly waived. Use when you want an agent or subagent to pick up a goal and drive it to a verified close across one of this repo's 18 domains ('run this goal through the engineering harness', 'set up an agentic loop for marketing work', 'make the finance domain self-verifying'). NOT for authoring Claude Code Workflow-tool .js scripts (workflow-builder), N-agent tournaments on one task (agenthub), single-file metric optimization (autoresearch-agent), or discovering published loop recipes (loop-library).
How this skill is triggered — by the user, by Claude, or both
Slash command
/agent-harness:agent-harnessThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
You are a harness operator, not a hero. The loop — not your optimism — decides when work
assets/harness_manifest.schema.jsonassets/harnesses/business-growth.jsonassets/harnesses/business-operations.jsonassets/harnesses/c-level-advisor.jsonassets/harnesses/commercial.jsonassets/harnesses/compliance-os.jsonassets/harnesses/engineering-team.jsonassets/harnesses/engineering.jsonassets/harnesses/finance.jsonassets/harnesses/loop-library.jsonassets/harnesses/markdown-html.jsonassets/harnesses/marketing-skill.jsonassets/harnesses/marketing.jsonassets/harnesses/product-team.jsonassets/harnesses/productivity.jsonassets/harnesses/project-management.jsonassets/harnesses/ra-qm-team.jsonassets/harnesses/research-ops.jsonassets/harnesses/research.jsonreferences/agentic_loop_canon.mdYou are a harness operator, not a hero. The loop — not your optimism — decides when work is done. Your job: compile the goal into tasks with checks, execute one task at a time, let the controller adjudicate verification, and stop when the state machine says stop.
GOAL → goal_compiler → PLAN → loop_controller: [execute → verify]* → CLOSE
↑______retry (≤ max_attempts, changed approach)
└── ESCALATE on exhausted budgets — never fake success
Three layers, all JSON: a committed per-domain manifest (what skills/tools/checks exist), a per-goal plan (which tasks, which verifications, what "done" means), and a per-run state file (the single source of truth; a fresh session resumes from it alone).
# 0. Pick the domain manifest (18 committed under assets/harnesses/, e.g. engineering-team.json)
ls assets/harnesses/
# 1. Compile the goal (refuses vague goals with exit 3 + forcing questions)
python3 scripts/goal_compiler.py \
--goal "audit the payments service and design an SLO with an error budget" \
--manifest assets/harnesses/engineering.json --out plan.json
# 2. Initialize the loop state
python3 scripts/loop_controller.py init --plan plan.json --state .agent-harness/state.json
# 3. Drive the loop — repeat until directive is "close" or "escalate"
python3 scripts/loop_controller.py next --state .agent-harness/state.json
# → {"action": "execute", "task": "T1", ...}: open the task's skill (SKILL.md at
# skill_path), do the work with its tools, then:
python3 scripts/loop_controller.py record --state .agent-harness/state.json \
--task T1 --phase execute --exit-code 0
# → the controller runs the task's checks ITSELF (subprocess, timeout, evidence log):
python3 scripts/loop_controller.py verify --state .agent-harness/state.json --task T1 --cwd <repo-root>
# 4. Close — refused (exit 4) while any task is unverified and unwaived
python3 scripts/loop_controller.py close --state .agent-harness/state.json
Regenerate a manifest after skills change (diff-stable, CI-checkable):
python3 scripts/harness_manifest_builder.py --domain engineering-team \
--repo-root <repo-root> --out-dir assets/harnesses --no-timestamp
verify runs the checks via subprocess;
a passing record --phase verify without --evidence is rejected (exit 6). You do not
get to declare a task verified.max_attempts_per_task → escalated
(exit 2); max_loop_iterations → escalate (exit 5). Exhausted budgets are never
reported as success — a human waives (close --waive T3 --reason "..."), you don't.next directive is executable by a new
session reading only the plan + state files. Long-running goals: run each iteration as
its own session against the durable state..agent-harness/ — never in .agenthub/, .autoresearch/, or
docs/TC/ (those belong to sibling skills).verify shell-executes each task's
check command; only run the harness on plan/state files you or goal_compiler.py
produced, never on files from untrusted input (see
references/verification_discipline.md).| # | Question | Recommended answer | Why (canon) |
|---|---|---|---|
| 1 | What single observable outcome means DONE? | A named artifact + a command that exits 0 against it | Verifier's law: invest in verifiability first |
| 2 | Which domain harness applies? | The domain whose skills name the deliverable; if two, run two sequential loops | Orchestrator-workers: scoped objectives beat mega-goals |
| 3 | What must NOT change? | List no-touch paths; put them in the goal text so the compiler's plan inherits them | Boundaries are part of a subagent spec |
| 4 | Who reviews escalations, and how fast? | A named human; escalations block the loop by design | Approval-required is a terminal state, not a nuisance |
| 5 | What is the iteration budget? | Default 12 loop iterations / 3 attempts per task; raise only with a reason | Caps are runtime errors, not advice (OpenAI SDK max_turns) |
| Code | Tool | Meaning |
|---|---|---|
| 0 | all | OK / directive emitted |
| 2 | loop_controller | Escalation required — a human must review the evidence log |
| 3 | goal_compiler | Goal too vague — answer the forcing questions, recompile |
| 4 | goal_compiler / loop_controller | No skill matched / close refused (unverified tasks) |
| 5 | loop_controller | Global iteration cap reached |
| 6 | loop_controller | Invalid transition (recording on verified task, evidence missing, unknown task) |
python3 scripts/harness_manifest_builder.py --sample, scripts/goal_compiler.py --sample,
and scripts/loop_controller.py --sample all exit 0.--goal "make it better") exits 3 and prints forcing questions.loop_controller.py close on a state with an unverified task exits 4.loop_controller.py --sample shows a verify failure consuming an attempt
and the loop still closing only after a passing verify with evidence..js scripts for Claude Code's Workflow
tool. NOT for goal-to-close loop state (this skill).verification[].See references/domain_harness_design.md for the three-layer architecture, the reuse map, and how to raise a domain's harness quality.
npx claudepluginhub motivatedc-creator/saafy --plugin agent-harnessSets up isolated workspaces using native worktree tools or git worktree fallback. Use before starting feature work to protect the current branch.