From citadel
Metric-driven optimization loop in isolated worktrees: proposes changes, measures with a scalar metric command, keeps improvements, discards failures. Supports convergence detection and diminishing returns.
How this skill is triggered — by the user, by Claude, or both
Slash command
/citadel:experimentThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
The user provides three things:
The user provides three things:
npm run build 2>&1 | tail -1 | grep -oP '\d+')If any input is missing, ask for it. The metric MUST output a single number to stdout.
Baseline: {value} ({metric command})For each iteration (up to budget):
isolation: "worktree")node scripts/run-with-timeout.js 300)Iteration {N}: {value} ({delta from baseline}) → {KEEP|DISCARD}
Change: {one-line description of what was tried}
After each iteration, check:
Write results to .planning/research/experiment-{slug}.md:
# Experiment: {Description}
> Metric: `{command}`
> Direction: {lower|higher} is better
> Scope: {glob pattern}
> Budget: {N iterations}
> Date: {ISO date}
## Results
| Iteration | Value | Delta | Verdict | Change |
|-----------|-------|-------|---------|--------|
| baseline | {N} | — | — | — |
| 1 | {N} | {+/-} | KEEP | {desc} |
| 2 | {N} | {+/-} | DISCARD | {desc} |
## Outcome
- **Start**: {baseline}
- **End**: {final value}
- **Improvement**: {percentage}
- **Iterations**: {kept}/{total}
- **Stop reason**: {convergence|diminishing|budget}
## Kept Changes
{List of changes that were kept, with commit hashes}
Also log to .planning/telemetry/agent-runs.jsonl:
{"event":"experiment-complete","slug":"{slug}","baseline":0,"final":0,"improvement":"0%","kept":0,"total":0,"timestamp":"ISO"}
| Goal | Metric Command |
|---|---|
| Reduce bundle size | npm run build 2>&1 | grep -oP 'Total size: \K\d+' |
| Reduce type errors | npx tsc --noEmit 2>&1 | grep -c 'error TS' |
| Increase test pass rate | npm test 2>&1 | grep -oP '\d+ passing' |
| Reduce file count | find src -name '*.ts' | wc -l |
| Reduce line count | wc -l src/**/*.ts | tail -1 | awk '{print $1}' |
Disclosure: "Running experiment loop on [target] with fitness: [function]. Each iteration commits. Budget: [N iterations]."
Reversibility: amber — modifies source files across iterations; each iteration is committed; undo with git revert on kept commits.
Trust gates:
.planning/research/experiment-{slug}.md with all iteration rows filledMetric command outputs nothing or non-numeric text: Treat as a metric failure. Ask the user to provide a command that outputs a single number to stdout before starting iterations.
No worktree support (e.g., shallow clone): Fall back to branch isolation. Create a branch, run changes there, measure, then delete or merge the branch. Never modify the working tree directly.
If .planning/research/ does not exist: Create it before writing the experiment report. If .planning/ itself doesn't exist, create the full path or output the report inline.
Budget exhausted with zero kept iterations: Report outcome as "no improvement found". This is a valid result — do not continue past the budget.
---HANDOFF---
- Experiment: {description}
- Result: {baseline} → {final} ({improvement}%)
- Kept: {N}/{total} iterations
- Stop reason: {reason}
- Report: .planning/research/experiment-{slug}.md
- Reversibility: amber — undo kept iterations with `git revert` on each kept commit
---
npx claudepluginhub sethgammon/citadel --plugin citadelGuides interactive setup of optimization goals, metrics, and scope; runs autonomous git-committed experiment loops: code changes, testing, measurement, keep improvements or revert. For performance tuning in git repos.
Runs autonomous experiment loops to iteratively optimize measurable metrics like code performance, ML loss, build size via git branches, code changes, verify commands, and guards.
Sets up autonomous experiment loops for code optimization targets. Gathers goal/metric/files, creates git branch/benchmark script/logging, runs baseline via subagent. For 'run autoresearch' or iterative experiments.