Explain the autoresearch methodology — a verifiable autonomous experiment loop — and evaluate whether the current repo is a good fit before running the autoresearch-verify and autoresearch-program skills. Use when a user wants to add autonomous experimentation to a project, asks about autoresearch/autoresearch, or is about to invoke the other autoresearch skills.
How this skill is triggered — by the user, by Claude, or both
Slash command
/will-wright-eng-skills:autoresearch-methodThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
autoresearch adds a verifiable autonomous experiment loop to a git repository:
autoresearch adds a verifiable autonomous experiment loop to a git repository:
agent edits bounded scope -> verifier runs -> result is scored -> commit is kept or reverted
The loop runs unattended. Every candidate is committed, scored against a fixed compute budget, and kept only if it strictly beats the current best on a single primary metric. Failures and regressions are reset to the parent commit. The methodology generalizes the pattern from karpathy/autoresearch.
A repo is a good candidate when all of the following hold:
If any condition is missing, surface that to the user before running the verify or program skills. Do not paper over a missing verifier or a fuzzy metric — the methodology is only as strong as those two pieces.
The skills run in this order:
program.md at the repo root with mutable/immutable scope baked in via light templating. After this skill runs, hand program.md to a fresh agent session; the skills are no longer in the picture.Each skill assumes the previous step is complete. Do not run autoresearch-program before autoresearch-verify — there is nothing to optimize against yet. Do not skip autoresearch-method if you have not first checked the repo fits the method.
Nothing on disk. This skill is informational. Its job is to:
autoresearch-verify, or surface the gap that blocks the method.npx claudepluginhub will-wright-eng/skillsCreates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.