From autodialectics
Run benchmark suites and manage policy evolution — create challengers, compare against champions, promote or rollback policies.
How this skill is triggered — by the user, by Claude, or both
Slash command
/autodialectics:benchmarkThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Use this skill to benchmark policies and drive champion/challenger evolution.
Use this skill to benchmark policies and drive champion/challenger evolution.
autodialectics-mcp must be on PATH (pip install autodialectics).
benchmark(suite_dir?, policy_id?) — run the benchmark suite against a policy. Returns case-by-case results with scores and decisions.evolve_policy(use_gepa?) — analyze recent benchmark reports and create a challenger policy. Set use_gepa: false to skip the GEPA optimizer (simpler heuristic fallback).promote_policy(policy_id) — promote a challenger to champion if comparison rules allow.rollback_policy() — revert to the previous champion if the current one regresses.autodialectics benchmark
autodialectics evolve
autodialectics promote <policy_id>
autodialectics rollback
benchmark → evolve_policy → benchmark (with challenger) → compare → promote or rollback
evolve_policy returns no_reports, run benchmarks first to generate data.If the user passes a suite directory after /autodialectics:benchmark, use it as the benchmark suite path.
npx claudepluginhub hmbown/plugins --plugin autodialecticsCreates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.