From edpa
Auto-calibrate EDPA CW signal weights using the Monte Carlo + coordinate-descent optimizer (v1.11+). One target file (cw_heuristics.yaml.tmpl), one metric (MAD on a synthetic corpus), two phases (random sample → coordinate descent). Use when: user says "calibrate CW", "auto-calibrate", "optimize heuristics", "recalibrate signals". Synthetic corpus — runnable any time, no ground-truth file required. Re-run after a real PI close once team-confirmed CW corrections are available (see "Re-run with real data" below).
How this skill is triggered — by the user, by Claude, or both
Slash command
/edpa:edpa-autocalibThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Optimizes the five **signal weights** (`assignee`, `pr_author`, `commit_author`,
Optimizes the five signal weights (assignee, pr_author, commit_author,
pr_reviewer, issue_comment) in plugin/edpa/templates/cw_heuristics.yaml.tmpl
against a synthetic corpus generated procedurally. The engine consumes those
weights directly — there is no role_weights or role_overrides block any
more (both were dropped in v1.11; see plugin/edpa/scripts/engine.py:864).
The optimizer is self-contained: it generates its own ground truth via Monte
Carlo, evaluates candidate weight vectors against it, and writes the best
candidate back into the template when --apply is passed.
$ARGUMENTS = optional flags forwarded to calibrate_signals.py. Common forms:
help → show current calibration metadata, propose a default runquick → adds --quick (200 MC samples; ~1 s; smoke test only)--scenarios <N> (e.g. 2000); default 1000apply → after calibration, write best weights back to the template--scenarios 2000 --seed 7 --apply --report report.json) →
passed verbatim$ARGUMENTS is empty)calibration: block from
plugin/edpa/templates/cw_heuristics.yaml.tmpl and print:
Last calibration:
method: MC random-sample + coordinate descent
scenarios: 1000 records: 31041
baseline MAD: 0.0861
calibrated: 0.0805 (+6.5%)
version: 1.11.0
timestamp: 2026-05-08T18:37:24Z
Suggested run:
python3 plugin/edpa/scripts/calibrate_signals.py \
--scenarios 1000 --seed 42
Apply best weights to the template? [N]
None for the synthetic path. The MC corpus is generated in-process; no
.edpa/data/ground_truth.yaml is needed. The legacy role_weights /
role_overrides schema and the evaluate_cw.py autoresearch evaluator
were removed in v1.18.2.
If the user explicitly asks to calibrate against real PI data, fall through to "Re-run with real data" below.
Target file: plugin/edpa/templates/cw_heuristics.yaml.tmpl
Script: plugin/edpa/scripts/calibrate_signals.py (LOCKED)
Metric: mean_absolute_deviation(predicted_cw, true_cw)
where predicted_cw = Σ weight × signal_count, per-item normalized
Direction: lower
Phases: (1) MC random sampling default 2000 samples (200 if --quick)
(2) coordinate descent refines top-5 candidates
Search space: 5D, each weight ∈ [0.1, 8.0]
Defaults: assignee 4.0, pr_author 3.4, commit_author 2.78,
pr_reviewer 2.25, issue_comment 1.14
CRITICAL: never edit calibrate_signals.py. The synthetic corpus generator
and the MAD cost function are inside the same script intentionally — the
locked-vs-tunable separation is preserved by structure: the cost function takes
only a candidate weight vector and pure-reads signal_count × weight with
per-item normalization (no parameters live inside the cost function itself).
If you are tempted to modify the generator to match a particular weight
vector, STOP — that gamifies the metric. Add new scenario flavors only when
they reflect a real-world contribution pattern the corpus does not yet model,
and even then file a separate PR — not inside a calibration run.
python3 plugin/edpa/scripts/calibrate_signals.py \
--scenarios "${SCENARIOS:-1000}" \
--seed "${SEED:-42}" \
${QUICK:+--quick} \
${APPLY:+--apply} \
${REPORT:+--report "$REPORT"}
Expected stdout (abridged):
Generating 1000 synthetic scenarios (seed=42)...
→ 31041 (person, item) records across 1000 scenarios
Baseline (v1.11 shipped defaults): MAD = 0.0861
Phase 1 — Monte Carlo random sampling (2000 samples)...
Top 5 candidates by MAD: ...
Phase 2 — Coordinate descent refinement...
Cand 1: 0.0820 → 0.0805 after refinement
...
Best calibrated weights (MAD = 0.0805):
assignee: 4.00
pr_author: 3.40
...
MAD improvement: 0.0861 → 0.0805 (+6.5%)
Summarize the run to the user:
--apply was used (template updated or not)When the user passed apply, calibrate_signals.py --apply has already
rewritten the template signals: block and refreshed the calibration:
metadata. Confirm by re-reading the target file's calibration: block and
echoing mad_calibrated and calibrated_at.
If not applied, leave the template untouched and tell the user how to apply later:
Re-run with: python3 plugin/edpa/scripts/calibrate_signals.py --apply
The MC corpus is a prior: it encodes plausible signal/cw mappings under
v1.11's procedural model. To improve on it, replace synthetic
SyntheticContribution records with team-confirmed corrections from a closed
PI retrospective.
Today this requires a small adapter (not yet implemented as a skill):
edpa_results.json).SyntheticContribution(person, role, item_id, true_cw=<confirmed_cw>, signal_counts=<observed_counts_from_engine>).generate_corpus(...) with the loaded real corpus and rerun the
optimizer.If the user asks for this flow now, refuse with: "Real-data calibration adapter not yet implemented. Run the synthetic MC pipeline first; track real corrections in retros until the adapter ships."
--scenarios 200 --quick (~1 s; may not improve over
baseline, that's fine).--scenarios 1000 --seed 42 (~10 s; 31 k records).--scenarios 2000 --seed 42 --apply (~30 s).--seed values. If best weights
agree within ~±0.3, the result is stable. If they diverge, raise
--scenarios.calibrate_signals.py missing → check plugin/edpa/scripts/; do not
recreate from template. Tell the user the plugin install is incomplete.MAD improvement: +0.0% after a full run → expected; the shipped defaults
are already near a local optimum on the v1.11 generator. Higher
--scenarios rarely changes this.Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub technomaton/edpa --plugin edpa