From armory
Replaces single-point guesses with structured three-point PERT estimates (best/likely/worst) including confidence intervals, unknowns, and assumptions. Useful for effort estimation, story pointing, or t-shirt sizing.
How this skill is triggered — by the user, by Claude, or both
Slash command
/armory:estimate-calibratorThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Replaces single-point guesses with structured three-point estimates: decomposes work
Replaces single-point guesses with structured three-point estimates: decomposes work into atomic units, estimates best/likely/worst case for each, identifies unknowns and assumptions, calculates aggregate ranges using PERT, and assigns confidence levels with explicit rationale.
| File | Contents | Load When |
|---|---|---|
references/estimation-methods.md | PERT formula, three-point estimation, Monte Carlo basics | Always |
references/unknown-categories.md | Technical, scope, external, and organizational uncertainty types | Unknown identification |
references/calibration-tips.md | Cognitive biases in estimation, historical calibration, buffer strategies | Always |
references/sizing-heuristics.md | Common task size patterns, complexity indicators, reference class data | Quick sizing needed |
If the work item is not already decomposed into atomic units:
For each task, estimate three scenarios:
| Scenario | Definition | Mindset |
|---|---|---|
| Best case | Everything goes right. No surprises. | "If I've done this exact thing before" |
| Likely case | Normal friction. Some minor obstacles. | "Realistic expectation with typical setbacks" |
| Worst case | Significant problems. Not catastrophic. | "Murphy's law but not a disaster" |
Key rule: Worst case is NOT "everything goes wrong." It's the realistic bad scenario (90th percentile), not the apocalyptic one (99th percentile).
Categorize unknowns that affect estimates:
| Category | Example | Impact |
|---|---|---|
| Technical | "Never used this library before" | Likely case inflated, worst case much higher |
| Scope | "Requirements may change" | All estimates may shift |
| External | "Depends on API access from partner" | Blocking risk — could delay entirely |
| Integration | "Haven't tested with production data" | Hidden complexity at integration |
| Organizational | "Need design approval" | Calendar time, not effort time |
For individual tasks, use the PERT formula:
Expected = (Best + 4 × Likely + Worst) / 6
Std Dev = (Worst - Best) / 6
For aggregate (project) estimates:
| Confidence | Meaning | When |
|---|---|---|
| High | Likely case within ±20% | Well-understood task, team has done it before |
| Medium | Likely case within ±50% | Some unknowns, moderate familiarity |
| Low | Likely case within ±100% or more | Significant unknowns, new technology |
## Estimate: {Work Item}
### Summary
| Scenario | Duration |
|----------|----------|
| Best case | {time} |
| Likely case | {time} |
| Worst case | {time} |
| **PERT expected** | **{time}** |
| **Confidence** | **{High/Medium/Low}** |
### Task-Level Estimates
| # | Task | Best | Likely | Worst | PERT | Unknowns |
|---|------|------|--------|-------|------|----------|
| 1 | {task} | {time} | {time} | {time} | {time} | {key unknown or "None"} |
| 2 | {task} | {time} | {time} | {time} | {time} | {key unknown} |
| | **Total** | **{sum}** | **{sum}** | **{sum}** | **{pert}** | |
### Key Unknowns
| # | Unknown | Category | Impact on Estimate | Mitigation |
|---|---------|----------|-------------------|------------|
| 1 | {unknown} | {Technical/Scope/External} | +{time} if realized | {spike, prototype, early test} |
### Assumptions
- {Assumption 1 — what must be true for this estimate to hold}
- {Assumption 2}
### Risk Factors
- {Risk}: If realized, adds {time}. Likelihood: {High/Medium/Low}.
### Confidence Rationale
**{High/Medium/Low}** because:
- {Specific reason — e.g., "Team has built 3 similar features"}
- {Specific reason — e.g., "External API is a new integration"}
### Recommendation
{Commit to PERT expected with {X}% buffer, or spike the top unknown first.}
| Problem | Resolution |
|---|---|
| Work item not decomposed | Decompose into 3-8 tasks first (or suggest task-decomposer skill). |
| No historical reference | Estimate relative to a known task: "This is about 2x the auth feature." |
| Stakeholder wants a single number | Provide PERT expected with buffer matching confidence level (High: +20%, Medium: +50%, Low: +100%). |
| Estimate seems too large | Check for scope creep in task list. Remove non-essential tasks. Identify what can be deferred. |
| Team has never done this type of work | Mark confidence as Low. Recommend a spike before committing to an estimate. |
Push back if:
npx claudepluginhub mathews-tom/armory --plugin armoryEstimate AI-assisted and hybrid human+agent development work with research-backed PERT statistics and calibration feedback loops
Apply evidence-based estimation methods (story points, t-shirt sizing, planning poker) to reduce uncertainty. Use when sizing work for sprints or releases.
Estimates story points using Fibonacci sequence with hour conversions, platform adjustments, and velocity calculations for software tasks.