Skill

autoresearch:reason

Adversarial multi-round reasoning loop with blind-judge panel to reach rigorous conclusions on complex questions. Use when you need devil's advocate critique or structured debate.

developer-tools

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/autoresearch:reason

User invocable

Model invocable

Inline context

Default effort

Tool Access

This skill is limited to the following tools:

ReadWriteEditBashWebFetchWebSearch

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Adversarial multi-round reasoning loop with a blind-judge panel. Arguments are assigned

SKILL.md

163 lines · ~1.6k tokens

Stats

LanguagePython

Stars18

Forks2

MaintenanceExcellent

Last CommitJun 21, 2026

Actions

View Source View Plugin View on GitHub View README

autoresearch:reason

Adversarial multi-round reasoning loop with a blind-judge panel. Arguments are assigned crypto-random IDs before critique so judges evaluate logic, not author identity. Runs until convergence or budget exhausted.

Autonomy Directive

You are an autonomous reasoning agent. Once the debate begins:

NEVER STOP to ask permission between rounds.
NEVER ASK "should I continue?" mid-debate.
NEVER DECLARE CONVERGENCE prematurely — a single unchallenged round is not convergence.
The loop runs until: all positions converge OR budget exhausted OR user interrupts.
If neither condition is true, begin the next round immediately.

Setup

Step 1 — Define the question

If not provided, ask once: "What is the question or decision to reason about?" The question must be specific enough to allow falsifiable positions. Refuse vague inputs like "think about AI" — ask for a concrete framing.

Step 2 — Set parameters

Ask if not provided:

Number of positions (default: 3). Minimum 2, maximum 5.
Max rounds (default: 4). Minimum 2.
Convergence threshold: "Converged" = all judges rate the top position's logical score ≥8/10 AND no position has an unanswered rebuttal.

Round Structure

Each round follows this exact sequence:

Phase 1 — Propose Positions (Round 1 only)

Generate N distinct positions on the question. Positions must:

Be mutually distinguishable (not minor variations of each other)
Be stated as falsifiable claims, not vague preferences
Cover the genuine range of defensible views (not strawmen)

Write each position to reason/rounds.md as: Position [PENDING-ID]: [statement]

Phase 2 — Assign Crypto-Random IDs

Assign each position a random alphanumeric ID (e.g., ARG-7F3A, ARG-2C91). Write the mapping to reason/id-map.md — this file is sealed until the end (not read during debate). Replace all position labels in reason/rounds.md with their assigned IDs. From this point forward, all debate references use IDs only — never "Position 1" or author names.

Phase 3 — Blind Critique Round

For each argument ID, write a rigorous critique:

Identify the strongest assumption and test whether it holds
Find the weakest logical link in the chain
Propose a concrete counterexample or falsifying scenario
Rate logical strength: 1–10 (where 10 = logically airtight)

Critiques reference IDs only: "ARG-7F3A assumes X, which fails when Y..." Write all critiques to reason/rounds.md under ## Round [N] — Critiques.

Phase 4 — Rebuttal Round

Each argument ID responds to the critiques it received:

Acknowledge valid critiques (update or narrow the claim)
Rebut invalid critiques with specific evidence or reasoning
A position may concede partially — update the statement if weakened by critique

Write rebuttals to reason/rounds.md under ## Round [N] — Rebuttals.

Phase 5 — Judge Evaluation

Three independent judges evaluate all arguments:

Judge A: Logic and internal consistency
Judge B: Evidence quality and falsifiability
Judge C: Practical applicability

Each judge scores every argument ID on their dimension (1–10) and identifies the strongest argument. Judges reference IDs only. Write scores to reason/rounds.md under ## Round [N] — Judgment.

Phase 6 — Convergence Check

After each round:

If top-scoring argument has score ≥8/10 from all judges AND no unanswered rebuttal exists → converged.
If max rounds reached → converged by budget.
Otherwise → begin next round (return to Phase 3 with updated positions).

Output Files

All output goes to a reason/ folder in the working directory.

File	Purpose
`reason/rounds.md`	Per-round arguments, critiques, rebuttals, and scores (IDs only during debate)
`reason/verdict.md`	Final synthesis: winning argument with reasoning, minority positions summarized
`reason/id-map.md`	Revealed at end only: maps each ID → original position label

verdict.md structure

# Verdict: [Question]
Rounds completed: [N]
Convergence: [yes/no — budget exhausted]

## Winning Argument
ID: [ARG-XXXX] (revealed: [original position])
Score: [avg judge score]/10
Summary: [2-3 sentence synthesis]
Key evidence: [bullet points]

## Minority Positions
- [ARG-YYYY]: [why it lost — specific logical weakness identified]
- [ARG-ZZZZ]: [why it lost]

## Synthesis
[2-3 paragraphs on what the debate revealed — including any nuances that
don't fit cleanly into the winning argument]

## Confidence
[Low / Medium / High] — with explicit statement of remaining uncertainty

Anti-Anchoring Protocol

The ID system exists to prevent these failure modes:

Position ordering bias — "the first argument always sounds best"
Framing bias — "the argument labeled 'conservative' gets dismissed without evaluation"
Authority bias — "the argument by the named expert wins by default"

Judges MUST NOT reference position order, original labels, or authorship until after verdict.md is written and id-map.md is revealed.

Edge Cases

Situation	Handling
Two positions are identical	Merge them; reduce N by 1
One position dominates all others in round 1	Still run min 2 rounds — premature convergence is bias
Judges disagree strongly (spread ≥5 points)	Note as "contested" in verdict.md — no forced winner
Budget = 1 round	Complete one full cycle, write verdict with low confidence
Question has a factual answer	Note this upfront — reason is for judgment calls, not facts

autoresearch:reason

Popularity

Invocation

Tool Access

Context Preview

SKILL.md

autoresearch:reason

Popularity

Invocation

Tool Access

Context Preview

SKILL.md

autoresearch:reason

Autonomy Directive

Setup

Step 1 — Define the question

Step 2 — Set parameters

Round Structure

Phase 1 — Propose Positions (Round 1 only)

Phase 2 — Assign Crypto-Random IDs

Phase 3 — Blind Critique Round

Phase 4 — Rebuttal Round

Phase 5 — Judge Evaluation

Phase 6 — Convergence Check

Output Files

verdict.md structure

Anti-Anchoring Protocol

Edge Cases

Similar Skills

autoresearch:reason

Autonomy Directive

Setup

Step 1 — Define the question

Step 2 — Set parameters

Round Structure

Phase 1 — Propose Positions (Round 1 only)

Phase 2 — Assign Crypto-Random IDs

Phase 3 — Blind Critique Round

Phase 4 — Rebuttal Round

Phase 5 — Judge Evaluation

Phase 6 — Convergence Check

Output Files

verdict.md structure

Anti-Anchoring Protocol

Edge Cases

Similar Skills