From arbor
Coordinates multi-phase research workflow by loading and managing phase skills for setup, ideation, execution, merge evaluation, novelty search, and reporting.
How this skill is triggered — by the user, by Claude, or both
Slash command
/arbor:arbor-agent-orchestratorThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Use this as the first skill for an Arbor-style research run. It is the phase
Use this as the first skill for an Arbor-style research run. It is the phase
loader and policy owner; load the smaller skills only when their phase applies.
For normal user-facing use, prefer starting with arbor-research-agent; that
wrapper performs Arbor-style intake and then loads this orchestrator.
This suite mirrors the open-source branch of arbor, not the older
single hypothesis-tree extraction. The product entry point is arbor; the run
architecture is:
Read references/source-map.md when auditing against the source tree or when
you need exact file origins.
Read references/compatibility.md when packaging the suite for another agent
runtime or checking Codex/Claude Code portability.
Launch and contract: load arbor-agent-setup-intake.
Establish target cwd, metric, baseline status, budget, scope preference,
dev/test discipline, config/plugin choice, and session directory.
Coordinator loop: load arbor-agent-coordinator.
Run INIT, OBSERVE, IDEATE, SELECT, DISPATCH, DECIDE until the cycle cap,
budget limit, or diminishing returns says to stop.
IDEATE only: load arbor-agent-ideate.
This is a hard gate for novelty/scientific runs. It must follow
TreeView(format="constraints") and precede every TreeAddNode. If a
plugin disables strict skills for performance-first MLE/Kaggle, use the
free-form path described by arbor-agent-plugins-hitl-budget instead.
Executor dispatch: load arbor-agent-executor.
Use for RunExecutor / RunExecutorParallel behavior, worktree lifecycle,
executor prompts, long RunTraining commands, report parsing, artifact
capture, and tree updates.
Merge and scoring: load arbor-agent-merge-eval.
Use before baseline metadata changes, merge attempts, B_test verification,
protected-path checks, and final test scoring.
Related work: load arbor-agent-search.
Use after a node is done or merged and beat trunk, especially before
merge decisions where novelty matters.
Domain adaptation and human gates: load
arbor-agent-plugins-hitl-budget when config mentions plugins, profiles,
mle_kaggle, lifecycle hooks, convergence, budget policy, or
interaction modes direction, review, or collaborative.
Resume and finalization: load arbor-agent-resume-report when the run
is interrupted/resumed, when dashboard/events/checkpoint artifacts matter,
or when producing REPORT.md.
No native Arbor tools: load arbor-agent-tools.
Use its scripts/arbor_state.py helper to emulate TreeView,
TreeAddNode, TreeSetMeta, TreeUpdateNode, TreePrune,
TreePropagate, executor prompt generation, eval score capture, merge
checks, and report generation in a plain Codex/Claude environment.
baseline_score, trunk_score, eval_cmd, eval_cmd_test,
dataset_info, metric_direction, and trunk_branch as metadata before
dispatching real executors.{cwd} and {node_id}. Do not hardcode the
main repository path inside executor eval commands.arbor_state.py, run tree-mutating commands serially. Do not
parallelize init, meta, add, update, prune, propagate, eval,
record, worktree, or merge against the same run.arbor CLI is installed and the user wants a real run, prefer
invoking it. If the user wants a skill-based reconstruction or a smoke test,
emulate the behavior with this suite and arbor-agent-tools.When the user asks for a smoke test, forward test, dry run, or validation of
the skill suite, propagate smoke-only through the contract, metadata,
executor prompt, raw reports, and final summary.
arbor_state.py parse-log, another
cached-score parser, a harmless echo, or an explicitly labelled mocked score
for plumbing validation.cat, raw rg, raw grep, or tail long training logs. Some logs
use carriage-return progress updates that make one physical line enormous.
Use arbor_state.py parse-log or normalize with tr '\r' '\n' before
matching; only inspect at most 20 log lines when debugging a failure.arbor_state.py prompt-executor --smoke.
Save the generated prompt as experiments/<node_id>/executor_prompt.md.check, and REPORT.md.Use this skeleton when no native arbor runtime is available:
arbor-agent-setup-intake; produce a contract and initialize
.arbor/sessions/<run_name>/.coordinator/idea_tree.json.arbor-agent-coordinator; complete INIT and metadata.TreeView(format="constraints").arbor-agent-ideate; add 1-3 ideas.arbor-agent-executor; dispatch one or more executors.arbor-agent-search for validated winners when useful.arbor-agent-merge-eval; merge, prune, or continue.arbor-agent-resume-report; run final B_test only if it is available,
authorized, and the run is not smoke-only; write REPORT.md; summarize
artifact paths.TreeView(format="constraints").{cwd} substitution.npx claudepluginhub ruc-nlpir/arbor --plugin arborCoordinates the Arbor research loop: persistent ReAct cycle with Idea Tree state, INIT/OBSERVE/IDEATE/SELECT/DISPATCH/DECIDE protocol, tool mapping, and cycle caps. Use after setup and before phase-specific skills.
Runs a research workflow with baseline measurement, failure analysis, web research, and strategy generation for metric-driven optimization. Use when project has research_target configured.
Runs an autonomous 5-stage research loop that reads research.md, proposes hypotheses, runs experiments, evaluates results mechanically, keeps improvements, discards failures, and iterates until a target metric is achieved or budget exhausted.