Author, run, and debug CORAL tasks for autonomous coding agent evaluation — scaffold task definitions with custom graders, launch parallel agent experiments, monitor progress via CLI, and diagnose stalled runs with AI-driven root-cause analysis.
Use this subagent to diagnose a CORAL run that's misbehaving — agents restarting, every eval failing, the score plateaued, or "is this run healthy?". It reads `coral status` / `coral log` / `coral show` / notes, identifies the pathology, and returns a ranked set of concrete fixes (fix the grader, resume with an instruction, tune heartbeats, fork a regressed line). Read-only on the run — it recommends, it doesn't restart or stop anything.
Use this subagent to turn "optimize / speed up / improve this with CORAL" into a working CORAL task. Give it the code (or just a repo and a rough goal) and it acts immediately — explores the repo to infer the optimization target, scaffolds a .coral_workspace/, writes the grader, and iterates `coral validate` until the grader cleanly scores the seed, then hands back a ready-to-launch task. Delegate here whenever the user wants CORAL pointed at existing code.
The fast path from zero to a running CORAL experiment — what CORAL is and when to reach for it, installing the `coral` CLI, registering a runtime with `coral setup`, and the `.coral_workspace/` convention for pointing CORAL at code you already have and want optimized. Use this whenever the user asks "what is coral", "should I use coral for this", wants to install or get coral set up, hits a "command not found" for coral or doesn't have it installed yet, or says "use coral to optimize / speed up / improve this code" and you need the end-to-end onboarding from install to a launched run. Hands off to `setting-up-coral` (runtime bindings), `creating-a-coral-task` (grader authoring), and `running-coral-experiments` (operating a run) for depth.
Author a new CORAL task — the three pieces that must line up (`task.yaml`, `seed/`, a packaged `grader/`), the `coral init` → `coral validate` → smoke-test loop, and how to pick a grader pattern (stdout float, test pass-rate, ratio-vs-baseline, multi-metric, or an LLM rubric judge). Use whenever the user wants to create a CORAL task, write or wire a grader, port a benchmark into CORAL, score open-ended outputs (reports/memos) with a judge, or debug a grader that crashes on the seed / ranks the leaderboard backwards / leaks the answer key. Deep references for the TaskGrader API, grader patterns, rubric judges, and the full task.yaml schema live alongside this skill.
Run and manage CORAL experiments from the operator side — launch agents with `coral start` (dotlist overrides, model/count, tmux vs local), monitor with `coral status` / `coral log` / `coral show` / the web dashboard, and drive the loop with `coral resume` (inject instructions, fork from an attempt), `coral heartbeat` (tune reflection cadence), and `coral stop`. Use whenever the user wants to start a CORAL run, check on agents, read scores/leaderboard, steer or resume a run, diagnose agents that keep restarting or fail every eval, scale to more agents or islands, or stop a run. Deep references for steering/heartbeat tuning and scaling/troubleshooting live alongside this skill.
One-time machine setup after installing the `coral` CLI — register local agent runtimes as named bindings with `coral setup` / `coral setup agent`, validate them with `coral agents doctor` (incl. a live hello-ping that catches expired auth and model typos), and reference them from a task via `agents.binding`. Use when the user is configuring which agent runtimes/models coral can use, hits a "runtime not found" / auth error when starting a run, or asks how to set up claude/codex/cursor for coral.
Uses power tools
Uses Bash, Write, or Edit tools
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
English | 中文
Installation · Plugin · Supported Agents · How It Works · Examples · Docs · Paper
CORAL is infrastructure for autonomous AI agent organizations that run experiments, share knowledge, and continuously improve solutions. Give it a codebase and a grader, and CORAL handles the rest: isolated workspaces, safe evaluation, persistent shared state, and multi-agent collaboration. Natively integrated with Claude Code, OpenCode, Codex, Cursor Agent, and Kiro.
eval/grader.py grader auto-discovery is deprecated and removed — wire graders via grader.entrypoint pointing at a packaged grader. See the custom grader guide.
curl -fsSL https://raw.githubusercontent.com/Human-Agent-Society/CORAL/main/install.sh | sh
Installs the latest coral release globally via uv tool install. Pin a specific release with CORAL_VERSION=<tag> if you need to. See Installation docs for manual install, dev setup, and prerequisites.
coral init my-task # scaffold a task
cd my-task && coral start -c task.yaml # launch agents
Prefer to author and run CORAL tasks from inside your own Claude Code or Codex without memorizing the CLI? Install the CORAL plugin — a skills-first bundle (no MCP) that teaches the workflows (coral setup → init/validate → start/status/log) and checks coral is installed on session start.
Claude Code:
/plugin marketplace add Human-Agent-Society/CORAL
/plugin install coral@coral-marketplace
Codex (v0.117.0+):
codex plugin marketplace add Human-Agent-Society/CORAL
codex plugin add coral@coral-marketplace
Both pull from this repo's marketplace manifests; the plugin lives under plugin/.
Quickstart — point CORAL at code you already have. Once installed, open the repo whose code you want to optimize and just ask:
use coral to optimize this — make sample() in saga/decode.py faster without changing its output
The plugin scaffolds a gitignored .coral_workspace/, drops your code into a seed/, writes a grader for your metric, and loops coral validate until the task is launch-ready — then hands you the coral start command. On Claude Code a coral-task-author subagent does the whole grind autonomously (and a coral-run-doctor triages a stuck run); on any harness the bundled skills walk the same path.
Skills: coral-quickstart (install → setup → .coral_workspace/), setting-up-coral (runtime bindings), creating-a-coral-task (grader authoring), running-coral-experiments (operate a run). See the Harness Plugin guide or plugin/README.md for agents, the skills-dir alternative, and other harnesses.
npx claudepluginhub human-agent-society/coral --plugin coralIntelligent orchestration platform for AI coding tools — routes tasks to the best model, learns from outcomes, and enforces quality through multi-model consensus. 46 MCP tools for agent management, research, memory, consensus voting, codebase intelligence, and a full dev pipeline.
Multi-agent orchestration framework for Claude Code. Routes tasks to specialized Haiku/Sonnet subagents while Opus orchestrates — inspired by speculative decoding. Includes 10 specialized heads, environment preflight checks, and ~50% API cost reduction.
Hive agent skills for collaborative evolution. /hive-setup installs hive-evolve, registers your agent, and clones a task. /hive runs the autonomous experiment loop. /hive-create-task guides you through designing and publishing a new task.
This skill should be used when the model's ROLE_TYPE is orchestrator and needs to delegate tasks to specialist sub-agents. Provides scientific delegation framework ensuring world-building context (WHERE, WHAT, WHY) while preserving agent autonomy in implementation decisions (HOW). Use when planning task delegation, structuring sub-agent prompts, or coordinating multi-agent workflows.
OpenAgentsControl — multi-agent orchestration for Claude Code. Context-aware development with skills, subagents, parallel execution, and automated code review.
Repowire mesh usage skills for AI coding agents: cross-agent review and planning, delegate, usage patterns, and install/update. Backend-agnostic and parameterised on the agent you choose.