From claude-commands
Watches AO worker tmux sessions, classifies state (WORKING/IDLE/QUEUED/DEAD/COMPLETED), auto-remediates trust TUI blocks, and push-notifies on stuck sessions. Use /babysit to start a monitoring loop on specific workers, all active workers, or workers matching a PR/branch.
How this skill is triggered — by the user, by Claude, or both
Slash command
/claude-commands:babysitThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
**Purpose:** Watch AO worker tmux sessions, classify state, auto-remediate known failures, and notify on stuck or dead shells. Sits *alongside* the launchd-managed `ao lifecycle-worker` (which runs the system-level reaction/poll loop) — `babysit` is the **Claude-side observer** for individual spawned sessions.
Purpose: Watch AO worker tmux sessions, classify state, auto-remediate known failures, and notify on stuck or dead shells. Sits alongside the launchd-managed ao lifecycle-worker (which runs the system-level reaction/poll loop) — babysit is the Claude-side observer for individual spawned sessions.
Use when:
ao spawn / ao spawn -p project "...") and want to keep an eye on itDo NOT use when:
/auton — it diagnoses the jleechanclaw + AO autonomy chain end-to-end)/ao-lifecycle-triage for log-driven triage)/ao-worker-dispatch)/ao-session-monitor — babysit uses it internally for one-pane detection but is multi-worker / multi-cycle)| Skill | When to use instead |
|---|---|
/ao-session-monitor | Single pane, one-shot classification (babysit embeds the same one-liner) |
/babysit-openclaw | Slack-thread-based openclaw monitoring (different model — openclaw posts to Slack; AO workers use tmux) |
/auton | System-level: why is the autonomous chain not driving PRs to N-green? |
/ao-lifecycle-triage | One stuck worker, deep log dive |
/ao-spawn-gate | Pre-spawn resource check |
ao-session-monitor)| Indicator | State | Babysit action |
|---|---|---|
✻✶✳✽✾ + duration, Running…, Bash(/Read( etc. in last 20 lines | WORKING | No action — log every 5 min |
Baked for Xm or Sautéed for Xm and X ≥ 30 | STALLED-COMPLETED | Push-notify: "Worker has been completed-but-untouched for Xm — review output" |
❯ prompt with no activity indicators in 20 lines | IDLE | If lifecycle-worker should have sent a message, push-notify |
Press up to edit queued messages | QUEUED | If queued > 10 min, push-notify (worker may be stuck on input) |
Trust TUI: "Do you trust the contents of this project?" with no auto---add-dir | TUI-BLOCKED | Auto-remediate: send Enter to select "Yes, I trust this folder" |
+uncommitted in status bar with no recent Bash(git in 25 lines | HAS-WORK-NO-COMMIT | Push-notify at 15 min: "Worker has uncommitted edits and is not committing" |
Pane dead (no ❯, no activity indicators, no recent tool output) for > 5 min | DEAD | Push-notify: "Worker tmux pane is dead — manual respawn required" |
Do you trust...): send Enter once. Only if the prompt is visible. Never send on a ❯ prompt without the trust question./babysit snapshot [session-name]
/babysit snapshot # all ao-* sessions
/babysit snapshot ao-6312 # one session
Runs the ao-session-monitor one-liner, prints the table, exits.
/babysit watch <session-name> [--max-min N]
/babysit watch ao-6312 --max-min 60
Polls every 60s. Auto-remediates TUI/queue. Push-notifies on stalled/dead. Exits when:
stop / cancel--max-min reached (default 90)/babysit watch-all [--max-min N]
Same as Mode 2, but applies to all ao-* tmux sessions. Per-worker status table printed every 5 min.
/babysit pr <PR#|branch> [--max-min N]
/babysit pr 661
/babysit pr fix/bd-rgk0-skeptic-cron-trigger-age-filter
Resolves the PR/branch → tmux session via ao list and the worktree's git state, then watches.
for s in $(tmux list-sessions 2>/dev/null | grep "ao-[0-9]" | cut -d: -f1); do
name=${s##*-}
last=$(tmux capture-pane -t "$s" -p -S -20)
pr=$(echo "$last" | grep -oE "PR: #[0-9]+" | head -1)
uc=""; echo "$last" | grep -q "uncommitted" && uc="+uc"
activity=$(echo "$last" | grep -oE "[✻✶✳✽✾] [A-Za-z]+…[^)]*\)" | tail -1)
if [ -n "$activity" ]; then echo " $name: WORKING $pr $uc ($activity)"
elif echo "$last" | grep -qE "Baked|Sautéed"; then echo " $name: completed $pr"
elif echo "$last" | grep -q "queued"; then echo " $name: QUEUED $pr"
else echo " $name: idle $pr $uc"
fi
done
tmux capture-pane -t "$s" -p -S -30 | grep -q "Do you trust the contents of this project" \
&& tmux send-keys -t "$s" Enter
Pre-check: only send Enter if no Enter has been sent in the last 60s (sentinel file ~/.cache/babysit/${s}.last_enter).
from claude_code import push_notification # conceptual
# Use the PushNotification tool with:
# message: "ao-6312 stalled-completed for 35m — review output"
# status: proactive
Cap: 1 push per session per 30 min. Sentinel: ~/.cache/babysit/${s}.last_notify.
The bash one-liner runs every 60s. Each iteration:
~/.cache/babysit/
ao-6312.last_enter # epoch ms of last Enter sent
ao-6312.last_notify # epoch ms of last push-notify
ao-6312.last_state # WORKING|IDLE|QUEUED|DEAD|COMPLETED|TUI
ao-6312.started_at # epoch ms of first observation
stop, --max-min reached, DEAD state, COMPLETED + PR merged/auton or /eloop, babysit must not stack — exit cleanly and let those drive the system-level stateSession State PR Uncommitted Last activity
ao-6312 WORKING #661 no ✻ Cascading… (3m 12s)
ao-6309 TUI-BLOCKED #? no (trust prompt)
ao-6305 STALLED-CMP #657 yes Baked for 42m
ao-6302 DEAD — — (no output 8m)
> /babysit watch ao-6312
Watching ao-6312 (PR #661, branch fix/bd-rgk0-skeptic-cron-trigger-age-filter)
Polling every 60s. Will push-notify on stalled/dead/TUI-block. Max run 90 min.
[17:42:01] WORKING — ✻ Germinating… (0m 12s)
[17:43:01] WORKING — ✻ Germinating… (1m 14s)
[17:44:01] WORKING — ✻ Running tools… (2m 03s)
[17:45:30] TUI-BLOCKED — "Do you trust..." → sent Enter
[17:45:35] WORKING — ✻ Reading… (0m 04s)
...
> /babysit snapshot
Session State PR Uncommitted Last activity
ao-6312 WORKING #661 no ✻ Germinating… (0m 30s)
ao-6309 IDLE — no (no activity 4m)
ao-6305 STALLED-CMP #657 yes Baked for 42m
> /babysit pr 661
Resolved 661 → ao-6312 (worktree $HOME/.worktrees/agent-orchestrator/ao-6312, branch fix/bd-rgk0-...)
Watching ao-6312. Same as /babysit watch ao-6312.
By default, babysit observes and reports: it posts status updates but does not steer workers. In DRIVER mode, babysit takes the next step when it detects a stuck-on-failure pattern: it extracts the specific failure and sends an ao send with an exact fix instruction.
Trigger rule (mandatory): if the same CI gate failure (same check name, same error class) appears in ≥2 consecutive polls for the same worker, babysit MUST switch to DRIVER mode for that worker.
What to extract (the minimum to act on):
packages/cli/src/doctor.ts:142)dist/index.js argv shape not recognized in non-canonical check)change regex to accept '/path/dist/index.js' as canonical binary)What to send (use the SendMessage / ao send channel, NOT just a push notification):
ao send <session-id> "DRIVER (babysit): <check-name> failing 2+ ticks.
Failure: <one-line description of the error>
Fix: change <file>:<line> — <specific patch in plain English>
Verify: <command to run before pushing> (e.g., pnpm -C packages/cli test)"
Idempotency: one DRIVER send per failure-class per 30 minutes (sentinel ~/.cache/babysit/${s}.last_driver_${checkname}). After sending, babysit reverts to observer mode for the same check name until the failure either changes class or resolves.
Do NOT in DRIVER mode:
ao send with raw gh run output pasted in. Distill to one specific actionable instruction./auton or /eloop — those are system-level drivers; let them run or stop babysit.Why this exists: babysit's success metric was "status posted" not "failure fixed." Generic nudges do not move workers that have already tried and failed — they need exact file:line fix instructions. Observed in PR #7618 (rate-limit) over 4+ hours with 15+ babysit status updates and zero progress. Memory: [[babysit-not-a-driver]]. See also [[pr-driver-loop-contract]] in ~/.claude/CLAUDE.md.
/babysit and /auton simultaneously — they overlap and produce duplicate push-notifications❯ prompt without verifying a question is being asked — Enter on an empty prompt submits a blank, which is harmless but noise--max-min beyond 180 without explicit user approval — long watches consume push-notification quota and can mask real alertsao send "fix CI" message — that is the exact failure mode DRIVER mode was created to prevent. If you cannot extract a specific file:line + change, push-notify the user instead.| Failure | Detection | Recovery |
|---|---|---|
| tmux server not running | tmux has-session returns 1 | Push-notify: "tmux server down — restart with tmux start-server" |
| ao-* sessions exist but capture-pane returns empty | last == "" | Mark DEAD, push-notify |
| Sentinel dir not writable | mkdir -p fails | Print to stderr, continue without idempotency (degraded) |
ao list not in PATH | which ao returns empty | Print to stderr, use tmux-only mode (no PR resolution) |
| Push-notification tool unavailable | ImportError | Fall back to printing to stdout + writing a sentinel flag file ~/.cache/babysit/${s}.needs_human |
Monitor tool directly?Monitor watches a long-running script and emits a notification per stdout line — perfect for tailing logs. babysit is stateful: it diffs state across polls, applies idempotency via sentinel files, and remediates only on transitions. A Monitor invocation of tmux capture-pane would either flood notifications (every 60s) or miss transitions (if filtered too tightly). The stateful watch loop is the right primitive.
/ao-session-monitor — single-pane one-shot detection (babysit embeds its one-liner)/babysit-openclaw — Slack-thread-based, single-shot, different model/auton — system-level autonomy diagnostic/ao-lifecycle-triage — log-driven deep triage of a single stuck worker/evolve-loop / /eloop — autonomous loop; babysit is intentionally NOT autonomous (always opt-in)npx claudepluginhub jleechanorg/claude-commands --plugin claude-commandsRetrospectively diagnoses why the automation system failed to autonomously drive PRs to green and merge. Post-mortem for AO/Hermes worker failures.
Spawns and manages persistent tmux-based Claude Code CLI sessions with bidirectional communication. Subcommands: spawn, send, read, status, list, kill for parallel peer orchestration and multi-turn steering.
Launches and manages Claude Code, Codex, or Pi worker sessions as sub-processes. Useful for project managers that delegate tasks, assign work, monitor progress, review tool calls, and collect results via the `csd` CLI.