Agent self-improvement + retrieval. Captures learnings via colon-namespaced sub-skills (reflect, reflect:consolidate, reflect:ingest, reflect:recall, reflect-status) and auto-injects relevant prior learnings via SessionStart/UserPromptSubmit hooks. PostToolUse arms low-cost mini-learning capture; Stop enqueues short-session reflection. v4 cost rearchitecture: a $0 skip-gate + cascade slice the drain so it runs on Sonnet under hard caps (8 turns / 180s / 2M-token poison), with cost observability (reflect cost) and a weekly Opus synthesis pass. Philosophy: Correct once, never again; recall everything next time.
Based on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Full conversation scan for self-improvement. Detects behavioral corrections and knowledge signals, classifies them, proposes agent updates and knowledge notes with entity sidecars for GraphRAG indexing. Correct once, never again.
Retrieve relevant prior learnings from the global knowledge base. Hybrid vector + graph search over 170+ indexed learnings, reranked by confidence, recency, and tag overlap. Use when starting work, debugging a recurring problem, or before implementing a feature that may have prior art.
The global knowledge indexer. Harvests ALL memory sources across all tools (Claude, Codex, Copilot, Gemini) and all project types into the unified GraphRAG + QMD knowledge base. Archives originals, generates entity sidecars, and dual-indexes for future retrieval. This is THE command that makes the knowledge base comprehensive.
Project-level memory consolidation. Merges orphaned worktree memory directories into a single .agents/MEMORY.md for the current project. Deduplicates, sections, and proposes cleanup of orphan dirs. Does NOT index into the global knowledge base — use reflect:ingest for that.
Show reflection metrics, pending reviews, sidecar coverage, and GraphRAG health. Read-only views into the reflect system state. Can also approve/reject pending low-confidence items.
Matches all tools
Hooks run on every tool call, not just specific ones
Long-term memory for AI coding agents — correct once, never again.
reflect captures every correction and design decision your AI assistant makes, indexes them into a hybrid GraphRAG + BM25 knowledge base, and auto-recalls the relevant ones at the start of every new session — automatically, before the first token of your prompt is generated.
Works across Claude Code, Codex CLI, and GitHub Copilot — same engine, same KB, three harnesses.
📖 Full documentation → stevengonsalvez.github.io/agents-in-a-box — architecture, per-harness setup, and the Postgres backend in depth.
If you've used AI coding assistants for more than a week, you've corrected the same mistake twice. Maybe ten times. The assistant doesn't remember that:
The context window forgets the moment the session ends. reflect fixes that by capturing corrections as structured learnings, indexing them into a searchable knowledge base, and recalling the relevant ones at the start of every new session — so a fix you make once is a fix you never have to make again.
The engine lives at the repo root — install it with uv and the [graph] extra (pulls the full GraphRAG + vector stack):
uv tool install --upgrade 'git+https://github.com/stevengonsalvez/ainb-reflect-memory.git[graph]'
Verify with reflect --version.
reflect init # one-time: create the KB at ~/.claude/global-learnings/
reflect add ./my-solution.md # capture a learning (optional --entities sidecar)
reflect search "how did we fix the tokio panic" # hybrid GraphRAG + BM25 recall
The plugin (hooks + skills) that wires reflect into your agent harness lives under plugin/. Install it from this repo's marketplace:
claude plugin marketplace add stevengonsalvez/ainb-reflect-memory
claude plugin install reflect@ainb-reflect-memory
See plugin/README.md for the lifecycle hooks, sub-skills, and the Codex / Copilot adapters. (ainb reflect bootstrap installs the engine + prints system-tool steps in one shot.)
reflect runs a capture → index → recall loop:
/reflect analyses your conversation, classifies corrections vs. successes, and writes a Markdown learning note plus a YAML entity sidecar (people, files, libraries, decisions). A PreCompact hook fires automatically when the agent compacts a conversation, so nothing is lost.SessionStart, a hook runs hybrid search using the new session's working dir + recent commits as the query, fuses the results, reranks by confidence × recency × tag overlap, and injects the top three into the agent's context before you type anything.The markdown KB (~/.claude/global-learnings/*.md) is always the local
source of truth, and all LLM/embedding/clustering always stays client-side.
What changes between the two modes is only the derived vector + graph store.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimnpx claudepluginhub stevengonsalvez/agents-in-a-box --plugin reflectBrowser automation and inspection toolkit for AI agents. Debug web apps, inspect UI, capture network traffic, monitor navigation, take screenshots.
Agent self-improvement + retrieval. Captures learnings via colon-namespaced sub-skills (reflect, reflect:consolidate, reflect:ingest, reflect:recall, reflect-status) and auto-injects relevant prior learnings via SessionStart/UserPromptSubmit hooks. PostToolUse arms low-cost mini-learning capture; Stop enqueues short-session reflection. v4 cost rearchitecture: a $0 skip-gate + cascade slice the drain so it runs on Sonnet under hard caps (8 turns / 180s / 2M-token poison), with cost observability (reflect cost) and a weekly Opus synthesis pass. Philosophy: Correct once, never again; recall everything next time.
Complete AI coding agent toolkit with skills, workflows, and agents for Claude Code
Ultra-compressed communication mode. Cuts ~75% of tokens while keeping full technical accuracy by speaking like a caveman.
Memory compression system for Claude Code - persist context across sessions
Multi-model consensus engine integrating OpenAI Codex CLI, Gemini CLI, and Claude CLI for collaborative code review and problem-solving.
Curate auto-memory, promote learnings to CLAUDE.md and rules, extract proven patterns into reusable skills.