From total-recall
Initializes total-recall memory system for Claude Code sessions. Handles session_start, permission blocks, and startup announcements to ensure memory features are active.
How this skill is triggered — by the user, by Claude, or both
Slash command
/total-recall:using-total-recallThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill ensures the total-recall memory system is active for this session.
This skill ensures the total-recall memory system is active for this session.
session_start MCP tool now (if it already ran server-side, it returns cached results instantly)session_start is blocked by permissions (e.g., in TUI fullscreen / dontAsk mode):
mcp__plugin_total-recall_total-recall__session_start allowed to function. You may need to adjust permissions if you want total-recall active this session."/total-recall:commands setup to auto-configure permissions for future sessionstierSummarystorage (e.g. "sqlite", "cortex", "postgres"). If it shows a fallback like "sqlite (cortex failed)", flag this prominently.lastSessionAge is present, mention when the last session washints are present, briefly surface the most relevant onespinned_budget_pressure is present in hints, surface it prominently: pinned entries are eating over half the context budget — suggest unpinning or trimming entriesbackgroundTasks.reindex shows state: "running", tell the user a one-time embedding re-index is in progress (done/total): local semantic retrieval is degraded until it completes. It runs in the background and the server is fully usable meanwhile — re-check via the status tool or the next session_start. (This is normal after an embedding-model change or upgrade; it is not an error.)backgroundTasks.setup is present (event: "provisioned"), briefly note that first-run setup finished (the memory engine was downloaded) — a one-time messagehints to inform your behavior throughout the sessionThe built-in web UI is available at any time via total-recall ui (opens a local browser dashboard on port 5577 by default). It is independent of the MCP session — no AI assistant needed to use it.
Once initialized, follow these behaviors throughout the session. Tool calls will be visible to the user.
Since the MCP bootstrap shim (bin/start.js) stays connected while it provisions and starts the engine, a tools/call may return a structured not-ready result before the engine is up instead of dropping the connection. The response shape is an MCP tool result with isError: true and a JSON text payload:
{ "status": "not_ready", "phase": "<phase>", "hint": "total-recall is still starting up; retry in a moment." }
The key phases you will actually see:
| phase | What it means | What to do |
|---|---|---|
provisioning | First launch after a plugin update — the shim is downloading and sha256-verifying the engine binary from GitHub Releases. | Wait 5–10 seconds and retry the tool call. Self-heals; the connection stays up. |
engine-restarting | The engine crashed and the shim is restarting it. | Retry shortly (5–10 s). After repeated failures the shim reports engine-failed. |
engine-failed | Engine could not be started after several attempts. | Surface the phase to the user and suggest restarting the MCP host. |
The MCP connection never drops during provisioning — this eliminates the old MCP error -32000: Connection closed that would appear on first launch after a plugin update. Once the shim is proxying to the engine (phase: "proxying"), it emits notifications/tools/list_changed and normal operation resumes.
This is distinct from the engine's own model_not_ready error (embedding model, described below). Both use the same recovery approach: wait briefly and retry.
When session_start returns an error response containing "error": "model_not_ready", parse the JSON payload and follow the recovery flow based on reason:
| reason | What it means | What to do |
|---|---|---|
downloading | First-run bootstrap is in progress (loading/validating the ~133 MB bundled ONNX model). Another process or this one holds the lock. | Wait 5–10 seconds and call session_start again. Repeat up to 12 times (~2 minutes total). Surface a brief status to the user on the first retry: "Total-recall is preparing its embedding model on first run. This is a one-time setup." |
missing | Bundled model not found on disk. The runtime does not download it — it must be present in the artifact (fetched + sha256-verified at build via scripts/fetch-bge-small.sh). | Surface the hint field to the user verbatim (it contains manual install instructions) and proceed without memory features for this session. |
corrupted | Model file present but failed checksum (e.g., partial install or a bad bundled file). | Call session_start once more in case it was a transient read. If it fails again with the same reason, surface the hint field to the user verbatim (it contains manual install instructions) and proceed without memory. |
failed | Other unrecoverable error preparing the model. | Surface the hint field verbatim to the user (manual install commands) and proceed without memory features for this session. Do NOT keep retrying — that will only delay the user. |
After successful recovery, all subsequent total-recall behaviors (capture, retrieve, session end) should resume normally. If recovery is impossible, the assistant must continue helping the user with their actual task — memory unavailability is a degraded mode, not a fatal error.
When you detect these patterns in user messages, call memory_store:
memory_pin (for new content: memory_store with pinned: true)memory_unpinDo NOT ask permission — just store it.
Storage constraints (apply when storing or pinning):
On each user message that is a question or task request:
memory_search with the message, searching warm tiermemory_search and kb_search now return a retrievalId alongside results.
After you use retrieved memories in your work, call memory_feedback:
memory_feedback({ retrievalId, used: true })memory_feedback({ retrievalId, used: false })Report once per search, as soon as you know the outcome. Skip the call only when
retrievalId is empty. Do NOT ask permission — just report it. This is what makes
the Dashboard "Retrieval quality" metric real.
Pinned directives are re-asserted automatically near the live edge by the
per-turn pinned floor (where the host supports it — see the capability matrix in
skills/commands/SKILL.md). You do not need to do anything for this.
Additionally: when the user makes a significant task switch (a clearly new
piece of work), call session_refresh once. This re-prepends the pinned block
and refreshes hot-tier context near the current generation point. On hosts
without a per-turn floor (e.g. Cursor) this is the primary way pinned directives
stay salient — so do it on task switches there especially.
session_context to get current hot tier entriestotal-recall:compactor agent with the entries as inputsession_end for final bookkeepingnpx claudepluginhub strvmarv/total-recall-marketplace --plugin total-recallManages total-recall memory system: search, store, forget memories, inspect tiers, run eval, and configure. Automatically activates on memory-related queries.
Stores and retrieves persistent knowledge across Claude Code sessions, including memory search, recall, and session history.
Invoked via /memsy slash command; classifies intent (search, store, switch profile, list, doctor, setup) and runs the matching Memsy workflow for context memory.