Skill

using-total-recall

Initializes total-recall memory system for Claude Code sessions. Handles session_start, permission blocks, and startup announcements to ensure memory features are active.

developer-tools

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/total-recall:using-total-recall

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

This skill ensures the total-recall memory system is active for this session.

SKILL.md

129 lines · ~2.3k tokens

Stats

LanguageC#

Stars11

Forks3

MaintenanceExcellent

Last CommitJun 23, 2026

Actions

View Source View Plugin View on GitHub View README

Using total-recall

This skill ensures the total-recall memory system is active for this session.

Immediate Action

Call the total-recall session_start MCP tool now (if it already ran server-side, it returns cached results instantly)
If session_start is blocked by permissions (e.g., in TUI fullscreen / dontAsk mode):
- Tell the user: "total-recall session_start was blocked by permissions. The memory system needs mcp__plugin_total-recall_total-recall__session_start allowed to function. You may need to adjust permissions if you want total-recall active this session."
- Suggest running /total-recall:commands setup to auto-configure permissions for future sessions
- Proceed without memory features — this is degraded mode, not fatal
Announce startup using the returned data:
- Report tier summary: pinned, hot, warm, cold, KB counts from tierSummary
- Report storage backend from storage (e.g. "sqlite", "cortex", "postgres"). If it shows a fallback like "sqlite (cortex failed)", flag this prominently.
- If lastSessionAge is present, mention when the last session was
- If hints are present, briefly surface the most relevant ones
- If pinned_budget_pressure is present in hints, surface it prominently: pinned entries are eating over half the context budget — suggest unpinning or trimming entries
- If backgroundTasks.reindex shows state: "running", tell the user a one-time embedding re-index is in progress (done/total): local semantic retrieval is degraded until it completes. It runs in the background and the server is fully usable meanwhile — re-check via the status tool or the next session_start. (This is normal after an embedding-model change or upgrade; it is not an error.)
- If backgroundTasks.setup is present (event: "provisioned"), briefly note that first-run setup finished (the memory engine was downloaded) — a one-time message
- Keep it to 2-3 lines max
Use hints to inform your behavior throughout the session
Incorporate the returned context to inform your responses

The built-in web UI is available at any time via total-recall ui (opens a local browser dashboard on port 5577 by default). It is independent of the MCP session — no AI assistant needed to use it.

Ongoing Behaviors

Once initialized, follow these behaviors throughout the session. Tool calls will be visible to the user.

Handling startup not-ready results (shim provisioning)

Since the MCP bootstrap shim (bin/start.js) stays connected while it provisions and starts the engine, a tools/call may return a structured not-ready result before the engine is up instead of dropping the connection. The response shape is an MCP tool result with isError: true and a JSON text payload:

{ "status": "not_ready", "phase": "<phase>", "hint": "total-recall is still starting up; retry in a moment." }

The key phases you will actually see:

phase	What it means	What to do
`provisioning`	First launch after a plugin update — the shim is downloading and sha256-verifying the engine binary from GitHub Releases.	Wait 5–10 seconds and retry the tool call. Self-heals; the connection stays up.
`engine-restarting`	The engine crashed and the shim is restarting it.	Retry shortly (5–10 s). After repeated failures the shim reports `engine-failed`.
`engine-failed`	Engine could not be started after several attempts.	Surface the phase to the user and suggest restarting the MCP host.

The MCP connection never drops during provisioning — this eliminates the old MCP error -32000: Connection closed that would appear on first launch after a plugin update. Once the shim is proxying to the engine (phase: "proxying"), it emits notifications/tools/list_changed and normal operation resumes.

This is distinct from the engine's own model_not_ready error (embedding model, described below). Both use the same recovery approach: wait briefly and retry.

Handling model bootstrap errors (embedding model)

When session_start returns an error response containing "error": "model_not_ready", parse the JSON payload and follow the recovery flow based on reason:

reason	What it means	What to do
`downloading`	First-run bootstrap is in progress (loading/validating the ~133 MB bundled ONNX model). Another process or this one holds the lock.	Wait 5–10 seconds and call `session_start` again. Repeat up to 12 times (~2 minutes total). Surface a brief status to the user on the first retry: "Total-recall is preparing its embedding model on first run. This is a one-time setup."
`missing`	Bundled model not found on disk. The runtime does not download it — it must be present in the artifact (fetched + sha256-verified at build via `scripts/fetch-bge-small.sh`).	Surface the `hint` field to the user verbatim (it contains manual install instructions) and proceed without memory features for this session.
`corrupted`	Model file present but failed checksum (e.g., partial install or a bad bundled file).	Call `session_start` once more in case it was a transient read. If it fails again with the same reason, surface the `hint` field to the user verbatim (it contains manual install instructions) and proceed without memory.
`failed`	Other unrecoverable error preparing the model.	Surface the `hint` field verbatim to the user (manual install commands) and proceed without memory features for this session. Do NOT keep retrying — that will only delay the user.

After successful recovery, all subsequent total-recall behaviors (capture, retrieve, session end) should resume normally. If recovery is impossible, the assistant must continue helping the user with their actual task — memory unavailability is a degraded mode, not a fatal error.

Capture (continuous)

When you detect these patterns in user messages, call memory_store:

Correction: "no", "not that", "actually", "use X instead" -> type "correction"
Preference: How the user wants things done -> type "preference"
Decision: Non-obvious architectural or design choices -> type "decision"
Pin: "pin that", "never forget this", "keep this permanently" -> call memory_pin (for new content: memory_store with pinned: true)
Unpin: "unpin X" -> memory_unpin

Do NOT ask permission — just store it.

Storage constraints (apply when storing or pinning):

Pins are short directives, not reference material. Pinned entries are capped at 500 characters. If the content is longer, distill the RULE into <= 500 chars and pin the distillation; keep the full detail as a normal warm memory.
Store atomic, concise memories at every tier: one fact per entry; split compound observations into separate memories. Long reference content belongs in the knowledge base (kb_ingest), not in memories.

Retrieve (continuous)

On each user message that is a question or task request:

Call memory_search with the message, searching warm tier
If top score < 0.5, also search cold/knowledge tier
Use results to inform your response

Feedback (continuous)

memory_search and kb_search now return a retrievalId alongside results. After you use retrieved memories in your work, call memory_feedback:

Used it → memory_feedback({ retrievalId, used: true })
Searched and nothing was relevant → memory_feedback({ retrievalId, used: false })

Report once per search, as soon as you know the outcome. Skip the call only when retrievalId is empty. Do NOT ask permission — just report it. This is what makes the Dashboard "Retrieval quality" metric real.

Pinned directives (continuous)

Pinned directives are re-asserted automatically near the live edge by the per-turn pinned floor (where the host supports it — see the capability matrix in skills/commands/SKILL.md). You do not need to do anything for this.

Additionally: when the user makes a significant task switch (a clearly new piece of work), call session_refresh once. This re-prepends the pinned block and refreshes hot-tier context near the current generation point. On hosts without a per-turn floor (e.g. Cursor) this is the primary way pinned directives stay salient — so do it on task switches there especially.

Session End

Call session_context to get current hot tier entries
If there are 2+ hot entries, launch the total-recall:compactor agent with the entries as input
Parse the agent's JSON decisions and execute them
Call session_end for final bookkeeping

Rules

Let tool calls be visible — users should see that memory is working
ALWAYS store corrections — highest-value memories
ALWAYS search warm tier before answering project questions
NEVER modify host tool files (Claude Code memory/, CLAUDE.md, etc.)

using-total-recall

Popularity

Invocation

Context Preview

SKILL.md

using-total-recall

Popularity

Invocation

Context Preview

SKILL.md

Using total-recall

Immediate Action

Ongoing Behaviors

Handling startup not-ready results (shim provisioning)

Handling model bootstrap errors (embedding model)

Capture (continuous)

Retrieve (continuous)

Feedback (continuous)

Pinned directives (continuous)

Session End

Rules

Similar Skills

Using total-recall

Immediate Action

Ongoing Behaviors

Handling startup not-ready results (shim provisioning)

Handling model bootstrap errors (embedding model)

Capture (continuous)

Retrieve (continuous)

Feedback (continuous)

Pinned directives (continuous)

Session End

Rules

Similar Skills