From memstack
Monitors and troubleshoots TokenStack proxy for Claude Code. Useful for token savings or proxy issues.
How this skill is triggered — by the user, by Claude, or both
Slash command
/memstack:compressThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
*Monitor and troubleshoot the built-in TokenStack compression proxy for CC sessions.*
Monitor and troubleshoot the built-in TokenStack compression proxy for CC sessions.
When this skill activates, output:
Compress - Checking TokenStack status...
Then execute the protocol below.
| Context | Status |
|---|---|
| User says "tokenstack", "compression stats", "check proxy" | ACTIVE - run status check |
| User asks about token savings or context window | ACTIVE - point to dashboard report |
| Proxy errors or API connection failures appear | ACTIVE - run health diagnostics |
| General discussion about CC features | DORMANT - do not activate |
| User is actively coding (no proxy issues) | DORMANT - do not activate |
TokenStack is a transparent proxy between Claude Code and the Anthropic API. It compresses tool output before it reaches the API, extending effective context and lowering token cost.
This skill checks that the proxy is running and routing, and troubleshoots connection issues. For the full feature overview, enable steps, and transform tables, use the Token Optimization skill.
TokenStack ships inside the memstack-skill-loader package. There is no separate install.
python -m memstack_skill_loader dashboard --with-proxy
127.0.0.1:8787 and sets ANTHROPIC_BASE_URL automatically.python -m memstack_skill_loader proxycurl http://127.0.0.1:8787/health
A healthy proxy responds on /health. If there is no response, the proxy is not running.
Note: the proxy has no /stats endpoint. Live savings are reported in the dashboard, not over curl. See "Reading Savings" below.
Open the dashboard and look at the proxy indicator. A live PRO or FREE badge means Claude Code traffic is routing through TokenStack. If the badge reads "not detected," the dashboard was started without --with-proxy; restart it with the flag.
Check the port:
netstat -ano | findstr 8787
If nothing is listening, start it:
python -m memstack_skill_loader dashboard --with-proxy
Savings live in the dashboard (default http://localhost:3333):
| Tier | Transforms |
|---|---|
| Free | Six lossless text reductions, always applied |
| Pro | Adds seven transforms including AST truncation (license-gated) |
A valid Pro license switches the proxy to Pro tier automatically. The dashboard badge shows the active tier.
| Symptom | Fix |
|---|---|
No response on /health | Proxy not running. Start with python -m memstack_skill_loader dashboard --with-proxy. |
| Proxy indicator shows "not detected" | Dashboard was started without --with-proxy. Restart with the flag. |
| Badge shows FREE but you hold Pro | License cache may be stale. The proxy revalidates the license on next start. |
| Cost figures differ from Anthropic Console | Estimates use list token prices and do not model server-side prompt caching. For billed cost, check console.anthropic.com. |
TokenStack Status
Proxy: Running on :8787
Tier: PRO
Savings: read in the dashboard (Overview header and Burn Report)
--with-proxypython -m memstack_skill_loader proxy| Skill | Scope | When to Use |
|---|---|---|
| Token Optimization | Full TokenStack overview, enable steps, transform tables | Understanding or turning on compression |
| Compress (this) | Health, routing, troubleshooting | Proxy not running or not routing |
/stats curl; savings are now read from the dashboard. (Jun 2026)npx claudepluginhub cwinvestments/memstack --plugin memstackExplains TokenStack, the built-in compression proxy that reduces Claude Code tool output before it reaches the Anthropic API, saving tokens and extending context window capacity.
Configures and diagnoses token-saver compression settings including compression levels, processor status, hook debugging, and savings statistics.
Optimizes Claude Code sessions for Max-plan token limits via response compression, tool output filtering, drift prevention, and planning for broad tasks.