Portable multi-model orchestration: delegate to Ollama cloud, NVIDIA NIM, NVIDIA Security, and Codex from Claude Code.
Hand off to Codex for review, rescue, or adversarial verification
Auto-route a task — Opus picks models, dispatches in parallel, Codex verifies
List all available delegation models across providers
Security audit / PII / guardrail task via NVIDIA Security NIM
Delegate a prompt to a NVIDIA NIM frontier model
Admin access level
Server config contains admin-level keywords
Requires secrets
Needs API keys or credentials to function
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Portable Claude Code plugin for automatic multi-model orchestration. Opus plans + synthesizes. Sonnet/Haiku/Ollama cloud/NVIDIA NIM/NVIDIA Security/Codex execute in parallel. Codex verifies before merge. No user prompting for model choice — Opus auto-routes from task signal.
Drop into any project and start delegating across providers immediately.
multi-model) bundling 3 MCP servers + 6 slash commands + 1 auto-trigger skill..mcp.json needed — the plugin manifest loads the MCP servers.| Requirement | Notes |
|---|---|
| Claude Code | Version with plugin + marketplace support |
| Node.js ≥ 18 | on PATH |
@modelcontextprotocol/sdk, zod (global npm) | npm i -g @modelcontextprotocol/sdk zod |
MCP_GLOBAL_MODULES env | Points at your global node_modules. Windows: C:\Users\<you>\AppData\Roaming\npm\node_modules. macOS/Linux: output of npm root -g. |
NVIDIA_API_KEY (optional) | For NVIDIA NIM + Security. Get at build.nvidia.com. |
OLLAMA_HOST (optional) | Default http://localhost:11434. Ollama cloud models require an Ollama install + cloud-enabled account. |
| Codex plugin (optional) | For /codex:review, /codex:rescue, /codex:adversarial-review. Install from openai/codex-plugin-cc. Requires the Codex CLI on PATH. |
Two commands, any project, any machine:
claude plugin marketplace add ranjankumarpatel/claude-code-multi-model
claude plugin install multi-model@claude-code-multi-model
Restart Claude Code → plugin auto-loads with its 3 MCP servers. Verify:
claude mcp list # expect plugin:multi-model:{ollama,nvidia-nim,nvidia-security}
Updates: claude plugin update multi-model@claude-code-multi-model.
For hacking on the plugin itself:
git clone https://github.com/ranjankumarpatel/claude-code-multi-model.git
claude plugin marketplace add /absolute/path/to/claude-code-multi-model
claude plugin install multi-model@claude-code-multi-model
Set once per machine (shell profile):
# Required for MCP servers to find the SDK
export MCP_GLOBAL_MODULES="$(npm root -g)"
# Optional — NVIDIA NIM + Security
export NVIDIA_API_KEY="nvapi-..."
# Optional — override Ollama host
export OLLAMA_HOST="http://localhost:11434"
Windows PowerShell:
setx MCP_GLOBAL_MODULES "C:\Users\$env:USERNAME\AppData\Roaming\npm\node_modules"
setx NVIDIA_API_KEY "nvapi-..."
Install MCP deps globally:
npm i -g @modelcontextprotocol/sdk zod
Codex is optional but recommended — it's the verification gate + rescue executor in the auto-routing pattern.
codex runs on your terminal.claude plugin install codex@claude-code-multi-model
/codex:review or /codex:rescue inside Claude Code.If Codex is not installed, multi-model still works — auto-routing will simply skip the Codex verification step.
Opus never edits files or runs shell directly. It parses your request, decomposes into subtasks, and dispatches each to the best executor using this rubric:
| Task signal | Auto-route to |
|---|---|
| Bulk read / grep / rename / format | Haiku |
| Multi-file refactor, debugging, tests | Sonnet |
| Deep chain-of-thought reasoning | kimi-k2-thinking:cloud or deepseek-r1 |
| Coding second opinion / alt-frontier | gemma4:31b-cloud or nemotron-ultra |
| Long-context / agentic / vision | kimi-k2.5:cloud |
| Multilingual / non-English code | mistral-large |
| Large general-purpose | llama405b |
| Security audit / CVE / OWASP / PII / injection | NVIDIA Security |
| Stuck / failing tests / pre-merge verify | Codex |
| ≥2 independent subtasks | Parallel in one message |
You just state the goal. Opus reports the route in one line (e.g. Routing: refactor → Sonnet; rename → Haiku; audit → NVIDIA Security) and runs.
npx claudepluginhub ranjankumarpatel/claude-code-multi-model --plugin multi-modelFlagship+ skill pack for OpenRouter - 30 skills for multi-model routing, fallbacks, and LLM gateway mastery
Run any model with an Anthropic- or OpenAI-compatible API (e.g. DeepSeek, GLM, Kimi, Qwen, MiniMax) — even your Codex subscription — as real Claude Code workflows, agent-team teammates, or one-shot subagents, driven exactly like native ones. Your main session's own auth is untouched (OAuth subscription or API key, either works); API-key providers bill the provider key via apiKeyHelper, while a Codex subscription bills through a local OAuth daemon — each worker receives its credential on demand, never through its env or argv. Requires the `cc-fleet` binary on PATH, installed separately.
Fuse the Claude Code model with OpenAI Codex and agy: query all three in parallel, then Claude judges, synthesizes, and acts.
AI/ML development: LLM architecture, prompt engineering, ML ops, and NLP with production deployment focus
Delegate plan execution to Codex CLI via ASP. Part of cc-multi-cli-plugin. Requires the `multi` plugin.
Intelligent orchestration platform for AI coding tools — routes tasks to the best model, learns from outcomes, and enforces quality through multi-model consensus. 46 MCP tools for agent management, research, memory, consensus voting, codebase intelligence, and a full dev pipeline.