Model routing configuration templates and strategies for cost optimization, speed optimization, quality optimization, and intelligent fallback chains. Use when building AI applications with OpenRouter, implementing model routing strategies, optimizing API costs, setting up fallback chains, implementing quality-based routing, or when user mentions model routing, cost optimization, fallback strategies, model selection, intelligent routing, or dynamic model switching.
Limited to specific tools
Additional assets for this skill
This skill is limited to using the following tools:
README.mdexamples/cost-routing-example.mdexamples/dynamic-routing-example.mdexamples/fallback-chain-example.mdexamples/monitoring-example.mdscripts/analyze-cost-savings.shscripts/generate-routing-config.shscripts/test-fallback-chain.shscripts/validate-routing-config.shtemplates/balanced-routing.jsontemplates/cost-optimized-routing.jsontemplates/custom-routing-template.jsontemplates/quality-optimized-routing.jsontemplates/routing-config.pytemplates/routing-config.tstemplates/speed-optimized-routing.jsonProduction-ready model routing configurations and strategies for OpenRouter that optimize for cost, speed, quality, or balanced performance with intelligent fallback chains.
This skill provides comprehensive templates, scripts, and strategies for implementing sophisticated model routing in OpenRouter-powered applications. It helps you:
Use this skill when:
Goal: Minimize API costs while maintaining acceptable quality
Strategy:
Template: templates/cost-optimized-routing.json
Best for:
Goal: Minimize latency and response time
Strategy:
Template: templates/speed-optimized-routing.json
Best for:
Goal: Maximize output quality with premium models
Strategy:
Template: templates/quality-optimized-routing.json
Best for:
Goal: Dynamically route based on task complexity
Strategy:
Template: templates/balanced-routing.json
Best for:
Goal: Implement domain-specific routing logic
Template: templates/custom-routing-template.json
Customizable factors:
validate-routing-config.sh
test-fallback-chain.sh
generate-routing-config.sh
analyze-cost-savings.sh
Configuration Templates (JSON):
cost-optimized-routing.json - Free/cheap models with premium fallbackspeed-optimized-routing.json - Fastest models with streamingquality-optimized-routing.json - Premium models with fallbacksbalanced-routing.json - Task-based dynamic routingcustom-routing-template.json - Template for custom strategiesCode Templates:
routing-config.ts - TypeScript routing configurationrouting-config.py - Python routing configurationcost-routing-example.md - Complete cost-optimized routing setupdynamic-routing-example.md - Task complexity-based routingfallback-chain-example.md - 3-tier fallback strategymonitoring-example.md - Cost tracking and analytics setupDetermine your optimization goals:
# Interactive strategy selector
./scripts/generate-routing-config.sh
Answer questions about:
# Generate from strategy type
./scripts/generate-routing-config.sh cost-optimized > config.json
# Or copy template
cp templates/cost-optimized-routing.json config.json
# Validate syntax and model availability
./scripts/validate-routing-config.sh config.json
Checks:
# Test fallback behavior
./scripts/test-fallback-chain.sh config.json
Simulates failures to ensure graceful degradation.
# Compare routing strategies
./scripts/analyze-cost-savings.sh config.json baseline-config.json
Shows projected savings and performance tradeoffs.
{
"primary": "meta-llama/llama-3.2-3b-instruct:free",
"fallback": [
"anthropic/claude-4.5-sonnet",
"openai/gpt-4o-mini"
]
}
{
"simple_tasks": {
"models": ["google/gemma-2-9b-it:free"]
},
"medium_tasks": {
"models": ["anthropic/claude-4.5-sonnet"]
},
"complex_tasks": {
"models": ["openai/gpt-4o"]
}
}
{
"peak_hours": {
"models": ["openai/gpt-4o-mini"],
"max_latency_ms": 1000
},
"off_peak": {
"models": ["google/gemini-pro"],
"max_latency_ms": 3000
}
}
{
"free_tier": {
"models": ["meta-llama/llama-3.2-3b-instruct:free"],
"rate_limit": 10
},
"premium_tier": {
"models": ["anthropic/claude-4.5-sonnet"],
"rate_limit": 1000
}
}
google/gemma-2-9b-it:freemeta-llama/llama-3.2-3b-instruct:freemeta-llama/llama-3.2-1b-instruct:freemicrosoft/phi-3-mini-128k-instruct:freeUse for: High-volume, simple tasks, development
openai/gpt-4o-minigoogle/gemini-flash-1.5Use for: Production workloads, balanced cost/quality
anthropic/claude-4.5-sonnetopenai/gpt-4ogoogle/gemini-pro-1.5Use for: Complex reasoning, critical tasks, high quality
openai/gpt-4-vision-previewanthropic/claude-4.5-sonnet (code-specific)google/gemini-pro-1.5 (1M+ tokens)All models in fallback chain failing:
Higher costs than expected:
Quality degradation:
High latency:
See examples directory for complete implementations:
Skill Location: plugins/openrouter/skills/model-routing-patterns/
Version: 1.0.0
Supported Frameworks: Node.js, Python, TypeScript, any OpenRouter-compatible client