Implement cost-optimized LLM routing with NO OpenAI. Use tiered model selection (DeepSeek, Haiku, Sonnet) to achieve 70-90% cost savings. Triggers on "LLM costs", "model selection", "cost optimization", "which model", "DeepSeek", "Claude pricing", "reduce AI costs".
/plugin marketplace add ScientiaCapital/scientia-superpowers/plugin install scientiacapital-scientia-superpowers@ScientiaCapital/scientia-superpowersThis skill inherits all available tools. When active, it can use any tool Claude has access to.
Achieve 70-90% cost savings with intelligent model routing. NO OpenAI allowed.
NEVER use OpenAI models in this ecosystem.
Allowed providers:
| Model | Cost per 1M tokens | Use Case |
|---|---|---|
| DeepSeek V3 | $0.14 input / $0.28 output | Simple queries, classification |
| Claude Haiku | $0.25 input / $1.25 output | Moderate complexity |
| Gemini Flash | FREE (limited) | MVP, prototyping |
| Claude Sonnet | $3.00 input / $15.00 output | Complex reasoning |
| Claude Opus | $15.00 input / $75.00 output | Expert tasks only |
Use for:
from openai import OpenAI # OpenRouter uses OpenAI SDK
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key=os.environ["OPENROUTER_API_KEY"]
)
response = client.chat.completions.create(
model="deepseek/deepseek-chat",
messages=[{"role": "user", "content": prompt}],
max_tokens=500
)
Use for:
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-3-5-haiku-20241022",
max_tokens=1024,
messages=[{"role": "user", "content": prompt}]
)
Use for:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
messages=[{"role": "user", "content": prompt}]
)
from enum import Enum
from typing import Literal
class TaskComplexity(Enum):
SIMPLE = "simple"
MODERATE = "moderate"
COMPLEX = "complex"
def route_to_model(complexity: TaskComplexity) -> str:
"""Route to appropriate model based on complexity."""
routing = {
TaskComplexity.SIMPLE: "deepseek/deepseek-chat",
TaskComplexity.MODERATE: "claude-3-5-haiku-20241022",
TaskComplexity.COMPLEX: "claude-sonnet-4-20250514"
}
return routing[complexity]
def estimate_complexity(prompt: str) -> TaskComplexity:
"""Estimate task complexity from prompt characteristics."""
# Simple heuristics
word_count = len(prompt.split())
has_code = "```" in prompt or "def " in prompt or "function" in prompt
has_analysis = any(w in prompt.lower() for w in ["analyze", "compare", "evaluate"])
if word_count < 50 and not has_code and not has_analysis:
return TaskComplexity.SIMPLE
elif word_count < 200 or (has_code and not has_analysis):
return TaskComplexity.MODERATE
else:
return TaskComplexity.COMPLEX
def smart_complete(prompt: str, force_model: str = None) -> str:
"""Complete with automatic model routing."""
if force_model:
model = force_model
else:
complexity = estimate_complexity(prompt)
model = route_to_model(complexity)
# Route to appropriate client
if model.startswith("deepseek"):
return call_openrouter(model, prompt)
else:
return call_anthropic(model, prompt)
For MVPs and prototyping, use Gemini Flash (FREE):
import google.generativeai as genai
genai.configure(api_key=os.environ["GOOGLE_API_KEY"])
model = genai.GenerativeModel("gemini-1.5-flash")
response = model.generate_content(prompt)
Limits:
Track costs per project:
import json
from datetime import datetime
from pathlib import Path
COST_LOG = Path.home() / ".claude" / "llm_costs.jsonl"
def log_cost(project: str, model: str, input_tokens: int, output_tokens: int):
"""Log LLM usage for cost tracking."""
costs = {
"deepseek/deepseek-chat": (0.00014, 0.00028),
"claude-3-5-haiku-20241022": (0.00025, 0.00125),
"claude-sonnet-4-20250514": (0.003, 0.015),
"gemini-1.5-flash": (0, 0) # Free
}
input_cost, output_cost = costs.get(model, (0.01, 0.03))
total = (input_tokens / 1_000_000 * input_cost) + (output_tokens / 1_000_000 * output_cost)
entry = {
"timestamp": datetime.utcnow().isoformat(),
"project": project,
"model": model,
"input_tokens": input_tokens,
"output_tokens": output_tokens,
"cost_usd": round(total, 6)
}
with open(COST_LOG, "a") as f:
f.write(json.dumps(entry) + "\n")
return total
For voice pipelines (vozlux, solarvoice-ai):
def get_voice_tier(subscription: str) -> dict:
tiers = {
"starter": {
"tts": "polly",
"stt": "deepgram-base",
"llm": "deepseek"
},
"pro": {
"tts": "cartesia",
"stt": "deepgram-nova",
"llm": "haiku"
},
"enterprise": {
"tts": "cartesia",
"stt": "deepgram-nova",
"llm": "sonnet"
}
}
return tiers.get(subscription, tiers["starter"])
For a typical Scientia project:
| Usage Level | DeepSeek Heavy | Mixed Tier | Sonnet Heavy |
|---|---|---|---|
| Light (10K queries) | $1.40 | $8 | $90 |
| Medium (100K queries) | $14 | $80 | $900 |
| Heavy (1M queries) | $140 | $800 | $9,000 |
Recommendation: Use Mixed Tier routing for 90%+ of use cases.
Required in .env:
# Primary (Anthropic)
ANTHROPIC_API_KEY=sk-ant-...
# Cost optimization (OpenRouter for DeepSeek)
OPENROUTER_API_KEY=sk-or-...
# Free tier (Google)
GOOGLE_API_KEY=AIza...
# NEVER set these:
# OPENAI_API_KEY= # FORBIDDEN
lang-core enforces NO OpenAI at runtime:
def validate_environment():
"""Block OpenAI usage."""
if os.environ.get("OPENAI_API_KEY"):
raise EnvironmentError(
"OpenAI is not allowed in Scientia projects. "
"Use ANTHROPIC_API_KEY or OPENROUTER_API_KEY instead."
)
Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.
Applies Anthropic's official brand colors and typography to any sort of artifact that may benefit from having Anthropic's look-and-feel. Use it when brand colors or style guidelines, visual formatting, or company design standards apply.
Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the user asks to create a poster, piece of art, design, or other static piece. Create original visual designs, never copying existing artists' work to avoid copyright violations.