Optimize cloud costs — budget alerts, resource right-sizing, usage analysis, FinOps practices, and cost allocation for Firebase and GCP
How this skill is triggered — by the user, by Claude, or both
Slash command
/cure-product-engineering:finops [project-or-service]When to use
Use when optimizing cloud costs, setting budget alerts, or right-sizing resources on Firebase/GCP. NOT for financial modeling (use saas-financial-model). NOT for burn rate tracking (use burn-rate-tracker).
[project-or-service]The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Before starting, gather project context silently:
Before starting, gather project context silently:
PORTFOLIO.md if it exists in the project root or parent directories for product/team contextcat package.json 2>/dev/null || cat build.gradle.kts 2>/dev/null || cat Podfile 2>/dev/null to detect stackgit log --oneline -5 2>/dev/null for recent changesls src/ app/ lib/ functions/ 2>/dev/null to understand project structureCloud financial operations framework for Firebase and GCP projects. Use when setting up cost visibility, optimizing spend, establishing budgets, or building a cost-aware engineering culture. Every dollar spent on infrastructure should be traceable to a feature or user segment.
| Type | When to Use | Output |
|---|---|---|
| Cost Audit | Monthly or after bill shock — understand where money goes | Per-service cost breakdown, waste identification, optimization recommendations |
| Budget Setup | New project or new fiscal period — set guardrails | Budget alerts, spending limits, anomaly detection |
| Optimization Initiative | Costs growing faster than usage — reduce waste | Right-sizing plan, architecture changes, committed use discounts |
| Cost Allocation | Multi-product or multi-team — assign costs to owners | Tagging strategy, per-team dashboards, chargeback model |
| Forecasting | Planning phase — predict future spend | Growth-based projections, scenario modeling |
Every project MUST have:
1. GCP Billing Export to BigQuery (enabled once, runs continuously)
2. Monthly cost report emailed to engineering lead + finance
3. Per-service cost dashboard (Looker Studio or Data Studio)
4. Anomaly alerts for >20% day-over-day increase
Enable billing export:
GCP Console → Billing → Billing export → BigQuery export → Enable
Dataset: billing_export (create in same project)
This gives you raw billing data for custom queries and dashboards.
Every GCP resource MUST be tagged:
Required labels:
project: "antigravity" — which product
environment: "production" — dev / staging / production
team: "backend" — owning team
feature: "payments" — specific feature (for per-feature cost tracking)
cost-center: "engineering" — budget category
Apply labels:
Cloud Functions: setGlobalOptions({ labels: { project: "antigravity", ... } })
Cloud Run: gcloud run services update SERVICE --labels=project=antigravity
Cloud Storage: gsutil label set labels.json gs://BUCKET
Firestore: labels set at project level in console
Labels enable:
- Filter billing by team, feature, environment
- Answer "How much does the payments feature cost?"
- Answer "What percentage of spend is dev vs. production?"
Service Typical Cost Driver How to Track
──────────────────────────────────────────────────────────────────
Cloud Functions Invocations + compute time Cloud Monitoring → function/execution_count
Firestore Reads/writes/deletes Firebase Console → Usage tab
Cloud Storage Storage volume + egress GCP Console → Storage → Usage
Cloud Run CPU + memory per request Cloud Monitoring → container metrics
Firebase Auth Monthly active users (MAU) Firebase Console → Auth → Usage
Firebase Hosting Bandwidth + storage Firebase Console → Hosting → Usage
Secret Manager Access operations GCP Console → Secret Manager
Cloud Scheduler Job executions Minimal cost, rarely an issue
Networking/Egress Cross-region data transfer Often the hidden cost — monitor closely
-- BigQuery query: monthly cost by environment
SELECT
labels.value AS environment,
SUM(cost) AS total_cost,
SUM(cost) / SUM(SUM(cost)) OVER () * 100 AS pct_of_total
FROM `PROJECT.billing_export.gcp_billing_export_v1_*`
LEFT JOIN UNNEST(labels) AS labels ON labels.key = "environment"
WHERE invoice.month = FORMAT_DATE('%Y%m', CURRENT_DATE())
GROUP BY environment
ORDER BY total_cost DESC;
-- Target: production < 70% of total, dev+staging < 30%
-- If dev/staging > 30%, you have waste to clean up
See reference/details.md (section “Step 4: Firebase-Specific Optimization”) for full detail.
If your workload is predictable, commit for savings:
Resource On-Demand 1-Year CUD 3-Year CUD
──────────────────────────────────────────────────────────────
Cloud Run CPU $0.00002400 -17% -40%
Cloud Run Memory $0.00000250 -17% -40%
Compute Engine varies -37% -55%
Cloud SQL varies -25% -52%
When to commit:
✅ Stable production workload running > 6 months
✅ Baseline always-on compute (minInstances)
❌ Never commit for dev/staging environments
❌ Never commit for new projects (wait 3 months for data)
Review monthly — GCP provides right-sizing recommendations in Console:
GCP Console → Compute Engine → VM Instances → Right-sizing recommendations
GCP Console → Cloud Run → Services → Metrics (check actual vs. allocated)
Cloud Run right-sizing:
1. Check actual CPU/memory usage in Cloud Monitoring
2. If peak memory < 50% of allocation → reduce allocation
3. If CPU utilization consistently < 30% → reduce CPU or increase concurrency
4. Set CPU throttling = true (only charge for active request processing)
Cloud Functions right-sizing:
1. Check execution times in Firebase Console → Functions → Dashboard
2. If avg execution < 1s with 1GiB memory → try 256MiB
3. If cold start is the problem → increase minInstances, not memory
For batch processing, ML training, CI/CD runners:
- Spot VMs: 60-91% discount, but can be preempted with 30s notice
- Use for: CI/CD build agents, batch data processing, ML training
- Never for: user-facing services, databases, stateful workloads
gcloud compute instances create batch-worker \
--provisioning-model=SPOT \
--instance-termination-action=STOP \
--machine-type=e2-standard-4
Not every request needs GPT-4 or Claude Opus.
Route by complexity to minimize cost:
Tier Model Cost/1M tokens Use For
──────────────────────────────────────────────────────────────────
Fast GPT-4o-mini $0.15 input Classification, extraction, simple Q&A
Claude Haiku $0.25 input Validation, formatting, summarization
Standard GPT-4o $2.50 input Most features, content generation
Claude Sonnet $3.00 input Code generation, analysis
Premium GPT-4 $30.00 input Complex reasoning (rarely needed)
Claude Opus $15.00 input Critical decisions, legal/financial
Implementation:
1. Classify request complexity at the edge (use fast tier model)
2. Route to appropriate tier based on classification
3. Log cost per request for tracking
4. Set per-user or per-feature token budgets
// lib/ai-cost.ts — track and limit AI spend per feature
interface TokenBudget {
feature: string;
dailyLimit: number; // max tokens per day
monthlyLimit: number; // max tokens per month
currentDaily: number;
currentMonthly: number;
}
// Budget defaults per feature:
const BUDGETS: Record<string, { daily: number; monthly: number }> = {
"chat-assistant": { daily: 500_000, monthly: 10_000_000 },
"content-generator": { daily: 1_000_000, monthly: 20_000_000 },
"code-review": { daily: 200_000, monthly: 5_000_000 },
"search-summarize": { daily: 300_000, monthly: 8_000_000 },
};
// Check budget before every AI call:
// If daily budget exceeded → queue for tomorrow or downgrade model tier
// If monthly budget exceeded → disable feature, alert engineering
Cache identical or similar AI requests to avoid redundant API calls:
Strategy Cache TTL Estimated Savings
──────────────────────────────────────────────────────────────
Exact match (same prompt) 24 hours 20-40% for repeated queries
Semantic similarity 1 hour 10-20% for similar queries
Embedding cache 7 days Avoids re-embedding same documents
Precomputed responses 30 days For known common questions
Implementation:
1. Hash the prompt + model + temperature as cache key
2. Store in Redis/Firestore with TTL
3. Check cache before every API call
4. Log cache hit/miss ratio — target > 30% hit rate
# Set up three-tier budget alerts for every project
gcloud billing budgets create \
--billing-account=BILLING_ACCOUNT_ID \
--display-name="PROJECT_NAME Monthly Budget" \
--budget-amount=500 \
--threshold-rule=percent=0.5,basis=CURRENT_SPEND \
--threshold-rule=percent=0.8,basis=CURRENT_SPEND \
--threshold-rule=percent=1.0,basis=CURRENT_SPEND \
--threshold-rule=percent=1.2,basis=CURRENT_SPEND \
--notifications-rule-pubsub-topic=projects/PROJECT_ID/topics/billing-alerts
Alert tiers and response:
50% — Informational: email to engineering lead
80% — Warning: Slack alert to team channel, review spend
100% — Action required: freeze non-essential environments, investigate
120% — Escalation: alert CTO, consider emergency cost reduction
Set up day-over-day anomaly detection:
GCP Console → Billing → Budgets & alerts → Create budget
✅ Enable "Forecasted spend" alerts
✅ Set alert at 100% of forecasted budget
Custom anomaly detection (Cloud Function):
1. Query BigQuery billing export daily
2. Compare today's spend to 7-day rolling average
3. Alert if > 50% above average (could indicate: runaway function, DDoS, misconfigured autoscaling)
4. Auto-scale-down non-production environments on anomaly detection
Environment Monthly Cap Enforcement
──────────────────────────────────────────────────────────────────
Development $50 Auto-shutdown resources at cap
Staging $200 Alert at 80%, review at 100%
Production $2,000+ Alert tiers (50/80/100/120%)
Shared services $100 Alert at 80%
Enforcement:
- Dev environments: Cloud Scheduler job to shut down nightly
- Staging: reduce to zero instances outside business hours
- Production: never auto-shutdown, but alert aggressively
# Shut down dev Cloud Run services nightly
gcloud scheduler jobs create http dev-shutdown \
--schedule="0 20 * * MON-FRI" \
--uri="https://REGION-PROJECT.cloudfunctions.net/shutdownDev" \
--http-method=POST
Any change that increases monthly cost by >$100 requires:
1. Cost estimate in the PR description
2. Approval from engineering lead
3. Updated budget if needed
PR template addition:
## Cost Impact
- [ ] No cost change
- [ ] Estimated monthly increase: $___
- [ ] New service/resource: ___ at estimated $___/month
- [ ] Cost reviewed by: @engineering-lead
Track cost-per-feature monthly:
Feature Monthly Cost Users Cost/User Trend
──────────────────────────────────────────────────────────────────
Authentication $12 10,000 $0.001 Stable
Chat (AI-powered) $340 2,000 $0.170 Growing
Image uploads $85 5,000 $0.017 Stable
Search $45 8,000 $0.006 Stable
Notifications $20 10,000 $0.002 Stable
Use this to:
- Identify features that cost more than they're worth
- Set pricing tiers based on actual cost (AI features = premium tier)
- Justify infrastructure investments with per-user economics
- Track if optimization efforts are working (cost/user should decrease)
Every sprint planning should include:
1. Review current month spend vs. budget (5 minutes)
2. Flag any infrastructure tickets with cost implications
3. Assign cost tags to new features before development starts
4. Review optimization backlog — pick 1 cost ticket per sprint
Sprint board labels:
💰 cost-increase — this ticket will increase infrastructure spend
💰 cost-reduction — this ticket reduces infrastructure spend
💰 cost-neutral — no expected cost change
Make costs visible to every engineer:
1. Weekly cost Slack bot
Post to #engineering: "This week's cloud spend: $X (+Y% vs last week)"
Include top 3 cost drivers
2. Per-PR cost estimation
GitHub Action that estimates cost impact of infrastructure changes
Flag PRs that add new Cloud Functions, increase memory, add services
3. Monthly cost review
15-minute meeting: review spend, celebrate optimizations, plan reductions
Rotate presenter — every engineer should present once per quarter
4. Cost leaderboard (gamification)
Track optimization wins per engineer
Celebrate biggest cost reductions in team retros
Before analysis, gather infrastructure context:
Generate using Write:
docs/finops-report.md — findings with projected savingsmonitoring/budget-alerts.tf — Terraform budget alertsscripts/right-size-resources.sh — identify over-provisioned resourcesanalytics/cost-queries.sql — BigQuery queries for cost analysisFINOPS REPORT
Project: [NAME]
Date: [TODAY]
Prepared by: [NAME]
COST SUMMARY
┌──────────────────────────┬────────────────────────────────────┐
│ Field │ Value │
├──────────────────────────┼────────────────────────────────────┤
│ Current Monthly Spend │ $[X] │
│ Budget │ $[X] │
│ Spend vs. Budget │ [X%] │
│ Month-over-Month Change │ [+/-X%] │
│ Top Cost Driver │ [Service name: $X] │
│ Optimization Potential │ $[X] / month │
│ Cost per User │ $[X] │
│ FinOps Maturity │ [Crawl / Walk / Run] │
└──────────────────────────┴────────────────────────────────────┘
DELIVERABLES GENERATED:
- [ ] Per-service cost breakdown with trend analysis
- [ ] Cost allocation tags applied to all resources
- [ ] Budget alerts configured (50%, 80%, 100%, 120%)
- [ ] Firebase optimization recommendations with estimated savings
- [ ] GCP right-sizing recommendations
- [ ] AI/API cost management strategy
- [ ] Per-environment spending limits
- [ ] Cost approval workflow for PRs
- [ ] Monthly cost review process established
- [ ] Unit economics per feature calculated
RELATED SKILLS:
- /engineering-cost-model — project-level cost estimation
- /infrastructure-scaffold — infra configs with cost defaults
- /saas-financial-model — pricing tiers based on actual costs
- /performance-review — performance optimization often reduces cost
npx claudepluginhub cure-consulting-group/productengineeringskills --plugin cure-product-engineeringBlocks Edit/Write/Bash actions until Claude investigates importers, data schemas, and user instructions. Improves output quality by forcing concrete facts before edits.