Multi-agent parallel performance profiling workflow for identifying bottlenecks in data pipelines. Use when investigating performance issues, optimizing ingestion pipelines, profiling database operations, or diagnosing throughput bottlenecks in multi-stage data systems.
Inherits all available tools
Additional assets for this skill
This skill inherits all available tools. When active, it can use any tool Claude has access to.
references/impact_quantification_guide.mdreferences/integration_report_template.mdreferences/profiling_template.pyPrescriptive workflow for spawning parallel profiling agents to comprehensively identify performance bottlenecks across multiple system layers. Successfully discovered that QuestDB ingests at 1.1M rows/sec (11x faster than target), proving database was NOT the bottleneck - CloudFront download was 90% of pipeline time.
When to use this skill:
Key outcomes:
Agent 1: Profiling (Instrumentation)
Agent 2: Database Configuration Analysis
Agent 3: Client Library Analysis
Agent 4: Batch Size Analysis
Agent 5: Integration & Synthesis
Parallel Execution (all 5 agents run simultaneously):
Agent 1 (Profiling) → [PARALLEL]
Agent 2 (DB Config) → [PARALLEL]
Agent 3 (Client Library) → [PARALLEL]
Agent 4 (Batch Size) → [PARALLEL]
Agent 5 (Integration) → [PARALLEL - reads tmp/ outputs from others]
Key Principle: No dependencies between investigation agents (1-4). Integration agent synthesizes findings.
Dynamic Todo Management:
Each agent produces:
profile_pipeline.py)
Example Profiling Code:
import time
# Profile multi-stage pipeline
def profile_pipeline():
results = {}
# Phase 1: Download
start = time.perf_counter()
data = download_from_cdn(url)
results["download"] = time.perf_counter() - start
# Phase 2: Extract
start = time.perf_counter()
csv_data = extract_zip(data)
results["extract"] = time.perf_counter() - start
# Phase 3: Parse
start = time.perf_counter()
df = parse_csv(csv_data)
results["parse"] = time.perf_counter() - start
# Phase 4: Ingest
start = time.perf_counter()
ingest_to_db(df)
results["ingest"] = time.perf_counter() - start
# Analysis
total = sum(results.values())
for phase, duration in results.items():
pct = (duration / total) * 100
print(f"{phase}: {duration:.3f}s ({pct:.1f}%)")
return results
Priority Levels:
Impact Reporting Format:
### Recommendation: [Optimization Name] (P0/P1/P2) - [IMPACT LEVEL]
**Impact**: 🔴/🟠/🟡 **Nx improvement**
**Effort**: High/Medium/Low (N days)
**Expected Improvement**: CurrentK → TargetK rows/sec
**Rationale**:
- [Why this matters]
- [Supporting evidence from profiling]
- [Comparison to alternatives]
**Implementation**:
[Code snippet or architecture description]
Integration Agent Responsibilities:
Consensus Criteria:
Input: Performance metric below SLO Output: Problem statement with baseline metrics
Example Problem Statement:
Performance Issue: BTCUSDT 1m ingestion at 47K rows/sec
Target SLO: >100K rows/sec
Gap: 53% below target
Pipeline: CloudFront download → ZIP extract → CSV parse → QuestDB ILP ingest
Directory Structure:
tmp/perf-optimization/
profiling/ # Agent 1
profile_pipeline.py
PROFILING_REPORT.md
questdb-config/ # Agent 2
CONFIG_ANALYSIS.md
python-client/ # Agent 3
CLIENT_ANALYSIS.md
batch-size/ # Agent 4
BATCH_ANALYSIS.md
MASTER_INTEGRATION_REPORT.md # Agent 5
Agent Assignment:
IMPORTANT: Use single message with multiple Task tool calls for true parallelism
Example:
I'm going to spawn 5 parallel investigation agents:
[Uses Task tool 5 times in a single message]
- Agent 1: Profiling
- Agent 2: QuestDB Config
- Agent 3: Python Client
- Agent 4: Batch Size
- Agent 5: Integration (depends on others completing)
Execution:
# All agents run simultaneously (user observes 5 parallel tool calls)
# Each agent writes to its own tmp/ subdirectory
# Integration agent polls for completed reports
Progress Tracking:
Completion Criteria:
Report Structure:
# Master Performance Optimization Integration Report
## Executive Summary
- Critical discovery (what is/isn't the bottleneck)
- Key findings from each agent (1-sentence summary)
## Top 3 Recommendations (Consensus)
1. [P0 Optimization] - HIGHEST IMPACT
2. [P1 Optimization] - HIGH IMPACT
3. [P2 Optimization] - QUICK WIN
## Agent Investigation Summary
### Agent 1: Profiling
### Agent 2: Database Config
### Agent 3: Client Library
### Agent 4: Batch Size
## Implementation Roadmap
### Phase 1: P0 Optimizations (Week 1)
### Phase 2: P1 Optimizations (Week 2)
### Phase 3: P2 Quick Wins (As time permits)
For each recommendation:
Example Implementation:
# Before optimization
uv run python tmp/perf-optimization/profiling/profile_pipeline.py
# Output: 47K rows/sec, download=857ms (90%)
# Implement P0 recommendation (concurrent downloads)
# [Make code changes]
# After optimization
uv run python tmp/perf-optimization/profiling/profile_pipeline.py
# Output: 450K rows/sec, download=90ms per symbol * 10 concurrent (90%)
Context: Pipeline achieving 47K rows/sec, target 100K rows/sec (53% below SLO)
Assumptions Before Investigation:
Findings After 5-Agent Investigation:
Top 3 Recommendations:
Impact: Discovered database ingests at 1.1M rows/sec (11x faster than target) - proving database was never the bottleneck
Outcome: Avoided wasting 2-3 weeks optimizing database when download was the real bottleneck
❌ Bad: Profile database only, assume it's the bottleneck ✅ Good: Profile entire pipeline (download → extract → parse → ingest)
❌ Bad: Run Agent 1, wait, then run Agent 2, wait, etc. ✅ Good: Spawn all 5 agents in parallel using single message with multiple Task calls
❌ Bad: "Let's optimize the database config first" (assumption-driven) ✅ Good: Profile first, discover database is only 4% of time, optimize download instead
❌ Bad: Only implement P0 (highest impact, highest effort) ✅ Good: Implement P2 quick wins (1.3x for 4-8 hours effort) while planning P0
❌ Bad: Implement optimization, assume it worked ✅ Good: Re-run profiling script, verify expected improvement achieved
Not applicable - profiling scripts are project-specific (stored in tmp/perf-optimization/)
profiling_template.py - Template for phase-boundary instrumentationintegration_report_template.md - Template for master integration reportimpact_quantification_guide.md - How to assess P0/P1/P2 prioritiesNot applicable - profiling artifacts are project-specific