Guides building production multi-agent systems with OpenAI AgentKit and Agents SDK — orchestration, handoffs, routines, and architectural patterns.
How this skill is triggered — by the user, by Claude, or both
Slash command
/claude-skills-library:openai-agentkitThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill provides comprehensive guidance on building production-ready multi-agent systems using OpenAI's AgentKit platform and Agents SDK, following 2025 best practices.
This skill provides comprehensive guidance on building production-ready multi-agent systems using OpenAI's AgentKit platform and Agents SDK, following 2025 best practices.
Complete platform for building, deploying, and optimizing agents with enterprise-grade tooling.
Core Components:
The Agents SDK is the production evolution of the experimental Swarm framework. Use Agents SDK for all production work - Swarm is educational only.
Migration Note: If you encounter legacy Swarm code, migrate to Agents SDK immediately.
An Agent encapsulates:
Design Principle: Agents should be lightweight and specialized rather than monolithic and general-purpose.
A routine is a sequence of actions an agent can perform:
Think of it as: A mini-workflow that an agent owns and executes autonomously.
Handoffs enable agent-to-agent transitions in execution flow.
Key Pattern: When an agent encounters a task outside its specialization, it hands off to a more appropriate agent.
Example:
# Triage agent determines which specialist to use
if task.type == "refund":
handoff_to(refund_agent)
elif task.type == "sales":
handoff_to(sales_agent)
Use Case: Routing requests to specialized sub-agents
Structure:
User Request → Triage Agent → [Determines Category] → Specialist Agent
Example Implementation:
# Triage agent with handoff capabilities
triage_agent = Agent(
name="Customer Service Triage",
instructions="Analyze customer requests and route to appropriate specialist",
functions=[analyze_request],
handoffs=[refund_agent, sales_agent, support_agent]
)
When to Use:
Use Case: Multi-step workflows where each step has a specialist
Structure:
Step 1 Agent → [Complete] → Handoff → Step 2 Agent → ... → Final Agent
Example:
# Research → Analysis → Report Generation pipeline
research_agent → analysis_agent → report_agent
When to Use:
Use Case: Breaking complex tasks into parallel subtasks
Structure:
Coordinator Agent
↓
├─→ Subtask Agent 1
├─→ Subtask Agent 2
└─→ Subtask Agent 3
↓
Synthesis Agent (combines results)
When to Use:
DO: ✅ Keep agents focused on single responsibilities ✅ Provide clear, specific instructions in system prompts ✅ Define explicit handoff conditions ✅ Use descriptive agent names (helps with debugging) ✅ Test agents in isolation before integration
DON'T: ❌ Create monolithic "do-everything" agents ❌ Allow agents to communicate directly (use handoffs) ❌ Over-engineer with too many specialized agents ❌ Ignore error handling in handoffs ❌ Skip agent boundary testing
Effective Routines:
Example:
routine = {
"name": "Process Refund",
"entry": "User requests refund",
"steps": [
"Verify order exists",
"Check refund eligibility",
"Calculate refund amount",
"Process payment reversal",
"Send confirmation"
],
"exit": "Refund confirmed or rejection reason provided",
"error_handling": "Escalate to human agent if verification fails"
}
Critical Elements:
Example:
def handoff_condition(state):
"""Determine if handoff needed"""
if state.requires_specialized_knowledge:
return specialist_agent
if state.exceeds_authority_level:
return supervisor_agent
return None # Continue with current agent
Principle: Frameworks that limit LLM involvement and rely on predefined or direct execution flows operate more efficiently.
Strategies:
Principle: Give agents only the tools they need for their specialty.
Pattern:
# Specialized agents get targeted toolsets
refund_agent.tools = [verify_order, calculate_refund, process_payment]
sales_agent.tools = [check_inventory, create_quote, process_order]
# NOT: both agents get all 6 tools
Problem: Too many agents for simple tasks creates overhead
Solution: Start simple, add agents only when complexity demands it
Problem: Agent A → Agent B → Agent A creates loops
Solution: Design clear hierarchy or state-based termination
Problem: Agents lose context across handoffs
Solution: Implement proper state management and context passing
Problem: Overlapping agent responsibilities cause conflicts
Solution: Define explicit agent domains and decision criteria
Agent-Level:
System-Level:
Unit Testing:
# Test individual agent behaviors
def test_refund_agent():
result = refund_agent.process(valid_refund_request)
assert result.status == "approved"
assert result.amount > 0
Integration Testing:
# Test agent handoffs
def test_triage_to_refund():
initial_state = {"request": "I want a refund"}
final_state = orchestrator.run(initial_state)
assert final_state.handling_agent == "refund_agent"
assert final_state.completed == True
End-to-End Testing:
# Test full user journeys
def test_customer_journey():
scenarios = load_test_scenarios()
for scenario in scenarios:
result = system.execute(scenario)
assert result.meets_requirements()
Essential Observability:
Tools:
Agent Security:
Data Protection:
Horizontal Scaling:
Vertical Optimization:
from openai import OpenAI
client = OpenAI()
# Define specialized agent
support_agent = {
"name": "Technical Support Agent",
"model": "gpt-4o",
"instructions": """You are a technical support specialist.
Help users troubleshoot technical issues.
If issue requires refund, hand off to refund agent.
If issue is sales-related, hand off to sales agent.""",
"tools": [
{"type": "function", "function": troubleshooting_guide},
{"type": "function", "function": escalate_to_human}
]
}
def execute_agent_workflow(initial_request):
current_agent = triage_agent
context = {"request": initial_request, "history": []}
while not is_complete(context):
# Execute current agent
response = client.chat.completions.create(
model=current_agent.model,
messages=build_messages(context, current_agent),
tools=current_agent.tools
)
# Check for handoff
next_agent = determine_handoff(response)
if next_agent:
context["history"].append({
"from": current_agent.name,
"to": next_agent.name
})
current_agent = next_agent
else:
context["result"] = response
break
return context
Use AgentKit for OpenAI-based workflows, Claude SDK for Anthropic-based workflows, and MCP to bridge data sources to both.
LangGraph provides more fine-grained control flow. Use AgentKit for simpler workflows, LangGraph for complex state machines.
AgentKit agents can consume MCP servers as tools, standardizing data source connections.
Key Changes:
swarm.run() with Agents SDK orchestrationTimeline: Swarm is maintenance-only. Migrate all production code by Q2 2025.
Use OpenAI AgentKit when:
Consider alternatives when:
Official Documentation:
Community:
This skill ensures you build robust, scalable, production-ready multi-agent systems using OpenAI's latest platform capabilities in 2025.
npx claudepluginhub frankxai/claude-skills-library --plugin claude-skills-libraryOrchestrates multi-agent AI systems with handoffs, routing, and workflows using AI SDK v5 in TypeScript. For agent collaboration and task delegation across providers.
Provides patterns and principles for building reliable autonomous agents: agent loops (ReAct, Plan-Execute), goal decomposition, reflection, and production guardrails. Useful when designing constrained, domain-specific agents.
Patterns for multi-agent coordination, task decomposition, agent handoffs, and orchestration topology selection. Use when splitting large tasks across sub-agents or debugging agent systems.