Guide for designing effective tools for AI agents. Use when creating tools for custom agent systems or any AI tool interfaces. Provides principles for tool naming, input/output design, error handling, and evaluation methodologies that maximize agent effectiveness.
This skill inherits all available tools. When active, it can use any tool Claude has access to.
references/evaluation_guide.mdreferences/tool_design_patterns.mdThis skill provides comprehensive guidance for designing tools that AI agents can use effectively. Whether building custom agent tools or any AI-accessible interfaces, these principles maximize agent success in accomplishing real-world tasks.
Note: Use the more specific mcp-builder skill if you want to create an MCP server.
The quality of a tool system is measured not by how comprehensively it implements features, but by how well it enables AI agents to accomplish realistic, complex tasks using only the tools provided.
Before implementing any tool system, understand these foundational principles for designing tools that AI agents can use effectively:
Principle: Design thoughtful, high-impact workflow tools rather than simply wrapping existing API endpoints.
Why it matters: Agents need to accomplish complete tasks, not just make individual API calls. Tools that consolidate related operations reduce the number of steps agents must take and improve success rates.
How to apply:
schedule_event that both checks availability and creates the event)Examples:
check_calendar_availability, create_calendar_event, send_event_notificationschedule_event with parameters for checking conflicts and sending notificationsPrinciple: Agents have constrained context windows - make every token count.
Why it matters: When agents run out of context, they fail to complete tasks. Verbose tool outputs force agents to make difficult decisions about what information to keep or discard.
How to apply:
Examples:
detailed=true parameter for full dataPrinciple: Error messages should guide agents toward correct usage patterns, not just report failures.
Why it matters: Agents learn tool usage through feedback. Clear, educational errors help agents self-correct and succeed on retry.
How to apply:
Examples:
Principle: Tool names and organization should reflect how humans think about tasks, not just API structure.
Why it matters: Agents use tool names and descriptions to decide which tool to call. Natural naming improves tool discovery and reduces wrong tool selections.
How to apply:
search_users, create_project, send_messageslack_send_message not just send_messageExamples:
api_endpoint_users_post, api_endpoint_users_get, api_endpoint_users_deletecreate_user, search_users, delete_userPrinciple: Create realistic evaluation scenarios early and let agent feedback drive tool improvements.
Why it matters: Only by testing tools with actual agents can you discover usability issues. Prototype quickly and iterate based on real agent performance.
How to apply:
Process:
Follow this systematic framework when designing any tool for AI agents:
1. Identify Core Workflows
2. Design Input Schemas
3. Design Output Formats
has_more, next_offset, total_count)4. Plan Error Handling
Tool Naming Conventions:
search_users, create_projectgithub_create_issue, slack_send_messageTool Descriptions: Write comprehensive descriptions that include:
Tool Annotations (if supported by your system):
readOnlyHint: true for read-only operationsdestructiveHint: false for non-destructive operationsidempotentHint: true if repeated calls have same effectopenWorldHint: true if interacting with external systemsCode Quality Checklist:
Testing:
All tools that return data should support multiple formats for flexibility:
response_format="json")Purpose: Machine-readable structured data for programmatic processing
Best practices:
Example:
{
"users": [
{
"id": "U123456",
"name": "John Doe",
"email": "john@example.com",
"role": "developer",
"active": true
}
],
"total": 150,
"count": 20,
"has_more": true,
"next_offset": 20
}
response_format="markdown", typically default)Purpose: Human-readable formatted text for user presentation
Best practices:
Example:
## Users (20 of 150)
- **John Doe** (@john.doe)
- Email: john@example.com
- Role: Developer
- Status: Active
- **Jane Smith** (@jane.smith)
- Email: jane@example.com
- Role: Designer
- Status: Active
*Showing 20 results. Use offset=20 to see more.*
For tools that list resources:
Implementation requirements:
limit parameter (never load all results when limit specified)has_more, next_offset/next_cursor, total_countResponse structure:
{
"items": [...],
"total": 150,
"count": 20,
"offset": 0,
"has_more": true,
"next_offset": 20
}
Clear guidance in responses: Include instructions for getting more data:
To prevent overwhelming context windows:
Implementation:
Example handling:
CHARACTER_LIMIT = 25_000
if result.length > CHARACTER_LIMIT
truncated_data = data[0...[1, data.length / 2].max]
response[:truncated] = true
response[:truncation_message] =
"Response truncated from #{data.length} to #{truncated_data.length} items. " \
"Use 'offset' parameter or add filters like status='active' to see more."
end
Security and usability:
Schema design:
This skill includes reference documentation for deeper exploration:
Comprehensive patterns and anti-patterns for common tool design scenarios with detailed examples.
Complete methodology for creating evaluation questions that test tool effectiveness with AI agents, including how to run evaluations and interpret results.
For detailed examples and advanced patterns: