Expert prompt optimization system for building production-ready AI features. Use when users request help improving prompts, want to create system prompts, need prompt review/critique, ask for prompt optimization strategies, want to analyze prompt effectiveness, mention prompt engineering best practices, request prompt templates, or need guidance on structuring AI instructions. Also use when users provide prompts and want suggestions for improvement.
/plugin marketplace add breethomas/pm-thought-partner/plugin install breethomas-pm-thought-partner@breethomas/pm-thought-partnerThis skill inherits all available tools. When active, it can use any tool Claude has access to.
references/examples.mdreferences/research.mdMaster system for creating, analyzing, and optimizing prompts for AI products using research-backed techniques and battle-tested production patterns.
When improving any prompt, follow this systematic process:
Begin with what the model CANNOT do, not what it should do.
Pattern:
NEVER:
- [TOP 3 FAILURE MODES - BE SPECIFIC]
- Use meta-phrases ("I can help you", "let me assist")
- Provide information you're not certain about
ALWAYS:
- [TOP 3 SUCCESS BEHAVIORS - BE SPECIFIC]
- Acknowledge uncertainty when present
- Follow the output format exactly
Why: LLMs are more consistent at avoiding specific patterns than following general instructions. "Never say X" is more reliable than "Always be helpful."
Use formatting that signals technical documentation quality:
<system_constraints>, <task_instructions>)Why: Well-structured documents trigger higher-quality training data patterns.
Don't optimize manually - let the model do it using this meta-prompt:
You are a prompt optimization specialist. Your job is to improve prompts for production AI systems.
CURRENT PROMPT:
[User's prompt here]
PERFORMANCE DATA:
- Main failure modes: [List top 3 if known]
- Target use case: [Describe]
OPTIMIZATION TASK:
1. Identify the top 3 weaknesses in this prompt
2. Rewrite to fix those weaknesses using these principles:
- Hard constraints over soft instructions
- Specific examples over generic guidance
- Structured format over free text
3. Predict the improvement percentage for each change
CONSTRAINTS:
- Must maintain core functionality
- Cannot exceed 150% of current token count
- Must include failure mode handling
OUTPUT:
Optimized prompt + rationale for each change
Test the prompt systematically:
Identify the top 3 failure patterns and address them explicitly in the prompt.
Define clear success metrics:
Phase 1: Climb Up for Quality
Phase 2: Descend for Cost
Use this battle-tested template structure:
<system_role>
You are [SPECIFIC ROLE], not a general AI assistant.
You [CORE FUNCTION] for [TARGET USER].
</system_role>
<hard_constraints>
NEVER:
- [FAILURE MODE 1 - SPECIFIC]
- [FAILURE MODE 2 - SPECIFIC]
- [FAILURE MODE 3 - SPECIFIC]
- Use meta-phrases ("I can help you", "let me assist")
ALWAYS:
- [SUCCESS BEHAVIOR 1 - SPECIFIC]
- [SUCCESS BEHAVIOR 2 - SPECIFIC]
- [SUCCESS BEHAVIOR 3 - SPECIFIC]
- Acknowledge uncertainty when present
</hard_constraints>
<context_info>
Current user: [USER_CONTEXT]
Available tools: [TOOL_LIST]
Key limitations: [SPECIFIC_LIMITATIONS]
</context_info>
<task_instructions>
Your job is to [CORE TASK] by:
1. [STEP 1 - SPECIFIC ACTION]
2. [STEP 2 - SPECIFIC ACTION]
3. [STEP 3 - SPECIFIC ACTION]
If [EDGE_CASE_1], then [SPECIFIC_RESPONSE].
If [EDGE_CASE_2], then [SPECIFIC_RESPONSE].
If [EDGE_CASE_3], then [SPECIFIC_RESPONSE].
</task_instructions>
<output_format>
Respond using this exact structure:
[SECTION_1]: [DESCRIPTION]
[SECTION_2]: [DESCRIPTION]
Requirements:
- [FORMAT_REQUIREMENT_1]
- [FORMAT_REQUIREMENT_2]
</output_format>
<examples>
Example 1 - Happy Path:
Input: [TYPICAL_INPUT]
Output: [IDEAL_RESPONSE]
Example 2 - Edge Case:
Input: [EDGE_CASE_INPUT]
Output: [EDGE_CASE_RESPONSE]
Example 3 - Complex:
Input: [COMPLEX_SCENARIO]
Output: [COMPLEX_RESPONSE]
</examples>
Best for: Financial dashboards, data analysis, table processing Performance: 8.69% improvement on table tasks How: Make the AI manipulate table structure step-by-step, not reason about tables in text
Best for: Arithmetic reasoning, logic puzzles, formal reasoning Limitations: Only works on 100B+ parameter models; minimal benefit for content generation When NOT to use: Classification, content generation, most business tasks
When it helps: Task requires specific style, format examples improve output When it hurts: Advanced reasoning tasks (o1, DeepSeek R1 models) Best practice: Test systematically - few-shot has highest variability of any technique
Best for: Customer support, sales conversations, multi-turn interactions How: Show entire conversation flows, not isolated examples Benefit: Teaches conversation patterns, not just individual responses
Problem: One massive prompt trying to do sentiment analysis, routing, response generation, and task management simultaneously.
Fix: Break into specialized prompts:
Each prompt does ONE thing exceptionally well.
Problem: Prompt works perfectly on clean, polite, well-formatted demo data but fails on 40% of real production inputs.
Fix: Build eval suite from real chaos:
Problem: Shipping a prompt and never updating it as business evolves, user needs change, and new edge cases emerge.
Fix: Build continuous optimization:
Shorter, structured prompts have major advantages:
Example comparison:
Benefits of compression:
When to use longer prompts: Complex tasks requiring extensive context, edge case handling, or when that 88% cost increase delivers proportional value.
When user provides a prompt to improve:
Identify Current State
Analyze Against Framework
Provide Specific Recommendations
Offer Complete Rewrite
Suggest Testing Strategy
Conciseness Matters - Context window is shared. Only include what Claude doesn't already know.
Structure = Quality - XML for Claude, JSON for GPT-3.5, Markdown for docs. Format signals quality.
Hard Constraints Over Soft - "Never do X" is more reliable than "Be helpful."
Systematic Testing - Build evals with 20% happy path, 60% edge cases, 20% adversarial.
Continuous Optimization - Prompts decay as business evolves. Build iteration into workflow.
Cost-Performance Balance - Climb for quality first, then descend for cost optimization.
Use Chain-of-Table when:
Use Chain-of-Thought when:
Use Few-Shot when:
Use Multi-Shot when:
Use Nested Prompting when:
When providing prompt improvements, always:
Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.
Applies Anthropic's official brand colors and typography to any sort of artifact that may benefit from having Anthropic's look-and-feel. Use it when brand colors or style guidelines, visual formatting, or company design standards apply.
Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the user asks to create a poster, piece of art, design, or other static piece. Create original visual designs, never copying existing artists' work to avoid copyright violations.