Accessibility-first UI automation using IDB. Query accessibility tree (fast, 50 tokens) before screenshots (slow, 170 tokens). Use when automating simulator interactions, tapping UI elements, finding buttons, or testing user flows. Covers idb-ui-describe, idb-ui-tap, idb-ui-find-element patterns.
Inherits all available tools
Additional assets for this skill
This skill inherits all available tools. When active, it can use any tool Claude has access to.
Use the execute_idb_command MCP tool for all UI automation
The xclaude-plugin provides the execute_idb_command MCP tool which consolidates all IDB UI automation operations into a single, token-efficient dispatcher.
This is the most important rule: When automating UI interactions, you MUST use the execute_idb_command MCP tool.
execute_idb_command for all UI automation, element finding, and accessibility queriesidb commandsidb directly in bashidb commands in a terminalWhy? The MCP tool provides:
If execute_idb_command fails, the issue is with parameters or app state - not that you should use bash.
Always query the accessibility tree first. Only use screenshots as a fallback.
Use the execute_idb_command MCP tool with operation describe to access the accessibility tree.
| Approach | Time | Tokens | Reliability |
|---|---|---|---|
| Accessibility tree | ~120ms | ~50 | Survives theme changes |
| Screenshot | ~2000ms | ~170 | Breaks on visual changes |
Result: 3-4x faster, 80% cheaper, more reliable
execute_idb_commandBefore starting automation, check if the app has good accessibility support:
Invoke the execute_idb_command MCP tool:
{
"operation": "check-accessibility",
"target": "booted"
}
Interprets:
Note: Most modern iOS apps have good accessibility support. Skip this check if you're confident.
execute_idb_command with operation: "describe"This is your starting point for all UI automation:
Invoke the execute_idb_command MCP tool:
{
"operation": "describe",
"target": "booted",
"parameters": {
"operation": "all"
}
}
Returns:
{
"elements": [
{
"label": "Login",
"type": "Button",
"frame": { "x": 100, "y": 400, "width": 175, "height": 50 },
"centerX": 187,
"centerY": 425,
"enabled": true,
"visible": true
},
{
"label": "Email",
"type": "TextField",
"value": "",
"frame": { "x": 50, "y": 300, "width": 275, "height": 44 },
"centerX": 187,
"centerY": 322
}
]
}
Use centerX and centerY for tap coordinates.
Option A: Search by Label/Text (Preferred)
{
"operation": "find-element",
"target": "booted",
"parameters": {
"query": "Login"
}
}
Option B: Manual Search
From the accessibility tree response, find the element you want by:
label: Button text, field labelstype: Button, TextField, Cell, etc.value: Current input valuevisible: Only interact with visible elementsTap:
{
"operation": "tap",
"target": "booted",
"parameters": {
"x": 187,
"y": 425
}
}
Input Text:
{
"operation": "input",
"target": "booted",
"parameters": {
"text": "user@example.com"
}
}
Keyboard Actions:
{
"operation": "input",
"target": "booted",
"parameters": {
"key": "return"
}
}
Available keys: return, home, delete, space, escape, tab, up, down, left, right
After interaction, query accessibility tree again to verify:
{
"operation": "describe",
"target": "booted"
}
1. describe → Find "Email" text field
2. tap → Focus email field
3. input → Type email
4. describe → Find "Password" text field
5. tap → Focus password field
6. input → Type password
7. describe → Find "Login" button
8. tap → Submit form
9. describe → Verify next screen
1. describe → Get all buttons
2. find-element → Search for specific button
3. tap → Execute tap
4. describe → Verify navigation
1. describe → Get all text fields
2. For each field:
- tap → Focus field
- input → Enter text
- input key:return → Next field
3. describe → Find submit button
4. tap → Submit
1. describe → Check if element visible
2. If not visible:
- gesture (swipe up) → Scroll
- describe → Check again
3. find-element → Locate target
4. tap → Interact
{
"operation": "gesture",
"target": "booted",
"parameters": {
"gesture_type": "swipe",
"direction": "up",
"duration": 200
}
}
Directions: up, down, left, right
{
"operation": "gesture",
"target": "booted",
"parameters": {
"gesture_type": "button",
"button": "HOME"
}
}
Buttons: HOME, LOCK, SIRI, SIDE_BUTTON, APPLE_PAY, SCREENSHOT, APP_SWITCH
Only use screenshots if:
Accessibility quality is "poor"
{ "operation": "check-accessibility", "target": "booted" }
Visual verification needed
Element not in accessibility tree
For everything else, use accessibility tree.
Problem: find-element returns no results
Solutions:
describe to see all elementsProblem: Tap executes but nothing happens
Solutions:
enabled: truevisible: truecenterX, centerY)Problem: Text input not appearing
Solutions:
return, delete)If using screenshots with idb-ui-tap, coordinates may need scaling:
{
"operation": "tap",
"target": "booted",
"parameters": {
"x": 187,
"y": 425,
"applyScreenshotScale": true,
"screenshotScaleX": 0.5,
"screenshotScaleY": 0.5
}
}
But with accessibility-first, this is rarely needed.
describe with point coordinates for specific regionsThis Skill works with execute_idb_command tool:
execute_idb_command toolxc://operations/idb: Complete IDB operations referencexc://reference/accessibility: Accessibility tree structure guidexc://workflows/accessibility-first: This workflow patternRemember: Accessibility tree first, screenshots last. 3-4x faster, 80% cheaper.