Quickstart
Before You Start
This page assumes a working Clawperator installation: CLI installed, device
connected, Operator APK installed, and clawperator doctor --json returning
"criticalOk": true.
If you have not reached that state yet, complete Setup first.
If you are choosing between runtime-skill discovery, MCP, and direct CLI automation after install, read Host Agent Orientation first. This page starts after that choice and focuses on direct observe / decide / act execution.
The Automation Loop
Clawperator is a deterministic actuator. The agent reasons and decides; Clawperator executes and returns structured data. Every automation follows the same three-step loop:
Observe -> Decide -> Act
- Observe - take a UI snapshot to read the current device state
- Decide - parse the XML hierarchy to find the right node to target
- Act - send an execution payload with the next action, wait for the result
Repeat until the task is done.
Step 1: Observe
The canonical observation action is snapshot_ui. Run it with the built-in
snapshot command:
clawperator snapshot --json --device <device_serial>
On success, the result envelope contains the XML hierarchy in
envelope.stepResults[0].data.text.
Success conditions to check before proceeding:
{
"envelope": {
"status": "success",
"stepResults": [
{
"actionType": "snapshot_ui",
"success": true,
"data": {
"text": "<?xml version='1.0' encoding='UTF-8' standalone='yes' ?>\n<hierarchy rotation=\"0\">...</hierarchy>"
}
}
]
}
}
Branch rule: only proceed when envelope.status == "success" and
stepResults[0].success == true. If data.text is missing, Node converts the
step to SNAPSHOT_EXTRACTION_FAILED. See Snapshot Format
for recovery.
Step 2: Decide
Parse the XML to find the node you want to act on. The key attributes for targeting are:
| Attribute | Use for |
|---|---|
text |
Matching visible label text |
resource-id |
Stable structural identifiers like android:id/title |
content-desc |
Accessibility labels when text is empty |
clickable |
Only clickable="true" nodes can be tapped directly |
scrollable |
Identifies containers to target for scroll actions |
bounds |
Screen coordinates in [x1,y1][x2,y2] form |
For example, in a Settings hierarchy the "Connections" row appears as:
<!-- The row container: clickable="true", no resource-id -->
<node
text=""
resource-id=""
class="android.widget.LinearLayout"
clickable="true"
enabled="true"
bounds="[30,1290][1050,1499]">
<!-- Title: clickable="false", resource-id="android:id/title" -->
<node
text="Connections"
resource-id="android:id/title"
class="android.widget.TextView"
clickable="false"
bounds="[216,1332][507,1402]" />
<!-- Summary: shows current sub-settings -->
<node
text="Wi-Fi • Bluetooth • SIM manager"
resource-id="android:id/summary"
class="android.widget.TextView"
clickable="false"
bounds="[216,1402][816,1457]" />
</node>
To tap this row, target the child title text. Clawperator will resolve the
nearest clickable ancestor if the matched node is not itself clickable.
Selector to use:
{
"textEquals": "Connections",
"resourceId": "android:id/title"
}
See Selectors for the full NodeMatcher contract.
Step 3: Act
Send an execution payload via clawperator exec --json. The payload lists one
or more actions in sequence. Clawperator dispatches them in order and returns a
single result envelope.
Example - click the "Connections" row, wait for navigation, then take a snapshot:
{
"commandId": "nav-to-connections",
"taskId": "nav-to-connections",
"source": "agent-loop",
"expectedFormat": "android-ui-automator",
"timeoutMs": 15000,
"actions": [
{
"id": "click-1",
"type": "click",
"params": {
"matcher": {
"textEquals": "Connections",
"resourceId": "android:id/title"
}
}
},
{
"id": "wait-1",
"type": "wait_for_navigation",
"params": {
"expectedPackage": "com.android.settings",
"timeoutMs": 5000
}
},
{
"id": "snap-2",
"type": "snapshot_ui"
}
],
"mode": "direct"
}
Save this as payload.json and run:
clawperator exec --json --device <device_serial> < payload.json
Success conditions:
envelope.status == "success"envelope.stepResultshas three entries, all withsuccess: trueenvelope.stepResults[2].data.textcontains the Connections screen hierarchy
Branch rule: if any step fails, the envelope still returns but the failed
step has success: false. Check envelope.stepResults[i].success individually,
not just the top-level status. See Errors for recovery by code.
Putting It Together
A complete agent sequence for reading the Android version:
1. Pre-flight
clawperator doctor --json --device <device_serial>
# Require: "criticalOk": true
2. Open Settings and observe
{
"commandId": "open-settings",
"taskId": "open-settings",
"source": "agent-loop",
"expectedFormat": "android-ui-automator",
"timeoutMs": 20000,
"actions": [
{
"id": "open-1",
"type": "open_app",
"params": { "applicationId": "com.android.settings" }
},
{
"id": "wait-1",
"type": "wait_for_navigation",
"params": { "expectedPackage": "com.android.settings", "timeoutMs": 5000 }
},
{
"id": "snap-1",
"type": "snapshot_ui"
}
],
"mode": "direct"
}
3. Read snapshot, find "About phone"
Parse stepResults[2].data.text. Search for a node with
resource-id="android:id/title" and text="About phone". If it is not
visible, scroll down the recycler_view and take another snapshot.
Scroll payload (use when the target row is below the visible area):
{
"commandId": "scroll-settings",
"taskId": "scroll-settings",
"source": "agent-loop",
"expectedFormat": "android-ui-automator",
"timeoutMs": 10000,
"actions": [
{
"id": "scroll-1",
"type": "scroll",
"params": {
"direction": "down",
"container": { "resourceId": "com.android.settings:id/recycler_view" }
}
},
{
"id": "snap-2",
"type": "snapshot_ui"
}
],
"mode": "direct"
}
4. Navigate and read the value
Once "About phone" is visible, click it and snapshot the next screen to find the "Android version" row:
{
"commandId": "read-android-version",
"taskId": "read-android-version",
"source": "agent-loop",
"expectedFormat": "android-ui-automator",
"timeoutMs": 20000,
"actions": [
{
"id": "click-about",
"type": "click",
"params": {
"matcher": { "textEquals": "About phone", "resourceId": "android:id/title" }
}
},
{
"id": "wait-about",
"type": "wait_for_navigation",
"params": { "expectedPackage": "com.android.settings", "timeoutMs": 5000 }
},
{
"id": "snap-3",
"type": "snapshot_ui"
}
],
"mode": "direct"
}
Parse stepResults[2].data.text for the "Android version" row. The value
appears as the text attribute of the summary node adjacent to an android:id/title
node with text="Android version".
What to Read Next
| Topic | Page |
|---|---|
| All action types and parameters | Actions |
| Selector syntax for node matching | Selectors |
| Snapshot XML format and annotated example | Snapshot Format |
| Full result envelope contract | API Overview |
| Error codes and recovery steps | Errors |
| Scrolling and multi-step navigation patterns | Navigation Patterns |
| Environment variable controls | Environment Variables |