Agent Quickstart
This is the fastest path for a cold-start agent to go from "Clawperator is installed" to "I ran a real command and know how to inspect the result."
Use this page first, then branch into:
- Clawperator Snapshot Format for the exact
snapshot_uioutput structure - Clawperator Node API - Agent Guide for the full contract
- API Overview for a shorter reference path
- Multi-Device Workflows for Agents if more than one Android target is connected
What Clawperator is
Clawperator is the hand. Your external agent is the brain.
- the brain decides what to do on the user's behalf
- Clawperator executes validated Android UI actions deterministically
- Clawperator returns structured results that the brain can reason over
Clawperator is not a planner or a policy engine. It does not decide which user inputs are appropriate. It just provides the interaction primitives.
Before you start
Make sure the runtime is ready:
clawperator doctor --output json
clawperator devices --output json
If multiple devices are connected, always pass --device-id <device_id> so
targeting stays explicit.
If the current CLI behavior and a narrative doc ever seem to disagree, prefer subcommand help for the exact shipped flags and usage:
clawperator observe snapshot --help
clawperator observe screenshot --help
clawperator skills compile-artifact --help
clawperator skills run --help
clawperator doctor --help
Step 1 - Take a snapshot
The quickest way to inspect the current UI is:
clawperator observe snapshot --device-id <device_id> --output json
Successful snapshot output includes:
- one result envelope
stepResults[0].actionType = "snapshot_ui"stepResults[0].data.actual_format = "hierarchy_xml"- optional snapshot metadata such as
foreground_package,has_overlay, andwindow_count stepResults[0].data.textcontaining the XML hierarchy
Example response shape:
{
"ok": true,
"envelope": {
"commandId": "cmd-snap-1",
"taskId": "task-snap-1",
"status": "success",
"stepResults": [
{
"id": "snap1",
"actionType": "snapshot_ui",
"success": true,
"data": {
"actual_format": "hierarchy_xml",
"foreground_package": "com.android.settings",
"has_overlay": "false",
"window_count": "2",
"text": "<?xml version=\"1.0\" encoding=\"UTF-8\"?><hierarchy rotation=\"0\">...</hierarchy>"
}
}
],
"error": null
},
"deviceId": "<device_id>",
"terminalSource": "clawperator_result"
}
Read the XML in data.text to identify:
- stable
resource-idvalues - visible labels in
text - icon labels in
content-desc - scroll containers via
scrollable="true"
The full parsing contract lives in Clawperator Snapshot Format.
Step 2 - Run a first execution payload
The smallest useful first execution usually opens an app, gives it a moment to settle, then snapshots the result.
Example payload:
{
"commandId": "quickstart-001",
"taskId": "quickstart-001",
"source": "cold-start-agent",
"expectedFormat": "android-ui-automator",
"timeoutMs": 15000,
"actions": [
{
"id": "open-settings",
"type": "open_app",
"params": { "applicationId": "com.android.settings" }
},
{
"id": "settle",
"type": "sleep",
"params": { "durationMs": 1000 }
},
{
"id": "snap-settings",
"type": "snapshot_ui"
}
]
}
Run it with:
clawperator execute --device-id <device_id> --execution /path/to/execution.json --output json
Step 3 - Read the result envelope correctly
Always check both:
envelope.status- each
stepResults[n].success
Those mean different things:
envelope.status = "failed"means the execution as a whole failedstepResults[n].success = falsemeans one step failed, even if the overall execution still completed
That distinction matters for actions like close_app, where the overall
execution may complete even though the step reports a per-step failure.
Step 4 - Use the default agent loop
For unknown apps, use a single-action plus re-observe loop:
- snapshot
- decide what to press or read
- execute one action or one tight sequence
- snapshot again
- repeat
This is the safe default for cold-start usage. Use larger multi-action payloads only after the UI path is already known.