Snapshot Format
Purpose
Define what snapshot_ui returns, where the XML hierarchy is attached in the result envelope, what extraction failures look like, and what parts of the snapshot contract an agent can rely on.
Sources
- Snapshot extraction:
apps/node/src/domain/executions/snapshotHelper.ts - Snapshot post-processing:
apps/node/src/domain/executions/runExecution.ts - Hard limits:
apps/node/src/contracts/limits.ts - Snapshot builder:
apps/node/src/domain/observe/snapshot.ts - Action contract summary:
docs/api/actions.md
What snapshot_ui Returns
snapshot_ui is the canonical read-only UI observation action. The Android runtime writes the hierarchy dump to logcat, then the Node layer extracts the XML and attaches it to the successful step result as data.text.
The built-in clawperator snapshot command constructs a one-step execution with these exact defaults:
source: "clawperator-observe"expectedFormat: "android-ui-automator"timeoutMs: 30000whenbuildSnapshotExecution()is called without an override- one action with
id: "snap"andtype: "snapshot_ui" mode: "direct"commandIdis generated assnapshot-${Date.now()}-${Math.random().toString(36).slice(2, 9)}taskIdequals the generatedcommandId
For CLI snapshot --json, machine-checkable success means:
- exit code
0 - top-level JSON has
envelope envelope.status == "success"envelope.stepResults[0].actionType == "snapshot_ui"envelope.stepResults[0].success == trueenvelope.stepResults[0].data.textis present
Example one-step payload from the clawperator snapshot builder:
{
"commandId": "snapshot-1700000000000-abcd123",
"taskId": "snapshot-1700000000000-abcd123",
"source": "clawperator-observe",
"expectedFormat": "android-ui-automator",
"timeoutMs": 30000,
"actions": [
{
"id": "snap",
"type": "snapshot_ui"
}
],
"mode": "direct"
}
How Snapshot Data Flows
The current flow is:
- Android executes
snapshot_ui. - Android writes the hierarchy dump into logcat lines that include the marker
[TaskScope] UI Hierarchy:. - Node reads those logcat lines after execution.
extractSnapshotsFromLogs()reconstructs one or more XML documents from the log stream.attachSnapshotsToStepResults()walks backward through successfulsnapshot_uisteps and attaches the extracted XML asstepResults[i].data.text.markExtractionFailedSnapshotSteps()converts any still-successful snapshot step with missingdata.textinto a failed step withdata.error = "SNAPSHOT_EXTRACTION_FAILED".addSettleWarnings()may attachdata.warnif the snapshot action immediately followsclickorscroll_and_click.
Debugging details that matter when extraction goes wrong:
runExecution()clears logcat before dispatch withadb logcat -c- after the execution finishes, Node dumps logcat with
adb logcat -d -v tag snapshotHelper.tsonly extracts blocks from lines containing[TaskScope] UI Hierarchy:
Important boundaries:
- Node does not parse the XML into a typed object. It treats the hierarchy as opaque text.
- When multiple snapshots exist in one execution, Node attaches the most recent extracted snapshot to the most recent successful
snapshot_uistep, walking backward through both lists. - If no successful
snapshot_uisteps exist, extraction output is ignored. - Node only reads logcat for snapshot extraction when the result envelope already contains at least one
snapshot_uistep. - for direct snapshot executions like
clawperator snapshot, the stepidmatches the actionid("snap")
Envelope Placement
Successful snapshot_ui data lives inside the step result, not in a separate top-level field:
{
"envelope": {
"commandId": "snapshot-1700000000000-abcd123",
"taskId": "snapshot-1700000000000-abcd123",
"status": "success",
"stepResults": [
{
"id": "snap",
"actionType": "snapshot_ui",
"success": true,
"data": {
"text": "<?xml version=\"1.0\" encoding=\"UTF-8\"?><hierarchy rotation=\"0\">...</hierarchy>"
}
}
],
"error": null
},
"deviceId": "<device_serial>",
"terminalSource": "clawperator_result",
"isCanonicalTerminal": true
}
Verification pattern - confirm the snapshot contract is active:
clawperator snapshot --json --device <device_serial>
Check these exact fields:
{
"envelope": {
"status": "success",
"stepResults": [
{
"actionType": "snapshot_ui",
"success": true,
"data": {
"text": "<?xml version=\"1.0\" encoding=\"UTF-8\"?><hierarchy rotation=\"0\">...</hierarchy>"
}
}
]
}
}
The XML Format
Node's contract is that data.text contains the raw XML hierarchy string. Node does not validate individual XML attributes, but the extracted content follows Android UI Automator style hierarchy dumps with a <hierarchy> root and nested <node> elements.
Typical node attributes visible in current snapshots include:
| XML attribute | Meaning | Related selector field |
|---|---|---|
resource-id |
Android resource id, often package:id/name |
resourceId |
text |
visible text | textEquals, textContains |
content-desc |
accessibility label | contentDescEquals, contentDescContains |
class |
widget class name such as android.widget.TextView |
none |
bounds |
screen rectangle in "[x1,y1][x2,y2]" form |
none |
package |
package name for the node | none |
clickable |
"true" or "false" |
none |
enabled |
"true" or "false" |
none |
scrollable |
"true" or "false" |
none |
Important limits on what to infer:
data.textis the only Node-guaranteed snapshot success field today.NodeMatcher.roleis a Clawperator selector concept documented in Selectors, not a direct XML attribute.- A node appearing in the XML does not guarantee it is currently reachable on screen. Use
bounds, scrolling, and follow-up actions to confirm reachability.
Current runtime note:
- Android-side code currently also emits keys such as
actual_format,foreground_package,has_overlay,overlay_package, andwindow_count - those keys are not documented as Node-guaranteed success fields in the current Node contract
- agents should rely on
data.textfirst and treat other snapshot metadata as opportunistic runtime data
Realistic XML Fragment
<?xml version="1.0" encoding="UTF-8"?>
<hierarchy rotation="0">
<node
index="0"
text=""
resource-id="com.android.settings:id/recycler_view"
class="androidx.recyclerview.widget.RecyclerView"
package="com.android.settings"
content-desc=""
clickable="false"
enabled="true"
scrollable="true"
bounds="[0,884][1080,2196]">
<node
index="0"
text="Connected devices"
resource-id="android:id/title"
class="android.widget.TextView"
package="com.android.settings"
content-desc=""
clickable="false"
enabled="true"
bounds="[216,1503][661,1573]" />
</node>
</hierarchy>
Annotated Live-Device Example
Full clawperator snapshot --json output from an Android emulator running
Android 15 (API 35) with the Settings app open. This example uses the emulator
because it produces reproducible results that any device can create and run.
Envelope
{
"envelope": {
"commandId": "snapshot-1774926121032-v6kvd37",
"taskId": "snapshot-1774926121032-v6kvd37",
"status": "success",
"stepResults": [
{
"id": "snap",
"actionType": "snapshot_ui",
"success": true,
"data": {
"actual_format": "hierarchy_xml",
"foreground_package": "com.android.settings",
"has_overlay": "false",
"window_count": "2",
"text": "<?xml version='1.0' encoding='UTF-8' standalone='yes' ?>\n<hierarchy rotation=\"0\">...</hierarchy>"
}
}
],
"error": null
},
"deviceId": "emulator-5554",
"terminalSource": "clawperator_result",
"isCanonicalTerminal": true
}
Fields to note:
actual_format,foreground_package,has_overlay, andwindow_countappear in thedataobject alongsidetext. They are runtime-detail fields emitted by the Android side and are not part of the Node-guaranteed contract. An agent may read them opportunistically (for example, confirmingforeground_packagebefore proceeding), but must not depend on them being present in all environments or versions.terminalSource: "clawperator_result"andisCanonicalTerminal: trueare outer-envelope fields added by the terminal output layer; they are not part of theenvelopesub-object.
Annotated XML Fragment
The full hierarchy is trimmed to show the structurally important layers. Omitted attributes (checkable, checked, focused, long-clickable, password, selected) are present in real output but rarely useful for targeting.
This example was captured on an emulator running Android 15 with a 1080 x 2400 pixel display.
<?xml version='1.0' encoding='UTF-8' standalone='yes' ?>
<!--
rotation="0" - device is portrait. Bounds coordinates use portrait dimensions.
On this emulator: 1080 x 2400 pixels.
-->
<hierarchy rotation="0">
<!-- Main scrollable container for Settings homepage -->
<node
resource-id="com.android.settings:id/settings_homepage_container"
class="android.widget.ScrollView"
package="com.android.settings"
clickable="false"
enabled="true"
scrollable="true"
bounds="[0,136][1080,2337]">
<!-- Toolbar containing the Settings title -->
<node
resource-id="com.android.settings:id/settings_toolbar"
class="android.widget.LinearLayout"
package="com.android.settings"
clickable="false"
bounds="[0,136][1080,292]">
<!--
The visible screen title. Use text="Settings" to confirm the active screen.
clickable="false" - this is a label, not a tap target.
-->
<node
text="Settings"
resource-id="com.android.settings:id/action_bar"
class="android.widget.TextView"
package="com.android.settings"
clickable="false"
enabled="true"
bounds="[48,185][267,243]" />
<!--
Search button. Use resource-id to target.
clickable="true" - this is a tap target.
-->
<node
text=""
resource-id="com.android.settings:id/search_action_bar"
class="android.widget.Button"
package="com.android.settings"
content-desc="Search settings"
clickable="true"
enabled="true"
bounds="[912,136][1080,292]" />
</node>
<!--
Main content list container.
Use resourceId selector with value "com.android.settings:id/main_content".
-->
<node
resource-id="com.android.settings:id/main_content"
class="android.widget.LinearLayout"
package="com.android.settings"
clickable="false"
bounds="[0,292][1080,2337]">
<!--
A settings list row. The row container is clickable.
Target by the child title text.
-->
<node
text=""
resource-id=""
class="android.widget.LinearLayout"
package="com.android.settings"
clickable="true"
enabled="true"
bounds="[0,315][1080,483]">
<node
text="Network & internet"
resource-id="android:id/title"
class="android.widget.TextView"
package="com.android.settings"
clickable="false"
enabled="true"
bounds="[144,351][579,447]" />
<!--
Summary text node showing current state.
resource-id="android:id/summary" shows current Wi-Fi status.
-->
<node
text="Wi-Fi"
resource-id="android:id/summary"
class="android.widget.TextView"
package="com.android.settings"
clickable="false"
enabled="true"
bounds="[144,351][936,447]" />
</node>
<!-- Second row follows the same pattern -->
<node
text=""
resource-id=""
class="android.widget.LinearLayout"
package="com.android.settings"
clickable="true"
enabled="true"
bounds="[0,483][1080,651]">
<node
text="Connected devices"
resource-id="android:id/title"
class="android.widget.TextView"
package="com.android.settings"
clickable="false"
enabled="true"
bounds="[144,519][621,615]" />
<node
text="Bluetooth"
resource-id="android:id/summary"
class="android.widget.TextView"
package="com.android.settings"
clickable="false"
enabled="true"
bounds="[144,519][936,615]" />
</node>
</node>
</node>
</hierarchy>
Targeting Patterns from This Example
| Goal | Selector approach |
|---|---|
| Confirm Settings is open | textEquals: "Settings" on a TextView |
| Tap a named settings row | resourceId: "android:id/title" + textEquals: "Network & internet" to locate, then click parent row |
| Read current state of a row | resourceId: "android:id/summary" + nearby textEquals for the row title |
| Tap the search button | contentDescEquals: "Search settings" or resourceId: "com.android.settings:id/search_action_bar" |
| Scroll the list | resourceId: "com.android.settings:id/settings_homepage_container" as scroll target |
Extraction Failure
If a snapshot_ui step initially succeeds but Node cannot attach data.text, Node rewrites that step into a failure:
{
"id": "snap",
"actionType": "snapshot_ui",
"success": false,
"data": {
"error": "SNAPSHOT_EXTRACTION_FAILED",
"message": "UI hierarchy extraction produced no output for this step. Check clawperator version compatibility and logcat extraction health."
}
}
This is not just a warning. It changes the step to success: false, and later envelope reconciliation can change the whole execution to status: "failed".
Typical recovery:
- Run
clawperator version --check-compat. - Run
clawperator doctor --json. - Re-run the snapshot with
--verboseif you need to inspect log correlation and the[TaskScope] UI Hierarchy:marker.
Verification pattern - confirm extraction failure handling:
clawperator snapshot --json --device <device_serial>
If extraction failed, branch on:
{
"envelope": {
"status": "failed",
"stepResults": [
{
"actionType": "snapshot_ui",
"success": false,
"data": {
"error": "SNAPSHOT_EXTRACTION_FAILED"
}
}
]
}
}
Related error case:
- if the command never returns an envelope at all, the caller gets a top-level
RESULT_ENVELOPE_TIMEOUTerror instead of a snapshot step result
Settle Warning
Node also adds a best-effort warning to successful snapshots when the immediately preceding action was click or scroll_and_click:
{
"warn": "snapshot captured without a preceding sleep step; UI may not have settled - consider adding a sleep step between click and snapshot_ui"
}
This warning appears only when:
- the snapshot step is successful
- Node can map the step id back to the original action order
- the previous action in that execution was
clickorscroll_and_click
Any intervening action such as sleep, wait_for_node, or read_text suppresses this warning because it may already provide settling time.
Snapshot Line Limit
LIMITS.MAX_SNAPSHOT_LINES is 2000.
This constant is defined in apps/node/src/contracts/limits.ts, but the current Node extraction path in snapshotHelper.ts does not actively clamp snapshots to 2000 lines. Treat it as a documented size-limit constant, not as a currently enforced truncation rule in extraction.
The same limits file also defines:
MAX_SNAPSHOT_BYTES = 262144
Operationally:
- do not assume Node will truncate at 2000 lines today
- do assume very large hierarchies are higher risk across extraction, payload handling, and downstream consumers
- keep large XML payloads within the documented size constants when possible
Successful Step Example
{
"id": "snap",
"actionType": "snapshot_ui",
"success": true,
"data": {
"text": "<?xml version=\"1.0\" encoding=\"UTF-8\"?><hierarchy rotation=\"0\"><node index=\"0\" text=\"Settings\" resource-id=\"com.android.settings:id/action_bar\" class=\"android.widget.TextView\" package=\"com.android.settings\" content-desc=\"\" clickable=\"false\" enabled=\"true\" bounds=\"[0,0][1080,176]\" /></hierarchy>"
}
}
What To Rely On
- rely on
stepResults[i].data.textas the canonical snapshot payload - rely on
SNAPSHOT_EXTRACTION_FAILEDwhen text extraction failed after execution - rely on
RESULT_ENVELOPE_TIMEOUTwhen no usable result envelope returned at all - treat
data.warnas advisory only - treat Android-emitted metadata fields beyond
textas runtime details, not as Node-guaranteed contract fields - use Selectors to map XML attributes into actionable selector objects