Clawperator Node Runtime and API Design
Product naming:
- Product:
Clawperator - Android package/application namespace:
com.clawperator.operator
Purpose
Clawperator is a deterministic actuator tool that allows agents to execute Android automations on behalf of a user. It provides a stable layer for LLM-driven device control with deterministic inputs/outputs, eliminating the need for brittle, direct recipe-specific shell scripting.
Execution model:
- Agents call Clawperator CLI/API.
- Clawperator performs
adband Android tooling interactions. - Clawperator sends validated runtime commands to Android (
ACTION_AGENT_COMMAND). - Clawperator returns structured execution results.
Critical requirement:
- Skill artifacts are optional.
- If no artifact exists, or an artifact is wrong/stale due to UI drift, feature flags, staged rollouts, or account-level variants, agents must still execute using generic runtime actions and live UI observation.
Agent-customer policy:
- The Clawperator Node runtime interface (CLI + HTTP API) is the primary/default interface for agents.
- The Android APK/runtime service is an execution target, not the agent-facing integration surface.
- Agents should not need direct
adbfor common tasks. - Raw
adbremains available as an explicit fallback for edge cases and debugging.
Design implication:
- If a workflow is common (for example package listing, screenshots, device discovery, app open/close, execution, snapshot, logs), provide a first-class Clawperator command/API for it.
Shipped Commands
Core commands:
clawperator doctor: Validate prerequisites and environment.clawperator devices: Discover connected device IDs.clawperator packages list: Confirm presence of receiver and target apps on device.clawperator execute: Run an execution JSON payload.clawperator observe snapshot: Get current UI hierarchy ashierarchy_xml.clawperator observe screenshot: Capture device screen.clawperator action [open-app|click|read|wait|type]: Single-step interaction wrappers.clawperator serve: Start HTTP/SSE server for remote agent access.clawperator doctor --fix: Best-effort environment remediation.clawperator skills install/update/search/run: Skills lifecycle.clawperator version --check-compat: CLI/APK compatibility check.
Contracts:
- Canonical Envelope:
[Clawperator-Result] {JSON}is the ONLY way success/failure is reported. expectedFormatRequired: Every observation/execution must includeexpectedFormat: "android-ui-automator".- Single-Flight Lock: Only one execution per
deviceId/receiverPackageat a time. Overlaps returnEXECUTION_CONFLICT_IN_FLIGHT.
HTTP API Server (serve)
When running clawperator serve [--port <number>] [--host <string>], a local HTTP server is started to allow remote agents to interact with Clawperator without direct CLI access.
⚠️ Security Warning: The HTTP API currently provides no authentication or authorization. By default, it binds to
127.0.0.1(localhost) for safety. If you bind to0.0.0.0or a public IP via--host, any client on your network can remotely control your connected Android devices. Only expose this API on trusted networks or behind an authenticated gateway.
REST Endpoints
GET /devices: List all connected Android devices and their states.POST /execute: Execute a full JSON execution payload.- Body:
{"execution": {...}, "deviceId": "...", "receiverPackage": "..."} - Returns:
RunExecutionResult(200 OK or 4xx/5xx on failure). - Status 423 Locked: Returned if another execution is in flight for the target device.
- Body:
POST /observe/snapshot: Quick helper for UI capture.- Body:
{"deviceId": "...", "receiverPackage": "..."}
- Body:
POST /observe/screenshot: Quick helper for visual capture.- Body:
{"deviceId": "...", "receiverPackage": "..."}
- Body:
Event Streaming (SSE)
The server provides a real-time event stream at GET /events. Callers should use a standard SSE client to subscribe.
- Event:
clawperator:result: Emitted when an execution reaches a terminal state (success or failure) and a deviceId is known.- Data:
{"deviceId": "...", "envelope": {...}}
- Data:
- Event:
clawperator:execution: Emitted for every attempt to run an execution, including pre-resolution failures.- Data:
{"deviceId": "...", "input": {...}, "result": {...}}
- Data:
- Event:
heartbeat: Upon connection, a{"code": "CONNECTED", ...}message is sent to verify the stream is active.
Concurrency and Locking
The server utilizes an in-memory single-flight lock per deviceId. If a second request arrives for the same device while an execution is in progress, the server returns HTTP 423 (Locked) immediately.
Determinism Doctrine
- No Hidden Logic: Clawperator never retries a failed action or auto-falls back to a different strategy (e.g., from
artifacttodirect). - Pre-Flight Validation: Every execution is validated against the target device and receiver capabilities before any ADB call is made.
- Canonical Result: Exactly one terminal envelope per
commandId. If a timeout occurs, the CLI emits aRESULT_ENVELOPE_TIMEOUTerror.
Error Taxonomy
LLM agents must use these codes to decide their next step.
Setup & Connectivity
ADB_NOT_FOUND: ADB is missing from PATH.NO_DEVICES: No Android devices are connected via USB/Network.MULTIPLE_DEVICES_DEVICE_ID_REQUIRED: More than one device exists; specify--device-id.RECEIVER_NOT_INSTALLED: The target receiver package is not on the device.
Execution & State
EXECUTION_VALIDATION_FAILED: The execution JSON is malformed or invalid.EXECUTION_ACTION_UNSUPPORTED: The requested action type is not supported by the runtime.EXECUTION_CONFLICT_IN_FLIGHT: A command is already running on the target device.RESULT_ENVELOPE_TIMEOUT: The command ran but no terminal envelope was received within the timeout.RESULT_ENVELOPE_MALFORMED: Logcat emitted an invalid JSON envelope.
UI & Nodes
NODE_NOT_FOUND: The selector (matcher) failed to find the target UI element.NODE_NOT_CLICKABLE: The target element was found but is not enabled/clickable.SECURITY_BLOCK_DETECTED: A system-level security overlay (e.g., "Package Installer" or "Permission Dialog") is blocking interaction.
Detailed Step-Level Error Handling
While the top-level status indicates overall command success, individual stepResults can provide granular failure diagnostics. This is essential for agents to reason about partial completions.
Step Error Format
When a step fails but the runtime continues (or fails fast), the stepResults entry will include:
- success: false
- data.error: A stable machine-readable error code.
- data.message: A human-readable (and LLM-readable) explanation.
Example: UNSUPPORTED_RUNTIME_CLOSE
This error occurs when a close_app action is dispatched to the Android runtime. Because of sandbox restrictions, the runtime cannot reliably close other apps.
Desired Outcome: The agent should see this error and know that the 'Hand' (Node CLI) is responsible for pre-flight closure via ADB.
{
"id": "step-1",
"actionType": "close_app",
"success": false,
"data": {
"application_id": "com.example.app",
"error": "UNSUPPORTED_RUNTIME_CLOSE",
"message": "Android runtime cannot reliably close apps. Use the Clawperator Node API or 'adb shell am force-stop' directly for this action."
}
}
Safety & Concurrency
In-Flight Semantics
A command is considered "in-flight" from the moment the ADB broadcast is sent until the [Clawperator-Result] is received or the timeoutMs is reached. If a command times out, the lock is held for an additional 2000ms "settle" window before allowing the next execution.
PII Redaction Policy
By default, Clawperator returns full-fidelity UI text to the agent for maximum reasoning accuracy.
- User Warning: Results will contain sensitive data (names, account digits, OTPs) if they are visible on the screen.
- Agent Mitigation: Do not ship raw Clawperator results to long-term storage without user consent.
API-First, ADB-Capable
This runtime is intentionally API-first:
- Agents should use Clawperator commands/APIs by default.
- Clawperator should wrap common Android/adb operations behind stable, typed contracts.
- Direct adb usage is a fallback path, not the baseline integration model.
Direct adb is still supported for:
- unsupported/emerging edge cases,
- low-level diagnostics,
- temporary gaps before a stable Clawperator primitive exists.
When fallback adb is used, Clawperator should still encourage convergence back to first-class APIs by:
- exposing equivalent primitives as they become common,
- keeping result/error formats structured and machine-readable,
- documenting fallback-to-API migration paths.
Skill Artifact Optionality and Failure Handling
Skill artifacts are optional, but fallback behavior is explicit:
- If artifact compile succeeds, execute compiled execution.
- If artifact compile fails, Clawperator returns a structured compile error and does not auto-fallback.
- If runtime verification fails, Clawperator returns a structured execution failure and does not auto-retry with alternate strategy.
- Agent chooses next step (retry, inspect UI, switch to direct actions, or abort).
Runtime must expose a mode on each execution:
artifact_compileddirect
This keeps behavior deterministic and avoids hidden control-flow in the runtime.
Execution Unit Contract
Use one term everywhere: execution.
compileproduces anexecution.executeruns anexecution.
Execution schema aligns with Android AgentCommand constraints.
Execution input may come from:
- skill artifact compile output, or
- direct action list authored by agent/tooling.
Example execution:
{
"commandId": "cmd-123",
"taskId": "task-123",
"source": "openclaw",
"expectedFormat": "android-ui-automator",
"timeoutMs": 90000,
"actions": [
{ "id": "close", "type": "close_app", "params": { "applicationId": "com.example.app" } },
{ "id": "open", "type": "open_app", "params": { "applicationId": "com.example.app" } },
{ "id": "wait", "type": "sleep", "params": { "durationMs": 3000 } }
]
}
Device Selection Policy (v1)
deviceId?: string is supported on execute/observe.
Selection behavior:
- If exactly one connected device exists and
deviceIdis omitted, use that device. - If more than one connected device exists,
deviceIdis required. - If provided
deviceIdis not connected indevicestate, fail preflight.
Agentic Best-Effort Mode
Best-effort mode is a first-class execution path for unknown or drifting UIs.
Behavior goals:
- Observe current UI (
snapshot_ui). - Identify likely anchors (toolbar/tab/menu/button/search patterns).
- Attempt constrained navigation/action.
- Re-observe and verify progress.
- Retry within safety bounds.
Best-effort does not imply unsafe freeform behavior; all attempts remain within validated runtime action limits and capability policy.
Important ownership split:
- Clawperator provides primitives and structured observations.
- The agent owns exploration policy/strategy.
- Clawperator should not silently invent fallback control flow.
Cardinality drift handling:
- Execution should tolerate mismatches between recipe assumptions and live UI (for example expected second device tile but only one exists).
- Runtime returns structured ambiguity/partial outcomes rather than hard-failing every mismatch.
- Agent decides whether to proceed with alternate target selection or stop.
Result Transport Channel (v1 choice)
Chosen v1 mechanism:
- logcat JSON envelope with strict prefix.
Required Android emission format (single line):
[Clawperator-Result] {"commandId":"...","taskId":"...","status":"success|failed","stepResults":[...],"error":null}
Current implementation note:
- Android emits canonical [Clawperator-Result] terminal envelopes for command completion.
Rules:
- Exactly one terminal result envelope per
commandId. - Envelope payload must be valid single-line JSON.
- Clawperator parser filters by
commandIdand prefix. - Non-envelope logs are ignored for result semantics.
This removes ad-hoc scraping patterns and provides deterministic parsing until a stronger transport is added.
Additionally, intermediate observation envelopes may be emitted with prefix:
[Clawperator-Event] {json...}
This supports agent feedback loops during best-effort execution.
Safety Bounds (hard constants)
Public limits (v1):
MAX_EXECUTION_ACTIONS = 50MAX_EXECUTION_TIMEOUT_MS = 120000MIN_EXECUTION_TIMEOUT_MS = 1000MAX_PAYLOAD_BYTES = 64000MAX_RETRY_ATTEMPTS_PER_STEP = 10MAX_SNAPSHOT_LINES = 2000MAX_SNAPSHOT_BYTES = 262144
Action policy:
- denylist by default for unsupported/unsafe action types
- allow only runtime-supported actions in v1
Best-effort specific bounds:
MAX_BEST_EFFORT_STEPS = 30MAX_BEST_EFFORT_RUNTIME_MS = 180000MAX_CONSECUTIVE_FAILED_ATTEMPTS = 5
Supported action types (v1):
open_appclose_appwait_for_nodeclickscroll_and_clickread_textsnapshot_uisleeptype_textdoctor_ping
Doctor and Dependency Management
clawperator doctor checks:
adbinstalled and executable- adb server reachable
- connected devices and states
- target package presence and version compatibility
- Android Developer Options and USB debugging (advisory)
- end-to-end handshake via
doctor_ping
clawperator doctor --fix capabilities (best effort):
- restart adb server
- run
clawperator grant-device-permissions - print exact remediation when automatic fix is unavailable
See Clawperator Doctor for the full check list and JSON report shape.
Skill Integration Mechanism
Canonical source of skills:
clawperator-skillsrepository
Distribution model:
clawperator-skillsCI generatesskills-index.jsononmain.clawperator skills installclones the local skills checkout on first setup.clawperator skills update [--ref <ref>]refreshes the checkout and can pin to a specific ref when needed.- Local cache stores synced artifacts for deterministic offline execution.
Runtime should execute against cached/pinned skill content, not live network fetches during execution.
Skill compilation requirements are defined in:
docs/design/skill-design.md
When skill artifacts are missing/stale, runtime can still execute direct executions supplied by the agent.
Skill Implementation Language Strategy
To set a maintainable baseline for future skills:
- Preferred language for new non-trivial skills: Node.js with TypeScript.
- Bash is allowed only for thin wrappers and simple glue.
- Python is a planned secondary path after Node contracts and tooling are stable.
Rationale:
- Better testability, typing, and reuse for parsing-heavy and multimodal workflows.
- Safer payload construction and lower shell-quoting risk than large Bash scripts.
- Cleaner evolution toward SDK-backed skill execution.
Migration policy:
- Do not mass-rewrite all existing Bash skills immediately.
- For new high-value or high-complexity skills, prefer Node.js/TypeScript implementations.
- Temporary Bash implementations (including the current Life360 flow) are acceptable only as stopgaps and must be queued for early migration once minimal Node skill SDK/runtime helpers are in place.
Agent-Friendly Command and Alias Layer
Because agents are primary customers, Clawperator should accept intuitive aliases that normalize to canonical actions.
Examples:
tap->clickpress->clicklong_press->clickwith long-click paramswait_for->wait_for_nodefind->wait_for_noderead->read_textsnapshot->snapshot_uisleep->sleepaction: Primary entry point for single-step interactions.
Rules:
- Canonical form is stored and logged.
- Aliases are input-only conveniences.
- Alias table is explicit/versioned (no fuzzy guessing in parser).
Node Module Structure
src/cli/*- command handlers and argument parsing
src/domain/doctor/*- prerequisites and auto-fix logic
src/domain/devices/*- adb discovery and selection
src/domain/skills/*- install/update/search/run/list/get/compile-artifact
src/domain/executions/*- validation, run, state transitions
src/adapters/android-bridge/*- adb broadcast + logcat result envelope parsing
src/contracts/*- schema constants, JSON types
Determinism and Validation Requirements
- Skill artifact compile must be pure and deterministic.
- Execution validation must occur before any adb call.
- Every run must emit correlated IDs:
executionId,commandId,taskId,deviceId. - Side-effecting executions must include verification signals in step results.
- Direct/fallback executions must include explicit mode/status metadata.
- Artifact compile must fail if required input variables are missing (no implicit PII/user-literal substitution).
Testing Strategy
Clawperator should define layered tests, with real-device execution as a first-class requirement.
- Unit tests (Node/CLI)
- execution schema validation and hard bounds
- device selection policy
- alias normalization to canonical actions
- result envelope parser correctness
- Integration tests (mock adb/logcat)
- doctor/device discovery behavior
- compile -> execute orchestration
- failure contracts and fallback instruction pointers
- Android instrumentation tests
ACTION_AGENT_COMMANDexecution path[Clawperator-Result]envelope emission- step result mapping and verification semantics
- Real-device tests
- run a baseline skill/execution on a known installed app (current baseline can be Google Home)
- verify end-to-end reliability across close/open/session policy behavior
- Future dedicated test APK
- create a controlled Android app exposing stable test UI elements/states
- migrate core conformance tests to this APK to reduce third-party app drift risk
Security and Policy
- Capability-based execution gating (from skill/artifact metadata).
- Per-profile allowlist/denylist for capabilities and packages.
- Disable dangerous capabilities by default (
purchase_riskoff unless explicit policy). - Audit trail for compile, execute, and result envelopes.
- Best-effort mode still obeys capability policy and hard limits.
Stability & Versioning
Clawperator follows Semantic Versioning (SemVer) for the Node SDK/CLI and its API contracts.
Versioning Rules
- Major Bump (
1.x.x): Breaking changes to the result envelope JSON schema, CLI command removal, or incompatibleACTION_AGENT_COMMANDprotocol changes. - Minor Bump (
x.1.x): New supported actions, new CLI commands, or backward-compatible schema additions. - Patch Bump (
x.x.1): Bug fixes, internal refactoring, or documentation updates.
Stability Boundary
- Stable (v1):
execute,observe snapshot,devices, and the[Clawperator-Result]envelope structure. - Alpha/Unstable:
execute best-effort,--serve(HTTP), and any feature marked as(Upcoming)in these docs. These may break without a major version bump until they are promoted to stable.