# Clawperator Full Documentation This document contains all the technical documentation for Clawperator, compiled into a single file for easy digestion by AI agents. # Home # Clawperator Docs Clawperator is a deterministic actuator tool for LLM-driven Android automation. It acts as the "hand" for an LLM "brain," allowing agents to automate device control on a dedicated Android device. Use this page as a technical map: - start with setup if you are preparing a device or emulator - start with the Node API guide if you are integrating an agent - use the reference section for exact command, payload, and error semantics ## Machine-facing routes - [llms.txt](https://docs.clawperator.com/llms.txt) - compact machine-oriented docs entrypoint - [llms-full.txt](https://docs.clawperator.com/llms-full.txt) - compiled full docs corpus on the docs host - [sitemap.xml](https://docs.clawperator.com/sitemap.xml) - canonical crawl map for this docs host ## Recommended paths ### Start here if you are integrating an agent - [Node API - Agent Guide](ai-agents/node-api-for-agents.md) - Canonical CLI and HTTP API contract for agent builders - [Operator LLM Playbook](design/operator-llm-playbook.md) - Practical operating rules for observation, action loops, and skill execution - [API Overview](reference/api-overview.md) - Execution payload, action types, result envelope shape, and snapshot semantics - [CLI Reference](reference/cli-reference.md) - Command-line entrypoints and flags ### Start here if you are preparing runtime infrastructure - [First-Time Setup](getting-started/first-time-setup.md) - Install the CLI, choose an Android environment, and prepare the [Clawperator Operator Android app](getting-started/android-operator-apk.md) - [Running Clawperator on Android](getting-started/running-clawperator-on-android.md) - Canonical actuator model, physical device vs emulator, and user responsibilities - [OpenClaw First Run](getting-started/openclaw-first-run.md) - Task-oriented runbook for installing Clawperator, preparing Android, and completing a first real skill run - [Clawperator Operator Android app](getting-started/android-operator-apk.md) - Package variants, installation, and permissions for Clawperator's Android app ## Getting Started - [Running Clawperator on Android](getting-started/running-clawperator-on-android.md) - Canonical actuator model, physical device vs emulator, and user responsibilities - [Clawperator Terminology](getting-started/terminology.md) - Canonical definitions for Android device, Operator app, user-installed Android apps, and other core terms - [First-Time Setup](getting-started/first-time-setup.md) - Install the CLI, choose an Android environment, and prepare the [Clawperator Operator Android app](getting-started/android-operator-apk.md) - [OpenClaw First Run](getting-started/openclaw-first-run.md) - Task-oriented runbook for installing Clawperator, preparing Android, and completing a first real skill run - [Project Overview](getting-started/project-overview.md) - Mission, architecture, and repository surfaces - [Clawperator Operator Android app](getting-started/android-operator-apk.md) - Package variants, installation, and permissions for Clawperator's Android app ## For AI Agents - [Node API - Agent Guide](ai-agents/node-api-for-agents.md) - Canonical CLI and HTTP API reference for agents - [Operator LLM Playbook](design/operator-llm-playbook.md) - Action contracts, runtime conventions, and skill packaging - [API Overview](reference/api-overview.md) - Execution payload, action types, result envelopes, and snapshot delivery - [CLI Reference](reference/cli-reference.md) - Exact command surface for local and scripted integrations ## Reference - [CLI Reference](reference/cli-reference.md) - Command-line usage and flags - [API Overview](reference/api-overview.md) - Execution payload, action types, result envelopes, and snapshot semantics - [Error Codes](reference/error-codes.md) - Structured runtime and API error code reference - [Doctor](reference/node-api-doctor.md) - Runtime readiness checks, exit behavior, and JSON report shape ## Architecture - [System Overview](architecture/architecture.md) - High-level architecture and execution flow - [Node Runtime and API Design](design/node-api-design.md) - Detailed Node contract and runtime design ## Troubleshooting - [Version Compatibility](troubleshooting/compatibility.md) - [Troubleshooting the Operator App](troubleshooting/troubleshooting.md) - [Known Issues](troubleshooting/known-issues.md) - [Crash Logs](troubleshooting/crash-logs.md) ## Skills - [Usage Model](skills/usage-model.md) - [Skill Authoring Guidelines](skills/skill-authoring-guidelines.md) - [Skill Design](design/skill-design.md) - [Device Prep and Runtime Tips](skills/device-prep-and-runtime-tips.md) - [Skills Verification](skills/skills-verification.md) - [Blocked Terms Policy](skills/blocked-terms-policy.md) --- # Getting Started # Running Clawperator on Android Clawperator operates an Android device on behalf of a user. In these docs, "Android device" means either: - a physical Android phone connected over `adb` - a local Android emulator provisioned through the Node CLI This is the canonical actuator model for Clawperator. The Node runtime talks to an Android device, and the Android device runs the [Clawperator Operator Android app](android-operator-apk.md). Canonical definitions for terms such as "Android device", "[Clawperator Operator Android app](android-operator-apk.md)", and "user-installed Android apps" live in [Clawperator Terminology](terminology.md). ## The actuator model ```text Agent (Brain) | Clawperator Node Runtime | Android Device (physical or emulator) ``` Clawperator operates the device UI. It does not own account setup, app configuration, or user credentials. ## User responsibilities Before automation starts, the user is responsible for preparing the Android device. That includes: - installing the apps the automation will target - signing into Google, Play Store, and any app-specific accounts - completing first-run flows, prompts, and app configuration - ensuring the user-installed Android apps are in a usable state for automation Clawperator does not: - create accounts - sign into accounts - configure apps on the user's behalf - bypass authentication or anti-abuse gates Agents should assume the device already contains the required apps, logins, and configuration. ## Choosing an Android environment Clawperator supports two actuator environments. ### Option A - Physical Android device This is the recommended path for production use. - best app compatibility - least divergence from a normal user device - lowest risk of emulator detection - strong fit for persistent or long-running automation Use a physical device when reliability matters more than convenience. ### Option B - Android emulator This is primarily for development, testing, and situations where no physical device is available. - no dedicated hardware required - quick to reprovision - useful for local validation and agent development Some apps detect emulator environments or refuse login on them. Clawperator does not guarantee third-party app compatibility on emulators. ## Emulator provisioning Provisioning is owned by the Node CLI and API, not by `install.sh`. ```bash clawperator provision emulator ``` Provisioning is deterministic and reuse-first: 1. Reuse a running supported emulator. 2. Start a stopped supported AVD. 3. Create a new supported AVD if none exist. The default supported emulator profile is: - Android API level `35` - device profile `pixel_7` - ABI `arm64-v8a` - Google Play system image - default AVD name `clawperator-pixel` Even on an emulator, the user still needs to: - sign in to a Google account if Play Store access is needed - install the user-installed Android apps - sign in to those apps - complete any first-run configuration ## Installing the Clawperator Operator Android app Use the canonical install command to install the [Clawperator Operator Android app](android-operator-apk.md) and grant required permissions in one step: ```bash clawperator operator setup --apk ~/.clawperator/downloads/operator.apk ``` If multiple devices are connected, target one explicitly: ```bash clawperator operator setup --apk ~/.clawperator/downloads/operator.apk --device-id ``` This command installs the APK, grants the accessibility service and notification listener permissions, and verifies the package is ready. > Do not use raw `adb install` for normal setup. It installs the APK but leaves the device in an unusable state without required permissions. If the Operator APK crashes after initial setup and Android revokes its permissions, run the remediation command: ```bash clawperator grant-device-permissions ``` Do not use this as part of normal setup. The normal setup path is always `clawperator operator setup` (`clawperator operator install` remains an alias). ## Verifying setup Use `doctor` to confirm the Android device is ready: ```bash clawperator doctor ``` `doctor` verifies that the device is reachable, the [Clawperator Operator Android app](android-operator-apk.md) is installed, and the runtime handshake is working. All critical checks must pass before automation starts. --- # Clawperator Terminology This page defines the canonical terms used across Clawperator documentation. Prefer these terms when writing or updating docs so agents can rely on stable meanings. ## Android device The Android environment that Clawperator operates on behalf of a user. An Android device can be: - a physical Android device connected over `adb` - a local Android emulator provisioned through the Node CLI or HTTP API When a doc says "Android device", it does not imply physical hardware only. ## Physical Android device A real Android phone or tablet connected to the host, usually over USB with `adb` enabled. This is the recommended actuator environment for production automation because it has the best app compatibility and the least divergence from normal user behavior. ## Android emulator A local Android Virtual Device provisioned and managed by the Node CLI or HTTP API. Use this term for the emulator environment itself, not for the Android apps running inside it. ## Actuator device The Android device that Clawperator operates. This is a conceptual term for the execution environment. In practice it means the same runtime target as "Android device". ## Clawperator Operator Android app Clawperator's own Android app, documented in [Clawperator Operator Android app](android-operator-apk.md). This is the app installed on the Android device so Clawperator can receive commands, observe the UI through Android Accessibility, and execute actions. Important: this is not the same thing as the Android apps the user wants Clawperator to operate. ## User-installed Android apps The Android apps the user wants Clawperator to operate, such as Settings, shopping apps, banking apps, ride-share apps, or social apps. These apps are the user's responsibility to: - install - sign into - configure - keep ready for automation Avoid vague phrases like "target app" when "user-installed Android app" is more precise. ## Node runtime The Clawperator Node CLI and HTTP API running on the host machine. This layer validates commands, talks to `adb`, manages emulator lifecycle, dispatches executions, and reads result envelopes from the Android device. ## Agent The external LLM-driven system that reasons about what to do next and calls the Clawperator Node runtime. In the Clawperator ecosystem, this usually means OpenClaw or another OpenClaw-like agent that uses Clawperator as its execution hand. Clawperator is the execution hand. The agent is the planning brain. ## Receiver package The Android application ID of the installed [Clawperator Operator Android app](android-operator-apk.md) variant that the Node runtime should talk to. Current package IDs: - `com.clawperator.operator` for the release build - `com.clawperator.operator.dev` for the local debug build ## Execution A validated command payload sent from the Node runtime to the [Clawperator Operator Android app](android-operator-apk.md). An execution contains one or more actions and produces exactly one canonical `[Clawperator-Result]` envelope. ## Skill A packaged automation recipe distributed through the skills bundle or sibling skills repository. Skills are not the same thing as the core Clawperator runtime. They sit above it and use the runtime to operate user-installed Android apps. --- # First-Time Setup Clawperator requires an Android device to operate. This device may be: * a physical Android phone * an Android emulator In both cases, the device must be configured with the apps and user logins required by the automation. Clawperator operates the UI on that device. It does not create accounts, sign into apps, or complete app configuration on behalf of the user. For an overview of the actuator model and user responsibilities, see [Running Clawperator on Android](running-clawperator-on-android.md). --- ## Step 1 - Install CLI The simplest path is to run the installer: ```bash curl -fsSL https://clawperator.com/install.sh | bash ``` The installer installs the CLI, downloads the latest stable [Clawperator Operator Android app](android-operator-apk.md) package locally, and runs `clawperator doctor` to detect missing setup. Historical versions and release notes remain available on [GitHub Releases](https://github.com/clawperator/clawperator/releases). --- ## Step 2 - Choose Android Environment ### Option A: Physical Device Requirements: USB cable, any Android 5.0+ device. 1. **Enable Developer Options:** On the device, open **Settings**, go to **About Phone**, and tap **Build Number** 7 times until you see "You are now a developer". Go back to **Settings** - a **Developer Options** entry will appear. 2. **Enable USB Debugging:** In **Developer Options**, enable it (toggle at the top), then enable **USB Debugging**. 3. **Connect via USB:** Connect the device to your machine. On the device, a dialog will appear: **"Allow USB debugging?"** Tap **Allow** (optionally check "Always allow from this computer"). Verify the connection: ```bash adb devices ``` You should see your device listed as `device` (not `unauthorized`). ### Option B: Android Emulator Clawperator manages the Android emulator lifecycle. No manual AVD setup is required. Requirements: `adb`, `emulator`, `sdkmanager`, `avdmanager` in `PATH`. Provision the emulator: ```bash clawperator provision emulator --output json ``` This command reuses a running supported emulator, starts a stopped supported AVD, or creates a new AVD with the default profile. The default supported emulator profile is: - Android API level `35` - Google Play system image - ABI `arm64-v8a` - device profile `pixel_7` - AVD name `clawperator-pixel` You can inspect configured AVDs at any time: ```bash clawperator emulator list --output json clawperator emulator inspect clawperator-pixel --output json ``` If both a physical device and an emulator are connected, you will need to pass `--device-id ` to later commands. --- ## Step 3 - Install the Clawperator Operator Android app Use the canonical install command to install the [Clawperator Operator Android app](android-operator-apk.md) and grant required permissions in one step: ```bash clawperator operator setup --apk ~/.clawperator/downloads/operator.apk ``` If you have multiple devices connected, specify the target device: ```bash clawperator operator setup \ --apk ~/.clawperator/downloads/operator.apk \ --device-id ``` For local debug builds, specify the receiver package: ```bash clawperator operator setup \ --apk ~/.clawperator/downloads/operator-debug.apk \ --receiver-package com.clawperator.operator.dev ``` This command runs three phases in order: 1. Installs the APK onto the device via `adb`. 2. Grants required device permissions (accessibility service, notification listener). 3. Verifies the package is accessible after install. The command fails with a structured error if any phase fails. The error includes which phase failed and why, so agents and users can diagnose and recover. > Do not use raw `adb install` for normal setup. It copies the APK but leaves the device in a partial state without required permissions. Use `clawperator operator setup` instead. > Always use `clawperator operator setup` for setup. `clawperator operator install` remains an alias. Only run `clawperator grant-device-permissions` after the Operator APK crashes and Android revokes the accessibility or notification permissions. --- ## Step 4 - Verify Setup Run the diagnostic check: ```bash clawperator doctor ``` A fully configured device will show all checks passing. Common warnings: | Warning | Fix | | :--- | :--- | | `DEVICE_UNAUTHORIZED` | Tap "Allow" on the device USB debugging dialog | | `RECEIVER_NOT_INSTALLED` | Complete Step 3 (run `clawperator operator setup`) | | `DEVICE_ACCESSIBILITY_NOT_RUNNING` | If the Operator APK crashed after setup, run `clawperator grant-device-permissions` to restore the revoked permissions | | `DEVICE_DEV_OPTIONS_DISABLED` | Enable Developer options (physical device only) | | `DEVICE_USB_DEBUGGING_DISABLED` | Enable USB debugging (physical device only) | --- ## Step 5 - Run Your First Command Observe the current UI state: ```bash clawperator observe snapshot --device-id ``` Open an app: ```bash clawperator action open-app \ --app com.android.settings \ --device-id \ --receiver-package com.clawperator.operator ``` > Use `com.clawperator.operator` for the release [Clawperator Operator Android app](android-operator-apk.md), `com.clawperator.operator.dev` for the local debug build. Before running real automations, make sure the Android apps the user wants Clawperator to operate are installed, signed in, and already configured on the device or emulator. --- ## Keeping the Device Ready For reliable automation: - Keep the device **screen unlocked** (set screen timeout to maximum or "Never" in Display settings) - Keep the device **plugged in** (charging) - Keep the Clawperator **Accessibility Service enabled** - For physical devices: keep **USB Debugging enabled** --- ## Troubleshooting See [Troubleshooting the Operator App](https://docs.clawperator.com/troubleshooting/troubleshooting/) for common issues. For environment checks: `clawperator doctor --output pretty` Verify the installed CLI and [Clawperator Operator Android app](android-operator-apk.md) pair explicitly: ```bash clawperator version --check-compat --receiver-package com.clawperator.operator ``` --- # OpenClaw First Run This guide is the first successful OpenClaw + Clawperator path. It is written for agents and technical operators who want one clear outcome: 1. install Clawperator from the public website 2. prepare an Android device or emulator 3. install skills 4. verify the runtime 5. run a real skill and return a result If the user says something like: > "clawperator.com looks interesting - set this up for me, and tell me the battery % on my home battery" this is the guide an agent should follow. ## What this guide assumes - OpenClaw is running on a Mac mini or similar host - the host can access `https://clawperator.com` and `https://docs.clawperator.com` - one Android actuator environment will be used: - a physical Android device connected over `adb` - or a local Android emulator provisioned through Clawperator - the user is responsible for app installation, login, and app configuration on that Android device Clawperator operates the Android UI. It does not create accounts, sign into apps, or complete first-run app setup on behalf of the user. ## Trust boundary The first-time USB debugging authorization prompt on a physical Android device still requires a human tap on the device. Everything else in this guide is intended to be automatable by an agent once: - the physical device has already authorized this host for `adb` - or the emulator path is available and the Android SDK tools are installed ## Step 1 - Install Clawperator from the public website Start from the public install command shown on `clawperator.com`: ```bash curl -fsSL https://clawperator.com/install.sh | bash ``` The installer: - installs the `clawperator` CLI - downloads the latest stable [Clawperator Operator Android app](android-operator-apk.md) package - runs `clawperator doctor` - installs or upgrades the [Clawperator Operator Android app](android-operator-apk.md) when the Android device is present and ready - runs `clawperator skills install` - writes `CLAWPERATOR_SKILLS_REGISTRY` into the shell profile when skills install succeeds Verify the CLI is available: ```bash clawperator --version ``` ## Step 2 - Choose the Android environment Clawperator can run against either a physical Android device or a local Android emulator. ### Option A - Physical Android device This is the preferred path for compatibility and long-running reliability. Verify that a device is connected: ```bash adb devices clawperator devices ``` If the device appears as `unauthorized`, the first-time USB debugging prompt must be accepted by a human on the Android device before continuing. ### Option B - Android emulator If no physical device is available, provision an emulator after the CLI is installed: ```bash clawperator provision emulator --output json ``` This command reuses a running supported emulator, starts a stopped supported AVD, or creates a new Google Play emulator if none exist. Capture the returned serial and use it as `--device-id` in later commands. ## Step 3 - Verify or install the Clawperator Operator Android app The [Clawperator Operator Android app](android-operator-apk.md) must be installed on the Android device. The installer usually handles this automatically. If needed, install it with the canonical command: ```bash clawperator operator setup --apk ~/.clawperator/downloads/operator.apk --device-id ``` If only one device is connected, `--device-id` can be omitted: ```bash clawperator operator setup --apk ~/.clawperator/downloads/operator.apk ``` For a local debug build of the [Clawperator Operator Android app](android-operator-apk.md): ```bash clawperator operator setup \ --apk ~/.clawperator/downloads/operator-debug.apk \ --receiver-package com.clawperator.operator.dev ``` This command is the normal path for agents and operators. It installs the APK and grants required permissions (accessibility, notification listener) in one step. `clawperator operator install` remains a compatibility alias. Do not split initial setup into separate `adb install` and `grant-device-permissions` calls. Reserve `clawperator grant-device-permissions` for recovery after the Operator APK crashes and Android revokes the previously granted permissions. ## Step 5 - Run doctor Before OpenClaw relies on the runtime, verify the environment: ```bash clawperator doctor --device-id --output pretty ``` Expected checks include: - `adb` healthy - device connected and reachable - [Clawperator Operator Android app](android-operator-apk.md) installed - handshake successful If doctor does not pass, stop and fix the environment before moving on. ## Step 6 - Verify skills installation The installer should have run `clawperator skills install`, but agents should verify it. Check the registry env var: ```bash echo $CLAWPERATOR_SKILLS_REGISTRY ``` The expected path is: ```bash ~/.clawperator/skills/skills/skills-registry.json ``` If the env var is empty, set it explicitly: ```bash export CLAWPERATOR_SKILLS_REGISTRY="$HOME/.clawperator/skills/skills/skills-registry.json" ``` If skills were not installed, run: ```bash clawperator skills install ``` Confirm that skills are visible: ```bash clawperator skills list ``` ## Step 7 - Run one safe verification skill Before attempting a user-facing app skill, run a known-safe verification skill: ```bash clawperator skills run com.android.settings.capture-overview --device-id ``` If this succeeds, the end-to-end OpenClaw + Clawperator + Android path is working. ## Step 8 - Find the SolaX skill The public skills bundle includes a SolaX battery skill: - skill ID: `com.solaxcloud.starter.get-battery` - Android app ID: `com.solaxcloud.starter` - purpose: read current SolaX Cloud battery percentage Agents should confirm that the skill is present: ```bash clawperator skills search --app com.solaxcloud.starter clawperator skills get com.solaxcloud.starter.get-battery ``` ## Step 9 - Understand the preconditions for the SolaX run For this first-run path, the user must already have: - the SolaX Cloud Android app installed on the Android device - the user signed into the SolaX app - the app in a usable state for automation That means the agent does **not** need SolaX cloud credentials. The agent still needs the Android runtime to be ready and the correct device selected. If the user also wants Google Home or another app later, the same rule applies: the user owns the app install and login state, while Clawperator only operates the UI. ## Step 10 - Run the SolaX skill Run the public SolaX battery skill: ```bash clawperator skills run com.solaxcloud.starter.get-battery --device-id ``` The expected success shape is a line like: ```text ✅ SolaX battery level: 61.0% ``` At this point the agent can answer the user directly with the current battery percentage. ## Exact first-run command sequence For agents that need one deterministic sequence, the normal path is: ```bash curl -fsSL https://clawperator.com/install.sh | bash clawperator --version clawperator devices clawperator operator setup --apk ~/.clawperator/downloads/operator.apk --device-id clawperator doctor --device-id --output pretty echo $CLAWPERATOR_SKILLS_REGISTRY clawperator skills list clawperator skills run com.android.settings.capture-overview --device-id clawperator skills search --app com.solaxcloud.starter clawperator skills get com.solaxcloud.starter.get-battery clawperator skills run com.solaxcloud.starter.get-battery --device-id ``` If no physical device is present, insert this after install: ```bash clawperator provision emulator --output json ``` Use the returned emulator serial as ``. ## What agents should document during first run When an agent is asked to "set this up for me", it should record: - the exact install command used - whether a physical Android device or emulator was chosen - the selected device serial - whether the [Clawperator Operator Android app](android-operator-apk.md) was installed automatically or manually - whether `operator setup` succeeded (or which phase failed) - whether `doctor` passed - whether skills installed cleanly - the value of `CLAWPERATOR_SKILLS_REGISTRY` - the exact skill command used - the final returned battery percentage - every ambiguity, manual step, or failure encountered ## Common failure points ### No Android device present If `adb devices` shows no device: - stop if the user expected a physical device and none is available - otherwise use the emulator path with `clawperator provision emulator` ### Device unauthorized If `adb devices` shows `unauthorized`, the Android device has not approved this host for USB debugging yet. A human must approve the prompt on the device before the agent can continue. ### Skills registry missing If `clawperator skills list` fails or the registry env var is empty: ```bash export CLAWPERATOR_SKILLS_REGISTRY="$HOME/.clawperator/skills/skills/skills-registry.json" clawperator skills install ``` ### SolaX app not ready If the SolaX skill fails because the app is logged out, showing onboarding, or blocked by an unexpected screen, that is not a Clawperator install problem. It means the Android device is not yet prepared by the user for that app workflow. ### Multiple devices connected Always pass `--device-id ` if more than one device is visible. This is especially important after emulator provisioning, because both a physical device and an emulator may be connected at the same time. ## Summary The first successful OpenClaw path is: 1. install Clawperator from `clawperator.com` 2. choose a physical Android device or provision an emulator 3. verify the [Clawperator Operator Android app](android-operator-apk.md) and permissions 4. confirm `doctor` passes 5. confirm skills are installed 6. run a safe verification skill 7. run `com.solaxcloud.starter.get-battery` 8. return the battery percentage to the user --- # Project Overview ## Mission Clawperator is a deterministic actuator tool for LLM-driven Android automation. It provides a stable execution layer that allows agents to perform actions on a dedicated Android device on behalf of a user. Clawperator can operate either a physical Android device or a local Android emulator. The emulator option is intended primarily for development and testing. Some applications may detect or block emulator environments. In both environments, the user owns app installation, sign-in, and app configuration. Clawperator only operates the device UI. This approach ensures that the user's primary phone (e.g., an iPhone) remains undisturbed while the actuator device handles automation tasks. Physical device setups have virtually no hardware requirements - any cheap or old Android device works. ## Core Philosophy: The Brain and the Hand Clawperator is designed as the execution "hand" for an LLM "brain": 1. **The Agent (The Brain):** An external LLM (e.g., OpenClaw) that owns reasoning, planning, and decision-making. It supervises the automation and decides which actions to take based on the user's intent. 2. **Clawperator (The Hand):** A deterministic actuator tool. It provides the reliable "fingers" to observe UI state (`snapshot_ui`), perform precise interactions (taps, scrolls), and report structured results back to the brain. **Clawperator is intentionally not intelligent.** It does not perform autonomous multi-step reasoning or agentic planning. It follows instructions and provides high-fidelity feedback so the Agent can drive the process. ## Architecture The system consists of two primary layers: - **Android Runtime (`apps/android`):** An Android application that leverages Accessibility APIs to inspect the UI tree and perform actions (taps, scrolls, text entry, and system hard-keys like `back`/`home`). It listens for commands via a broadcast receiver. - **Node Runtime/CLI (`apps/node`):** The primary interface for agents. It wraps `adb` interactions, validates execution payloads, dispatches commands to the Android device, parses canonical result envelopes from logs, and owns the full Android emulator lifecycle (discovery, provisioning, creation, and teardown). ## Website Surfaces This repository also contains two separate public website builds: - **Landing site (`sites/landing`):** The static Next.js site for `https://clawperator.com`. This is the marketing homepage and install/download entrypoint. - **Docs site (`sites/docs`):** The MkDocs site for `https://docs.clawperator.com`. This is the technical documentation surface. Keep them distinct when editing: - Landing-site root files such as `install.sh`, `robots.txt`, `llms.txt`, and `sitemap.xml` belong to `sites/landing/public/`. - Docs content should be changed in `docs/`, `apps/node/src/`, or `../clawperator-skills/docs/`, not directly in `sites/docs/docs/`. - Docs-site root files such as `robots.txt` and `llms.txt` live in `sites/docs/static/` and are copied into the built site by `./scripts/docs_build.sh`. - Both public sites deploy automatically to Cloudflare after changes are merged to `main`. ## The Role of Skills Skills are reusable templates for app-specific workflows (e.g., "get thermostat temperature" or "check grocery prices"). - **Canonical Home:** `../clawperator-skills` (a dedicated sibling repository). - **Project-local maintenance skills:** Repository-specific Codex workflows live in `.agents/skills/`. For example, `.agents/skills/docs-generate/` regenerates the MkDocs site content in `sites/docs/docs/` from the canonical doc sources. - **Nature of Skills:** Due to the dynamic nature of mobile apps (A/B tests, server-side flags, unexpected popups), skills are treated as **highly informed context** for the Agent rather than purely deterministic scripts. - **Agent Responsibility:** The Agent uses skill templates as a baseline, modifying them at runtime to handle personal configurations (variable substitution) or UI drift. ## Package Identifiers - **`com.clawperator.operator`**: The production/release package name. - **`com.clawperator.operator.dev`**: The local development package name (used when building from source). ## Safety & Privacy - **Full-fidelity results:** By default, result envelopes contain exactly what is on screen, including sensitive text. Agents should not forward raw results to long-term storage without user consent. - **Control:** The "Two-Handed" model ensures that agents can only execute within the safety bounds defined by the Clawperator runtime. - **Observability:** Agents use `snapshot_ui` (`hierarchy_xml`) and screenshots to "see" the device state. --- # Clawperator Operator Android app This is Clawperator's own Android app. It runs as a background service on the dedicated Android device and executes actions requested by the Node API. This app is intentionally distinct from the Android apps the user wants Clawperator to operate. Those user-installed Android apps are not part of Clawperator itself. See [Clawperator Terminology](terminology.md) for the full distinction. ## Application IDs The app is distributed in two variants, each with its own application ID: * **`com.clawperator.operator`**: The stable, release version. This is the default package used by the CLI and intended for most users and remote AI agents. * **`com.clawperator.operator.dev`**: The local debug version. This is used by developers building the APK from source locally. *Note: The CLI communicates with `com.clawperator.operator` by default. If you are using a debug build, you must pass the `--receiver-package com.clawperator.operator.dev` flag to CLI commands.* ## Installation ### Prerequisites - Android device with Developer Options and USB Debugging enabled. - `adb` installed on your host machine. ### Automatic Installation The easiest way to install is via the one-line installer: ```bash curl -fsSL https://clawperator.com/install.sh | bash ``` This downloads the latest app package and installs it to your connected device. ### Manual Installation To install manually: 1. Download the latest app package from [clawperator.com/operator.apk](https://clawperator.com/operator.apk). 2. Connect your device via USB. 3. Run the canonical install command: ```bash clawperator operator setup --apk operator.apk ``` This command installs the APK and grants all required permissions in one step. See [First-Time Setup](first-time-setup.md) for full details. > Do not use raw `adb install` for normal setup. It installs the APK but does not grant the permissions required for the app to operate. ### Historical Versions Historical versions can be downloaded from `downloads.clawperator.com`. The URL structure follows the versioning pattern: - `https://downloads.clawperator.com/operator/v/operator-v.apk` Example for v0.3.0: - [https://downloads.clawperator.com/operator/v0.3.0/operator-v0.3.0.apk](https://downloads.clawperator.com/operator/v0.3.0/operator-v0.3.0.apk) ## Required Permissions The app requires three permissions to operate: ```bash clawperator operator setup --apk ``` This command grants all three permissions automatically during install. The permissions are: | Permission | ADB mechanism | Purpose | |---|---|---| | Accessibility service | `settings put secure enabled_accessibility_services` | Read UI tree and dispatch gestures | | Post notifications (`POST_NOTIFICATIONS`) | `pm grant android.permission.POST_NOTIFICATIONS` | Show foreground service status notification (Android 13+) | | Notification listener | `settings put secure enabled_notification_listeners` | Observe notifications from all apps on the device | The accessibility service and notification listener permissions are enabled by appending the Clawperator service component to the relevant secure setting, matching the format Android uses internally. The `POST_NOTIFICATIONS` grant is a standard runtime permission grant; on Android 12 and below it is silently skipped. If the Operator APK crashes after setup and Android revokes the granted permissions, use the remediation command: ```bash clawperator grant-device-permissions ``` This re-grants the same permissions without reinstalling the APK. Do not use it for normal setup - agents should use `clawperator operator setup` for the initial install path every time. `clawperator operator install` remains an alias. ## Logging and Debugging The app enables **debug logging in release builds** by default. This provides maximum visibility into UI events, accessibility node trees, and command execution states, so remote AI agents can observe internal state and troubleshoot issues directly through `adb logcat` even when using production APKs. To view logs: ```bash adb logcat | grep -E '(Operator|Clawperator)' ``` ## Troubleshooting - **Device not found:** Run `adb devices` to ensure your device is recognized. - **Permission denied:** Ensure you have accepted the USB debugging prompt on your device. - **Handshake timeout:** Ensure the Clawperator Accessibility Service is enabled in your device's Accessibility settings. --- # Architecture # Clawperator Architecture Clawperator is a two-layer system: an Android runtime that executes actions on the device, and a Node CLI/API that agents talk to. The Android runtime runs on an Android device which may be: * a physical Android device * a local Android emulator Emulator lifecycle belongs to the Node layer, not the installer. User accounts, app installs, and app configuration remain the user's responsibility on either environment. ## Layers ### Android Runtime (`apps/android`) The Android app runs as a persistent background service on the dedicated actuator device. It uses the Android Accessibility API to: - Inspect the live UI tree of any foreground application - Perform precise interactions: taps, scrolls, text entry - Listen for commands via a broadcast receiver (`ACTION_AGENT_COMMAND`) - Emit structured results via logcat using the canonical `[Clawperator-Result]` envelope The app ships in two variants: - `com.clawperator.operator` - release build of the [Clawperator Operator Android app](../getting-started/android-operator-apk.md), used by default - `com.clawperator.operator.dev` - local debug build of the [Clawperator Operator Android app](../getting-started/android-operator-apk.md), used when building from source ### Node CLI/API (`apps/node`) The Node package is the agent-facing interface. It: - Wraps all `adb` interactions so agents do not need to issue raw shell commands - Owns Android emulator discovery, creation, lifecycle, and provisioning - Validates execution payloads before dispatch - Broadcasts commands to the Android receiver via `adb shell am broadcast` - Reads and parses the `[Clawperator-Result]` envelope from logcat - Exposes an HTTP/SSE server (`clawperator serve`) for remote agent access - Provides `clawperator doctor` for environment diagnostics Android emulator support is intentionally implemented in the Node layer. `install.sh` remains a bootstrap script and does not manage emulator lifecycle. ## Data Flow ``` Agent | | CLI invocation or HTTP POST v Node CLI/API (apps/node) | | adb shell am broadcast ACTION_AGENT_COMMAND v Android Receiver (apps/android) | | Accessibility API actions v Device UI | | [Clawperator-Result] envelope via logcat v Node CLI/API | | Structured result v Agent ``` ## Emulator Provisioning Flow When the agent needs an emulator instead of a physical device, the Node layer follows a deterministic reuse-first flow: 1. Inspect running emulators from `adb devices`. 2. Resolve running emulator names with `adb -s emu avd name`. 3. Reuse a running supported emulator if one exists. 4. Otherwise inspect configured AVDs from `~/.android/avd/`. 5. Start a stopped supported AVD if one exists. 6. Otherwise install the default system image and create a new AVD. 7. Start the emulator detached with `-no-snapshot-load -no-boot-anim`. 8. Wait for adb registration and Android boot completion. The default supported profile is Android API `35`, Google Play, ABI `arm64-v8a`, device profile `pixel_7`, and AVD name `clawperator-pixel`. ## Android Build Modules ``` apps/android/ app/ - Clawperator Operator Android app (com.clawperator.operator) app-conformance/ - Conformance test APK for execution layer testing shared/ - Shared Android modules (action engine, contracts, etc.) ``` ## Conformance APK `apps/android/app-conformance` is a dedicated test app with a deterministic, stable UI. It exists to test Clawperator's execution layer without relying on third-party apps. ## Website Surfaces Two separate public sites are maintained in this repository: - `sites/landing/` - Next.js static site for `https://clawperator.com` (marketing, installer) - `sites/docs/` - MkDocs site for `https://docs.clawperator.com` (technical docs) Both deploy automatically to Cloudflare when changes merge to `main`. ## Skills Skills are packaged app-specific automation recipes maintained in a sibling repository (`../clawperator-skills`). The Node CLI provides discovery, metadata lookup, and a convenience run wrapper. Skills are standalone and can also be invoked directly without the Node CLI. ## Key Design Constraints - **Deterministic execution:** one broadcast in, one `[Clawperator-Result]` envelope out - **Single-flight lock:** only one execution per device at a time - **No autonomous planning in the runtime:** Clawperator executes commands; reasoning stays in the agent - **No direct adb required for agents:** all routine automation goes through the Node CLI/API --- # For AI Agents # Clawperator Node API - Agent Guide Clawperator provides a deterministic execution layer for LLM agents to control Android devices. This guide covers the CLI and HTTP API contracts. ## Concepts - **Execution**: A payload of one or more actions dispatched to the device. Every execution produces exactly one `[Clawperator-Result]` envelope. - **Action**: A single step (`open_app`, `click`, `read_text`, etc.) within an execution. - **Snapshot**: A captured UI hierarchy dump (`hierarchy_xml`) for observing device state. - **Skill**: A packaged recipe from the skills repo, compiled into an execution payload. ## CLI Reference | Command | Description | | :--- | :--- | | `operator setup --apk ` | Install the Operator APK and grant required device permissions (canonical setup command, with `operator install` kept as an alias) | | `devices` | List connected Android serials and states | | `emulator list` | List configured Android Virtual Devices with compatibility metadata | | `emulator inspect ` | Show normalized metadata for one Android Virtual Device | | `emulator create [--name ]` | Create the default supported Android emulator | | `emulator start ` | Start an existing Android Virtual Device and wait until boot completes | | `emulator stop ` | Stop a running Android emulator by AVD name | | `emulator delete ` | Delete an Android Virtual Device by name | | `emulator status` | List running Android emulators and boot state | | `emulator provision` | Reuse or create a supported Android emulator and return its ADB serial | | `provision emulator` | Alias of `emulator provision` | | `execute --execution ` | Run a full execution payload | | `observe snapshot` | Capture UI hierarchy dump (`hierarchy_xml`) | | `observe screenshot` | Capture device screen as PNG | | `action open-app --app ` | Open an application | | `action click --selector ` | Click a UI element | | `action read --selector ` | Read text from element | | `action type --selector --text ` | Type text | | `action wait --selector ` | Wait for element | | `skills list` | List available skills | | `skills get ` | Show skill metadata | | `skills search [--app ] [--intent ] [--keyword ]` | Search skills by app, intent, or keyword (at least one filter required) | | `skills compile-artifact --artifact ` | Compile skill to execution payload | | `skills run [--device-id ]` | Invoke a skill script (convenience wrapper) | | `skills install` | Clone skills repo to `~/.clawperator/skills/` | | `skills update [--ref ]` | Pull latest skills (optionally pin to a ref) | | `grant-device-permissions` | Re-grant Operator permissions only after an Operator APK crash causes Android to revoke them | | `serve` | Start HTTP/SSE server | | `doctor` | Run environment diagnostics | | `version` | Print the CLI version or check CLI / Clawperator Operator Android app compatibility | **Global options:** `--device-id `, `--receiver-package `, `--output `, `--format ` (alias for `--output`), `--timeout-ms `, `--verbose` For agent callers, `--output json` is the canonical output mode. `pretty` is for human inspection. Default receiver package: - release app package: `com.clawperator.operator` - local debug app package: pass `--receiver-package com.clawperator.operator.dev` Use subcommand help when the docs and the current CLI differ: ```bash clawperator observe snapshot --help clawperator skills sync --help clawperator doctor --help ``` Use `clawperator version --check-compat` before automation batches when the agent needs to verify that the installed [Clawperator Operator Android app](../getting-started/android-operator-apk.md) matches the CLI's supported `major.minor` version: ```bash clawperator version --check-compat --receiver-package com.clawperator.operator ``` The response includes the CLI version, detected [Clawperator Operator Android app](../getting-started/android-operator-apk.md) version, app `versionCode`, receiver package, compatibility verdict, and remediation guidance on mismatch. ## HTTP API (`clawperator serve`) Start with `clawperator serve [--port ] [--host ]`. Default: `127.0.0.1:3000`. > **Security:** The API is unauthenticated. Binds to localhost by default. Only use `--host 0.0.0.0` on trusted networks. | Endpoint | Description | | :--- | :--- | | `GET /devices` | Returns `{ ok: true, devices: [...] }` | | `GET /android/emulators` | Returns configured AVDs with compatibility metadata | | `GET /android/emulators/:name` | Returns normalized metadata for one AVD | | `GET /android/emulators/running` | Returns running emulator devices and boot state | | `POST /android/emulators/create` | Ensure the system image exists and create an AVD | | `POST /android/emulators/:name/start` | Start an AVD and return a booted emulator device | | `POST /android/emulators/:name/stop` | Stop a running emulator by AVD name | | `DELETE /android/emulators/:name` | Delete an AVD by name | | `POST /android/provision/emulator` | Reuse or create a supported emulator and return a booted device | | `POST /execute` | Body: `{"execution": , "deviceId": "...", "receiverPackage": "..."}` | | `POST /observe/snapshot` | Capture UI tree | | `POST /observe/screenshot` | Capture screenshot | | `GET /skills` | List skills. Query params: `?app=&intent=&keyword=` | | `GET /skills/:skillId` | Get skill metadata | | `POST /skills/:skillId/run` | Run skill. Body: `{"deviceId": "...", "args": [...]}` | | `GET /events` | SSE stream: `clawperator:result`, `clawperator:execution`, `heartbeat` | See `apps/node/examples/basic-api-usage.js` for a complete SSE + REST example. ## Android Emulator Support Note: Clawperator does not configure accounts, install the Android apps the user wants Clawperator to operate, or complete first-run app setup inside the emulator. Agents should assume the emulator already contains the logged-in Android apps required for automation. Clawperator supports Android emulator provisioning as an alternative runtime to a physical Android device. Emulator lifecycle management lives in the Node CLI and HTTP API, not in `install.sh`. Provisioning policy is deterministic: 1. Reuse a running supported emulator. 2. Start a stopped supported AVD. 3. Create a new supported AVD if none exist. The default supported emulator profile is: - Android API level `35` - Google Play system image - ABI `arm64-v8a` - device profile `pixel_7` - default AVD name `clawperator-pixel` Compatibility is determined from AVD metadata under: - `~/.android/avd/.avd/config.ini` - `~/.android/avd/.ini` The implementation normalizes and evaluates these fields: - `PlayStore.enabled` - `abi.type` - `image.sysdir.1` - `hw.device.name` Inspect one AVD: ```bash clawperator emulator inspect clawperator-pixel --output json ``` Provision a ready emulator: ```bash clawperator provision emulator --output json ``` Typical provisioning result (CLI output): ```json { "type": "emulator", "avdName": "clawperator-pixel", "serial": "emulator-5554", "booted": true, "created": false, "started": false, "reused": true } ``` HTTP response from `POST /android/provision/emulator` wraps the same payload with `"ok": true`. If both a physical device and an emulator are connected, continue to pass `--device-id ` to execution and observe commands so targeting stays explicit. ## Execution Payload Every execution requires `expectedFormat: "android-ui-automator"`. ```json { "commandId": "unique-id-123", "taskId": "task-456", "source": "my-agent", "expectedFormat": "android-ui-automator", "timeoutMs": 60000, "actions": [ { "id": "step1", "type": "open_app", "params": { "applicationId": "com.example.app" } }, { "id": "step2", "type": "enter_text", "params": { "matcher": { "resourceId": "com.example.app:id/search_input" }, "text": "hello", "submit": true } }, { "id": "step3", "type": "click", "params": { "matcher": { "textEquals": "Login" }, "clickType": "default" } }, { "id": "step4", "type": "snapshot_ui" } ] } ``` **Execution timeout limit:** `timeoutMs` is schema-validated. The allowed range is 1,000-120,000 ms (1 second to 2 minutes). Submitting a value outside this range causes `EXECUTION_VALIDATION_FAILED` - the execution is rejected before any action runs. Operations that require longer running time must be split across multiple execution payloads. For install or download flows, use `wait_for_node` polling within the 120-second window rather than a single long sleep. **Result envelope:** Exactly one `[Clawperator-Result]` JSON block is emitted to logcat on completion. Node reads and returns it. See the Result Envelope section for the full shape and per-action `data` contents. ## NodeMatcher Reference A NodeMatcher identifies a single UI element for action targeting. Used as a required param in `click`, `enter_text`, `read_text`, `wait_for_node`, and `scroll_and_click`. All specified fields are combined with AND semantics: every specified field must match the target element. At least one field is required per matcher. | Field | Type | Description | | :--- | :--- | :--- | | `resourceId` | `string` | Developer-assigned element ID. Format: `"com.example.app:id/element_name"`. Most stable - prefer over all others when present. | | `contentDescEquals` | `string` | Exact match on accessibility content description. Use for icon buttons with no visible text. | | `textEquals` | `string` | Exact match on visible text label. Fragile for server-driven or localized content. | | `textContains` | `string` | Substring match on visible text. Use when full text is dynamic or may be truncated. | | `contentDescContains` | `string` | Substring match on accessibility label. Fallback for partial or dynamic accessibility labels. | | `role` | `string` | Matches by Clawperator semantic role name (`button`, `textfield`, `text`, `switch`, `checkbox`, `image`, `listitem`, `toolbar`, `tab`). Derived from runtime role inference, not the raw UIAutomator `class` string. Generally low selectivity - many elements share a role. Use as a secondary constraint for most roles. **Exception:** `role: "textfield"` targets the inferred text-input role, which is derived from common Android text-input widgets (for example class names containing `EditText`, `TextInputEditText`, or `AutoCompleteTextView`). It is the correct primary selector for text input fields in apps that do not assign `resource-id` to their inputs (which includes many production apps such as Google Play Store). In those apps `role: "textfield"` may be the only reliable way to target the active text input. | **Selector priority (most to least stable):** `resourceId` > `contentDescEquals` > `textEquals` > `textContains` > `contentDescContains` > `role` Combine fields to increase specificity when a single field is ambiguous: ```json { "resourceId": "com.example.app:id/submit_btn", "textEquals": "Submit" } ``` ## Action Reference ### Action types and params | Action | Required params | Optional params | | :--- | :--- | :--- | | `open_app` | `applicationId: string` | - | | `open_uri` | `uri: string` | `retry: object` | | `close_app` | `applicationId: string` | - | | `click` | `matcher: NodeMatcher` | `clickType: "default" \| "long_click" \| "focus"` (default: `"default"`) | | `enter_text` | `matcher: NodeMatcher`, `text: string` | `submit: boolean` (default: `false`), `clear: boolean` (accepted by Node contract, currently ignored by Android runtime) | | `read_text` | `matcher: NodeMatcher` | `validator: "temperature"` (only supported validator today), `retry: object` | | `wait_for_node` | `matcher: NodeMatcher` | `retry: object` - controls polling attempts and backoff delays (see `retry` object shape below). There is no per-action `timeoutMs`; the outer execution `timeoutMs` is the only wall-clock limit. | | `snapshot_ui` | - | `retry: object` | | `take_screenshot` | - | `path: string`, `retry: object` | | `sleep` | `durationMs: number` | - | | `scroll_and_click` | `target: NodeMatcher` | `container: NodeMatcher`, `direction: "down" \| "up" \| "left" \| "right"` (default: `"down"`), `maxSwipes: number` (default: `10`, range: 1-50), `distanceRatio: number` (default: `0.7`, range: 0-1), `settleDelayMs: number` (default: `250`, range: 0-10000), `findFirstScrollableChild: boolean` (default: `true`), `clickAfter: boolean` (default: `true`), `scrollRetry: object` (default preset: `maxAttempts=4`, `initialDelayMs=400`, `maxDelayMs=2000`, `backoffMultiplier=2.0`, `jitterRatio=0.15`), `clickRetry: object` (default preset: `maxAttempts=5`, `initialDelayMs=500`, `maxDelayMs=3000`, `backoffMultiplier=2.0`, `jitterRatio=0.15`) | | `scroll` | - | `container: NodeMatcher` (default: auto-detect first scrollable), `direction: "down" \| "up" \| "left" \| "right"` (default: `"down"` - reveals content further down, finger swipes up), `distanceRatio: number` (default: `0.7`, range: 0-1), `settleDelayMs: number` (default: `250`, range: 0-10000), `findFirstScrollableChild: boolean` (default: `true`), `retry: object` (default: no retry - see scroll behavior note) | | `scroll_until` | - | `container: NodeMatcher` (default: auto-detect), `direction: "down" \| "up" \| "left" \| "right"` (default: `"down"`), `distanceRatio: number` (default: `0.7`, range: 0-1), `settleDelayMs: number` (default: `250`, range: 0-10000), `maxScrolls: number` (default: `20`, range: 1-200), `maxDurationMs: number` (default: `10000`, range: 0-120000), `noPositionChangeThreshold: number` (default: `3`, range: 1-20), `findFirstScrollableChild: boolean` (default: `true`) | | `press_key` | `key: "back" \| "home" \| "recents"` | - | ### CLI-to-action-type mapping | CLI command | Payload action type | | :--- | :--- | | `action type --selector --text ` | `enter_text` | | `action click --selector ` | `click` | | `action read --selector ` | `read_text` | | `action wait --selector ` | `wait_for_node` | | `action open-app --app ` | `open_app` | | `action open-uri --uri ` | `open_uri` | | `action press-key --key ` | `press_key` | | `observe snapshot` | `snapshot_ui` | ### Action behavior notes - `sleep.durationMs` must be in the range `0`-`120000` ms. Values above the cap are rejected with `EXECUTION_VALIDATION_FAILED` before dispatch (consistent with the execution `timeoutMs` validation). It also consumes from the outer execution `timeoutMs` budget. **`retry` object shape:** All action types that accept a `retry` param use the same object schema: ```json { "maxAttempts": 3, "initialDelayMs": 500, "maxDelayMs": 3000, "backoffMultiplier": 2.0, "jitterRatio": 0.15 } ``` `maxAttempts` is capped at 10. `initialDelayMs` and `maxDelayMs` are capped at 30,000 and 60,000 ms respectively. Omit the `retry` field to use the action's default preset. For `wait_for_node`, the default is `UiReadiness` (`maxAttempts=5`, `initialDelayMs=500`, `maxDelayMs=3000`, `backoffMultiplier=2.0`, `jitterRatio=0.15`). **`open_app`:** Opens the app's default launch activity by `applicationId`. **`open_uri`:** Opens a URI using the Clawperator Android app's implicit `ACTION_VIEW` intent - no adb shortcut is used. The Android device's registered handler for the URI scheme is invoked directly. Any URI scheme is supported: deep links (`market://details?id=com.actionlauncher.playstore`), standard URLs (`https://example.com`), and custom app schemes. If no application is registered for the URI scheme, the action fails with `URI_NOT_HANDLED`. A chooser dialog may appear on devices with multiple handlers for a scheme; follow the `open_uri` step with a `snapshot_ui` to verify the expected app is in the foreground. The alias `open_url` is also accepted and normalized to `open_uri`. **`close_app`:** The Node layer intercepts `close_app` actions and runs `adb shell am force-stop ` before dispatching to Android. The Android step always returns `success: false` with `data.error: "UNSUPPORTED_RUNTIME_CLOSE"` - this is expected. The overall execution `status` remains `"success"` and the app is force-stopped. Do not treat this step result as a recoverable failure. **`click`:** Finds the node matching `matcher` and performs the specified `clickType`. The default click type is `"default"` (standard accessibility click with gesture fallback). Use `"long_click"` for long-press targets and `"focus"` to focus without activating. **`click` example request (`/execute`):** ```json { "deviceId": "", "execution": { "commandId": "cmd-click-1", "taskId": "task-click-1", "source": "local-test", "expectedFormat": "android-ui-automator", "timeoutMs": 30000, "actions": [ { "id": "click1", "type": "click", "params": { "matcher": { "resourceId": "com.example.app:id/submit_button" }, "clickType": "default" } } ] } } ``` **`click` example success response:** ```json { "ok": true, "envelope": { "commandId": "cmd-click-1", "taskId": "task-click-1", "status": "success", "stepResults": [ { "id": "click1", "actionType": "click", "success": true, "data": { "click_types": "click" } } ], "error": null }, "deviceId": "", "terminalSource": "clawperator_result" } ``` **`enter_text`:** The CLI command is `action type` but the execution payload action type is `enter_text`. The `submit` param triggers a keyboard Enter/submit after typing - use this for search fields and single-field forms where pressing Enter submits. The Node contract still accepts `clear`, but the Android runtime does not implement it yet, so it currently has no effect. **`enter_text` example request (`/execute`):** ```json { "deviceId": "", "execution": { "commandId": "cmd-type-1", "taskId": "task-type-1", "source": "local-test", "expectedFormat": "android-ui-automator", "timeoutMs": 30000, "actions": [ { "id": "type1", "type": "enter_text", "params": { "matcher": { "resourceId": "com.example.app:id/search_input" }, "text": "hello world", "submit": true } } ] } } ``` **`enter_text` example success response:** ```json { "ok": true, "envelope": { "commandId": "cmd-type-1", "taskId": "task-type-1", "status": "success", "stepResults": [ { "id": "type1", "actionType": "enter_text", "success": true, "data": { "text": "hello world", "submit": "true" } } ], "error": null }, "deviceId": "", "terminalSource": "clawperator_result" } ``` **`read_text`:** `validator` is not an open-ended string in practice. The Android runtime currently supports only `"temperature"` and rejects any other value. **`read_text` example request (`/execute`):** ```json { "deviceId": "", "execution": { "commandId": "cmd-read-1", "taskId": "task-read-1", "source": "local-test", "expectedFormat": "android-ui-automator", "timeoutMs": 30000, "actions": [ { "id": "read1", "type": "read_text", "params": { "matcher": { "resourceId": "com.example.app:id/temperature_label" } } } ] } } ``` **`read_text` example success response:** ```json { "ok": true, "envelope": { "commandId": "cmd-read-1", "taskId": "task-read-1", "status": "success", "stepResults": [ { "id": "read1", "actionType": "read_text", "success": true, "data": { "text": "22.5 C", "validator": "none" } } ], "error": null }, "deviceId": "", "terminalSource": "clawperator_result" } ``` **`snapshot_ui`:** Clawperator returns a single canonical snapshot format: `hierarchy_xml`. The Android runtime writes the hierarchy dump to device logcat, and the Node layer injects that raw XML into `data.text` after execution. `data.actual_format` is always `"hierarchy_xml"` for successful snapshot steps. `observe snapshot` (CLI subcommand) and `snapshot_ui` (execution action type) use the same internal pipeline and produce identical output. `observe snapshot` builds a single-action execution internally and calls `runExecution`. Use `observe snapshot` for ad-hoc inspection from the command line. Use `snapshot_ui` as a step within a multi-action execution payload. **Failure case - extraction error:** If snapshot post-processing finishes without attaching UI hierarchy text to the step (`data.text` remains absent), the step returns `success: false` with `data.error: "SNAPSHOT_EXTRACTION_FAILED"`. A common cause is that logcat does not contain a matching `[TaskScope] UI Hierarchy:` marker for the step, but partial extraction or other logcat mismatches can also trigger this error. This typically means the installed clawperator binary is out of date with the Android Operator APK. Run `clawperator version --check-compat` and `clawperator doctor` to diagnose. See Troubleshooting for resolution steps. **`snapshot_ui` example request (`/execute`):** ```json { "deviceId": "", "execution": { "commandId": "cmd-snap-1", "taskId": "task-snap-1", "source": "local-test", "expectedFormat": "android-ui-automator", "timeoutMs": 30000, "actions": [ { "id": "snap1", "type": "snapshot_ui" } ] } } ``` **`snapshot_ui` example success response:** ```json { "ok": true, "envelope": { "commandId": "cmd-snap-1", "taskId": "task-snap-1", "status": "success", "stepResults": [ { "id": "snap1", "actionType": "snapshot_ui", "success": true, "data": { "actual_format": "hierarchy_xml", "text": "..." } } ], "error": null }, "deviceId": "", "terminalSource": "clawperator_result" } ``` **`take_screenshot`:** `observe screenshot` uses the same execution contract under the hood. Android reports `UNSUPPORTED_RUNTIME_SCREENSHOT`, then the Node layer captures the screenshot via `adb exec-out screencap -p`, writes it to `data.path`, and normalizes the step result to `success: true` when capture succeeds. **`press_key`:** Issues a system-level key event via the Android Accessibility Service (`performGlobalAction`). Supported keys: `"back"`, `"home"`, `"recents"`. The alias `key_press` is normalized to `press_key`. No retry - this action is single-attempt by design. Requires the Clawperator Operator accessibility service to be running on the device. If the service is unavailable, the execution returns a top-level failed envelope with `status: "failed"` and no `stepResults`. Use `clawperator doctor` to diagnose accessibility service availability before running executions that include `press_key`. When testing local/debug builds, pass the matching `receiverPackage` (`com.clawperator.operator.dev`) instead of relying on the default release package. Returns `success: false` with `data.error: "GLOBAL_ACTION_FAILED"` if the OS reports the global action could not be performed (rare soft OS failure - accessibility service was running but Android declined the action). **`press_key` key scope:** This action covers only Android accessibility global actions. Non-global keys - `enter`, `search`, `volume_up`, `volume_down`, `escape`, and raw keycodes - are not supported by `press_key`. They use a different Android mechanism (`input keyevent`) that is not routed through the Operator accessibility service. Use `adb shell input keyevent ` outside the execution payload for those keys until a dedicated raw-key primitive is added. **`press_key` example request (`/execute`):** ```json { "deviceId": "", "receiverPackage": "com.clawperator.operator.dev", "execution": { "commandId": "cmd-press-home", "taskId": "task-press-home", "source": "local-test", "expectedFormat": "android-ui-automator", "timeoutMs": 30000, "actions": [ { "id": "open1", "type": "open_app", "params": { "applicationId": "com.android.settings" } }, { "id": "home1", "type": "press_key", "params": { "key": "home" } }, { "id": "snap1", "type": "snapshot_ui" } ] } } ``` **`press_key` example success response:** ```json { "ok": true, "envelope": { "commandId": "cmd-press-home", "taskId": "task-press-home", "status": "success", "stepResults": [ { "id": "open1", "actionType": "open_app", "success": true, "data": { "application_id": "com.android.settings" } }, { "id": "home1", "actionType": "press_key", "success": true, "data": { "key": "home" } }, { "id": "snap1", "actionType": "snapshot_ui", "success": true, "data": { "actual_format": "hierarchy_xml", "text": "" } } ], "error": null }, "deviceId": "", "terminalSource": "clawperator_result" } ``` **`press_key` example validation failure:** ```json { "ok": false, "error": { "code": "EXECUTION_VALIDATION_FAILED", "message": "press_key params.key must be one of: back, home, recents", "details": { "path": "actions.0.params.key" } } } ``` **`scroll_and_click`:** This action has two separate retry knobs. `scrollRetry` controls the scroll/search loop and defaults to the `UiScroll` preset (`maxAttempts=4`, `initialDelayMs=400`, `maxDelayMs=2000`, `backoffMultiplier=2.0`, `jitterRatio=0.15`). `clickRetry` controls the final click attempt and defaults to the `UiReadiness` preset (`maxAttempts=5`, `initialDelayMs=500`, `maxDelayMs=3000`, `backoffMultiplier=2.0`, `jitterRatio=0.15`). **`clickAfter` flag:** When `clickAfter: false`, the action scrolls until the target is visible but does not click it. This is useful when you need to bring an element into view before a separate `snapshot_ui` or `read_text` action, or when you want to confirm presence before committing a click. **`scroll`:** Performs a single scroll gesture and reports whether content actually moved. Unlike `scroll_and_click`, this action has no target element and does not click. It is designed for exploratory navigation - panning through a list to observe content before deciding what to do next. **Direction semantics (content direction, not finger direction):** - `"down"` - reveals content further down the list. Finger swipes up. Default. - `"up"` - reveals content further up the list. Finger swipes down. - `"left"` / `"right"` - horizontal carousel navigation. Direction refers to the content movement, not the swipe direction. The action always reports one of three outcomes in `data.scroll_outcome`: - `"moved"` - gesture was dispatched and the list position changed. - `"edge_reached"` - gesture was dispatched but the container was already at its limit. This is `success: true`, not an error. It is the expected terminal state when paginating a finite list. - `"gesture_failed"` - the OS rejected the gesture dispatch (`success: false`). **Retry behavior:** `scroll` defaults to no retry (`retry` param defaults to a single attempt). This differs from most UI actions, which default to `UiReadiness` retry (3 attempts with backoff). The reason: retrying a scroll that returned `edge_reached` is wasteful. If the container may not have loaded yet, pass an explicit `retry` object or send a `wait_for_node` first. `container` targeting and the `findFirstScrollableChild` flag work the same way as `scroll_and_click`. If no `container` is provided, the first `scrollable="true"` node on screen is used. `findFirstScrollableChild` defaults to `true` - when the matched container itself is not scrollable, the runtime automatically uses its first scrollable descendant. Set to `false` only if you need strict container matching. **Auto-detect caveat:** On nested-scroll layouts, the first visible `scrollable="true"` node may be an outer wrapper rather than the content list you actually want. When the screen contains more than one plausible scroll surface, prefer an explicit `container` matcher using the list's `resource-id` from `snapshot_ui` rather than relying on auto-detect. Typical observe-decide-act loop using `scroll`: ```json [ { "id": "snap1", "type": "snapshot_ui" }, { "id": "scr1", "type": "scroll", "params": { "direction": "down" } }, { "id": "snap2", "type": "snapshot_ui" } ] ``` After receiving `snap2`, the agent compares it to `snap1`. If `scr1.data.scroll_outcome` is `"edge_reached"`, no further scrolling is possible in that direction. **`scroll` example request:** ```json { "deviceId": "", "execution": { "commandId": "cmd-scroll-1", "taskId": "task-explore", "source": "agent", "expectedFormat": "android-ui-automator", "timeoutMs": 30000, "actions": [ { "id": "snap1", "type": "snapshot_ui" }, { "id": "scr1", "type": "scroll", "params": { "direction": "down" } }, { "id": "snap2", "type": "snapshot_ui" } ] } } ``` **`scroll` example step result (content moved):** ```json { "id": "scr1", "actionType": "scroll", "success": true, "data": { "scroll_outcome": "moved", "direction": "down", "distance_ratio": "0.7" } } ``` **`scroll` example step result (at bottom of list):** ```json { "id": "scr1", "actionType": "scroll", "success": true, "data": { "scroll_outcome": "edge_reached", "direction": "down", "distance_ratio": "0.7" } } ``` **`scroll_until`:** Bounded scroll loop. Scrolls repeatedly until a termination condition fires and returns `termination_reason` so the agent knows why it stopped. Always applies caps even when not specified. Direction semantics are the same as `scroll`. `container`, `distanceRatio`, `settleDelayMs`, and `findFirstScrollableChild` behave identically to `scroll`. **Termination reasons (`data.termination_reason`):** - `EDGE_REACHED` - content ended naturally (finite list). `success: true`. - `MAX_SCROLLS_REACHED` - hit `maxScrolls` cap. `success: true`. Normal for infinite feeds. - `MAX_DURATION_REACHED` - hit `maxDurationMs` cap. `success: true`. Normal for infinite feeds. - `NO_POSITION_CHANGE` - no content movement across `noPositionChangeThreshold` consecutive scrolls. `success: true`. - `CONTAINER_NOT_FOUND` - container resolution failed. `success: false`. - `CONTAINER_NOT_SCROLLABLE` - container is not scrollable. `success: false`. `MAX_SCROLLS_REACHED`, `MAX_DURATION_REACHED`, and `NO_POSITION_CHANGE` are clean terminal states, not errors. Agents scrolling infinite feeds should expect these and handle them without treating the action as failed. **Current runtime caveat:** If the resolved container disappears mid-loop because the app navigated away or rebuilt the view tree unexpectedly, the current Android runtime can collapse that case into `EDGE_REACHED`. When a scroll loop might trigger navigation or heavy UI re-layout, follow it with `snapshot_ui` or `wait_for_node` before assuming the list truly ended. **`scroll_until` example request:** ```json { "commandId": "cmd-su-1", "taskId": "task-paginate", "expectedFormat": "android-ui-automator", "timeoutMs": 30000, "actions": [ { "id": "su1", "type": "scroll_until", "params": { "direction": "down", "maxScrolls": 25 } } ] } ``` **`scroll_until` example step result (finite list, reached bottom):** ```json { "id": "su1", "actionType": "scroll_until", "success": true, "data": { "termination_reason": "EDGE_REACHED", "scrolls_executed": "12", "direction": "down" } } ``` **`scroll_until` example step result (infinite feed, hit cap):** ```json { "id": "su1", "actionType": "scroll_until", "success": true, "data": { "termination_reason": "MAX_SCROLLS_REACHED", "scrolls_executed": "20", "direction": "down" } } ``` ## Pagination Recipe When an agent needs to read all content from a scrollable list, the correct approach depends on whether the list is finite or infinite. ### Finite lists (settings screens, contact lists, search results) Use a manual scroll loop: issue `scroll` actions one at a time, snapshot after each, and stop when `scroll_outcome` is `"edge_reached"`. This gives full control over when to stop and what to extract. ``` while true: snapshot_ui -> extract visible items scroll down -> if edge_reached: break ``` `maxScrolls` is not required here because the agent controls the loop and `edge_reached` is the natural termination condition. ### Infinite feeds (social media, Play Store, news feeds) Use `scroll_until` with an explicit `maxScrolls` cap. There is no true "bottom" on infinite-scroll lists - lazy loading means the edge is never definitively reached. `scroll_until` is designed for this case: it scrolls as far as the agent wants and returns a machine-readable `termination_reason`. ```json { "id": "feed1", "type": "scroll_until", "params": { "direction": "down", "maxScrolls": 30 } } ``` After this action, `termination_reason` will be one of: - `MAX_SCROLLS_REACHED` - agent hit its own cap (normal for infinite feeds) - `EDGE_REACHED` - list actually ended (finite list reached bottom) - `NO_POSITION_CHANGE` - content stopped moving (stale list or true bottom) Both `MAX_SCROLLS_REACHED` and `NO_POSITION_CHANGE` are clean terminal states. Do not treat them as errors. **Required: always set `maxScrolls`.** Without an explicit cap, the default is 20 scrolls. For feeds where you want more coverage, pass a larger value. Never omit `maxScrolls` or rely on `NO_POSITION_CHANGE` alone as the termination condition for infinite feeds - a slow network load can pause position change temporarily and cause early exit. ### Returning to top After scrolling down a feed, use `scroll_until` with `direction: "up"` to return to the top. On finite lists, `EDGE_REACHED` signals the top. On infinite feeds, `NO_POSITION_CHANGE` or `MAX_SCROLLS_REACHED` signals that position has stabilized near the top. ```json { "id": "top1", "type": "scroll_until", "params": { "direction": "up", "maxScrolls": 50, "noPositionChangeThreshold": 3 } } ``` ## Result Envelope Every execution emits exactly one `[Clawperator-Result]` envelope: ```json { "commandId": "...", "taskId": "...", "status": "success" | "failed", "stepResults": [ { "id": "...", "actionType": "...", "success": true | false, "data": { "key": "value" } } ], "error": "human-readable reason" | null, "errorCode": "STABLE_CODE" | null } ``` - `status` is `"success"` when the execution completes (including partial step failures like `close_app`). `status` is `"failed"` only on total execution failure (dispatch error, timeout, validation failure). - `error` (top-level) contains a human-readable description of the failure. Do not branch agent logic on this string - it is not a stable contract. - `errorCode` (top-level) contains a stable, enumerated code when the failure has a known cause. Branch agent logic on this field. May be absent in envelopes from older APK versions or for unclassified failures. - `data.error` on a step result contains the per-step error code when `success` is `false`. - All `data` values are strings. `data` is always an object (never `null`), but may be empty. - Only one execution may be in flight per device. Concurrent requests for the same device return `EXECUTION_CONFLICT_IN_FLIGHT`. ### Per-action result data Typical `data` keys by action type: | Action | Typical `data` keys | | :--- | :--- | | `open_app` | `application_id` | | `open_uri` | `uri` | | `close_app` | `application_id`, `error` (`"UNSUPPORTED_RUNTIME_CLOSE"`), `message` | | `click` | `click_types` | | `enter_text` | `text` (text typed), `submit` (`"true"` or `"false"`) | | `read_text` | `text` (extracted text value), `validator` (`"none"` or validator type) | | `snapshot_ui` | `actual_format`, `text` (snapshot content - see note below) | | `take_screenshot` | `path` (local screenshot file path after Node capture) | | `wait_for_node` | `resource_id`, `label` (matched node details) | | `scroll_and_click` | `max_swipes`, `direction`, `click_types`, `click_after` (`"true"` or `"false"`) | | `scroll` | `scroll_outcome` (`"moved"`, `"edge_reached"`, or `"gesture_failed"`), `direction`, `distance_ratio`, `settle_delay_ms`, `resolved_container` (resourceId of auto-detected container, when present) | | `scroll_until` | `termination_reason` (see behavior note), `scrolls_executed`, `direction`, `resolved_container` (when present) | | `sleep` | `duration_ms` | | `press_key` | `key` (`"back"`, `"home"`, or `"recents"`) | For any failed step: `success: false` and `data.error` contains the error code string. **Snapshot content delivery:** The UI hierarchy is produced by the Android runtime and written to device logcat. The Node layer reads logcat after execution and injects the raw XML into `data.text`. `data.actual_format` is `"hierarchy_xml"` on successful snapshot steps. **`read_text` value:** The extracted text value is in `data.text`. ## Snapshot Output Format `snapshot_ui` and `clawperator observe snapshot` produce the canonical `hierarchy_xml` format. `data.actual_format` reports `"hierarchy_xml"` on success. ### `hierarchy_xml` Structured XML produced by UIAutomator. Each `` represents one UI element. Attributes map directly to NodeMatcher fields: | XML attribute | NodeMatcher field | Notes | | :--- | :--- | :--- | | `resource-id` | `resourceId` | `"com.example.app:id/name"`. Empty string when not set by developer. | | `text` | `textEquals` / `textContains` | Visible text content. Empty string if none. | | `content-desc` | `contentDescEquals` / `contentDescContains` | Accessibility label. Empty string if none. | | `class` | - | Java widget class name, e.g., `"android.widget.Button"`. Informational only - `NodeMatcher.role` uses Clawperator semantic role names such as `button`, `textfield`, `text`, or `switch`, not the raw `class` attribute. | | `clickable` | - | `"true"` if the element accepts tap events. | | `scrollable` | - | `"true"` marks a scroll container. Use as `container` in `scroll_and_click`, `scroll`, or `scroll_until`. | | `bounds` | - | `"[x1,y1][x2,y2]"` pixel rectangle. Useful for understanding spatial layout. | | `enabled` | - | `"false"` means the element is visible but not interactable. | | `long-clickable` | - | `"true"` if the element accepts long-press. Use `clickType: "long_click"`. | **XML attribute escaping:** Snapshot output is XML, so special characters in attribute values are escaped when the hierarchy is serialized. For example, an apostrophe appears as `'`, an ampersand as `&`. These escaped sequences are returned as-is in `data.text` - they are not decoded after extraction. When targeting elements with special characters in their labels, use `contentDescContains` or `textContains` with a substring that avoids the escaped characters rather than an exact match requiring the full escaped form. Example: for a node with `content-desc="Search for 'vlc'"`: - `contentDescContains: "Search for"` -- works - `contentDescEquals: "Search for 'vlc'"` -- fails (apostrophe is not decoded) **Annotated example from Android Settings main screen (live device capture):** ```xml ... ... ``` **Reading patterns:** - **Tap targets** are `clickable="true"` nodes. In list UIs these are often container (`LinearLayout`) nodes whose text-bearing children hold the visible label while the container itself has `text=""`. When you match any node, Clawperator first attempts `ACTION_CLICK` on the first `clickable="true"` ancestor it finds while walking up the tree from the matched node. If that accessibility click does not succeed, Clawperator falls back to a gesture tap at the center of the matched node's bounding box. This means matching a non-clickable label node (for example, `textEquals: "Connections"`) works correctly as long as it is visually inside a clickable parent tap target. If both mechanisms fail, the execution currently terminates with a failed envelope and empty `stepResults` rather than a per-step `NODE_NOT_CLICKABLE` code. - **Icon-only buttons** (no `text`) use `content-desc` for their label. Target with `contentDescEquals`. - **Scroll containers** have `scrollable="true"`. Pass their `resource-id` as the `container` matcher in `scroll_and_click` or `scroll`. If `container` is omitted from `scroll`, the first `scrollable="true"` node on screen is used automatically. - **Disabled elements** have `enabled="false"`. They cannot be interacted with - scrolling or waiting for a state change is required first. **Apps with obfuscated or missing resource-ids:** Many production apps (Google Play Store, social media apps, banking apps) set `resource-id=""` on all or most nodes. In this case, fall back to content-desc and text matchers. The fallback priority for these apps is: 1. `contentDescEquals` - for elements with stable accessibility labels (icon buttons, tabs) 2. `textEquals` - for elements with stable visible text (button labels, section headers) 3. `contentDescContains` / `textContains` - when the value may include dynamic content, counts, or special characters (including HTML entities - see note above) 4. `role: "textfield"` - for text inputs when no `resource-id` is present Note: `content-desc` values sometimes contain newlines when an element's label spans multiple pieces of information (for example, an app name followed by developer name in Play Store results). Use `contentDescContains` with a known stable substring rather than a full exact match. ## Error Codes Branch agent logic on codes from `envelope.errorCode` (top-level Android result envelopes), `error.code` (Node API / CLI structured errors), or `stepResults[].data.error` (per-step failures). The `envelope.error` field contains a human-readable description and is not a stable contract. | Code | Source | Meaning | | :--- | :--- | :--- | | `SERVICE_UNAVAILABLE` | `envelope.errorCode` | Clawperator Operator accessibility service is not running on the device. Use `clawperator doctor` to diagnose. | | `EXECUTION_CONFLICT_IN_FLIGHT` | `error.code` | Device is busy with another execution | | `ANDROID_SDK_TOOL_MISSING` | `error.code` | A required Android SDK tool such as `adb`, `emulator`, `sdkmanager`, or `avdmanager` is not available | | `EMULATOR_NOT_FOUND` | `error.code` | Requested AVD does not exist | | `EMULATOR_NOT_RUNNING` | `error.code` | Requested AVD is not currently running | | `EMULATOR_ALREADY_RUNNING` | `error.code` | Requested operation requires the AVD to be stopped first | | `EMULATOR_UNSUPPORTED` | `error.code` | The AVD exists but does not satisfy Clawperator compatibility rules | | `EMULATOR_START_FAILED` | `error.code` | Emulator process failed to register with adb in time | | `EMULATOR_BOOT_TIMEOUT` | `error.code` | Emulator registered with adb but Android did not finish booting in time | | `ANDROID_AVD_CREATE_FAILED` | `error.code` | AVD creation failed | | `ANDROID_SYSTEM_IMAGE_INSTALL_FAILED` | `error.code` | System image install or SDK license acceptance failed | | `EMULATOR_STOP_FAILED` | `error.code` | Emulator stop request failed | | `EMULATOR_DELETE_FAILED` | `error.code` | Emulator deletion failed | | `NODE_NOT_FOUND` | `data.error` | Selector matched no UI element | | `RESULT_ENVELOPE_TIMEOUT` | `error.code` | Command dispatched but no result received | | `RECEIVER_NOT_INSTALLED` | `error.code` | [Clawperator Operator Android app](../getting-started/android-operator-apk.md) not found on device | | `DEVICE_UNAUTHORIZED` | `error.code` | Device not authorized for ADB | | `VERSION_INCOMPATIBLE` | `error.code` | CLI and installed [Clawperator Operator Android app](../getting-started/android-operator-apk.md) versions do not share the same `major.minor` | | `APK_VERSION_UNREADABLE` | `error.code` | The device package dump did not expose a readable [Clawperator Operator Android app](../getting-started/android-operator-apk.md) version | | `EXECUTION_VALIDATION_FAILED` | `error.code` | Payload failed schema validation | | `SECURITY_BLOCK_DETECTED` | `data.error` | Android blocked the action (e.g., secure keyboard) | | `NODE_NOT_CLICKABLE` | `data.error` | Reserved error code. Intended for "element found but not interactable", but not currently emitted consistently by the Android and Node runtimes. | | `UNSUPPORTED_RUNTIME_CLOSE` | `data.error` | Expected per-step result for all `close_app` steps. The Android runtime does not support a force-stop action response - the Node layer handles the close via `adb shell am force-stop` before dispatch. The overall execution `status` remains `"success"`. Treat as non-fatal. | | `SNAPSHOT_EXTRACTION_FAILED` | `data.error` | `snapshot_ui` step completed but the Node layer did not attach any snapshot text to the step during post-processing. The most common cause is a Node binary packaging mismatch or other logcat extraction issue. Rebuild or reinstall the npm package and check version compatibility. | | `GLOBAL_ACTION_FAILED` | `data.error` | `press_key` step result when the OS reports `performGlobalAction` returned false. Rare soft failure - the accessibility service was running but Android declined to execute the action. | | `CONTAINER_NOT_FOUND` | `data.error` | `scroll` step could not locate a scrollable container. Either no scrollable node is present on screen, or the provided `container` matcher matched nothing. | | `CONTAINER_NOT_SCROLLABLE` | `data.error` | `scroll` step found the matched container but it is not scrollable and no scrollable descendant was found. With the default `findFirstScrollableChild: true`, the runtime already walks one level down before raising this error. | | `GESTURE_FAILED` | `data.error` | `scroll` step: the OS rejected the gesture dispatch. The accessibility service was running but Android declined to execute the swipe gesture. Step returns `success: false`. | Primary top-level error taxonomy: `apps/node/src/contracts/errors.ts`. This table also includes runtime-only step error strings such as `UNSUPPORTED_RUNTIME_CLOSE`. ## Key Behaviors - **Single-flight:** One execution per device at a time. Concurrent requests return `EXECUTION_CONFLICT_IN_FLIGHT`. - **No hidden retries:** If an action fails, the error is returned immediately. Retry logic belongs in the agent. - **Deterministic results:** Exactly one terminal envelope per `commandId`. Timeouts return `RESULT_ENVELOPE_TIMEOUT` with diagnostics. - **Timeout override:** `--timeout-ms ` overrides the execution timeout for `execute`, `observe snapshot`, and `observe screenshot` within policy limits. - **Device targeting:** Specify `--device-id` when multiple devices are connected. Omit for single-device setups. - **Emulator reuse over creation:** Provisioning never creates duplicate AVDs when a supported running or stopped emulator already exists. - **Deterministic emulator boots:** Emulator starts use `-no-snapshot-load` and wait for both `sys.boot_completed` and `dev.bootcomplete`. - **Validation before dispatch:** Every payload is schema-validated before any ADB command is issued. ## Skills Skills are packaged Android automation scripts distributed via the public GitHub repository at `https://github.com/clawperator/clawperator-skills`. The Node API provides discovery and metadata - skills are standalone and can be invoked directly by agents without the Node API. ### Setup ```bash clawperator skills install export CLAWPERATOR_SKILLS_REGISTRY="$HOME/.clawperator/skills/skills/skills-registry.json" ``` ### Discovery ```bash # List all skills clawperator skills list # Search by Android application ID clawperator skills search --app com.android.settings # Get skill metadata clawperator skills get com.android.settings.capture-overview ``` ### Invocation Skills can be invoked three ways: 1. **Direct script invocation** (standalone - no Node API required): ```bash node ~/.clawperator/skills/skills/com.android.settings.capture-overview/scripts/capture_settings_overview.js ``` 2. **Convenience wrapper** via Node API: ```bash clawperator skills run com.android.settings.capture-overview --device-id ``` 3. **Artifact compile + execute** (for skills with `.recipe.json` artifacts): ```bash clawperator skills compile-artifact --artifact --vars '{"KEY":"value"}' clawperator execute --execution ``` ### Skills Run Response ```json { "ok": true, "skillId": "com.android.settings.capture-overview", "output": "Settings Overview captured\nTEXT_BEGIN\n...\nTEXT_END", "exitCode": 0, "durationMs": 8500 } ``` ## Use Cases - **Price comparison:** Open shopping apps, search, capture prices via `read_text`, return structured comparison. - **Location check:** Open a tracking app, capture current location data via snapshot or screenshot. - **Cross-app automation:** Read state from one app, act in another, report results. ## FAQ **Does Clawperator do autonomous planning?** No. It executes commands and reports structured results. Reasoning and planning stay in the agent. **How are concurrent executions handled?** Single-flight per device. A second overlapping execution returns `EXECUTION_CONFLICT_IN_FLIGHT`. **When should I use direct `adb` instead?** Use `adb` directly for operations not covered by the execution payload API: - **Diagnostics** when you need to inspect raw device state (logcat, package list, window focus). - **Pre-flight setup** outside the automation loop (granting permissions, installing APKs, checking installed packages). For routine UI automation, use Clawperator so result/error semantics stay consistent. **Can Clawperator open a specific URL or deep link?** Yes. Use the `open_uri` action with any URI scheme - `market://`, `https://`, or a custom app deep link. The Clawperator Android app issues an `ACTION_VIEW` intent directly on the device; no adb command is needed. Example: ```json { "id": "nav1", "type": "open_uri", "params": { "uri": "market://details?id=com.actionlauncher.playstore" } } ``` If multiple apps are registered for the URI scheme, a system chooser may appear. Follow the `open_uri` step with `snapshot_ui` to confirm the expected app opened. **Does Clawperator run skills?** Skills are standalone programs that agents can invoke directly. The Node API provides discovery (`skills list`, `skills search`), metadata (`skills get`), and a convenience `skills run` wrapper. Skills do not need the Node API to execute - agents can call skill scripts directly. **Does Clawperator configure accounts or app settings?** No. Clawperator automates the UI on whatever user-installed Android apps are already installed and signed in on the device. It does not log in to apps, create accounts, or configure device settings on behalf of the user. If an automation targets an app that requires authentication, the user must sign in to that app manually on the device before the agent runs. For emulators using a Google Play system image, the user must also sign in to a Google account before Play Store-gated apps are accessible. **How should agents handle sensitive text in results?** Default behavior is full-fidelity results for agent reasoning. PII redaction (`--safe-logs`) is a planned feature. --- # Reference # CLI Reference ``` clawperator [options] ``` --- ## Global Options These flags are parsed globally. Command support varies by path. | Flag | Description | |------|-------------| | `--device-id ` | Target Android device serial | | `--receiver-package ` | Target Operator package for broadcast dispatch | | `--output ` | Output format (default: `json`) | | `--timeout-ms ` | Override execution timeout for `execute`, `observe snapshot`, `observe screenshot`, and `inspect ui` within policy limits | | `--verbose` | Include debug diagnostics in output | | `--help` | Show help | | `--version` | Show version | Default receiver package: `com.clawperator.operator`. Use `--receiver-package com.clawperator.operator.dev` for local debug APKs. --- ## Commands ### `operator setup` Install the Clawperator Operator APK and grant required device permissions in one step. ``` clawperator operator setup --apk [--device-id ] [--receiver-package ] [--output ] ``` | Flag | Description | |------|-------------| | `--apk ` | Local path to the Operator APK file (required) | | `--device-id ` | Target Android device serial (required when multiple devices are connected) | | `--receiver-package ` | Operator package identifier (required when both release and debug variants are installed) | This is the canonical setup command. `clawperator operator install` remains a compatibility alias. It runs three phases in sequence: 1. **Install**: Copies the APK onto the device via `adb install -r`. 2. **Permissions**: Grants the accessibility service, notification posting, and notification listener permissions. 3. **Verification**: Confirms the package is visible via `pm list packages`. If `--receiver-package` is omitted, setup auto-detects the package only when exactly one known Operator variant is installed. If both release and debug variants are installed, pass `--receiver-package` explicitly. If any phase fails, the command exits with a structured JSON error identifying which phase failed. Error codes: | Code | Phase | Meaning | |------|-------|---------| | `OPERATOR_APK_NOT_FOUND` | pre-install | APK path does not exist on disk | | `OPERATOR_INSTALL_FAILED` | install | `adb install` returned a non-zero exit code | | `OPERATOR_GRANT_FAILED` | permissions | One or more permission grants failed | | `OPERATOR_VERIFY_FAILED` | verification | Package not found after install | Do not use raw `adb install` for normal setup. It installs the APK without granting permissions, leaving the device in an unusable state. For debug builds, pass `--receiver-package com.clawperator.operator.dev`. --- ### `emulator list` List configured Android Virtual Devices and their compatibility metadata. ``` clawperator emulator list [--output ] ``` --- ### `emulator inspect` Show the normalized metadata for one Android Virtual Device. ``` clawperator emulator inspect [--output ] ``` This is the diagnostic command for understanding whether an AVD is supported and why. --- ### `emulator create` Create the default supported Google Play Android Virtual Device. ``` clawperator emulator create [--name ] [--output ] ``` Defaults: - Android API `35` - Google Play image - ABI `arm64-v8a` - device profile `pixel_7` - AVD name `clawperator-pixel` --- ### `emulator start` Start an existing Android Virtual Device and wait for Android boot completion. ``` clawperator emulator start [--output ] ``` --- ### `emulator stop` Stop a running Android emulator by AVD name. ``` clawperator emulator stop [--output ] ``` --- ### `emulator delete` Delete an Android Virtual Device by name. ``` clawperator emulator delete [--output ] ``` --- ### `emulator status` List running Android emulators and boot state. ``` clawperator emulator status [--output ] ``` --- ### `emulator provision` Reuse or create a supported Android emulator and return a booted ADB target. ``` clawperator emulator provision [--output ] clawperator provision emulator [--output ] ``` Provisioning prefers: 1. a running supported emulator 2. a stopped supported AVD 3. creation of a new supported AVD --- ### `devices` List connected Android devices. ``` clawperator devices ``` **Output:** JSON array of `{ serial, state }` objects. --- ### `packages list` List installed package IDs on a device. ``` clawperator packages list [--device-id ] [--third-party] ``` | Flag | Description | |------|-------------| | `--device-id ` | Target device serial | | `--third-party` | Limit to third-party packages only | --- ### `execute` Execute a validated command payload. ``` clawperator execute --execution [--device-id ] [--receiver-package ] [--timeout-ms ] ``` | Flag | Description | |------|-------------| | `--execution ` | Execution payload as inline JSON or a path to a JSON file (required) | | `--device-id ` | Target device serial | | `--receiver-package ` | Target Operator package | | `--timeout-ms ` | Override execution timeout within policy limits | The `--execution` value must conform to the `Execution` contract (see [api-overview.md](./api-overview.md)). **Note:** `execute best-effort` is not implemented in this stage. Use `observe snapshot` + agent reasoning instead. --- ### `observe snapshot` Capture the current UI snapshot from the device. ``` clawperator observe snapshot [--device-id ] [--receiver-package ] [--timeout-ms ] [--output ] [--verbose] ``` Returns ASCII-formatted UI tree via the `snapshot_ui` action. --- ### `observe screenshot` Capture the current device screen as a PNG file. ``` clawperator observe screenshot [--device-id ] [--receiver-package ] [--timeout-ms ] [--output ] [--verbose] ``` The PNG is saved to a temp path and the path is returned in the result envelope. --- ### `inspect ui` Alias for `observe snapshot` with formatted output. ``` clawperator inspect ui [--device-id ] [--receiver-package ] [--timeout-ms ] [--output ] [--verbose] ``` `inspect ui` is a wrapper alias over `observe snapshot`. --- ### `action open-uri` Open a URI on the device using the system default handler. ``` clawperator action open-uri --uri [--device-id ] [--receiver-package ] ``` | Flag | Description | |------|-------------| | `--uri ` | URI to open (required). Any scheme: `https://`, `market://`, deep links, etc. | --- ### `action open-app` Open an app by package ID. ``` clawperator action open-app --app [--device-id ] [--receiver-package ] ``` | Flag | Description | |------|-------------| | `--app ` | Android application ID (required) | --- ### `action click` Click a UI node matching a selector. ``` clawperator action click --selector [--device-id ] [--receiver-package ] ``` | Flag | Description | |------|-------------| | `--selector ` | `NodeMatcher` JSON (required) | Example selector: `'{"resourceId":"com.example.app:id/button_ok"}'` --- ### `action read` Read text from a UI node. ``` clawperator action read --selector [--device-id ] [--receiver-package ] ``` | Flag | Description | |------|-------------| | `--selector ` | `NodeMatcher` JSON (required) | --- ### `action wait` Wait for a UI node to appear. ``` clawperator action wait --selector [--device-id ] [--receiver-package ] ``` | Flag | Description | |------|-------------| | `--selector ` | `NodeMatcher` JSON (required) | --- ### `action type` Type text into a UI node. ``` clawperator action type --selector --text [--submit] [--clear] [--device-id ] [--receiver-package ] ``` | Flag | Description | |------|-------------| | `--selector ` | `NodeMatcher` JSON (required) | | `--text ` | Text to type (required) | | `--submit` | Send submit/enter after typing | | `--clear` | Clear the field before typing | --- ### `skills list` List available skills from the local index/cache. ``` clawperator skills list ``` --- ### `skills get ` Show metadata for a specific skill. ``` clawperator skills get ``` --- ### `skills compile-artifact` Compile a skill artifact with optional variable substitution. ``` clawperator skills compile-artifact --artifact [--vars ] clawperator skills compile-artifact --skill-id --artifact [--vars ] ``` | Flag / Arg | Description | |------------|-------------| | `` | Skill ID (positional) | | `--skill-id ` | Skill ID (named flag, alternative to positional) | | `--artifact ` | Artifact name, e.g. `ac-status` or `ac-status.recipe.json` (required) | | `--vars ` | Variable substitution JSON object (default: `{}`) | --- ### `skills search` Search skills by target application, intent, or keyword. ``` clawperator skills search [--app ] [--intent ] [--keyword ] ``` | Flag | Description | |------|-------------| | `--app ` | Filter by Android application ID | | `--intent ` | Filter by skill intent | | `--keyword ` | Search skill ID and summary text | At least one filter is required. --- ### `skills run` Invoke a skill's primary script as a convenience wrapper. ``` clawperator skills run [--device-id ] [-- ] ``` | Flag / Arg | Description | |------------|-------------| | `` | Skill ID (required) | | `--device-id ` | Device serial passed as first script arg | | `-- ` | Additional arguments passed through to the script | Skills are standalone programs. This command is a convenience - agents can also invoke skill scripts directly. --- ### `skills install` Clone the skills repository to `~/.clawperator/skills/`. ``` clawperator skills install ``` Prints the `CLAWPERATOR_SKILLS_REGISTRY` env var export instruction on success. --- ### `skills update` Pull latest skills from the repository. ``` clawperator skills update [--ref ] ``` | Flag | Description | |------|-------------| | `--ref ` | Pin to a specific git ref (default: `main`) | --- ### `skills sync` Sync and pin the skills index/cache to a specific git ref. ``` clawperator skills sync --ref ``` | Flag | Description | |------|-------------| | `--ref ` | Git ref to pin to (required) | Use `clawperator skills sync --help` when you need the current clone and registry-path guidance. --- ### `grant-device-permissions` Re-grant accessibility and notification permissions only after an Operator APK crash causes Android to revoke them. ``` clawperator grant-device-permissions [--device-id ] [--receiver-package ] [--output ] ``` This command is for **crash recovery only**. Use it after a previously working Operator APK crashes and Android revokes the accessibility or notification permissions. For initial setup, always use `clawperator operator setup` instead. Use the release package by default. Pass `--receiver-package com.clawperator.operator.dev` for local debug builds. --- ### `serve` Start a local HTTP/SSE server for remote control. ``` clawperator serve [--port ] [--host ] ``` | Flag | Description | |------|-------------| | `--port ` | Port to listen on (default: `3000`) | | `--host ` | Host to bind (default: `127.0.0.1`) | HTTP endpoints exposed: - `GET /devices` - list connected devices - `POST /execute` - run an execution payload - `POST /observe/snapshot` - capture UI snapshot - `POST /observe/screenshot` - capture screenshot - `GET /skills` - list or search skills - `GET /skills/:skillId` - get skill metadata - `POST /skills/:skillId/run` - run a skill script - `GET /events` - SSE stream of execution results See [api-overview.md](./api-overview.md) for HTTP API details. --- ### `doctor` Run environment and runtime checks. ``` clawperator doctor [--output ] [--device-id ] [--receiver-package ] [--verbose] clawperator doctor --json clawperator doctor --fix clawperator doctor --full clawperator doctor --check-only ``` | Flag | Description | |------|-------------| | `--json` | Output as JSON (alias for `--output json`) | | `--fix` | Attempt non-destructive host fixes | | `--full` | Full Android build + install + handshake + smoke | | `--check-only` | Always exit 0 for CI or automation | | `--device-id ` | Target device serial | | `--receiver-package ` | Target Operator package | `doctor` checks APK presence before attempting version compatibility and handshake validation. Use `clawperator doctor --help` if you need the current timeout and package-target guidance. --- ### `version` Show the CLI version, or compare it with the installed Operator APK. ``` clawperator version clawperator version --check-compat [--device-id ] [--receiver-package ] ``` | Flag | Description | |------|-------------| | `--check-compat` | Compare the CLI version with the installed APK version | | `--device-id ` | Target device serial | | `--receiver-package ` | Target Operator package | `clawperator version --check-compat` reports the CLI version, installed APK version, APK `versionCode`, receiver package, compatibility verdict, and remediation guidance when versions do not match. Use `clawperator version --help` for the current compatibility-check notes and default receiver-package guidance. --- ## Exit Codes - `0` - success - `1` - error (JSON error object with `code` field printed to stdout) Error output always uses the `[Clawperator-Result]` terminal envelope format when dispatched via an execution, or a plain `{ code, message }` JSON object for CLI-level errors. --- # Error Codes All errors returned by the Node API use a structured `ClawperatorError` shape: ```ts { code: ErrorCode; message: string; hint?: string; details?: Record; fallback_instructions_path?: string; } ``` The `code` field is always one of the string constants listed below. --- ## Host | Code | Description | |------|-------------| | `HOST_DEPENDENCY_MISSING` | A required host-side tool or dependency is missing | --- ## Setup and Connectivity | Code | Description | |------|-------------| | `ADB_NOT_FOUND` | `adb` binary not found on the host | | `NO_DEVICES` | No Android devices are connected | | `MULTIPLE_DEVICES_DEVICE_ID_REQUIRED` | Multiple devices connected but no `--device-id` specified | | `RECEIVER_NOT_INSTALLED` | The Clawperator Operator APK is not installed on the device | | `DEVICE_NOT_FOUND` | The specified `--device-id` is not among connected devices | --- ## Execution and State | Code | Description | |------|-------------| | `EXECUTION_VALIDATION_FAILED` | The execution payload failed schema validation | | `EXECUTION_ACTION_UNSUPPORTED` | One or more action types in the payload are not supported | | `EXECUTION_CONFLICT_IN_FLIGHT` | Another execution is already running on this device | | `RESULT_ENVELOPE_TIMEOUT` | The device did not emit a `[Clawperator-Result]` envelope within the timeout | | `RESULT_ENVELOPE_MALFORMED` | The result envelope emitted by the device could not be parsed | | `SNAPSHOT_EXTRACTION_FAILED` | UI hierarchy extraction from device logs failed | --- ## UI and Nodes | Code | Description | |------|-------------| | `NODE_NOT_FOUND` | No UI node matched the provided `NodeMatcher` | | `NODE_NOT_CLICKABLE` | The matched node is not interactable | | `SECURITY_BLOCK_DETECTED` | A security overlay or lock screen blocked the action | | `CONTAINER_NOT_FOUND` | `scroll` or `scroll_until` could not locate a scrollable container. Either no scrollable node is present on screen, or the provided `container` matcher matched nothing. | | `CONTAINER_NOT_SCROLLABLE` | `scroll` or `scroll_until` found the matched container but it is not scrollable, and `findFirstScrollableChild` is false (or no scrollable descendant was found). | | `GESTURE_FAILED` | `scroll` step: the OS rejected the gesture dispatch. The accessibility service was running but Android declined to execute the swipe gesture. Step returns `success: false`. | --- ## Doctor and Host Checks These codes are produced by `clawperator doctor` and related checks. | Code | Description | |------|-------------| | `NODE_TOO_OLD` | Node.js version is below the required minimum | | `ADB_SERVER_FAILED` | The ADB server failed to start | | `ADB_NO_USB_PERMISSIONS` | The host lacks USB permissions to communicate with the device | | `DEVICE_UNAUTHORIZED` | Device is connected but has not authorized this host for ADB | | `DEVICE_OFFLINE` | Device is listed by ADB but is offline | | `DEVICE_SHELL_UNAVAILABLE` | ADB shell is not available on the device | | `RECEIVER_VARIANT_MISMATCH` | The installed APK variant (debug/release) does not match the expected variant | | `DEVICE_DEV_OPTIONS_DISABLED` | Developer options are not enabled on the device | | `DEVICE_USB_DEBUGGING_DISABLED` | USB debugging is not enabled on the device | | `DEVICE_ACCESSIBILITY_NOT_RUNNING` | The Clawperator accessibility service is not running | | `ANDROID_BUILD_FAILED` | The Android APK build step failed | | `ANDROID_INSTALL_FAILED` | APK installation on the device failed | | `ANDROID_APP_LAUNCH_FAILED` | The app failed to launch after install | | `SMOKE_OPEN_SETTINGS_FAILED` | Smoke test: opening device Settings failed | | `SCRCPY_NOT_FOUND` | `scrcpy` binary not found (optional dependency) | | `APK_VERSION_UNREADABLE` | The installed APK version could not be read from `adb shell dumpsys package` | | `APK_VERSION_INVALID` | The installed APK version string is not parseable for compatibility checks | | `CLI_VERSION_INVALID` | The CLI version string is not parseable for compatibility checks | | `VERSION_INCOMPATIBLE` | Node API and Android runtime versions are incompatible | | `LOGCAT_UNAVAILABLE` | Could not access device logcat | | `ANDROID_SDK_TOOL_MISSING` | A required Android SDK tool such as `adb`, `emulator`, `sdkmanager`, or `avdmanager` is not available | | `EMULATOR_NOT_FOUND` | The requested AVD does not exist | | `EMULATOR_ALREADY_RUNNING` | The requested operation requires the AVD to be stopped first | | `EMULATOR_NOT_RUNNING` | The requested AVD is not currently running | | `EMULATOR_UNSUPPORTED` | The AVD exists but does not satisfy Clawperator compatibility rules | | `EMULATOR_CREATE_FAILED` | Reserved generic emulator creation failure code | | `EMULATOR_START_FAILED` | Emulator process did not register with adb in time | | `EMULATOR_STOP_FAILED` | Emulator stop request failed | | `EMULATOR_DELETE_FAILED` | Emulator deletion failed | | `EMULATOR_BOOT_TIMEOUT` | Android boot completion did not finish before timeout | | `ANDROID_SYSTEM_IMAGE_INSTALL_FAILED` | Android SDK system image install or license acceptance failed | | `ANDROID_AVD_CREATE_FAILED` | `avdmanager` failed to create the AVD | --- ## Operator setup These codes are produced by `clawperator operator setup` (or the `operator install` alias). | Code | Description | |------|-------------| | `OPERATOR_APK_NOT_FOUND` | Local APK file not found | | `OPERATOR_INSTALL_FAILED` | `adb install` returned a non-zero exit code | | `OPERATOR_GRANT_FAILED` | One or more required device permission grants failed | | `OPERATOR_VERIFY_FAILED` | Operator package not visible to package manager after install | --- ## Internal / Other | Code | Description | |------|-------------| | `BROADCAST_FAILED` | ADB broadcast to the receiver package failed | | `PAYLOAD_TOO_LARGE` | Execution payload exceeds the 64,000 byte limit | | `DOCTOR_FAILED` | Doctor check runner encountered an unexpected error | --- ## Skills These codes are produced by the skills CLI commands (`skills list`, `skills get`, `skills search`, `skills run`, `skills compile-artifact`, `skills install`, `skills update`, `skills sync`) and may also be returned by the HTTP skills endpoints when running in serve mode. | Code | Description | |------|-------------| | `SKILL_NOT_FOUND` | No skill with the given ID exists in the registry | | `ARTIFACT_NOT_FOUND` | The named artifact does not exist for the skill | | `COMPILE_VARS_REQUIRED` | Reserved; not currently emitted | | `COMPILE_VAR_MISSING` | A required placeholder variable was not provided | | `COMPILE_VARS_PARSE_FAILED` | The `--vars` JSON string could not be parsed | | `COMPILE_VALIDATION_FAILED` | Compiled artifact failed execution schema validation | | `REGISTRY_READ_FAILED` | Could not read or parse the skills registry file | | `SKILL_SCRIPT_NOT_FOUND` | The skill's script file does not exist on disk | | `SKILL_EXECUTION_FAILED` | The skill script exited with a non-zero code | | `SKILL_EXECUTION_TIMEOUT` | The skill script exceeded the execution timeout | | `SKILLS_SYNC_FAILED` | Git clone or pull of the skills repository failed | | `SKILLS_GIT_NOT_FOUND` | `git` is not installed or not on PATH | --- ## Diagnostic Types Some errors include additional fields for deeper diagnosis. ### TimeoutDiagnostics Returned when `RESULT_ENVELOPE_TIMEOUT` occurs. ```ts { code: "RESULT_ENVELOPE_TIMEOUT"; message: string; lastCorrelatedEvents?: string[]; // last logcat lines correlated to this command broadcastDispatchStatus?: string; // result of the ADB broadcast call deviceId?: string; receiverPackage?: string; } ``` ### BroadcastDiagnostics Returned when `BROADCAST_FAILED` or `RECEIVER_NOT_INSTALLED` occurs. ```ts { code: "BROADCAST_FAILED" | "RECEIVER_NOT_INSTALLED"; message: string; lastCorrelatedEvents?: string[]; broadcastDispatchStatus?: string; deviceId?: string; receiverPackage?: string; } ``` --- ## Doctor Check Result `clawperator doctor` returns a `DoctorReport`: ```ts { ok: boolean; deviceId?: string; receiverPackage?: string; checks: DoctorCheckResult[]; nextActions?: string[]; } ``` Each check in `checks`: ```ts { id: string; // e.g. "host.adb.present" status: "pass" | "warn" | "fail"; code?: string; // one of the error codes above summary: string; detail?: string; fix?: { title: string; platform: "mac" | "linux" | "win" | "any"; steps: Array<{ kind: "shell" | "manual"; value: string }>; }; deviceGuidance?: { screen: string; steps: string[]; }; evidence?: Record; } ``` --- # API Overview Clawperator exposes a Node-based interface for agent-driven device automation. The agent (brain) calls this API to dispatch actions to an Android device (hand). The API handles device resolution, payload validation, broadcast dispatch, and result collection. --- ## Interaction Model 1. Agent constructs an `Execution` payload. 2. Agent calls `execute` (CLI) or `POST /execute` (HTTP). 3. Clawperator validates the payload, resolves the device, dispatches via ADB broadcast, and waits for a `[Clawperator-Result]` envelope from logcat. 4. The result envelope is returned to the agent. Single-flight enforcement: only one execution per device runs at a time. Concurrent calls return `EXECUTION_CONFLICT_IN_FLIGHT`. Clawperator can also provision and manage Android emulators through the Node layer. That gives agents a deterministic alternative runtime to a physical Android device. --- ## Execution Payload The core unit dispatched to the device. ```json { "commandId": "my-cmd-001", "taskId": "my-task-001", "source": "my-agent", "expectedFormat": "android-ui-automator", "timeoutMs": 30000, "actions": [ { "id": "step1", "type": "click", "params": { "matcher": { "resourceId": "com.example:id/submit" } } } ] } ``` ### Fields | Field | Type | Required | Description | |-------|------|----------|-------------| | `commandId` | string | yes | Unique command identifier (max 128 chars) | | `taskId` | string | yes | Task correlation ID (max 128 chars) | | `source` | string | yes | Caller identifier (max 64 chars) | | `expectedFormat` | string | yes | Must be `"android-ui-automator"` | | `timeoutMs` | number | yes | Timeout in ms (1000-120000) | | `actions` | array | yes | 1-50 actions | | `mode` | string | no | `"artifact_compiled"` or `"direct"` | ### Limits | Limit | Value | |-------|-------| | Max actions per execution | 50 | | Min timeout | 1,000 ms | | Max timeout | 120,000 ms | | Max payload size | 64,000 bytes | | Max ID length | 128 chars | | Max source length | 64 chars | | Max matcher value length | 512 chars | --- ## Action Types ### Canonical Types | Type | Required params | Description | |------|----------------|-------------| | `open_uri` | `uri` | Open a URI via the system default handler | | `open_app` | `applicationId` | Launch an app | | `close_app` | `applicationId` | Force-stop an app | | `click` | `matcher` | Tap a UI node | | `scroll_and_click` | `target` | Scroll to and tap a node | | `scroll` | - | Single scroll gesture with outcome reporting | | `scroll_until` | - | Bounded scroll loop with machine-readable termination reason | | `read_text` | `matcher` | Read text from a UI node | | `enter_text` | `matcher`, `text` | Type text into a UI node | | `wait_for_node` | `matcher` | Wait for a node to appear | | `snapshot_ui` | - | Capture the canonical `hierarchy_xml` UI tree | | `take_screenshot` | - | Capture screen as PNG | | `sleep` | `durationMs` | Pause execution | | `press_key` | `key` | Issue a system navigation key via accessibility | ### Aliases (normalized at input) | Alias | Canonical type | |-------|---------------| | `tap`, `press` | `click` | | `wait_for`, `find`, `find_node` | `wait_for_node` | | `read` | `read_text` | | `snapshot` | `snapshot_ui` | | `screenshot`, `capture_screenshot` | `take_screenshot` | | `type_text`, `text_entry`, `input_text` | `enter_text` | | `open_url` | `open_uri` | | `key_press` | `press_key` | --- ## NodeMatcher (Selector) Used to identify a UI node. At least one field is required. ```json { "resourceId": "com.example.app:id/button_ok", "role": "android.widget.Button", "textEquals": "OK", "textContains": "Submit", "contentDescEquals": "Submit button", "contentDescContains": "submit" } ``` All fields are optional but at least one must be non-empty. Values are ORed internally by the device runtime. --- ## Action Params Reference | Param | Type | Used by | |-------|------|---------| | `uri` | string | `open_uri` | | `applicationId` | string | `open_app`, `close_app` | | `matcher` | NodeMatcher | `click`, `read_text`, `enter_text`, `wait_for_node` | | `text` | string | `enter_text` | | `submit` | boolean | `enter_text` - press enter after typing | | `clear` | boolean | `enter_text` - clear field before typing | | `clickType` | string | `click` - `default`, `long_click`, or `focus` | | `target` | NodeMatcher | `scroll_and_click` | | `container` | NodeMatcher | `scroll_and_click`, `scroll`, `scroll_until` | | `direction` | string | `scroll_and_click`, `scroll`, `scroll_until` | | `maxSwipes` | number | `scroll_and_click` | | `clickAfter` | boolean | `scroll_and_click` - when `false`, scroll to target without clicking | | `maxScrolls` | number | `scroll_until` - maximum scroll iterations (default: 20) | | `maxDurationMs` | number | `scroll_until` - wall-clock cap in ms (default: 10000) | | `noPositionChangeThreshold` | number | `scroll_until` - consecutive no-movement scrolls before stopping (default: 3) | | `durationMs` | number | `sleep` | | `key` | `"back"\|"home"\|"recents"` | `press_key` | | `path` | string | `take_screenshot` - output file path | | `distanceRatio` | number | `scroll_and_click`, `scroll`, `scroll_until` | | `settleDelayMs` | number | `scroll_and_click`, `scroll`, `scroll_until` | | `findFirstScrollableChild` | boolean | `scroll_and_click`, `scroll`, `scroll_until` - auto-use first scrollable descendant (default: `true`) | | `retry` | object | per-step retry config | For `scroll` and `scroll_until`, omitting `container` uses the first visible `scrollable="true"` node. That is convenient on simple screens, but on nested-scroll layouts agents should prefer an explicit `container.resourceId` taken from `snapshot_ui`. --- ## Result Envelope All executions return a `ResultEnvelope` via the `[Clawperator-Result]` terminal signal. ```json { "commandId": "my-cmd-001", "taskId": "my-task-001", "status": "success", "stepResults": [ { "id": "step1", "actionType": "click", "success": true, "data": {} } ], "error": null } ``` ### ResultEnvelope Fields | Field | Type | Description | |-------|------|-------------| | `commandId` | string | Correlates to the dispatched command | | `taskId` | string | Correlates to the task | | `status` | `"success"\|"failed"` | Overall execution status | | `stepResults` | array | Per-action results | | `error` | string or null | Top-level error message if failed | ### StepResult Fields | Field | Type | Description | |-------|------|-------------| | `id` | string | Action ID from the execution | | `actionType` | string | Canonical action type | | `success` | boolean | Whether the step succeeded | | `data` | object | Action-specific output data | | `error` | string | Step-level error message | For `snapshot_ui`, `data.text` contains the UI tree string. For `take_screenshot`, `data.path` contains the local file path of the PNG. For `press_key`, `data.key` contains the normalized lowercase key name. If Android rejects the global action, the step returns `success: false` with `data.error: "GLOBAL_ACTION_FAILED"`. --- ## HTTP API (serve mode) Start with `clawperator serve [--port 3000] [--host 127.0.0.1]`. ### `GET /android/emulators` Return configured Android Virtual Devices with normalized compatibility metadata. **Response:** ```json { "avds": [ { "name": "clawperator-pixel", "exists": true, "running": false, "supported": true, "apiLevel": 35, "abi": "arm64-v8a", "playStore": true, "deviceProfile": "pixel_7", "systemImage": "system-images;android-35;google_apis_playstore;arm64-v8a", "unsupportedReasons": [] } ] } ``` ### `GET /android/emulators/:name` Return the normalized view of one AVD. This is the emulator diagnosis endpoint. ### `GET /android/emulators/running` Return running emulator devices and boot state. **Response:** ```json { "devices": [ { "type": "emulator", "avdName": "clawperator-pixel", "serial": "emulator-5554", "booted": true } ] } ``` ### `POST /android/emulators/create` Create a new supported AVD. The request body may include: - `name` - `apiLevel` - `deviceProfile` - `abi` - `playStore` ### `POST /android/emulators/:name/start` Start an existing AVD and return a booted emulator device: ```json { "type": "emulator", "avdName": "clawperator-pixel", "serial": "emulator-5554", "booted": true } ``` ### `POST /android/emulators/:name/stop` Stop a running emulator by AVD name. ### `DELETE /android/emulators/:name` Delete an AVD by name. ### `POST /android/provision/emulator` Provision a supported emulator using deterministic reuse-first orchestration: 1. reuse a running supported emulator 2. start a stopped supported AVD 3. create a new supported AVD The default profile is Android API `35`, Google Play, ABI `arm64-v8a`, device profile `pixel_7`, and AVD name `clawperator-pixel`. ### `GET /devices` Returns connected devices. **Response:** ```json { "ok": true, "devices": [{ "serial": "", "state": "device" }] } ``` ### `POST /execute` Run an execution payload. **Request body:** ```json { "execution": { /* Execution object */ }, "deviceId": "", "receiverPackage": "com.clawperator.operator" } ``` **Response (success):** ```json { "ok": true, "envelope": { /* ResultEnvelope */ }, "deviceId": "" } ``` **Response (error):** ```json { "ok": false, "error": { "code": "ERROR_CODE", "message": "..." } } ``` **HTTP status codes:** | Code | Condition | |------|-----------| | 200 | Success | | 400 | Validation error, missing fields, ambiguous device | | 404 | Device not found or no devices | | 413 | Payload too large | | 423 | Execution conflict (in-flight) | | 504 | Result envelope timeout | ### `POST /observe/snapshot` Capture UI snapshot. Body: `{ "deviceId"?, "receiverPackage"? }`. Same response shape as `/execute`. ### `POST /observe/screenshot` Capture screenshot. Body: `{ "deviceId"?, "receiverPackage"? }`. Same response shape as `/execute`. The PNG path is in `envelope.stepResults[0].data.path`. ### `GET /skills` List or search skills. Use query parameters to filter. **Query parameters:** `?app=&intent=&keyword=` (all optional). **Response:** ```json { "ok": true, "skills": [{ "id": "...", "applicationId": "...", "intent": "...", "summary": "..." }], "count": 2 } ``` ### `GET /skills/:skillId` Get metadata for a specific skill. **Response (success):** ```json { "ok": true, "skill": { "id": "...", "applicationId": "...", "intent": "...", "summary": "...", "scripts": [...], "artifacts": [...] } } ``` **Response (not found):** HTTP 404 ```json { "ok": false, "error": { "code": "SKILL_NOT_FOUND", "message": "..." } } ``` ### `POST /skills/:skillId/run` Run a skill script (convenience wrapper). **Request body:** ```json { "deviceId": "", "args": ["extra", "args"] } ``` Both fields are optional. **Response (success):** ```json { "ok": true, "skillId": "...", "output": "...", "exitCode": 0, "durationMs": 8500 } ``` **Response (error):** ```json { "ok": false, "error": { "code": "SKILL_EXECUTION_FAILED", "message": "...", "skillId": "...", "exitCode": 1, "stderr": "..." } } ``` ### `GET /events` (SSE) Server-Sent Events stream. Emits two event types: - `clawperator:result` - fired when an execution completes: `{ deviceId, envelope }` - `clawperator:execution` - fired for every execution attempt: `{ deviceId, input, result }` - `heartbeat` - initial connection confirmation --- ## Environment Variables | Variable | Description | |----------|-------------| | `ADB_PATH` | Override path to `adb` binary | | `CLAWPERATOR_RECEIVER_PACKAGE` | Default receiver package (fallback if not passed as option) | | `CLAWPERATOR_SKILLS_REGISTRY` | Path to `skills-registry.json`. If unset, defaults to `./skills/skills-registry.json` relative to the working directory. After `skills install`, set to `~/.clawperator/skills/skills/skills-registry.json`. | --- ## Receiver Packages | Variant | Package ID | |---------|-----------| | Release | `com.clawperator.operator` | | Debug / Local | `com.clawperator.operator.dev` | --- # Clawperator Doctor `clawperator doctor` is the runtime readiness check for the Node CLI. It verifies that the host environment, connected device, and installed [Clawperator Operator Android app](../getting-started/android-operator-apk.md) are in a usable state, and that the end-to-end command path is functional, before an agent relies on the device. This page describes the current shipped behavior. It replaces the older v0.1 design notes. ## Command Surface ```bash clawperator doctor [--output ] [--device-id ] [--receiver-package ] clawperator doctor --json clawperator doctor --fix clawperator doctor --full clawperator doctor --check-only ``` Supported flags: - `--output pretty|json` - select human-readable or machine-readable output - `--format pretty|json` - alias for `--output` - `--json` - shorthand for `--output json` - `--device-id ` - target one device when multiple are connected - `--receiver-package ` - override the target Operator package - `--fix` - run shell-based remediation steps from non-passing checks (both `fail` and `warn`) - `--full` - include Android build, install, launch, and smoke test checks - `--check-only` - always exit `0`, even when critical checks fail; does not change halt behavior (doctor still returns early on critical failures) Default receiver package: - release app package: `com.clawperator.operator` - local debug app package: `com.clawperator.operator.dev` If you use a local debug build of the [Clawperator Operator Android app](../getting-started/android-operator-apk.md), pass `--receiver-package com.clawperator.operator.dev` consistently to `doctor`, `operator setup`, `grant-device-permissions`, `version --check-compat`, and `observe snapshot`. ## What Doctor Checks Doctor runs checks in a fixed order. When a critical check fails, doctor returns immediately - all subsequent checks are skipped. The one exception is `device.capability`: it is a critical check (its failure marks the report as not ok), but a failure there does not halt the run; doctor continues into the runtime readiness phase regardless. `--full` mode inserts additional build/install/launch checks at two points in the sequence rather than appending them at the end. The full execution order is: ### 1. Host checks - `host.node.version` - Node.js major version must be `22` or newer - `host.adb.presence` - `adb` must be installed and reachable in `PATH` - `host.adb.server` - `adb start-server` must succeed - `host.java.version` - Java 17 or 21 must be installed (`--full` only) - `build.android.assemble` - runs `./gradlew :app:assembleDebug` (`--full` only) ### 2. Device discovery - `device.discovery` - exactly one reachable target device must be available, or `--device-id` must disambiguate multiple devices - `build.android.install` - runs `./gradlew :app:installDebug` (`--full` only) - `build.android.launch` - launches `clawperator.activity.MainActivity` (`--full` only) - `device.capability` - the target device shell must be reachable; the report also captures SDK level, `wm size`, and `wm density` as evidence ### 3. Runtime readiness - `readiness.apk.presence` - confirms the requested receiver package is installed, or warns if the other release/debug variant is installed instead - `readiness.version.compatibility` - verifies that the CLI and installed [Clawperator Operator Android app](../getting-started/android-operator-apk.md) share a compatible `major.minor` - `readiness.settings.dev_options` - warns if Android Developer Options are disabled - `readiness.settings.usb_debugging` - warns if USB debugging is disabled - `readiness.handshake` - sends a `doctor_ping` command and waits for one canonical `[Clawperator-Result]` envelope - `readiness.smoke` - opens Android Settings and runs a `snapshot_ui` step; passes if the snapshot command succeeds (`--full` only, runs after handshake) ## Critical vs Advisory Checks Not every warning makes the environment unusable. Critical checks currently include: - host Node/adb/java checks - device discovery and shell availability - Android build/install/launch checks in `--full` - [Clawperator Operator Android app](../getting-started/android-operator-apk.md) version compatibility - handshake - smoke test in `--full` Advisory warnings currently include: - [Clawperator Operator Android app](../getting-started/android-operator-apk.md) not installed or wrong release/debug variant installed - Developer Options disabled - USB debugging disabled Exit behavior: - normal mode exits `0` when all critical checks pass - normal mode exits `1` when any critical check fails - `--check-only` always exits `0` In other words, `doctor` is allowed to report warnings while still exiting successfully if the critical command path is usable. ## Output Model ### Pretty output Pretty output groups results into: - critical checks - advisory checks - a count of additional passed checks - a final readiness summary - `Next actions` with commands or manual steps ### JSON output `--output json`, `--format json`, and `--json` all return a `DoctorReport`: ```json { "ok": true, "criticalOk": true, "deviceId": "", "receiverPackage": "com.clawperator.operator", "checks": [ { "id": "readiness.handshake", "status": "pass", "summary": "Handshake successful." } ], "nextActions": [ "Try: clawperator observe snapshot --device-id " ] } ``` Important fields: - `ok` - currently mirrors whether all critical checks passed - `criticalOk` - explicit critical-check verdict used by the CLI exit code - `checks[]` - ordered check results with IDs, status, summary, and optional diagnostics - `nextActions[]` - optional; deduplicated shell commands or manual instructions; populated from non-passing check remediation steps, or suggested follow-up commands when all checks pass; omitted when there are no actions to surface. Note: when `--fix` is used, shell remediation steps are executed during finalization and are not included in `nextActions` - only manual and on-device guidance steps remain Each `DoctorCheckResult` can also include: - `code` - stable error code when one is available - `detail` - failure detail or extra context - `fix` - shell or manual remediation steps - `deviceGuidance` - on-device navigation instructions - `evidence` - structured diagnostic facts such as versions or device properties ## How `--fix` Works Today `--fix` does not have a separate repair plan. After the check run completes (or halts early on a critical failure), doctor applies available shell-based remediation steps from the collected checks during finalization. Checks are not re-run after fixes are applied. If the run halted early, only checks that ran before the halt will have their fix steps executed. Today that can include actions such as: - restarting the adb server - running `clawperator grant-device-permissions` after an Operator APK crash revoked permissions - running follow-up diagnostic commands suggested by handshake failures Manual steps and on-device guidance are still reported in `nextActions`; they are not automated. Because `--fix` is driven by per-check shell steps, its behavior is intentionally narrow and deterministic. It is best treated as a convenience repair pass, not a full interactive bootstrap flow. ## Handshake Semantics The handshake is the core end-to-end proof that the Node CLI can talk to the Operator app: 1. Doctor clears logcat for a clean capture window. 2. It broadcasts a `doctor_ping` execution to the configured receiver package. 3. It waits for a correlated `[Clawperator-Result]` envelope for up to 7000 ms. 4. It fails if the broadcast itself fails, the envelope times out, or the Operator returns a runtime error. On handshake timeout, the report includes: - broadcast dispatch status - receiver package - device id when available - follow-up commands such as `clawperator grant-device-permissions` after a crash-revocation event and `clawperator observe snapshot --timeout-ms 5000` ## Common Usage Basic local check: ```bash clawperator doctor --output pretty ``` Machine-readable installer or automation check: ```bash clawperator doctor --format json ``` Target a specific device and debug build of the [Clawperator Operator Android app](../getting-started/android-operator-apk.md): ```bash clawperator doctor \ --device-id \ --receiver-package com.clawperator.operator.dev \ --output pretty ``` Best-effort repair pass: ```bash clawperator doctor --fix ``` Full Android build/install/smoke validation: ```bash clawperator doctor --full ``` ## Related Commands - `clawperator version --check-compat` - version compatibility check without the full doctor report - `clawperator grant-device-permissions` - restore Accessibility and related app ops after an Operator APK crash causes Android to revoke them - `clawperator observe snapshot` - direct runtime check once doctor reports the environment is ready For initial installation and device setup, see [First-Time Setup](../getting-started/first-time-setup.md) and [OpenClaw First Run](../getting-started/openclaw-first-run.md). --- # Troubleshooting # Version Compatibility Clawperator expects the Node CLI and the installed [Clawperator Operator Android app](../getting-started/android-operator-apk.md) to move together. ## Compatibility rule The CLI and the [Clawperator Operator Android app](../getting-started/android-operator-apk.md) are compatible when their `major.minor` versions match. Examples: - CLI `0.1.4` and app `0.1.4` - compatible - CLI `0.1.4` and app `0.1.9` - compatible - CLI `0.1.4` and app `0.1.4-d` - compatible - CLI `0.1.4` and app `0.1.4-rc.1` - compatible - CLI `0.1.4` and app `0.2.1` - not compatible - CLI `0.1.4` and app `0.2.2` - not compatible - CLI `0.1.4` and app `0.2.3` - not compatible - CLI `0.1.4` and app `0.2.4` - not compatible - CLI `0.1.4` and app `0.2.5` - not compatible - CLI `0.1.4` and app `0.3.0` - not compatible Notes: - Patch differences are allowed. - The local debug suffix `-d` is ignored for compatibility checks. - Prerelease suffixes such as `-alpha.1`, `-beta.2`, and `-rc.1` are parsed, but compatibility still depends only on matching `major.minor`. ## Check versions Print the CLI version: ```bash clawperator version ``` Check the CLI against the installed [Clawperator Operator Android app](../getting-started/android-operator-apk.md): ```bash clawperator version --check-compat --receiver-package com.clawperator.operator ``` The compatibility check reports: - CLI version - installed [Clawperator Operator Android app](../getting-started/android-operator-apk.md) version - installed [Clawperator Operator Android app](../getting-started/android-operator-apk.md) `versionCode` - receiver package checked - compatibility verdict - remediation guidance when versions do not match `clawperator doctor` also runs this check after confirming the requested receiver package is installed. ## Common mismatch symptoms When the CLI and the [Clawperator Operator Android app](../getting-started/android-operator-apk.md) are out of sync, you may see: - `VERSION_INCOMPATIBLE` from `clawperator doctor` or `clawperator version --check-compat` - `RESULT_ENVELOPE_MALFORMED` if the CLI and the [Clawperator Operator Android app](../getting-started/android-operator-apk.md) disagree on result shape - `EXECUTION_ACTION_UNSUPPORTED` when the CLI sends an action the [Clawperator Operator Android app](../getting-started/android-operator-apk.md) does not support yet - timeouts or handshake failures after a partial upgrade ## Remediation Upgrade the CLI: ```bash npm install -g clawperator@latest ``` Install a compatible [Clawperator Operator Android app](../getting-started/android-operator-apk.md): ```bash clawperator operator setup --apk ``` If you are using a local debug build, make sure the receiver package matches the installed variant: - release app package: `com.clawperator.operator` - debug app package: `com.clawperator.operator.dev` If the [Clawperator Operator Android app](../getting-started/android-operator-apk.md) version cannot be read, verify the device can return package metadata: ```bash adb shell dumpsys package ``` --- # Troubleshooting the Operator App The Clawperator operator app on Android must satisfy **three requirements** before it is ready to accept agent commands. The in-app "doctor" screen shows the current state and turns **green** only when all three are met. This document walks through each requirement and how to fix it. ## The three requirements 1. **Developer settings enabled** - The device’s Developer options menu is unlocked. 2. **USB debugging enabled** - USB debugging is turned on in Developer options. 3. **Permissions granted** - The Clawperator accessibility (operator) service is enabled and running. If any requirement is not met, the app shows an orange background and a dedicated screen explaining what to do. --- ## 1. Developer settings enabled **What it means:** Android hides the Developer options menu until you unlock it by tapping "Build number" (or "System version") multiple times. **How to fix:** 1. Open **Settings** on the device. 2. Go to **About phone** (or **About device**). 3. Find **Build number** (on some devices this is under "Software information" or labeled "System version"). 4. Tap **Build number** **7 times in a row**. You should see a message like "You are now a developer!" or "Developer mode has been enabled." 5. Go back to the main Settings screen. You should now see **Developer options** (often under System or Additional settings). **In the app:** If this requirement is not met, the doctor screen shows "Android Developer mode must be turned on" and steps similar to the above. Use **Open system settings** to jump to Settings; on some devices the app opens About phone or Developer options directly. --- ## 2. USB debugging enabled **What it means:** Even after Developer options are visible, USB debugging must be turned on so that your computer (and tools like `adb` or the Node API) can talk to the device. **How to fix:** 1. Ensure **Developer options** are enabled (see requirement 1). 2. Open **Settings** → **Developer options**. 3. Find **USB debugging** and turn it **On**. 4. When you connect the device via USB (or use wireless debugging), you may see a prompt to **Allow USB debugging** for this computer. Check "Always allow from this computer" if you want to avoid the prompt next time, then tap **Allow**. **In the app:** If developer options are on but USB debugging is off, the doctor screen shows "USB debugging must be turned on" and an **Open Developer options** button to open that settings screen. --- ## 3. Permissions granted (accessibility / operator service) **What it means:** The Clawperator operator relies on an Android **Accessibility service** to inspect and act on the UI. That service must be enabled in system settings and running. **How to fix:** 1. On the **host computer**, run: ```bash clawperator grant-device-permissions ``` This uses `adb` to enable the Clawperator accessibility service on the connected device without requiring screen interaction. Add `--device-id ` if multiple devices are connected. 2. On the **device** (manual alternative): - Open **Settings** → **Accessibility** (or **Settings** → **Apps** → **Special app access** → **Accessibility**). - Find the **Clawperator** (or operator) service and turn it **On**. - Confirm any system dialog (e.g. "Allow [app] to observe your actions..."). 3. After enabling, run `clawperator doctor` to confirm the handshake passes. If doctor still fails, wait 2-3 seconds for the service to initialize and retry. **Note:** Android revokes the accessibility permission if the Clawperator app crashes. If the doctor screen shows "Permissions not granted" after the app had been ready, the service may have been disabled by a crash - re-run `clawperator grant-device-permissions` or re-enable the Clawperator service in Settings → Accessibility. --- ## Wireless Debugging (YMMV) Clawperator is designed to work with a dedicated, **always-on, permanently powered** Android device. For maximum reliability, a physical USB connection is strongly recommended. If you must use **Wireless Debugging**, be aware that your mileage may vary (YMMV) as connection stability can drop unexpectedly. 1. Ensure both the Android device and your host computer are on the same Wi-Fi network. 2. Go to **Settings** → **Developer options**. 3. Turn on **Wireless debugging**. 4. Tap **Wireless debugging** to see the IP address and port (e.g., `192.168.1.100:5555`). 5. On your computer, run: ```bash adb connect : ``` **Warning:** Wireless debugging sessions are prone to disconnection. If the device drops off the network, the Node CLI will return `NO_DEVICES`. For production use, always prefer a wired connection. --- ## Installer behavior `curl -fsSL https://clawperator.com/install.sh | bash` uses the stable metadata file at `https://downloads.clawperator.com/operator/latest.json`, downloads the immutable package for the [Clawperator Operator Android app](../getting-started/android-operator-apk.md) plus its `.sha256`, verifies the checksum, then handles device install like this: 1. **One connected device** - the installer offers to run `clawperator operator setup --apk ~/.clawperator/downloads/operator.apk --device-id `. 2. **Multiple connected devices** - the installer skips the install and prints `clawperator operator setup --apk ~/.clawperator/downloads/operator.apk --device-id `. 3. **No connected devices** - the installer skips the install and leaves the verified package for the [Clawperator Operator Android app](../getting-started/android-operator-apk.md) at `~/.clawperator/downloads/operator.apk`. 4. **`adb` missing** - the installer attempts to install `adb` automatically, or stops with a manual install link if it cannot. ## Emulator-Specific Issues Clawperator can provision a local Android emulator through the Node CLI and API. If provisioning fails, use the checks below. ### Missing Android SDK tools If provisioning returns `ANDROID_SDK_TOOL_MISSING`, verify that all required tools are available: ```bash which adb which emulator which sdkmanager which avdmanager ``` If one tool is outside your normal shell `PATH`, pass it explicitly when starting the HTTP API or CLI process: ```bash ADB_PATH=/path/to/adb \ EMULATOR_PATH=/path/to/emulator \ SDKMANAGER_PATH=/path/to/sdkmanager \ AVDMANAGER_PATH=/path/to/avdmanager \ clawperator provision emulator ``` ### AVD exists but is unsupported If `clawperator emulator inspect --output json` shows `supported: false`, the AVD will not be auto-selected by provisioning. Clawperator currently supports: - Android API level `35` - Google Play system image - ABI `arm64-v8a` - device profile `pixel_7` Inspect the normalized metadata and unsupported reasons: ```bash clawperator emulator inspect --output json ``` Clawperator evaluates compatibility from: - `~/.android/avd/.avd/config.ini` - `~/.android/avd/.ini` The key fields are: - `PlayStore.enabled` - `abi.type` - `image.sysdir.1` - `hw.device.name` ### Emulator starts but never becomes ready Provisioning waits for two Android boot signals: - `getprop sys.boot_completed` - `getprop dev.bootcomplete` If either never flips to `1`, Clawperator returns `EMULATOR_BOOT_TIMEOUT`. This usually points to a broken AVD, stale emulator state, or a host-level emulator issue. Recommended recovery: 1. Stop the emulator with `clawperator emulator stop `. 2. Delete the AVD with `clawperator emulator delete `. 3. Re-run `clawperator provision emulator`. Clawperator starts emulators with `-no-snapshot-load` to avoid stale snapshot state, so repeated boot timeouts usually indicate a deeper emulator or SDK problem. ### Emulator process launches but does not appear in adb If start returns `EMULATOR_START_FAILED`, the emulator process did not register with adb before the registration timeout expired. Check: - `adb devices` - `clawperator emulator status --output json` If the emulator window appears but adb never sees it, restart the adb server: ```bash adb kill-server adb start-server ``` Then retry: ```bash clawperator emulator start clawperator-pixel ``` ### App requires a Google account or Play Store sign-in The default emulator profile uses a Google Play system image. Some apps require a Google account to be signed in to the emulator before they can run. Clawperator does not handle account setup. To sign in: 1. Open the Play Store app on the emulator 2. Sign in with a Google account 3. Accept any prompts 4. Return to the home screen before running automations Some apps require additional configuration (such as accepting terms of service or completing a first-run flow) before Clawperator can interact with them. If an agent is blocked on a login screen or onboarding flow, treat that as device-preparation work that must be completed by the user. ### App not installed or not detected on the emulator If an automation targets an app that is not installed on the emulator, the `open_app` step will fail - the execution envelope will return `status: "failed"` with the reason in `envelope.error`. `NODE_NOT_FOUND` is a selector/matcher error and will not appear for a missing app. Install the app from the Play Store or via `adb install` before running. ### Slow emulator boot or sluggish UI Android emulators are resource-intensive. On machines without hardware virtualization or GPU acceleration, boots can be slow and UI interactions may be sluggish. Recommended settings: - Ensure hardware virtualization / emulator acceleration is enabled on the host (for example, Hypervisor.framework on macOS, WHPX on Windows, or KVM on Linux) - Ensure the Android emulator has GPU acceleration enabled (check AVD configuration) - Allocate sufficient RAM to the emulator (2 GB minimum recommended) If the emulator is consistently slow, consider using a physical device instead. ### Multiple devices connected Once an emulator is provisioned, you may have both a physical device and an emulator connected at the same time. In that state, continue to pass `--device-id ` to `execute`, `observe`, `action`, and `skills run` commands. ### Installer cloned everything except skills If the installer finishes but warns that skills setup was skipped, the core CLI and [Clawperator Operator Android app](../getting-started/android-operator-apk.md) are still installed. This does not block `clawperator doctor`, device discovery, or direct command execution. To set up skills manually: ```bash clawperator skills install export CLAWPERATOR_SKILLS_REGISTRY="$HOME/.clawperator/skills/skills/skills-registry.json" ``` --- ## Snapshot UI returns SNAPSHOT_EXTRACTION_FAILED **Symptom:** A `snapshot_ui` step returns `success: false` with `data.error: "SNAPSHOT_EXTRACTION_FAILED"`. The device is connected and responding, but no UI hierarchy XML appears in `data.text`. The stderr output from the CLI contains a warning like: ``` [clawperator] WARN: snapshot_ui step "..." UI hierarchy extraction produced no output. ``` **Most common cause:** The installed `clawperator` npm binary is out of date. The compiled `snapshotHelper.js` in older published packages searches for a logcat marker (`TaskScopeDefault:`) that does not match the marker the Android Operator APK actually emits (`[TaskScope] UI Hierarchy:`). The APK is correct and requires no changes. Other less common causes, such as partial or truncated logcat capture, can also leave a `snapshot_ui` step without extracted text and produce the same error. **How to confirm:** ```bash clawperator version --check-compat --receiver-package com.clawperator.operator ``` This will report any version mismatch between the CLI and the installed APK. **How to fix:** 1. Reinstall the npm package to get the current compiled binary: ```bash npm install -g clawperator ``` 2. Verify the fix by running a snapshot: ```bash clawperator observe snapshot --device-id --output json ``` A working snapshot returns `data.text` containing XML starting with `/files/crash-log.txt ``` ## How to fetch the log (adb) Replace `` with your app ID (e.g., `com.clawperator.operator.dev`). View directly: ```bash adb shell run-as cat files/crash-log.txt adb shell run-as com.clawperator.operator.dev cat files/crash-log.txt ``` Copy to Downloads for easier access: ```bash adb shell run-as cp files/crash-log.txt /sdcard/Download/crash-log.txt ``` ## Notes - The file is append‑only by design and is **not** pruned or rotated. - If you don’t see a file yet, it will be created on first write (e.g., on app start or crash). --- # Skills # Usage Model Related runtime/API repository: [clawperator](https://github.com/clawperator/clawperator) ## Core Principle Skills are a reliability layer for execution, not a substitute for agent reasoning. The intended operating model is two-handed: 1. Execution hand (`clawperator` + these skills) - dispatches Android actions - interacts through accessibility APIs - logs/captures outputs for downstream interpretation 2. Reasoning hand (LLM/agent) - interprets outputs - decides next command - handles unexpected UI states - manages retries/fallbacks/escalations ## What Skills Guarantee Skills aim to provide: - repeatable command structure - known selector and parsing strategies - stable output formatting Skills do **not** guarantee: - that UI structure is unchanged - that remote config/experiments are inactive - that alerts/permission/update dialogs are absent - that app/login/account state is valid ## Recommended Agent Loop 1. Start from a deterministic baseline state. 2. Run skill command. 3. Parse outputs (and screenshot path if provided). 4. Validate expected signal presence. 5. If mismatch: inspect UI state, adapt command, retry with controlled strategy. 6. Return user-facing answer with confidence/fallback note if needed. ## Minimal Baseline Demo (No App-Specific Skill Required) Use Android Settings (`com.android.settings`) as a universal baseline probe: 1. `close_app` Settings 2. `open_app` Settings 3. short settle delay 4. `snapshot_ui` (`ascii`) 5. capture an ADB screenshot and persist absolute file path This gives: - a device-specific text snapshot (`snapshot_ui`) and - a visual snapshot (ADB `screencap` path) which together are useful for multimodal LLM interpretation before moving to app-specific skills. You can run the packaged baseline skill via the Node API wrapper: ```bash clawperator skills run com.android.settings.capture-overview --device-id ``` Or invoke the script directly (no Node API required): ```bash DEVICE_ID="$(adb devices | awk 'NR>1 && $2==\"device\" {print $1; exit}')" ./skills/com.android.settings.capture-overview/scripts/capture_settings_overview.sh "$DEVICE_ID" app.actiontask.operator.development ``` Always pass `--device-id` to `skills run` when more than one device is connected. Without it the wrapper will fail if device auto-detection is ambiguous. ## Anti-Patterns - Blindly trusting first result line. - Treating skill scripts as static truth forever. - Embedding business decisions directly in skill scripts. - Ignoring non-zero exits or warning-only outputs. ## Practical Takeaway These skills reduce execution friction. The LLM/agent remains responsible for correctness. --- # Skill Authoring Guidelines (v0.1 PoC) ## Core Doctrine: Generic Actuator **Clawperator is the Hand; the Agent is the Brain.** 1. **Generic Interface:** The clawperator CLI/Node API knows nothing about specific apps. It only executes Execution JSON payloads. 2. **External Logic:** All app-specific logic (selectors, navigation flows, data parsing) MUST live in this clawperator-skills repository. 3. **Plain Node.js (.js):** Skills should primarily be authored in Plain Node.js (.js scripts). This ensures a lightweight, high-performance environment with zero compilation overhead. 4. **Migrating from Bash:** New skills MUST be authored in Node.js. Existing Bash-based skills SHOULD be migrated when they require significant updates or grow in complexity. Prefer opportunistic migration over big-bang rewrites. --- ## 1. The Deterministic Lifecycle (Close-Sleep-Open) To ensure a predictable starting state, every skill MUST follow this sequence as its first set of actions: ```json [ { "id": "close", "type": "close_app", "params": { "applicationId": "com.example" } }, { "id": "wait_close", "type": "sleep", "params": { "durationMs": 1500 } }, { "id": "open", "type": "open_app", "params": { "applicationId": "com.example" } }, { "id": "wait_open", "type": "sleep", "params": { "durationMs": 8000 } } ] ``` * **Why Close?** Apps often cache state (previous searches, deep-linked tabs). Force-closing ensures the app starts on its true Home screen. * **How it works:** The **Clawperator Node CLI** automatically intercepts close_app actions and performs a genuine adb shell am force-stop **before** dispatching the command to the device. * **Why the 1.5s Sleep?** Android needs a moment to fully clean up the process after a force-stop before it can be reliably reopened. * **Why the 8s Sleep?** 8 seconds is a conservative default for slow-to-initialize apps (e.g. retail). Treat this as a guideline: tune `durationMs` per app for reliability, keeping it consistent across that app's skills. --- ## 2. Navigating Decoy UI Elements Many apps use fake UI elements on the home screen that act as triggers for the real interaction. * **The Search Bar Trap:** Often, the search bar on a home screen is just a TextView or Button styled to look like a field. Clicking it navigates to a new screen or opens an overlay with the real EditText. * **Strategy:** 1. click the decoy bar (often identifiable by textContains: Search). 2. sleep for 1-2s. 3. enter_text into the real field (identifiable by role: textfield or a specific resourceId). --- ## 3. Selector Strategy 1. **resourceId is King:** Always prefer resourceId (e.g., com.woolworths:id/search_src_text). It is the most stable selector across device locales. 2. **textEquals for Buttons:** Use textEquals for precise matches on menu items or specific labels. 3. **Avoid Coordinates:** Never use raw x,y coordinates. They break across different screen resolutions and aspect ratios. --- ## 4. Reliable Node.js Patterns * **Safe Payloads:** Use Node.js to build Execution objects as native literals and JSON.stringify() them to avoid shell escaping issues. * **Robust Parsing:** Never assume XML attribute order in UI snapshots. Use attribute-independent regex: `const match = line.match(/^(?=[^>]*\bresource-id="[^"]*target_id")(?=[^>]*\btext="([^"]*)")[^>]*>/);` * **Error Handling:** Check `execFileSync` errors for `e.stdout` and `e.stderr` (converted from Buffers to strings) to provide meaningful failure messages. --- ## 5. Compliance and Security (Blocked Terms) **CRITICAL:** To protect user privacy and project integrity, the following must NEVER be hardcoded in scripts or documentation: 1. **Personal Paths:** Never include /Users/name/. Use relative paths or temporary directories. 2. **Device Serials:** Never hardcode your physical device ID. Pass the device ID as a command-line argument. 3. **PII:** Use placeholders like Person or AC_TILE_NAME for any user-specific data. **Validation Command:** grep -rE "Users/|[0-9A-Fa-f]{16}" . --- ## 6. Mandatory Metadata Every Execution payload sent by a skill MUST include: * **expectedFormat**: Must be "android-ui-automator". * **timeoutMs**: Set a realistic timeout (e.g., 90000 for complex flows). --- # Device Prep and Runtime Tips Related runtime/API repository: [clawperator](https://github.com/clawperator/clawperator) ## Why This Matters Most skill failures are environment/state issues, not script syntax issues. ## Device Preparation 1. Keep target apps up to date. 2. Keep Google Play app updates enabled to reduce stale/forced-update interrupt screens. 3. Ensure required permissions/accessibility are already granted for target apps. 4. Keep device unlocked and stable during runs. 5. Avoid concurrent manual interaction while automation is running. ## Runtime Best Practices - Start from a known app state when skill requires it. - Use settle delays around navigation-heavy transitions. - Capture both structured output and screenshots when debugging. - Treat warning outputs as actionable signals, not silent success. ## Common Failure Modes - Unexpected modal dialogs (permissions, battery optimization, updates). - Remote-config UI changes altering selectors/text. - Partial rendering causing empty/early reads. - App session/login changes. ## Operational Advice for Agents - Verify expected text/fields exist before trusting value extraction. - If critical fields are missing, re-observe and retry with bounded attempts. - Return explicit uncertainty to the user when signal quality is degraded. --- # Skills Verification Date: 2026-02-19 ## Structural checks - Registry/index regeneration: `./scripts/generate_skill_indexes.sh` -> OK - Shell syntax: `find skills -type f -path '*/scripts/*.sh' -print0 | xargs -0 -n1 bash -n` -> OK - Node API registry integration check: - `CLAWPERATOR_SKILLS_REGISTRY=-skills/skills/skills-registry.json node /apps/node/dist/cli/index.js skills list --output json` -> 10 skills ## Artifact compile checks (via Node API) - `com.globird.energy.get-usage / usage` -> OK - `com.google.android.apps.chromecast.app.get-aircon-status / ac-status` (with `AC_TILE_NAME=Master`) -> OK - `com.solaxcloud.starter.get-battery / battery` -> OK - `com.theswitchbot.switchbot.get-bedroom-temperature / bedroom-temperature` -> OK ## Live script checks (connected device: ``) - `get_bedroom_temperature.sh` -> PASS (`Bedroom temperature: 23.7°C`) - `get_solax_battery.sh` -> PASS (`SolaX battery level: 61.0%`) - `get_globird_usage.sh` -> PARTIAL (`Could not parse GloBird usage values`, script exit 0) - `search_woolworths_products.sh "Coke Zero"` -> FAIL (`stage=navigation`, could not focus search) - `search_coles_products.sh "Coke Zero"` -> FAIL (`stage=navigation`, could not focus search) - `get_life360_location.sh "Person"` -> FAIL (`person not found`; script listed discovered members) - `launch_scrcpy_readonly.sh INVALID_SERIAL` -> PASS expected failure (`device not connected/authorized`) - `capture_settings_overview.sh` -> PASS (`TEXT_BEGIN...TEXT_END` emitted plus `SCREENSHOT|path=...`) ## Notes - Failures above are runtime-state/app-navigation dependent, not metadata/layout failures. - Coles/Woolworths and Life360 scripts should be run with app state ready and target person/query values that exist. --- # Blocked Terms Policy (Local PII Guard) This repository supports a local pre-commit guard to reduce accidental commits of sensitive strings (for example personal names, device serials, or internal identifiers). ## Why this exists - Git history is hard to clean once sensitive text is committed. - LLM-assisted workflows can unintentionally propagate local values into code/docs. - A local denylist catches obvious leaks before commit. ## Shared local config path `clawperator` and `clawperator-skills` both look for an optional user-scoped config directory at: - `~/.clawperator/` Expected files: - `~/.clawperator/blocked-terms.txt` - `~/.clawperator/pre-commit-blocked-terms.sh` Bootstrap: ```bash mkdir -p ~/.clawperator cp ./blocked-terms.txt.example ~/.clawperator/blocked-terms.txt ``` ## `blocked-terms.txt` format - One term per line. - Case-insensitive matching. - Blank lines are ignored. - Lines starting with `#` are comments. - Use plain literal strings (no regex syntax required). Example: ```text # Personal identifiers full legal name device-serial-1234 family-member-name ``` ## Install hooks in each repo If a repo includes the helper installer script, run it from that repo root: ```bash ./scripts/install_blocked_terms_hook.sh ``` Today that helper lives in `clawperator-skills/scripts/install_blocked_terms_hook.sh`. For repos that do not ship the helper, create `.git/hooks/pre-commit` to exec `~/.clawperator/pre-commit-blocked-terms.sh`. The helper writes `.git/hooks/pre-commit` to call the shared hook script. If `~/.clawperator/` is missing, the hook will warn and skip checks (non-blocking). ## What the hook checks - Scans staged added lines (`git diff --cached`). - Matches blocked terms literally (`grep -F -i`). - Reports offending term + file. - Blocks commit on match. - Ignores the blocked-terms file itself. ## Scan already-committed content If you have the helper script available, use: ```bash ./scripts/scan_blocked_terms.sh ``` Today that scanner lives in `clawperator-skills/scripts/scan_blocked_terms.sh`. Modes: - Default: scans current `HEAD` tree. - `--history`: scans all reachable commits (slower). - `--terms-file `: use alternate terms file. ## Scope and limitations - This is a **local developer control** by default. - `.git/hooks` is not versioned; each clone/user must install it. - For organization-wide enforcement, add CI checks and/or use a shared hooks path policy. --- # Design # Clawperator Node Runtime and API Design Product naming: - Product: `Clawperator` - Android package/application namespace: `com.clawperator.operator` ## Purpose Clawperator is a deterministic actuator tool that allows agents to execute Android automations on behalf of a user. It provides a stable layer for LLM-driven device control with deterministic inputs/outputs, eliminating the need for brittle, direct recipe-specific shell scripting. Execution model: 1. Agents call Clawperator CLI/API. 2. Clawperator performs `adb` and Android tooling interactions. 3. Clawperator sends validated runtime commands to Android (`ACTION_AGENT_COMMAND`). 4. Clawperator returns structured execution results. Critical requirement: - Skill artifacts are optional. - If no artifact exists, or an artifact is wrong/stale due to UI drift, feature flags, staged rollouts, or account-level variants, agents must still execute using generic runtime actions and live UI observation. Agent-customer policy: - The **Clawperator Node runtime interface** (CLI + HTTP API) is the primary/default interface for agents. - The Android APK/runtime service is an execution target, not the agent-facing integration surface. - Agents should not need direct `adb` for common tasks. - Raw `adb` remains available as an explicit fallback for edge cases and debugging. Design implication: - If a workflow is common (for example package listing, screenshots, device discovery, app open/close, execution, snapshot, logs), provide a first-class Clawperator command/API for it. ## Shipped Commands Core commands: - `clawperator doctor`: Validate prerequisites and environment. - `clawperator devices`: Discover connected device IDs. - `clawperator packages list`: Confirm presence of receiver and target apps on device. - `clawperator execute`: Run an execution JSON payload. - `clawperator observe snapshot`: Get current UI hierarchy as `hierarchy_xml`. - `clawperator observe screenshot`: Capture device screen. - `clawperator action [open-app|click|read|wait|type]`: Single-step interaction wrappers. - `clawperator serve`: Start HTTP/SSE server for remote agent access. - `clawperator doctor --fix`: Best-effort environment remediation. - `clawperator skills install/update/search/run`: Skills lifecycle. - `clawperator version --check-compat`: CLI/APK compatibility check. Contracts: - **Canonical Envelope:** `[Clawperator-Result] {JSON}` is the ONLY way success/failure is reported. - **`expectedFormat` Required:** Every observation/execution must include `expectedFormat: "android-ui-automator"`. - **Single-Flight Lock:** Only one execution per `deviceId` / `receiverPackage` at a time. Overlaps return `EXECUTION_CONFLICT_IN_FLIGHT`. ## HTTP API Server (`serve`) When running `clawperator serve [--port ] [--host ]`, a local HTTP server is started to allow remote agents to interact with Clawperator without direct CLI access. > ⚠️ **Security Warning**: The HTTP API currently provides **no authentication or authorization**. By default, it binds to `127.0.0.1` (localhost) for safety. If you bind to `0.0.0.0` or a public IP via `--host`, any client on your network can remotely control your connected Android devices. Only expose this API on trusted networks or behind an authenticated gateway. ### REST Endpoints - **`GET /devices`**: List all connected Android devices and their states. - **`POST /execute`**: Execute a full JSON execution payload. - Body: `{"execution": {...}, "deviceId": "...", "receiverPackage": "..."}` - Returns: `RunExecutionResult` (200 OK or 4xx/5xx on failure). - Status **423 Locked**: Returned if another execution is in flight for the target device. - **`POST /observe/snapshot`**: Quick helper for UI capture. - Body: `{"deviceId": "...", "receiverPackage": "..."}` - **`POST /observe/screenshot`**: Quick helper for visual capture. - Body: `{"deviceId": "...", "receiverPackage": "..."}` ### Event Streaming (SSE) The server provides a real-time event stream at **`GET /events`**. Callers should use a standard SSE client to subscribe. - **Event: `clawperator:result`**: Emitted when an execution reaches a terminal state (success or failure) and a deviceId is known. - Data: `{"deviceId": "...", "envelope": {...}}` - **Event: `clawperator:execution`**: Emitted for *every* attempt to run an execution, including pre-resolution failures. - Data: `{"deviceId": "...", "input": {...}, "result": {...}}` - **Event: `heartbeat`**: Upon connection, a `{"code": "CONNECTED", ...}` message is sent to verify the stream is active. ### Concurrency and Locking The server utilizes an in-memory single-flight lock per `deviceId`. If a second request arrives for the same device while an execution is in progress, the server returns **HTTP 423 (Locked)** immediately. ## Determinism Doctrine 1. **No Hidden Logic:** Clawperator never retries a failed action or auto-falls back to a different strategy (e.g., from `artifact` to `direct`). 2. **Pre-Flight Validation:** Every execution is validated against the target device and receiver capabilities before any ADB call is made. 3. **Canonical Result:** Exactly one terminal envelope per `commandId`. If a timeout occurs, the CLI emits a `RESULT_ENVELOPE_TIMEOUT` error. ## Error Taxonomy LLM agents must use these codes to decide their next step. ### Setup & Connectivity - `ADB_NOT_FOUND`: ADB is missing from PATH. - `NO_DEVICES`: No Android devices are connected via USB/Network. - `MULTIPLE_DEVICES_DEVICE_ID_REQUIRED`: More than one device exists; specify `--device-id`. - `RECEIVER_NOT_INSTALLED`: The target receiver package is not on the device. ### Execution & State - `EXECUTION_VALIDATION_FAILED`: The execution JSON is malformed or invalid. - `EXECUTION_ACTION_UNSUPPORTED`: The requested action type is not supported by the runtime. - `EXECUTION_CONFLICT_IN_FLIGHT`: A command is already running on the target device. - `RESULT_ENVELOPE_TIMEOUT`: The command ran but no terminal envelope was received within the timeout. - `RESULT_ENVELOPE_MALFORMED`: Logcat emitted an invalid JSON envelope. ### UI & Nodes - `NODE_NOT_FOUND`: The selector (matcher) failed to find the target UI element. - `NODE_NOT_CLICKABLE`: The target element was found but is not enabled/clickable. - `SECURITY_BLOCK_DETECTED`: A system-level security overlay (e.g., "Package Installer" or "Permission Dialog") is blocking interaction. ## Detailed Step-Level Error Handling While the top-level `status` indicates overall command success, individual `stepResults` can provide granular failure diagnostics. This is essential for agents to reason about partial completions. ### Step Error Format When a step fails but the runtime continues (or fails fast), the `stepResults` entry will include: - `success: false` - `data.error`: A stable machine-readable error code. - `data.message`: A human-readable (and LLM-readable) explanation. ### Example: `UNSUPPORTED_RUNTIME_CLOSE` This error occurs when a `close_app` action is dispatched to the Android runtime. Because of sandbox restrictions, the runtime cannot reliably close other apps. **Desired Outcome:** The agent should see this error and know that the 'Hand' (Node CLI) is responsible for pre-flight closure via ADB. ```json { "id": "step-1", "actionType": "close_app", "success": false, "data": { "application_id": "com.example.app", "error": "UNSUPPORTED_RUNTIME_CLOSE", "message": "Android runtime cannot reliably close apps. Use the Clawperator Node API or 'adb shell am force-stop' directly for this action." } } ``` ## Safety & Concurrency ### In-Flight Semantics A command is considered "in-flight" from the moment the ADB broadcast is sent until the `[Clawperator-Result]` is received or the `timeoutMs` is reached. If a command times out, the lock is held for an additional 2000ms "settle" window before allowing the next execution. ### PII Redaction Policy By default, Clawperator returns **full-fidelity** UI text to the agent for maximum reasoning accuracy. - **User Warning:** Results *will* contain sensitive data (names, account digits, OTPs) if they are visible on the screen. - **Agent Mitigation:** Do not ship raw Clawperator results to long-term storage without user consent. ## API-First, ADB-Capable This runtime is intentionally **API-first**: 1. Agents should use Clawperator commands/APIs by default. 2. Clawperator should wrap common Android/adb operations behind stable, typed contracts. 3. Direct adb usage is a fallback path, not the baseline integration model. Direct adb is still supported for: - unsupported/emerging edge cases, - low-level diagnostics, - temporary gaps before a stable Clawperator primitive exists. When fallback adb is used, Clawperator should still encourage convergence back to first-class APIs by: - exposing equivalent primitives as they become common, - keeping result/error formats structured and machine-readable, - documenting fallback-to-API migration paths. ## Skill Artifact Optionality and Failure Handling Skill artifacts are optional, but fallback behavior is explicit: 1. If artifact compile succeeds, execute compiled execution. 2. If artifact compile fails, Clawperator returns a structured compile error and does not auto-fallback. 3. If runtime verification fails, Clawperator returns a structured execution failure and does not auto-retry with alternate strategy. 4. Agent chooses next step (retry, inspect UI, switch to direct actions, or abort). Runtime must expose a `mode` on each execution: - `artifact_compiled` - `direct` This keeps behavior deterministic and avoids hidden control-flow in the runtime. ## Execution Unit Contract Use one term everywhere: `execution`. - `compile` produces an `execution`. - `execute` runs an `execution`. Execution schema aligns with Android `AgentCommand` constraints. Execution input may come from: 1. skill artifact compile output, or 2. direct action list authored by agent/tooling. Example execution: ```json { "commandId": "cmd-123", "taskId": "task-123", "source": "openclaw", "expectedFormat": "android-ui-automator", "timeoutMs": 90000, "actions": [ { "id": "close", "type": "close_app", "params": { "applicationId": "com.example.app" } }, { "id": "open", "type": "open_app", "params": { "applicationId": "com.example.app" } }, { "id": "wait", "type": "sleep", "params": { "durationMs": 3000 } } ] } ``` ## Device Selection Policy (v1) `deviceId?: string` is supported on execute/observe. Selection behavior: 1. If exactly one connected device exists and `deviceId` is omitted, use that device. 2. If more than one connected device exists, `deviceId` is required. 3. If provided `deviceId` is not connected in `device` state, fail preflight. ## Agentic Best-Effort Mode Best-effort mode is a first-class execution path for unknown or drifting UIs. Behavior goals: 1. Observe current UI (`snapshot_ui`). 2. Identify likely anchors (toolbar/tab/menu/button/search patterns). 3. Attempt constrained navigation/action. 4. Re-observe and verify progress. 5. Retry within safety bounds. Best-effort does not imply unsafe freeform behavior; all attempts remain within validated runtime action limits and capability policy. Important ownership split: - Clawperator provides primitives and structured observations. - The agent owns exploration policy/strategy. - Clawperator should not silently invent fallback control flow. Cardinality drift handling: - Execution should tolerate mismatches between recipe assumptions and live UI (for example expected second device tile but only one exists). - Runtime returns structured ambiguity/partial outcomes rather than hard-failing every mismatch. - Agent decides whether to proceed with alternate target selection or stop. ## Result Transport Channel (v1 choice) Chosen v1 mechanism: - logcat JSON envelope with strict prefix. Required Android emission format (single line): - `[Clawperator-Result] {"commandId":"...","taskId":"...","status":"success|failed","stepResults":[...],"error":null}` Current implementation note: - Android emits canonical `[Clawperator-Result]` terminal envelopes for command completion. Rules: 1. Exactly one terminal result envelope per `commandId`. 2. Envelope payload must be valid single-line JSON. 3. Clawperator parser filters by `commandId` and prefix. 4. Non-envelope logs are ignored for result semantics. This removes ad-hoc scraping patterns and provides deterministic parsing until a stronger transport is added. Additionally, intermediate observation envelopes may be emitted with prefix: - `[Clawperator-Event] {json...}` This supports agent feedback loops during best-effort execution. ## Safety Bounds (hard constants) Public limits (v1): - `MAX_EXECUTION_ACTIONS = 50` - `MAX_EXECUTION_TIMEOUT_MS = 120000` - `MIN_EXECUTION_TIMEOUT_MS = 1000` - `MAX_PAYLOAD_BYTES = 64000` - `MAX_RETRY_ATTEMPTS_PER_STEP = 10` - `MAX_SNAPSHOT_LINES = 2000` - `MAX_SNAPSHOT_BYTES = 262144` Action policy: - denylist by default for unsupported/unsafe action types - allow only runtime-supported actions in v1 Best-effort specific bounds: - `MAX_BEST_EFFORT_STEPS = 30` - `MAX_BEST_EFFORT_RUNTIME_MS = 180000` - `MAX_CONSECUTIVE_FAILED_ATTEMPTS = 5` Supported action types (v1): - `open_app` - `close_app` - `wait_for_node` - `click` - `scroll_and_click` - `read_text` - `snapshot_ui` - `sleep` - `type_text` - `doctor_ping` ## Doctor and Dependency Management `clawperator doctor` checks: 1. `adb` installed and executable 2. adb server reachable 3. connected devices and states 4. target package presence and version compatibility 5. Android Developer Options and USB debugging (advisory) 6. end-to-end handshake via `doctor_ping` `clawperator doctor --fix` capabilities (best effort): 1. restart adb server 2. run `clawperator grant-device-permissions` 3. print exact remediation when automatic fix is unavailable See [Clawperator Doctor](../reference/node-api-doctor.md) for the full check list and JSON report shape. ## Skill Integration Mechanism Canonical source of skills: - `clawperator-skills` repository Distribution model: 1. `clawperator-skills` CI generates `skills-index.json` on `main`. 2. `clawperator skills install` clones the local skills checkout on first setup. 3. `clawperator skills update [--ref ]` refreshes the checkout and can pin to a specific ref when needed. 4. Local cache stores synced artifacts for deterministic offline execution. Runtime should execute against cached/pinned skill content, not live network fetches during execution. Skill compilation requirements are defined in: - `docs/design/skill-design.md` When skill artifacts are missing/stale, runtime can still execute direct executions supplied by the agent. ## Skill Implementation Language Strategy To set a maintainable baseline for future skills: 1. Preferred language for new non-trivial skills: Node.js with TypeScript. 2. Bash is allowed only for thin wrappers and simple glue. 3. Python is a planned secondary path after Node contracts and tooling are stable. Rationale: - Better testability, typing, and reuse for parsing-heavy and multimodal workflows. - Safer payload construction and lower shell-quoting risk than large Bash scripts. - Cleaner evolution toward SDK-backed skill execution. Migration policy: - Do not mass-rewrite all existing Bash skills immediately. - For new high-value or high-complexity skills, prefer Node.js/TypeScript implementations. - Temporary Bash implementations (including the current Life360 flow) are acceptable only as stopgaps and must be queued for early migration once minimal Node skill SDK/runtime helpers are in place. ## Agent-Friendly Command and Alias Layer Because agents are primary customers, Clawperator should accept intuitive aliases that normalize to canonical actions. Examples: - `tap` -> `click` - `press` -> `click` - `long_press` -> `click` with long-click params - `wait_for` -> `wait_for_node` - `find` -> `wait_for_node` - `read` -> `read_text` - `snapshot` -> `snapshot_ui` - `sleep` -> `sleep` - `action`: Primary entry point for single-step interactions. Rules: 1. Canonical form is stored and logged. 2. Aliases are input-only conveniences. 3. Alias table is explicit/versioned (no fuzzy guessing in parser). ## Node Module Structure - `src/cli/*` - command handlers and argument parsing - `src/domain/doctor/*` - prerequisites and auto-fix logic - `src/domain/devices/*` - adb discovery and selection - `src/domain/skills/*` - install/update/search/run/list/get/compile-artifact - `src/domain/executions/*` - validation, run, state transitions - `src/adapters/android-bridge/*` - adb broadcast + logcat result envelope parsing - `src/contracts/*` - schema constants, JSON types ## Determinism and Validation Requirements 1. Skill artifact compile must be pure and deterministic. 2. Execution validation must occur before any adb call. 3. Every run must emit correlated IDs: `executionId`, `commandId`, `taskId`, `deviceId`. 4. Side-effecting executions must include verification signals in step results. 5. Direct/fallback executions must include explicit mode/status metadata. 6. Artifact compile must fail if required input variables are missing (no implicit PII/user-literal substitution). ## Testing Strategy Clawperator should define layered tests, with real-device execution as a first-class requirement. 1. Unit tests (Node/CLI) - execution schema validation and hard bounds - device selection policy - alias normalization to canonical actions - result envelope parser correctness 2. Integration tests (mock adb/logcat) - doctor/device discovery behavior - compile -> execute orchestration - failure contracts and fallback instruction pointers 3. Android instrumentation tests - `ACTION_AGENT_COMMAND` execution path - `[Clawperator-Result]` envelope emission - step result mapping and verification semantics 4. Real-device tests - run a baseline skill/execution on a known installed app (current baseline can be Google Home) - verify end-to-end reliability across close/open/session policy behavior 5. Future dedicated test APK - create a controlled Android app exposing stable test UI elements/states - migrate core conformance tests to this APK to reduce third-party app drift risk ## Security and Policy 1. Capability-based execution gating (from skill/artifact metadata). 2. Per-profile allowlist/denylist for capabilities and packages. 3. Disable dangerous capabilities by default (`purchase_risk` off unless explicit policy). 4. Audit trail for compile, execute, and result envelopes. 5. Best-effort mode still obeys capability policy and hard limits. ## Stability & Versioning Clawperator follows Semantic Versioning (SemVer) for the Node SDK/CLI and its API contracts. ### Versioning Rules - **Major Bump (`1.x.x`):** Breaking changes to the result envelope JSON schema, CLI command removal, or incompatible `ACTION_AGENT_COMMAND` protocol changes. - **Minor Bump (`x.1.x`):** New supported actions, new CLI commands, or backward-compatible schema additions. - **Patch Bump (`x.x.1`):** Bug fixes, internal refactoring, or documentation updates. ### Stability Boundary - **Stable (v1):** `execute`, `observe snapshot`, `devices`, and the `[Clawperator-Result]` envelope structure. - **Alpha/Unstable:** `execute best-effort`, `--serve` (HTTP), and any feature marked as `(Upcoming)` in these docs. These may break without a major version bump until they are promoted to stable. --- # Operator LLM Playbook (Definitive) This is the canonical reference for LLM-driven automation in Clawperator. Use this doc for: - running the app through `ACTION_AGENT_COMMAND` - authoring/maintaining skill packages - integrating skill scripts with OpenClaw - understanding runtime components and naming --- ## 1) Runtime components (production) These are runtime components (not debug-only): - `com.clawperator.operator.runtime.OperatorCommandService` - `com.clawperator.operator.runtime.OperatorCommandReceiver` They own broadcast ingress for: - **Action Namespace:** `com.clawperator.operator.ACTION_AGENT_COMMAND` (stable) - **Package Target:** Varies by build (e.g., `com.clawperator.operator` or `com.clawperator.operator.dev`) --- ## 2) Command ingress contract ### Reliability rule (required) For app automation commands, default to: 1. `close_app` 2. `open_app` 3. wait for stabilization 4. add small post-navigation settle delays (~500–1500ms) before critical reads/clicks ### Required fields - `commandId: string` - `taskId: string` - `source: string` - **`expectedFormat: "android-ui-automator"`** (Required for v1 compatibility) - `actions: []` ### Determinism Doctrine 1. **Validation First:** No side effects if the payload is malformed. 2. **Exactly One Envelope:** Every command must emit a `[Clawperator-Result]`. 3. **No Retries:** The runtime never retries a failed step; it reports the failure immediately to the Brain. 4. **Stable IDs:** Correlate `commandId` and `taskId` end-to-end. --- ### Supported action types (current) | Action type | Key params | Notes | | :--- | :--- | :--- | | `open_app` | `applicationId: string` | Launches app by package ID | | `close_app` | `applicationId: string` | Node runs `adb shell am force-stop` pre-flight; Android step always returns `success: false` (expected) | | `enter_text` | `matcher: NodeMatcher`, `text: string`, `submit?: boolean`, `clear?: boolean` | CLI: `action type`. `submit: true` presses Enter after typing. `clear` is accepted by Node but currently ignored by Android | | `click` | `matcher: NodeMatcher`, `clickType?: "default"\|"long_click"\|"focus"` | CLI: `action click` | | `read_text` | `matcher: NodeMatcher`, `validator?: "temperature"`, `retry?: object` | CLI: `action read`. Result in `data.text`. Other validator values are rejected by the runtime | | `wait_for_node` | `matcher: NodeMatcher`, `retry?: object` | CLI: `action wait`. Waits with internal retry | | `snapshot_ui` | `retry?: object` | CLI: `observe snapshot`. Snapshot content in `data.text` as `hierarchy_xml` | | `take_screenshot` | `path?: string`, `retry?: object` | Node captures screenshot via ADB and returns local file path | | `scroll_and_click` | `target: NodeMatcher`, `container?: NodeMatcher`, `direction?`, `maxSwipes?`, `distanceRatio?`, `settleDelayMs?`, `findFirstScrollableChild?`, `clickAfter?: boolean`, `scrollRetry?: object`, `clickRetry?: object` | Scrolls until target is visible, then clicks by default. Set `clickAfter: false` to reveal the target without tapping it. `scrollRetry` defaults to UiScroll; `clickRetry` defaults to UiReadiness | | `scroll` | `container?: NodeMatcher`, `direction?`, `distanceRatio?`, `settleDelayMs?`, `findFirstScrollableChild?`, `retry?: object` | Performs exactly one scroll gesture and reports `scroll_outcome` as `moved`, `edge_reached`, or `gesture_failed` | | `scroll_until` | `container?: NodeMatcher`, `direction?`, `distanceRatio?`, `settleDelayMs?`, `maxScrolls?`, `maxDurationMs?`, `noPositionChangeThreshold?`, `findFirstScrollableChild?` | Bounded scroll loop that returns `termination_reason`. Use for feed pagination with explicit caps | | `sleep` | `durationMs: number` | Pause between steps. Must fit within the execution `timeoutMs` budget | **`enter_text` vs CLI `action type`:** The CLI command is `action type` but the action type field in execution payloads is `enter_text`. These map to the same runtime action. When building execution payloads directly, always use `enter_text`. **NodeMatcher fields:** `resourceId`, `contentDescEquals`, `textEquals`, `textContains`, `contentDescContains`, `role`. All fields are AND-combined. Prefer `resourceId` when available. Full reference in `docs/node-api-for-agents.md`. **Scroll targeting rule:** If a screen contains nested or multiple scrollable containers, do not rely on auto-detect. Capture `snapshot_ui`, identify the intended list's `resource-id`, and pass it as `params.container`. ### Visual verification with ADB screenshots (recommended) Use screenshots alongside UI-tree logs when building/debugging skills. ```bash adb exec-out screencap -p > ./tmp/ui-check.png ``` --- ## 3) Skills-first packaging (PII-safe) Canonical unit is a skill package, not a standalone recipe file. ### Required structure Skills are maintained in a dedicated sibling repository: `../clawperator-skills`. Each skill follows this structure: - `skills/./SKILL.md` - `skills/./scripts/*.sh` ### Nature of Skills Due to the dynamic nature of mobile apps (A/B tests, server-side flags, unexpected popups), skills are treated as **highly informed context** for the Agent rather than purely deterministic scripts. - **Agent Responsibility:** The Agent uses skill templates as a baseline, modifying them at runtime to handle personal configurations (variable substitution) or UI drift. ### Rules 1. No PII in committed skill artifacts. 2. Use variables/placeholders for user-specific labels (for example `{{AC_TILE_NAME}}`). 3. Prefer stable selectors first (`resourceId`), text matching second. 4. Keep fallback matching strategy documented in `SKILL.md`. 5. Keep skill-specific scripts/artifacts inside the skill folder. --- ## 4) Current skill set - `com.google.android.apps.chromecast.app.get-aircon-status` - `com.google.android.apps.chromecast.app.set-aircon` - `com.globird.energy.get-usage` - `com.solaxcloud.starter.get-battery` - `com.theswitchbot.switchbot.get-bedroom-temperature` --- ## 5) New skill authoring checklist 1. Start from a fresh app session (`close_app` then `open_app`). 2. Capture `snapshot_ui` and an ADB screenshot. 3. Identify robust selectors (`resource-id` first). 4. Create `skills/./SKILL.md`. 5. Add `scripts/*.sh` deterministic wrapper(s). 6. Add optional `artifacts/*.recipe.json` template(s) if helpful. 7. Validate on device end-to-end. 8. Update this playbook if conventions changed. --- ## 6) Where to update docs - Skill model/design: `docs/design/skill-design.md` - Canonical LLM/operator usage: `docs/design/operator-llm-playbook.md` (this file) - App-specific skill packages: `skills/./...` --- # Clawperator Skill Design Product naming: - Product: `Clawperator` - Legacy Android module/package naming in current codebase: `ActionTask` (temporary during migration) - Repository rename planned: TBD ## Purpose Define a deterministic, open-source **skill package** format for Android automation guidance. Skills are the primary artifact for agents and humans. Each skill may include: - `SKILL.md` (canonical agent-facing interface) - `scripts/` (deterministic wrappers) - `artifacts/*.recipe.json` (optional deterministic execution templates) Implementation language preference for `scripts/`: 1. Node.js/TypeScript is the default for new non-trivial skills. 2. Bash should be limited to small wrappers and command orchestration. Any non-trivial Bash skill is a temporary exception and must include a top-of-file migration note indicating it is queued for Node.js/TypeScript migration. 3. Python is supported as a future secondary option once a Python SDK contract is defined. Practical note: - Even accurate automations are inherently fuzzy because user/account UI state varies (feature flags, rollout state, number of devices, personalization). - Skills should encode preferred strategy plus fallbacks, not brittle assumptions. Important scope boundary: - Skill artifacts are accelerators, not a hard dependency. - Clawperator must remain useful when a skill artifact is missing, stale, or incorrect. - Runtime execution APIs must support direct execution without prebuilt artifacts. ## Concept Model Recipe artifact semantics (inside a skill) are split into three explicit layers: 1. `probe` - Single observation/read operation. - No side effects. - Typical shape: `snapshot_ui` + `read_text` + parse. 2. `flow` - Multi-step procedure. - Navigation and reads, optionally side-effect free. 3. `action` - Side-effecting procedure (toggle/set/click-with-effect). - Must include verification after side effect. Allowed `recipe_type` values (required): - `probe` - `flow` - `action` ## Repository Ownership Recommended split and source of truth: 1. Core repo: - Android runtime + Node CLI/API - skill/recipe schema and compiler 2. Skills repo (`clawperator-skills`): - versioned skill folders - optional deterministic `.recipe.json` artifacts per skill Absence of a skill artifact must not block execution. It only removes compile assistance. Suggested layout: ```text skills/ skills-registry.json skills-registry.schema.json generated/ manifest.json skills-index.min.json skills-index.jsonl by-app/ .json by-prefix/ .json tools/ generate_skill_indexes.sh ./ SKILL.md skill.json scripts/ *.ts | *.js | *.sh | *.py artifacts/ *.recipe.json ``` Language policy for new skills: - If the skill includes substantial parsing, state handling, retries, or multimodal outputs, implement in TypeScript. - If the skill is a tiny launcher/wrapper, Bash is acceptable. - Avoid introducing both Node and Python SDK dependencies at once; stabilize Node SDK/contracts first, then add Python based on demand. ## Recipe Artifact Schema Implementation note: - Current `skills/*/artifacts/*.recipe.json` files in this repository are runtime `AgentCommand` templates used by scripts/wrappers. - The richer metadata schema below is the design target and is not yet enforced by the current artifact generator/runtime path. Each optional recipe artifact remains deterministic and versioned. If Markdown recipe files are used, they may start with YAML frontmatter. Required fields: - `recipe_id` (string, globally unique) - `recipe_version` (semver) - `recipe_type` (`probe|flow|action`) - `application_id` (Android package id) - `summary` (short purpose) - `frameworks` (array from allowed taxonomy) - `session_policy` (`fresh|resume_ok`) - `app_build.version_name` (string) - `app_build.version_code` (integer) - `tested_on.android_api` (integer) - `capabilities` (array from allowed capability set) - `inputs` (array) - `outputs` (array) Recommended fields: - `compatibility.min_version_code` - `compatibility.max_version_code` - `tested_on.device_model` - `tested_on.locale` - `tested_on.last_verified_at` (ISO-8601) - `known_quirks` - `failure_modes` - `maintainers` - `tags` - `risk_level` (`low|medium|high`) ## Session Policy `session_policy` is required. - `fresh` (default) - Compiler auto-inserts step 0/1: - `close_app` - `open_app` - unless frontmatter sets `fresh_start_injected: false` and the recipe explicitly provides equivalent steps. - `resume_ok` - compiler does not inject close/open. This makes the reliability rule enforceable and uniform. ## PII and User-Specific Variable Policy Canonical recipes must not contain user-specific literals. Examples of forbidden literals in recipe files: - personal names - home/location labels - device nicknames (for example a real AC tile label) - physical device identifiers (for example real `adb` serials / `DEVICE_ID` values) - account emails/addresses/phone numbers Required approach: 1. User-specific/runtime-specific values must be declared in `inputs`. 2. Steps/selectors reference input variables, not hardcoded literals. 3. PII-like inputs must default to empty or generic placeholders, not real values. 4. Repositories should enforce a local sensitive-literal denylist via git hooks (`pre-commit` and/or `pre-push`) that scans staged diffs and blocks commits containing configured forbidden tokens. 5. Sensitive-literal scanning must include personal device serials/IDs used for local testing. Local guardrail requirement: - Keep denylist values in a local-only file that is never committed (for example under `.git/info/` or ignored local config). - Hook output should identify offending file/line and fail with a clear remediation message. - This is a complement to CI linting, not a replacement. Example: ```yaml inputs: - name: ac_tile_name type: string required: true default: "" ``` Compiler behavior: - unresolved required variables fail compile deterministically (`RECIPE_INPUT_MISSING`). ## Framework Taxonomy Allowed `frameworks` values: - `views` - `compose` - `react-native` - `expo` - `flutter` - `webview` - `hybrid` - `unknown` ## Capability Taxonomy Required frontmatter field: `capabilities`. Allowed values: - `observe` - `navigate` - `click` - `long_click` - `type_text` - `toggle` - `purchase_risk` Node runtime uses this list for policy gating (allow/deny before execution). ## Deterministic Step Schema Markdown remains human-readable, but each step must include a structured YAML block that maps 1:1 to runtime actions. Example: ~~~md ## Step 3: Scroll to Devices ```yaml id: scroll_devices type: scroll_and_click selector: text_contains: Devices annotation: STRUCTURAL container: class: androidx.recyclerview.widget.RecyclerView wait: after_ms: 500 retries: count: 3 ``` ~~~ Required step block fields: - `id` (unique within recipe) - `type` (runtime action type) - selector/matcher block required when action needs a target Optional fields: - `container` - `wait` - `retries` - `confirm` - `params` ## Selector DSL (v1) Only these keys are allowed in selector blocks: - `resource_id` - `class` - `text_equals` - `text_contains` - `content_desc_contains` - `clickable` - `enabled` - `selected` - `index_in_parent` (discouraged but allowed) - `any_of` - `all_of` - `annotation` `annotation` allowed values: - `STRUCTURAL` - `VARIABLE` - `USER_SPECIFIC` - `DERIVED` Unknown selector keys are schema errors. Variable interpolation rule: - Selector text fields may use explicit variable placeholders only (for example `${inputs.ac_tile_name}`). - Raw user-specific literal strings in selectors are schema/lint violations. ## Output Redaction Each declared output field supports required redaction behavior: - `redaction: none | hash | mask | drop` Default policy: - For text-derived outputs, default is `mask`. - For non-text numeric state outputs, default is `none` unless overridden. Example: ```yaml outputs: - name: account_name type: string redaction: mask - name: battery_percent type: number redaction: none ``` ## Runtime Mapping Compiler may only emit supported runtime action types: - `open_app` - `close_app` - `wait_for_node` - `click` - `scroll_and_click` - `read_text` - `snapshot_ui` - `sleep` Compilation must be deterministic and pure: - no LLM reasoning at compile time - no heuristics from prose-only sections - identical input recipe + vars => identical execution output ## Validation Rules Minimum validation checks: 1. Frontmatter required fields valid. 2. `recipe_type` and `frameworks` values valid. 3. `session_policy` present. 4. Every step has a valid YAML block. 5. Step action types are supported. 6. Selector DSL uses only allowed keys. 7. For `recipe_type=action` or capabilities containing `toggle|type_text|purchase_risk`, a verification requirement must exist after each side-effecting step. 8. User-specific literals are not present in frontmatter, selectors, or step blocks. 9. Required input variables used by selectors are declared in `inputs`. Verification can be satisfied by: - explicit `confirm` block in the step, or - subsequent `read_text`/assertion step referencing changed state. ## Fuzzy Real-World Variance (Cardinality Drift) Skill artifacts must tolerate entity-count variance. Example: - artifact may prefer interacting with "second AC tile" when two units exist - on another account only one unit may exist Guidance: 1. Prefer selector intent over absolute index assumptions. 2. Treat index-based targeting as optional fallback, not primary strategy. 3. If preferred target cardinality is unavailable, emit a structured ambiguity/partial result and continue with best-effort navigation where safe. 4. Capture variance notes in `known_quirks`; avoid overfitting until real user data justifies tighter constraints. ## CI Requirements `clawperator-skills` CI should run: 1. per-skill metadata validation (`skills/*/skill.json`) 2. run `skills/tools/generate_skill_indexes.sh` 3. `skills/skills-registry.json` schema validation 4. registry/index cross-check (every registered path exists and every skill folder is registered) 5. markdown parse + frontmatter schema validation 6. step schema validation 7. deterministic compile test 8. PII lint pass 9. capability policy lint (forbidden combos by profile) 10. literal-detector lint pass (fail on likely user-specific literals where variables are required) 11. compile-time input completeness checks for required variables ## Testing Strategy Recipe/compiler and runtime behavior must be tested at multiple levels. 1. Unit tests (fast, deterministic) - frontmatter/schema parsing - selector DSL validation - variable interpolation - redaction behavior - compile determinism (`same input => same execution`) 2. Integration tests (mocked bridges) - compile + execution orchestration - alias normalization - error contracts (`RECIPE_INPUT_MISSING`, `VERIFY_FAILED`, etc.) 3. Android instrumentation tests (real device/emulator) - run generated executions end-to-end against live UI tree - verify result envelope format and correlation IDs 4. Real-device smoke tests (required in CI/nightly where available) - baseline known app flow (current practical baseline: Google Home) - detect major runtime regressions in selectors/waits/verification behavior 5. Future dedicated test APK - build a stable Android test app with known UI elements/states - use as canonical non-flaky contract suite for `click/read/wait/scroll/toggle` ## Example Frontmatter (v1) ```yaml recipe_id: com.google.android.apps.chromecast.app.toggle-power recipe_version: 1.2.0 recipe_type: action application_id: com.google.android.apps.chromecast.app summary: Toggle climate power and verify final state. frameworks: - compose - views session_policy: fresh app_build: version_name: "3.31.100" version_code: 5311000 tested_on: android_api: 34 capabilities: - observe - navigate - click - toggle inputs: - name: ac_tile_name type: string required: true default: "" outputs: - name: power type: string redaction: none ``` ---