Native Apps

One-line summary

Three native client apps (macOS, iOS, Android) act as thin node clients that expose device capabilities (camera, microphone, screen, location, contacts, etc.) to the Gateway over WebSocket, while providing a local chat UI and voice interaction.

Responsibilities

  • Connect to the Gateway over WebSocket with Ed25519 device authentication and TLS certificate pinning
  • Advertise device capabilities (camera, screen, location, voice, contacts, calendar, etc.) to the Gateway
  • Execute Gateway-invoked commands (camera.snap, location.get, screen.record, canvas.navigate, etc.)
  • Provide local chat interface with streaming responses and markdown rendering
  • Support voice interaction: wake word detection and push-to-talk
  • Display the A2UI canvas via embedded WebView with JavaScript bridge

Architecture diagram

Key source files

Shared (Apple platforms)

FileRole
apps/shared/OpenClawKit/Shared Swift package: Protocol, session management, device identity, command definitions, A2UI
apps/shared/OpenClawChatUI/Shared chat UI: ChatViewModel, ChatView, ChatTransport protocol, Markdown rendering
apps/shared/OpenClawProtocol/Protocol models: GatewayModels, AnyCodable for JSON handling

macOS

FileRole
apps/macos/Sources/AppState.swiftGlobal state: @Observable singleton managing connection, settings, capabilities
apps/macos/Sources/CanvasWindowController.swiftCanvas window: WKWebView management, A2UI bridge, file watching
apps/macos/Sources/VoiceWakeRuntime.swiftVoice wake: Configurable trigger words ("Claude", "Computer", "Jarvis"), SwabbleKit engine
apps/macos/Sources/ExecApprovalsGatewayPrompter.swiftExec approvals: Operator approval prompts for tool execution
apps/macos/Package.swiftDependencies: OpenClawKit, Sparkle 2.8, MenuBarExtraAccess, swift-subprocess

iOS

FileRole
apps/ios/Sources/OpenClawApp.swiftEntry point: SwiftUI @main, AppDelegate for push notifications + background tasks
apps/ios/Sources/GatewayConnectionController.swiftConnection: Bonjour/mDNS discovery, TLS pinning, WebSocket lifecycle
apps/ios/Sources/IOSGatewayChatTransport.swiftChat transport: chat.send, chat.history, event subscriptions
apps/ios/Sources/CanvasController.swiftCanvas: WKWebView with JavaScript bridge for A2UI
apps/ios/Sources/VoiceWakeManager.swiftVoice wake: Speech framework, wake word detection

Android

FileRole
apps/android/.../MainActivity.ktEntry point: Single ComponentActivity with Jetpack Compose
apps/android/.../NodeApp.ktApplication: Initializes NodeRuntime
apps/android/.../NodeRuntime.ktCore runtime: GatewaySession, ChatController, voice, camera, screen, location handlers
apps/android/.../GatewaySession.ktWebSocket: OkHttp WebSocket, protocol v3, auto-reconnect (350ms * 1.7^n, max 8s)
apps/android/.../InvokeDispatcher.ktCommand router: Routes Gateway invokes to platform handlers with permission/foreground checks
apps/android/.../DeviceIdentityStore.ktDevice identity: Ed25519 key pair in encrypted SharedPreferences

WebSocket protocol (shared across all platforms)

Protocol version

All apps use protocol v3 (min=3, max=3).

Connect handshake

1. Client opens WebSocket to ws://host:port or wss://host:port
2. Server sends { type: "event", event: "connect.challenge", payload: { nonce } }
3. Client constructs signature payload:
   "v2|deviceId|clientId|clientMode|role|scopes|signedAtMs|token|nonce"
4. Client signs with Ed25519 private key
5. Client sends "connect" RPC with:
   {
     proto: { min: 3, max: 3 },
     client: { id, displayName, version, platform, mode },
     caps: ["canvas", "camera", "screen", "voiceWake", "location", ...],
     commands: [...list of supported commands...],
     permissions: { camera, microphone, location, ... },
     role: "node",
     auth: { device: { deviceId, publicKey, signedAtMs, signature, token } }
   }
6. Server responds with:
   {
     ok: true,
     server: { host },
     auth: { deviceToken },
     canvasHostUrl,
     snapshot: { sessionDefaults: { mainSessionKey } }
   }

Node invoke flow

Gateway → Client:
  { type: "event", event: "node.invoke.request",
    payload: { invokeId, command, params } }

Client executes command (e.g., camera.snap, location.get)

Client → Gateway:
  { type: "req", method: "node.invoke.result",
    params: { invokeId, result: {...} } }
  OR
  { type: "req", method: "node.invoke.result",
    params: { invokeId, error: { code, message } } }

Device capabilities by platform

CapabilityiOSAndroidmacOS
canvasWKWebViewAndroid WebViewWKWebView
cameraAVFoundationCameraX-
screenReplayKitMediaProjection-
locationCoreLocationFusedLocation-
voiceWakeSpeech frameworkAndroid SpeechRecognizerSwabbleKit
deviceUIDevice infoBuild info-
contactsContacts framework--
calendarEventKit--
remindersEventKit--
photosPhotos framework--
watchWatchConnectivity--
sms-SmsManager-

Node commands by platform

CommandiOSAndroidmacOS
canvas.present / canvas.hideYesYesYes
canvas.navigate / canvas.evalJSYesYesYes
canvas.a2ui.push / a2ui.resetYesYesYes
canvas.snapshotYesYesYes
camera.snap / camera.clipYesYes-
screen.recordYesYes-
location.getYesYes-
system.notifyYes--
chat.pushYes--
talk.pttStart / pttStopYesYesYes
contacts.search / contacts.addYes--
calendar.events / calendar.addYes--
reminders.list / reminders.addYes--
photos.latestYes--
watch.status / watch.notifyYes--
motion.activity / motion.pedometerYes--
device.status / device.infoYesYes-
sms.send-Yes-

Platform differences

AspectiOSAndroidmacOS
LanguageSwift 6KotlinSwift 6
UI FrameworkSwiftUIJetpack Compose (Material 3)SwiftUI + AppKit
Min VersioniOS 18+API 31 (Android 12)macOS 15+
App TypeFull-screen appFull-screen appMenu bar app
WebSocketURLSessionWebSocketTask (via OpenClawKit)OkHttp 5.3URLSessionWebSocketTask (via OpenClawKit)
CryptoCryptoKit (Ed25519)BouncyCastle (Ed25519)CryptoKit (Ed25519)
BackgroundBGAppRefreshTask + silent pushForeground ServiceAlways running
DiscoveryBonjour/mDNSmDNS + dnsjava (Tailscale)Bonjour + wide-area DNS-SD
Auto-UpdateApp Store / TestFlightAPK auto-updateSparkle 2.8
ArchitectureMVVM + ObservationMVVM + StateFlowMVVM + Observation
WebViewWKWebViewAndroid WebViewWKWebView

TLS and security

Certificate pinning:
  - SHA256 fingerprint of server TLS certificate
  - Trust-on-first-use (user must approve fingerprint once)
  - Stored per-gateway in device auth store

Loopback exception:
  - No TLS required for localhost/127.0.0.1 connections

Tailscale domains:
  - .ts.net domains enforce TLS
  - Wide-area DNS-SD discovery supported

Device identity:
  - Ed25519 key pair generated on first launch
  - Private key stored in:
    - iOS: Keychain
    - Android: EncryptedSharedPreferences (BouncyCastle)
    - macOS: Keychain
  - Public key sent during connect handshake
  - Device token returned by server, stored for subsequent connects

A2UI Canvas integration

All three platforms embed a WebView for the A2UI canvas:

Gateway → Node: "canvas.a2ui.push" { messages: [...] }
Node → WebView: JavaScript injection
WebView → Node: window.openclawCanvasA2UIAction.postMessage(json)
Node → Gateway: "node.invoke.result" { ... }

Canvas lifecycle:
  - canvas.present → show WebView
  - canvas.navigate → load URL
  - canvas.a2ui.push → inject A2UI messages
  - canvas.snapshot → capture screenshot (base64)
  - canvas.hide → hide WebView

How it connects to other modules

  • Depends on:

    • gateway/ — WebSocket server handles all communication, serves canvas host, manages device auth
    • apps/shared/OpenClawKit/ — shared Swift package for iOS + macOS (protocol, session, identity, commands)
    • apps/shared/OpenClawChatUI/ — shared chat UI components for iOS + macOS
  • Depended by:

    • None — native apps are leaf clients

My blind spots

  • Exact SwabbleKit wake word engine internals — third-party dependency, not OpenClaw source
  • How PeekabooBridgeHostCoordinator on macOS handles UI automation — seems like a significant capability
  • APK auto-update mechanism on Android (AppUpdateHandler) — how does it discover and apply updates?
  • Watch app features — WatchConnectivity is referenced but the actual watchOS app was not found in the source tree
  • Whether Android supports the full set of iOS commands (contacts, calendar, reminders, photos) or if these are iOS-only
  • How screen recording results are encoded and transmitted — size constraints for WebSocket frames
  • Voice wake accuracy and battery impact across platforms

Change frequency

  • iOS: High — most feature-rich platform, new commands and capabilities added regularly
  • Android: Medium — follows iOS features with some platform-specific additions (SMS, foreground service)
  • macOS: Medium — menu bar app with focused feature set, Peekaboo integration evolving
  • Shared (OpenClawKit): Medium — protocol changes propagate here first, then to platform code
  • Protocol (v3): Low — wire protocol is stable; capability additions don't require protocol changes