Security Deep Dive
How the architecture protects itself
Security is structural, not bolted on. Every component is designed with safe defaults, and protection mechanisms are out of reach of the agent they protect. This page details the runtime enforcement and isolation layers.
An independent Python daemon that intercepts every shell command before execution. It runs on a Unix socket, outside the agent's workspace, with its code and policies locked by OS-level immutability flags.
Three Enforcement Levels
Dangerous commands (sudo, rm -rf, chmod 777) are automatically rejected. No human review needed.
Read-only commands (git status, curl, cat) are automatically permitted. Fast path, no delay.
Sensitive commands (git push, rm, kill, brew install) trigger an iMessage approval request with a unique verification code. 2-minute timeout = automatic denial.
If the daemon crashes or becomes unreachable, ALL non-pre-authorized commands are denied. Fail-closed by design.
Critical configuration files are protected with OS-level immutability flags (chflags uchg). The agent cannot modify its own rules, identity, or security configuration.
openclaw.jsonSOUL.mdSECURITY.mdAGENTS.mdIDENTITY.mdTOOLS.mdexec-guardian.pypolicies.jsonUnlocking requires physical intervention by the human supervisor.
PIN Security
Every high-impact action requires a numeric PIN code known only to the human supervisor. The PIN is changed after any suspected compromise and is never stored in agent sessions.
Current implementation: PIN required for email sending, code deployment, file deletion, and system configuration changes.
Secured Mail-Reader
The mail-reader agent runs in exec allowlist mode. Only a single script is authorized: mail-extract with exactly 3 allowed flags (--hours, --account, --output). Everything else is denied.
--hours--account--output* (everything else)Native iMessage handling is disabled. A Python daemon (imsg-watcher) polls the chat.db database continuously. Each incoming message spawns an isolated one-shot sub-agent that handles the request then terminates.
- No persistent sessions between messages
- 24h greeting cooldown per contact
- Project briefing auto-injection
- 300-second timeout per sub-agent
Each agent has its own set of tool permissions. The main agent has broad access, while sub-agents operate with minimal permissions scoped to their specific task. A mail-reader cannot deploy code. A bridge-reader has zero tools.
| Agent | Shell | Files | Network | Git | Deploy | |
|---|---|---|---|---|---|---|
| main | ||||||
| mail-reader | ||||||
| bridge-reader | ||||||
| kickoff-coding | ||||||
| imsg-handler |
Each sub-agent receives only the minimum permissions required for its specific task.
iMessage CLI Safety
The iMessage sending CLI uses named flags only (--to, --text). Positional arguments are rejected. This prevents command injection through message content. 45-second timeout per send operation.
exec-guardian
Permanent daemon (launchd)Runtime enforcement layer -- every tool execution passes through the guardian before running.
HOW IT WORKS
- 1Python daemon listening on a Unix socket (exec-approvals). Every exec call from OpenClaw is intercepted
- 23-tier decision engine: auto-deny (dangerous commands like rm -rf, curl to unknown hosts), auto-allow (safe commands matching the allowlist), ask-julien (ambiguous cases sent via iMessage for human approval)
- 3Policies defined in policies.json -- both the daemon and the policy file are locked with chflags uchg (immutable)
- 4Fail-closed: if the guardian daemon crashes or is unreachable, ALL executions are blocked. The agent cannot bypass it
WHY IT MATTERS
An AI agent with shell access can do anything. The guardian ensures that even a compromised or confused agent cannot execute destructive commands without human approval. It's the last line of defense.
imsg-watcher
Permanent daemon (launchd)Replaces native iMessage handling with a controlled, isolated message processing pipeline.
HOW IT WORKS
- 1Polls macOS chat.db (SQLite) every few seconds for new messages. Native iMessage integration is disabled
- 2Each incoming message spawns a fresh one-shot sub-agent: new context, scoped tools, 300s timeout. When the sub-agent finishes, it's destroyed
- 324h greeting cooldown per contact prevents spam. Project context is auto-injected based on contact
- 4The main agent never sees raw iMessage content -- only the sub-agent does. If the sub-agent is compromised by a message, the damage is contained
WHY IT MATTERS
iMessage is an attack vector: anyone who knows the phone number can send prompt injections. The watcher isolates each message into a disposable sandbox. A malicious message kills one sub-agent -- the main agent is unaffected.
