Security Deep Dive
How the architecture protects itself
Security is structural, not bolted on. Every component is designed with safe defaults, and protection mechanisms are out of reach of the agent they protect. EasyClaw v2 adds three new layers -- permission classifier, PreToolUse hooks, and verification agent -- that reinforce existing defenses. This page details all 8 enforcement and isolation layers.
An independent daemon that intercepts every shell command before execution. It runs on a local channel, outside the agent's workspace, with its code and policies locked by OS-level immutability flags. With EasyClaw v2, the permission classifier and PreToolUse hooks add two filtering layers upstream.
Three Enforcement Levels
Dangerous commands (sudo, rm -rf, chmod 777) are automatically rejected. No human review needed.
Read-only commands (git status, curl, cat) are automatically permitted. Fast path, no delay.
Sensitive commands (git push, rm, kill, brew install) trigger an iMessage approval request with a unique verification code. 2-minute timeout = automatic denial.
If the daemon crashes or becomes unreachable, ALL non-pre-authorized commands are denied. Fail-closed by design.
Critical configuration files are protected with OS-level immutability flags. The agent cannot modify its own rules, identity, or security configuration.
Main configuration fileAgent identity manifestSecurity policyAgent definitionsIdentity and personalityTool inventoryExec-guardian codeGuardian policiesUnlocking requires physical intervention by the human supervisor.
PIN Security
Every high-impact action requires a numeric PIN code known only to the human supervisor. The PIN is changed after any suspected compromise and is never stored in agent sessions.
Current implementation: PIN required for email sending, code deployment, file deletion, and system configuration changes.
Secured Mail-Reader
The mail-reader agent runs in exec allowlist mode. Only a single script is authorized: mail-extract with exactly 3 allowed flags (--hours, --account, --output). Everything else is denied.
--hours--account--output* (everything else)Native iMessage handling is disabled. An iMessage monitoring daemon polls the system database continuously. Each incoming message spawns an isolated one-shot sub-agent that handles the request then terminates.
- No persistent sessions between messages
- 24h greeting cooldown per contact
- Project briefing auto-injection
- 300-second timeout per sub-agent
Each agent has its own set of tool permissions. The main agent has broad access, while sub-agents operate with minimal permissions scoped to their specific task. A mail-reader cannot deploy code. A bridge-reader has zero tools.
| Agent | Shell | Files | Network | Git | Deploy | |
|---|---|---|---|---|---|---|
| main | ||||||
| mail-reader | ||||||
| bridge-reader | ||||||
| kickoff-coding | ||||||
| imsg-handler |
Each sub-agent receives only the minimum permissions required for its specific task.
iMessage CLI Safety
The iMessage sending CLI uses named flags only (--to, --text). Positional arguments are rejected. This prevents command injection through message content. 45-second timeout per send operation.
Permission Classifier
Before every action, the agent automatically evaluates its risk level. Read a file? Immediate execution. Delete a directory? Confirmation required. The classifier considers the action type, context, user history, and defined rules. Three possible outcomes: autonomous execution, confirmation request, or refusal with explanation.
Learn moreHook System (PreToolUse)
The hook system intercepts every action before execution. A PreToolUse hook can block, modify, or log any command. Over 15 events cover the agent's entire lifecycle. Hooks are declarative (JSON), chainable, and conditional.
Learn moreAdversarial Verification Agent
An independent agent verifies results produced by other agents. It applies an adversarial process: questioning conclusions, looking for logical flaws, verifying sources. This mechanism adds a layer of reliability without slowing the main flow.
Learn moreExec-Guardian daemon
Permanent daemon (launchd)Runtime enforcement layer -- every tool execution passes through the guardian before running.
HOW IT WORKS
- 1Independent daemon exposed via a dedicated local channel for execution approvals. Every exec call from the agent is intercepted
- 23-tier decision engine: auto-deny (dangerous commands like rm -rf, curl to unknown hosts), auto-allow (safe commands matching the allowlist), ask-human (ambiguous cases sent via iMessage for human approval)
- 3Policies defined in a locked configuration file — the daemon and its policies are protected by OS immutability flags
- 4Fail-closed: if the guardian daemon crashes or is unreachable, ALL executions are blocked. The agent cannot bypass it
WHY IT MATTERS
An AI agent with shell access can do anything. The guardian ensures that even a compromised or confused agent cannot execute destructive commands without human approval. It's the last line of defense.
iMessage watcher daemon
Permanent daemon (launchd)Replaces native iMessage handling with a controlled, isolated message processing pipeline.
HOW IT WORKS
- 1Polls the system iMessage database every few seconds for new messages. Native iMessage integration is disabled
- 2Each incoming message spawns a fresh one-shot sub-agent: new context, scoped tools, 300s timeout. When the sub-agent finishes, it's destroyed
- 324h greeting cooldown per contact prevents spam. Project context is auto-injected based on contact
- 4The main agent never sees raw iMessage content -- only the sub-agent does. If the sub-agent is compromised by a message, the damage is contained
WHY IT MATTERS
iMessage is an attack vector: anyone who knows the phone number can send prompt injections. The watcher isolates each message into a disposable sandbox. A malicious message kills one sub-agent -- the main agent is unaffected.
permission-classifier
Active continuously (inline)NEWSecond control layer above exec-guardian. Evaluates each action's risk based on type, context, and user history.
HOW IT WORKS
- 1Every action passes through the classifier before reaching exec-guardian. The classifier analyzes: action type (read/write/delete), scope (local/remote), reversibility
- 2The history of permissions granted or denied by the user is considered. If the user has already authorized this type of action 10 times, the confirmation threshold drops
- 3Three possible outcomes: autonomous execution (low risk), confirmation request (medium risk), or refusal with explanation (high risk)
- 4The classifier gradually adapts to each user's preferences through a learning mechanism
WHY IT MATTERS
Exec-guardian applies static rules. The permission classifier adds a contextual intelligence layer: it learns when to ask and when to let through, reducing noise while maintaining security.
Learn morehooks (PreToolUse)
Event-driven triggeringNEWExtensible interception system. A PreToolUse hook can block, modify, or log any action before execution.
HOW IT WORKS
- 1Hooks are declared in JSON within the agent's configuration. Each hook listens to a specific event (on_tool_call, pre_response, etc.)
- 2A PreToolUse hook intercepts every tool call before execution. It can block the action, modify it, or log it in an audit trail
- 3Hooks are chainable: multiple hooks can listen to the same event, executing in the defined priority order
- 4Over 15 events cover the agent's entire lifecycle, from incoming message to final response
WHY IT MATTERS
Hooks let you customize security without modifying the agent's code. An audit hook records every action, a compliance hook blocks unauthorized actions, a notification hook alerts in real-time.
Learn moreverification-agent
Continuous passive observationNEWIndependent read-only agent that audits other agents' actions in real time. Cannot modify files or execute commands.
HOW IT WORKS
- 1The verification agent receives a real-time stream of all actions executed by other agents (commands, file modifications, API calls)
- 2Each action is evaluated against a set of rules: permission compliance, consistency with instructions, adherence to best practices
- 3The system identifies suspicious patterns: unusual action repetition, access to unexpected resources, inconsistent results
- 4Detected anomalies generate alerts classified by severity: info, warning, critical. Critical alerts interrupt the current action
WHY IT MATTERS
An agent that can do everything and checks itself is not reliable. An independent, read-only agent that observes from the outside provides a fundamental layer of reliability. It's the separation of powers principle applied to AI.
Learn more