Monitoring Deep Dive
Independent observation, radical transparency
An independent process continuously evaluates agent behavior without the agent being able to influence the results. The monitoring system never goes through the agent -- it communicates directly with the human operator.
Single-file Python web servers running on each machine. Auto-refreshing interface with no external dependencies.
Max Dashboard (6 tabs)
Inter-agent message logs and delivery status
Color-coded exec-guardian decisions
Pipeline status, extractions, MEMORY.md size
9-dimension scoring with trajectory analysis
Governor statistics from governor.db (SQLite)
Full site content: 9 use cases, 12 security layers
Eva Dashboard (6 tabs)
Inter-agent message logs
Exec-guardian decisions
Pipeline status
7-dimension scoring with radar chart
Governor statistics
Site content mirror
The Bridge Analytics tab reads directly from governor.db, an SQLite database maintained by the governor daemon. It provides real-time statistics on inter-agent communication volume, rate-limit events, and message delivery times.
Messages per direction (sent/received)
Rate-limit trigger frequency
Average delivery latency
STOP/GO event history
Eva uses a 7-dimension scoring model (vs 9 for Max). The radar chart dynamically adapts to the available dimensions.
correction_rate and satisfaction_rate are excluded for Eva due to insufficient session data.
Trust Score -- 30-day trend
An independent Python script (zero dependencies, stdlib only) runs every 6 hours via launchd cron. It reads session JSONL files, computes 9 dimensions across 4 time windows (24h, 7d, 30d, all), and pushes scores to Supabase PostgreSQL.
- The agent never scores itself -- completely independent process
- Trajectory over snapshot: 4 time windows for drift detection
- Robust statistics: median + MAD instead of mean + std
- Composite score 0-1 mapped to 5 trust bands
- Automatic drift alerts via iMessage when anomalies are detected
