Incident Timeline — P0/P1 History
Reverse-chronological registry of every P0/P1 incident that produced a governance gate, feedback rule, or structural change in OpenClaw. Each entry links its causal chain: incident → derived rule → gate it enforces. Asymmetry note: This hub links TO g-gates-network but the gates hub does NOT back-link here — gates are standing rules, not governed by past incidents (asymmetric: true in frontmatter).
Coverage period: 2026-03 → present. Severity threshold: P0 = safety/revenue/data integrity risk; P1 = ops disruption or governance gap. P2 and below not catalogued here.
Quick reference
| Field | Value |
|---|---|
| Coverage period | 2026-03 → 2026-05 (ongoing) |
| Incident count | 10 documented (P0: 4 / P1: 6) |
| Derived feedback rules | 12+ rules across 10 incidents |
| Last incident | 2026-05-03 (salesmsg-gateway-creds-exposure) |
| Severity threshold | P0 + P1 only |
| Cross-links | g-gates-network (one-way) · memory-rule-clusters · action-gate |
Incident timeline (reverse chronological)
| Date | Incident ID | Severity | One-line summary | Gate(s) derived | Rule(s) derived | Resolved? |
|---|---|---|---|---|---|---|
| 2026-05-03 | salesmsg-gateway-creds-exposure | P0 | Plaintext ANTHROPIC_API_KEY hardcoded in /etc/systemd/system/salesmsg-gateway.service; read-world-accessible for 24+ days | G-NO-PLAINTEXT-CREDS | feedback_no_plaintext_creds | 🔴 Rotation pending |
| 2026-05-02 | vault-rsync-wipe-cascade | P1 | Wave 0 overnight outputs deleted by --delete flag in vault rsync; 15 maps + 3 hubs wiped; Wave 1 re-authored | G-SERVICE-PRE-START-DOC (rsync flag audit) | feedback_archive_not_delete | ✅ rsync flag patched |
| 2026-05-02 | aws-mac-impaired | P1 | EC2 Mac Ultra instance entered impaired AWS health state at 22:15 UTC; EBS preserved; investorlift-cookie-refresh.service failed downstream | G-FAILED-SERVICE-MTTR | feedback_failed_service_mttr | 🟡 Impaired cleared; VPS Playwright migration pending |
| 2026-05-01 | vps-reboot-5-internal-cascade | P1 | Reboot #5 was internal oomd cascade (NOT datacenter event); timer stampede + EADDRINUSE port flap triggered OOM kill; 4 reboots in 12h | G-FAILED-SERVICE-MTTR · CHOKEPOINT-1 | feedback_substrate_right_size_to_working_set · feedback_user_slice_cap_pattern | 🟡 F1-F4 applied; Phase 1 remediation in flight |
| 2026-05-01 | cost-monitor-v3-bypass | P1 | cost-monitor v3 made LLM calls that bypassed tool_calls Postgres writes; cost audit trail silent for those calls; root cause: missing CHOKEPOINT-1 enforcement at proxy | G-CHOKEPOINT · G-DUAL-WRITE | feedback_chokepoint_principle · feedback_dual_write_required | ✅ CHOKEPOINT-1 enforced at portkey-proxy layer |
| 2026-04-16 | aurora-soul-overwrite | P0 | Claude proposed 350-line system prompt that would have overwritten Aurora’s “Build, don’t coordinate” principle + Intelligence Layer Rules 1-6; Henry caught it; redirected to skill layer | G-PLAN-INDEX-REQUIRED (skill vs SOUL gate) | feedback_agent_identity_first | ✅ SOUL.md untouched; quo-thread-orchestrator skill layer added |
| 2026-04 | 9870-wrong-inbox | P0 | 9,870 messages routed to wrong inbox; root cause: incident-derived rule existed in MEMORY.md below the 200-line auto-load fold — invisible at session start | G-MEMORY-FOLD-PROTECTION | feedback_memory_fold_protection | ✅ Rule elevated above line 150 in MEMORY.md |
| 2026-04 | 374-dup-SMS | P0 | 374 duplicate SMS messages sent; rule preventing duplication existed but was below MEMORY.md fold — not loaded; blast-safety gate not enforced | G-MEMORY-FOLD-PROTECTION · G-BLOCKER-SURFACING | feedback_memory_fold_protection · feedback_never_send_without_auth | ✅ Dedup gate added above fold; blast-safety wired |
| 2026-03 | macos-firewall-tailscaled | P1 | macOS system firewall on EC2 Mac Ultra blocked tailscaled daemon despite per-user Tailscale agent running; SSH sessions silently dropped after VPN drop; hours of debug | n/a (no gate derived — feedback rule only) | feedback_macos_firewall_blocks_tailscaled | ✅ LaunchDaemon system-level tailscaled plist installed |
| 2026-03 | obsidian-sync-external-changes | P1 | Obsidian Sync’s “external changes” prompt blocked vault auto-sync after Mac edits; no rule existed for modal dialog in headless context | n/a (feedback rule only) | feedback_obsidian_sync_external_changes | ✅ Obsidian Git workflow patched; .obsidian config synced |
Derived feedback rules (causal map)
One-way. These rules point back to the incidents that caused them. Do NOT make the gate hub link back here.
| Rule | Incident (cause) | Gate enforced | Cluster |
|---|---|---|---|
| feedback_no_plaintext_creds | salesmsg-gateway-creds-exposure (2026-05-03) | G-NO-PLAINTEXT-CREDS | security |
| feedback_credentials_repeat_exposure | salesmsg-gateway-creds-exposure + TOOLS.md 15-key leak (2026-05-03) | G-NO-PLAINTEXT-CREDS | security |
| feedback_archive_not_delete | vault-rsync-wipe-cascade (2026-05-02) | G-SERVICE-PRE-START-DOC (rsync audit) | ops |
| feedback_failed_service_mttr | aws-mac-impaired (2026-05-02) + investorlift-cookie-refresh failure | G-FAILED-SERVICE-MTTR | infra |
| feedback_substrate_right_size_to_working_set | vps-reboot-5-internal-cascade (2026-05-01) | G-FAILED-SERVICE-MTTR | infra |
| feedback_user_slice_cap_pattern | vps-reboot-5-internal-cascade (2026-05-01) | G-FAILED-SERVICE-MTTR | infra |
| feedback_chokepoint_principle | cost-monitor-v3-bypass (2026-05-01) | G-CHOKEPOINT | data |
| feedback_dual_write_required | cost-monitor-v3-bypass (2026-05-01) | G-DUAL-WRITE | data |
| feedback_agent_identity_first | aurora-soul-overwrite (2026-04-16) | G-PLAN-INDEX-REQUIRED | agent |
| feedback_memory_fold_protection | 9870-wrong-inbox + 374-dup-SMS (2026-04) | G-MEMORY-FOLD-PROTECTION | governance |
| feedback_never_send_without_auth | 374-dup-SMS (2026-04) | G-BLOCKER-SURFACING | messaging |
| feedback_macos_firewall_blocks_tailscaled | macos-firewall-tailscaled (2026-03) | n/a | infra |
Recurrence patterns
Three failure modes have appeared ≥2x. Recurrence = systemic risk.
Pattern 1 — Invisible rules (G-MEMORY-FOLD-PROTECTION)
Incidents: 9870-wrong-inbox + 374-dup-SMS (both 2026-04)
Mechanism: Rule existed in MEMORY.md but below the 200-line auto-load boundary → invisible at session start → not enforced → repeat incident.
Fix applied: All incident-derived rules (incident_derived: true) must live above line 150 in MEMORY.md. Enforced by G-MEMORY-FOLD-PROTECTION.
Monitor: If MEMORY.md exceeds 200 lines, run wc -l and move critical rules up.
Pattern 2 — Plaintext credentials in tracked files (G-NO-PLAINTEXT-CREDS)
Incidents: salesmsg-gateway-creds-exposure (2026-05-03) + TOOLS.md 15+ keys for 24+ days (2026-05-03)
Mechanism: Operator convenience → paste credential directly into systemd unit or TOOLS.md → file becomes readable by any process with host access.
Fix applied: G-NO-PLAINTEXT-CREDS gate + NemoClaw L7 cred-proxy pattern (pending B1-B6 ratification). See credential-security-policy.
Monitor: security-audit-funnel.js --dry-run weekly; grep for sk-[A-Za-z0-9]{20,} in tracked files.
Pattern 3 — Failed services silently degrading data pipelines (G-FAILED-SERVICE-MTTR)
Incidents: aws-mac-impaired (2026-05-02) → investorlift-cookie-refresh failure; vps-reboot-5 (2026-05-01) → 4 reboots.
Mechanism: Service enters failed/impaired state → no alert → downstream pipeline silently degrades (0 photos, 0 IL deals) → only noticed when Henry queries data.
Fix applied: G-FAILED-SERVICE-MTTR gate: >24h failed = fix / disable / archive. Daily cron checks systemctl --user list-units --state=failed. Discord #ops alert.
Open issues / TODOs
- salesmsg-gateway-creds-exposure (2026-05-03): ANTHROPIC_API_KEY rotation pending — see credential-security-policy P0 queue item #1
- Per-incident deep-dive pages (Wave 7 target): 9870-wrong-inbox, 374-dup-sms, aurora-soul-overwrite, vps-reboot-5-cascade, salesmsg-gateway-creds-exposure
- Add VPS reboot #3 and #4 entries (pre-2026-05-01) — dates and triggers not recorded in SESSION-AUDIT
- investorlift-cookie-refresh.service migration from VPS Playwright → SSH Mac Ultra (G-FAILED-SERVICE-MTTR clock running since 2026-05-02)
Related cluster
Governance hubs (one-way cross-links — incident → gate, not gate → incident)
- g-gates-network — all 10 documented gates derived from these incidents
- memory-rule-clusters — 11 clusters; incident-derived rules tagged
incident_derived: true - action-gate — covers read/write confirmation rules derived from 374-dup-SMS
- credential-security-policy — P0 queue derived from salesmsg-gateway-creds-exposure
- decision-log — strategic decisions made in response to incidents
Agents involved in incidents
Plans governing remediation
- openclaw-fragmentation-fix-2026-05-01 — master remediation plan; §A1 gates
- vps-cascade-permanent-remediation-2026-05-04 — vps-reboot-5 permanent fix
- nemoclaw-audit-2026-05-03 — cred-proxy pattern (salesmsg-gateway-creds-exposure counter)
- openclaw-obsidian-vault-2026-05-02 — vault rsync fix + EC2 Mac Ultra LaunchDaemon
System maps
- governance-gates-network — visual map of all gates
- incident-response-flow — response playbook map
Recent activity
- 2026-05-04: Hub created (W3-S4, Wave 3)
- 2026-05-03: salesmsg-gateway-creds-exposure added to P0 queue
- 2026-05-02: vault-rsync-wipe-cascade + aws-mac-impaired recorded
- 2026-05-01: vps-reboot-5 + cost-monitor-v3-bypass recorded; G-CHOKEPOINT gates derived