Cost Tracking Hub
This is the aggregating hub for all LLM and vendor cost instrumentation across OpenClaw. It references every integration that incurs cost, documents the CHOKEPOINT-1 tool_calls table, the cost-monitor cron, Discord #ops alert routing, and per-vendor cost rates. Read this hub when diagnosing cost overruns, auditing LLM spend, verifying CHOKEPOINT-1 compliance, or adding a new cost-incurring integration. It does not own the integrations — it aggregates their cost signals.
CHOKEPOINT-1 — tool_calls (Supabase CCP)
Per CLAUDE.md POSTGRES-CHOKEPOINT Phase 1.4: Every LLM call MUST write a tool_calls row before returning. This is the single cost audit surface. Any bypass = governance violation.
Schema
| Column | Type | Required | Notes |
|---|---|---|---|
id | uuid | Y | PK |
agent_id | text | Y (NOT NULL) | Agent making the call |
tool_name | text | Y (NOT NULL) | Tool or model identifier |
parameters | jsonb | Y | MUST include tier + model keys (CHECK constraint) |
result_ok | bool | Y (NOT NULL) | success/failure |
latency_ms | int | Y (NOT NULL) | wall-clock latency |
called_at | timestamptz | Y (NOT NULL) | call timestamp |
cost_usd | numeric | nullable | NULL = Max plan flat-rate; >0 = API-tier paid |
tokens_in | int | nullable | NULL = Max plan |
tokens_out | int | nullable | NULL = Max plan |
cache_read_tokens | int | nullable | prompt cache reads |
cache_write_tokens | int | nullable | prompt cache writes |
NULL cost_usd is intentional — signals flat-rate Max plan call (not an error).
Migration files:
workspace/migrations/2026-05-02-001-infra-config-changes.sqlworkspace/migrations/2026-05-02-002-tool-calls-not-null.sql
Health check
tool-calls-health-check.timer (every 5 min) compares Portkey call count vs tool_calls insert count. If delta >10% → Discord #ops alert.
node /home/opsadmin/.openclaw/workspace/scripts/tool-calls-health-check.jsFallback on Postgres outage
If Supabase CCP unreachable → write to /tmp/openclaw/tool-calls-fallback.jsonl. Drain back within 1h or escalate.
LLM cost tiers
Primary — Anthropic Max Plan (flat-rate)
Two Max plan seats active:
| Seat | Proxy port | Agents served | |
|---|---|---|---|
| Primary | henryRERI | :18900 | Main interactive session |
| Secondary | teamsteph@betterfiles.com | :18903 (anthropic-max-router) | 40 agents |
Cost: Flat-rate monthly subscription (no per-token charges).
Signal in tool_calls: cost_usd = NULL, tokens_in = NULL, tokens_out = NULL.
Hub: anthropic
Routing layer — Portkey Proxy
Portkey (127.0.0.1:18900 primary, 127.0.0.1:18903 secondary) is the cost chokepoint — all LLM calls flow through it.
portkey-proxy/proxy.js— proxy implementation- Fix #1 (Kimi 404): per-model gating prevents 404 cascades
- Fix #16 (cache_control): max-plan path respects cache_control properly
- CHOKEPOINT-1 enforcement:
tool-calls-health-check.timerevery 5 min - Hub: portkey
Fallback — OpenRouter
Used when Anthropic is unreachable or for budget routing. Cost: API-tier per-token; rates vary by model routed. Hub: openrouter (Wave 2)
Per-vendor cost rates
| Vendor | Cost model | Typical monthly | Notes |
|---|---|---|---|
| Anthropic (Max plan) | Flat-rate subscription | $200/seat × 2 | NULL in tool_calls |
| OpenRouter fallback | ~$0.002-0.015 / 1K tokens | Variable | Logged with cost_usd >0 |
| Voyage-4 embeddings | ~$0.06 / 1M tokens | ~$5-15 | Via supabase vec0 |
| Apollo (enrichment) | Credit-based; ~$0.01/contact | ~$10-50 | Via il-marketplace-pull |
| Hunter.io (email enrichment) | 500 free/mo; $49/mo paid | ~$49 | Via IL enrichment pipeline |
| SalesMsg SMS | ~$0.01/msg | ~$50-200 | Via salesmsg |
| OpenPhone SMS | Included in plan | Flat | Via openphone-quo |
| Hetzner VPS | CCX33 ~€20/mo | ~$22 | hetzner |
| Supabase CCP | Pro plan; usage-based above free tier | ~$25+ | supabase |
| GitHub Actions | Free tier (private repo) | ~$0 | github |
All costs reported to Discord #ops. See alert routing below.
cost-monitor cron
cost-monitor.timer runs nightly to aggregate tool_calls spend and compare against thresholds.
| Alert | Threshold | Channel |
|---|---|---|
| LLM overage | >10% delta Portkey vs tool_calls | Discord #ops |
| Daily spend spike | cost_usd sum > $X in 24h | Discord #ops |
| Voyage embedding burn | embedding_cache miss rate >50% | Discord #ops |
| Fallback queue stale | /tmp/openclaw/tool-calls-fallback.jsonl >1h old | Discord #ops |
Note: The main agent owns ~80% of cron and is a bottleneck. Phase 11.2 (OSIL) plans to distribute cron ownership across specialist agents. This is flagged as an architectural debt item.
Main agent cron bottleneck
Drift finding (2026-05-01 Visual Mapping): The main agent owns ~80% of all cron timers (62 total). This is a bottleneck and single point of failure. OSIL Phase 2 will distribute cost-monitor, tool-calls-health-check, and friction-report timers to purpose-built agents.
Relevant plan: openclaw-self-improvement-layer-2026-05-03 — OSIL Phase 2 cron redistribution.
CHOKEPOINT-2 — infra_config_changes
Config changes (Portkey routing edits, model routing changes, agent route changes) MUST write an infra_config_changes row BEFORE going live.
Schema: change_type, target_system, after_state, reason, approved_by, applied_via (+ before_state, rollback_command, smoke_test_result).
Migration: workspace/migrations/2026-05-02-001-infra-config-changes.sql
Supabase CCP — cost-relevant tables
| Table | Purpose | Cost signal |
|---|---|---|
tool_calls | LLM call audit trail | PRIMARY cost surface (CHOKEPOINT-1) |
infra_config_changes | Config change audit | Config-change gate (CHOKEPOINT-2) |
openphone_events | SMS send audit | SMS cost tracking |
ib_blast_history | Blast dedup + history | IB blast cost signal |
processed_webhook_events | Webhook dedup | 24h TTL |
webhook_audit_log | Security audit | non-blocking write |
Supabase project: CCP (svueekfvfrvhylxygktb) — PRODUCTION. All cost writes go here.
Hub: supabase
Discord ops alert routing
All cost alerts route to Discord #ops channel. Sources:
| Alert source | Timer | Condition |
|---|---|---|
tool-calls-health-check | every 5 min | Portkey vs tool_calls delta >10% |
cost-monitor | nightly | Spend thresholds |
security-audit-funnel | Monday 06:00 PT | Sig failure >5%, zero-event endpoints |
weekly-index-audit | weekly | Governance logs stale >14d |
friction-report | nightly 02:00 PT | P0/P1/P2 friction backlog |
Hub: discord
Adding a new cost-incurring integration
When a new vendor is added that incurs per-call or per-unit cost:
- Add vendor row to the cost rates table above
- Ensure every call writes a
tool_callsrow (CHOKEPOINT-1) - Add cost threshold to
cost-monitor.timerconfig - Update the relevant integration hub with cost field
- Register alert in Discord
#opsvia discord - Write
infra_config_changesrow (CHOKEPOINT-2)
Components
workspace/scripts/tool-calls-health-check.js— Portkey vs tool_calls drift check (every 5 min)portkey-proxy/proxy.js— LLM routing proxy (ports 18900, 18903)workspace/migrations/2026-05-02-001-infra-config-changes.sql— CHOKEPOINT-2 schemaworkspace/migrations/2026-05-02-002-tool-calls-not-null.sql— CHOKEPOINT-1 NOT NULL constraintsworkspace/scripts/cost-monitor.js— nightly cost aggregation (if exists; flag if missing)/tmp/openclaw/tool-calls-fallback.jsonl— fallback queue on Postgres outage
Cross-links
Integrations aggregated by this hub
- anthropic — LLM primary (Max plan, flat-rate)
- portkey — LLM proxy; CHOKEPOINT-1 enforcement; dual ports
- supabase —
tool_calls+infra_config_changestables - discord — ops cost alert channel
- salesmsg — SMS cost (~$0.01/msg)
- openphone-quo — SMS/calls (flat plan)
- hetzner — VPS hosting cost
- 1password — credential layer (cost: subscription)
- github — vault + CI/CD (free tier)
- aws — EC2 Mac Ultra (on-demand; paused 2026-05-02)
- investorlift — IL scraping (no direct API cost; Mac Ultra dependency)
Wave 2 integrations (pending hubs)
- openrouter — OpenRouter fallback (per-token, Wave 2)
- voyage — Voyage-4 embeddings (~$0.06/1M tokens, Wave 2)
Plans that govern this
- openclaw-fragmentation-fix-2026-05-01 — CHOKEPOINT gates §Phase 1.4
- openclaw-self-improvement-layer-2026-05-03 — OSIL Phase 2 cron redistribution
Feedback rules
- feedback_chokepoint_principle — single-surface writes; no bypasses
- feedback_dual_write_required — dual-write pattern enforcement
- feedback_cost_root_required — always trace cost to root cause
KB / source docs
System maps
- vm-cost-flow — cost signal flow diagram
- vm-portkey-routing — LLM routing topology
Related cluster — LLM routing
- portkey — proxy layer (CHOKEPOINT anchor)
- anthropic — primary LLM provider
- supabase — data substrate (tool_calls table)
- discord — ops alert channel
Open issues / TODOs
cost-monitor.jsexistence needs verification — confirm it exists inworkspace/scripts/or flag as missing- Main agent owns ~80% of cron (bottleneck) — OSIL Phase 2 will redistribute
- PropStream daily quota not tracked in cost monitoring — add threshold
- Apollo + Hunter costs need baseline month measurement (no historical in tool_calls)
- Voyage embedding cache hit rate not tracked — add to health check
Recent activity
- 2026-05-03: hub created by W1-S9
- 2026-05-03: CHOKEPOINT-1/2, all integration cost rates, alert routing, and open issues documented