Portkey AI Gateway
Portkey is the OpenClaw LLM gateway proxy (port :18900) and CHOKEPOINT-1 enforcement surface. Every LLM call from every one of the 36 agents routes through proxy.js which maps model → Portkey tier config, injects cache headers, and writes a tool_calls row to Supabase before returning. This file is the authoritative hub for all Portkey configuration, routing, dual-port topology, and drift detection. Read it whenever touching model routing, cost tracking, virtual keys, or proxy behavior.
Quick reference
| Field | Value |
|---|---|
| Vendor | Portkey AI |
| URL | https://app.portkey.ai |
| KB doc | API |
| Auth method | API key (x-portkey-api-key header) |
| Auth credential | op://Aurora/portkey/api-key |
| Cred-proxy port | n/a (until B1-B6 ratified — see nemoclaw-audit-2026-05-03) |
| Webhook port | n/a |
| Webhook handler | n/a |
| Tunnel path | n/a |
| Backup/recovery | vendor-owned (Portkey cloud SaaS); no local backup |
| Primary proxy port | :18900 — henryRERI primary, all API-tier agent traffic |
| Second Max port | :18903 — teamsteph@betterfiles.com Max plan via anthropic-max-router, serves 40 agents |
| Proxy file | ~/.openclaw/portkey-proxy/proxy.js |
| Outbound API base | https://api.portkey.ai/v1 |
| Account tier | Production ($49/mo as of 2026-04-24) |
| Rate limits | Per PRO-TIER-LIMITS |
| Rate-limit action | 429 → exponential backoff (3 retries), Discord ops alert |
| Cost | $49/mo flat + Portkey-cloud LLM passthrough at provider rates |
| Discord alert channel | ops |
| Drift cadence | 5-min cron via tool-calls-health-check.timer |
| Status | production |
Components
~/.openclaw/portkey-proxy/proxy.js— Express proxy; maps model → tier config; injectsx-portkey-config,x-portkey-metadata,cache_control; writestool_callsto supabase~/.openclaw/portkey-proxy/tier-config.json— Portkey tier config slugs: Strategic / Operations / Automation / Default / Voyage~/.openclaw/workspace/scripts/tool-calls-health-check.js— CHOKEPOINT-1 drift detector; compares Portkey call count vs Supabasetool_callsinserts; >10% delta triggers Discord opssystemd unit: portkey-proxy.service— user-level daemon binding:18900; restart viasystemctl --user restart portkey-proxy/tmp/openclaw/tool-calls-fallback.jsonl— fallback queue when Supabase unreachable; drains within 1h or escalates~/.openclaw/workspace/knowledge-base/portkey/— KB docs: API.md, GUARDRAILS.md, MODEL-CATALOG.md, OBSERVABILITY.md, ADMIN-API.md, PRO-TIER-LIMITS.md, PROMPT-LIBRARY.md
Dual-port topology (:18900 + :18903)
Two distinct routing paths exist for LLM calls:
Port :18900 — Primary Portkey proxy (henryRERI + API-tier agents)
- All 36 agents default here via
models.jsonbaseUrl: http://127.0.0.1:18900 - Outbound:
https://api.portkey.ai/v1with active tier config slug injected - Portkey applies load-balancing, caching, observability, virtual key isolation
Port :18903 — Max-plan local shortcut (teamsteph@betterfiles.com, 40 agents)
anthropic-max-routerbinary bound at127.0.0.1:18903- Routes OAuth Max subscription for
teamsteph@betterfiles.com - Added 2026-04-30; serves strategic + ops agents on Max plan flat-rate billing
- Phase 8 cutover FAILED 2026-05-02:
claude-max-api-proxy@teamstephat:18910only served OpenAI/v1/chat/completions; proxy.js sends Anthropic-native/v1/messages. Cutover blocked;:18903remains in place. See openclaw-fragmentation-fix-2026-05-01 §A2.9 finding F26. - Concurrency cap: 5 in-flight max (
MAX_PLAN_LOCAL_CAP=5); pool saturation returns explicit 502 withx-openclaw-max-plan-fallback: pool_saturated— NEVER silent fallback to Portkey cloud - Only Anthropic models route to
:18903(gated byisAnthropicModelcheck); non-Anthropic models fall through to Portkey cloud path
proxy.js key fixes
Fix #1 — Kimi 404 per-model gating
- Problem: Agents requesting
moonshot/kimi-for-coding(or similar non-Claude models) caused 404s on the max-plan:18903path because that upstream only handles Anthropic models. - Fix:
isAnthropicModel()guard at routing-decision site; non-Anthropic models fall through to Portkey cloud path regardless of tier assignment. - Reference: proxy.js lines ~340-348
Fix #16 — cache_control injection on max-plan path
- Problem: Old Tailscale max-proxy translated requests to OpenAI format, stripping
cache_control— prompt caching silently disabled for 35+ max-plan agents. - Fix: New
anthropic-max-routerat:18903usesclaude-code-20250219beta and speaks Anthropic-native.cache_control: {type: "ephemeral"}now injected on every Anthropic-format request for both:18900and:18903paths. System prompt, last user message block, and last tool definition all getcache_control. - Reference: proxy.js lines ~362-395
CHOKEPOINT-1 enforcement
Every LLM call writes a tool_calls row to supabase CCP project before returning.
proxy.js → logToolCall() → POST /rest/v1/tool_calls (fire-and-forget, non-blocking)
↓ (if Supabase unreachable)
→ /tmp/openclaw/tool-calls-fallback.jsonl (drains within 1h)
Required NOT-NULL fields (post-migration 2026-05-02-002): agent_id, tool_name, called_at, result_ok, latency_ms
Nullable (plan-tier signal): cost_usd, tokens_in, tokens_out, cache_read/write_tokens — NULL = Max-plan flat-rate; >0 = API-tier paid call
JSONB parameters required keys: tier and model (CHECK constraint tool_calls_parameters_required_keys)
Drift detection: tool-calls-health-check.timer runs every 5 min; compares Portkey call count vs tool_calls insert count; >10% delta alerts Discord ops. See OBSERVABILITY for metrics schema.
Virtual keys (10 live)
| Slug | Purpose | Notes |
|---|---|---|
anthropic-primary | Main API-tier Anthropic | Active |
anthropic-main-dfdf10 | Duplicate of primary | Consolidation candidate |
openclaw-produc-16fe8f | Legacy DUP | Consolidation candidate |
anthropic-max-p-99c12d | Max-plan Tailscale target | Dead credentials (401 on direct probe per 2026-04-24 audit) |
openai-producti-bdd7b2 | OpenAI | Active |
openrouter-fallback | OpenRouter Tier 2 | Fallback routing |
moonshot-kimi-84f4a1 | Moonshot / Kimi Tier 2 | Needs credits |
voyage-embeddin-9dffd3 | Voyage embeddings | Active (616K req/mo) |
voyage | Voyage DUP | Consolidation candidate |
google-ai-studi-8748cc | Google AI Studio | Active |
Voyage VK alias: pc-opencl-22d508. Credentials reference: op://Aurora/portkey/api-key
Tier configs (12 total, live state 2026-04-24)
| Config slug | Tier | Notes |
|---|---|---|
pc-opencl-7b85fe | Strategic (Opus) | Dormant since 2026-03-10 |
pc-opencl-62fac7 | Operations (Sonnet) | Dormant since 2026-03-10 |
pc-opencl-c14739 | Automation (Haiku) | Dormant since 2026-03-10 |
pc-opencl-d8b498 | High-Availability | Dormant since 2026-03-10 |
pc-opencl-454882 | Default (Sonnet/gsd) | Active |
pc-opencl-bd87ab | Max-Plan (91% of traffic) | Active — routes via Anthropic Max subscription |
pc-opencl-22d508 | Voyage Embeddings | Active |
pc-opencl-3b86e9 | Sonnet Optimized | Added 2026-03-02 |
pc-opencl-d4f457 | Gemma dense | Active |
pc-opencl-98748e | Gemma MoE | Active |
pc-opencl-1373ad | Gemma 26B | Active |
pc-opencl-2f97e0 | Gemma 31B | Active |
⚠️ CORRECTION (2026-04-24): pc-opencl-bd87ab was documented as “Anthropic Max via Tailscale” — this is WRONG. Live admin API shows it routes to anthropic-primary VK (API-tier), "mode": "simple" cache, no Tailscale custom_host. The anthropic-max-p-99c12d VK has dead credentials. See project_portkey_pro_audit_correction_2026-04-24 + plan amendment ∆8.
How it’s used
- Trigger: Any of the 36 agents issuing an LLM request via OpenClaw Gateway →
proxy.jsintercepts on:18900or:18903 - Workflow: Agent request → proxy extracts agent ID from URL path → reads model from body → maps to Portkey tier config slug → injects
x-portkey-config,x-portkey-metadata,x-portkey-cache-namespace→ strips provider prefixes (e.g.anthropic/claude-...→claude-...) → forwards tohttps://api.portkey.ai/v1(or:18903local shortcut) → logs cache status from response headers → firestool_callsINSERT - Agents involved: All 36 agents via _summary, _summary, and all Operations/Automation tier agents
- Failure mode — Portkey cloud down: Requests error out; no silent fallback to direct Anthropic (different billing). Discord ops alert via Portkey alert config.
- Failure mode — Supabase unreachable:
logToolCall()falls back to/tmp/openclaw/tool-calls-fallback.jsonl; drains within 1h or escalates per CHOKEPOINT-1 spec - Failure mode —
:18903pool saturated: Explicit 502 withx-openclaw-max-plan-fallback: pool_saturated; never silently reroutes - Success criteria: All LLM calls appear in Portkey dashboard +
tool_callsSupabase table within 5 min;tool-calls-health-check.timerreports <10% drift
Cross-links
Agents that touch this
- _summary — Strategic Opus tier; primary Max-plan consumer
- _summary — Operations Sonnet tier
- All 36 agents route via this gateway
Skills that invoke this
- aurora-model-swap — model routing reconfiguration
- acquisitions-followup — triggers LLM calls via gateway
- acquisitions-outreach — LLM dispatch for SMS generation
Plans that govern this
- openclaw-fragmentation-fix-2026-05-01 — CHOKEPOINT-1 definition + tool_calls schema migrations
- openclaw-self-improvement-layer-2026-05-03 — OSIL adds new model tiers via this gateway
- nemoclaw-audit-2026-05-03 — pending B1-B6; NemoClaw cred-proxy would inject at L7 (changes port topology)
- second-max-plan-decision — dual Max plan architecture decision
Feedback rules
- feedback_models_guardian_pattern — agent
models.jsongets overwritten by gateway on restart; editportkey-proxy/models-backups/instead - feedback_model_routing_pattern — Opus plans/audits in main; Sonnet executes via subagent; Haiku trivial
- feedback_chokepoint_principle — CHOKEPOINT-1 enforcement; every LLM call writes
tool_calls - feedback_dual_write_required — dual-write required for all state mutation
- feedback_action_gate_violation_repeated — service restarts require explicit auth
- project_portkey_pro_audit_correction_2026-04-24 — correction for VK config errors in prior KB
KB / source docs
- API — universal API formats, virtual key management
- MODEL-CATALOG — integrations, budgets, rate limits
- OBSERVABILITY — metrics, alerts, feedback API
- GUARDRAILS — deterministic + LLM + partner guardrails
- ADMIN-API — endpoint groups, auth patterns
- PRO-TIER-LIMITS — retention, overage, gating
System maps
- vm-gateway-topology — gateway + proxy port map
- vm-osil-overview — OSIL LLM tier visualization
Related: LLM routing cluster
Cluster anchor: this hub
| Hub | Role |
|---|---|
| anthropic | Primary LLM provider; all calls route via this hub |
| supabase | tool_calls writes land here (CHOKEPOINT-1) |
| openrouter (Tier 2 — hub pending) | Fallback routing via openrouter-fallback VK |
| moonshot-kimi (Tier 2 — hub pending) | Non-Anthropic model routing; Fix #1 gating |
Related: Memory/embeddings cluster
Cluster anchor: supabase
| Hub | Role |
|---|---|
| anthropic | LLM primary; embeddings separate via Voyage |
| supabase | Vector storage for embeddings + tool_calls writes |
| voyage (Tier 2 — hub pending) | Embedding calls via voyage-embeddin-9dffd3 VK (616K req/mo) |
Related: Credential layer cluster
Cluster anchor: 1password
| Hub | Role |
|---|---|
| 1password | Stores op://Aurora/portkey/api-key |
| All integration hubs | Consume credentials via op:// references only |
Open issues / TODOs
- VK consolidation: 3 duplicate VKs (
anthropic-main-dfdf10,openclaw-produc-16fe8f,voyageDUP) pending cleanup anthropic-max-p-99c12dVK has dead credentials (401) — rotate or archive- Phase 8 cutover to
:18910blocked on/v1/messagesformat compat — see openclaw-fragmentation-fix-2026-05-01 §A2.9 F26 portkey-proxy/models-backups/drift: verify all agent models.json backups are current after last gateway restart- NemoClaw cred-proxy (B1-B6 pending) would change
:18900port topology — see nemoclaw-audit-2026-05-03 tool-calls-health-check.timerdrift tolerance: current 10% threshold — evaluate for tightening after 30-day baseline
Related domain hubs
- ai-osil — AI/OSIL domain hub (this is the primary integration for OSIL)
Recent activity
- 2026-05-04: ai-osil domain hub cross-linked (ai-osil) as primary domain for OSIL AI self-improvement layer
- 2026-05-03: hub created (W1-S5)
- 2026-05-02: proxy.js Fix #1 (Kimi 404 per-model gating) + Fix #16 (cache_control on max-plan path)
- 2026-05-02: Phase 8 cutover to :18910 attempted + rolled back (F26)
- 2026-04-30: :18903 anthropic-max-router added for teamsteph@betterfiles.com Max plan
- 2026-04-24: Pro tier upgrade audit; VK correction (∆8); account tier documented