Portkey AI Gateway

Portkey is the OpenClaw LLM gateway proxy (port :18900) and CHOKEPOINT-1 enforcement surface. Every LLM call from every one of the 36 agents routes through proxy.js which maps model → Portkey tier config, injects cache headers, and writes a tool_calls row to Supabase before returning. This file is the authoritative hub for all Portkey configuration, routing, dual-port topology, and drift detection. Read it whenever touching model routing, cost tracking, virtual keys, or proxy behavior.

Quick reference

FieldValue
VendorPortkey AI
URLhttps://app.portkey.ai
KB docAPI
Auth methodAPI key (x-portkey-api-key header)
Auth credentialop://Aurora/portkey/api-key
Cred-proxy portn/a (until B1-B6 ratified — see nemoclaw-audit-2026-05-03)
Webhook portn/a
Webhook handlern/a
Tunnel pathn/a
Backup/recoveryvendor-owned (Portkey cloud SaaS); no local backup
Primary proxy port:18900 — henryRERI primary, all API-tier agent traffic
Second Max port:18903teamsteph@betterfiles.com Max plan via anthropic-max-router, serves 40 agents
Proxy file~/.openclaw/portkey-proxy/proxy.js
Outbound API basehttps://api.portkey.ai/v1
Account tierProduction ($49/mo as of 2026-04-24)
Rate limitsPer PRO-TIER-LIMITS
Rate-limit action429 → exponential backoff (3 retries), Discord ops alert
Cost$49/mo flat + Portkey-cloud LLM passthrough at provider rates
Discord alert channelops
Drift cadence5-min cron via tool-calls-health-check.timer
Statusproduction

Components

  • ~/.openclaw/portkey-proxy/proxy.js — Express proxy; maps model → tier config; injects x-portkey-config, x-portkey-metadata, cache_control; writes tool_calls to supabase
  • ~/.openclaw/portkey-proxy/tier-config.json — Portkey tier config slugs: Strategic / Operations / Automation / Default / Voyage
  • ~/.openclaw/workspace/scripts/tool-calls-health-check.jsCHOKEPOINT-1 drift detector; compares Portkey call count vs Supabase tool_calls inserts; >10% delta triggers Discord ops
  • systemd unit: portkey-proxy.service — user-level daemon binding :18900; restart via systemctl --user restart portkey-proxy
  • /tmp/openclaw/tool-calls-fallback.jsonl — fallback queue when Supabase unreachable; drains within 1h or escalates
  • ~/.openclaw/workspace/knowledge-base/portkey/ — KB docs: API.md, GUARDRAILS.md, MODEL-CATALOG.md, OBSERVABILITY.md, ADMIN-API.md, PRO-TIER-LIMITS.md, PROMPT-LIBRARY.md

Dual-port topology (:18900 + :18903)

Two distinct routing paths exist for LLM calls:

Port :18900 — Primary Portkey proxy (henryRERI + API-tier agents)

  • All 36 agents default here via models.json baseUrl: http://127.0.0.1:18900
  • Outbound: https://api.portkey.ai/v1 with active tier config slug injected
  • Portkey applies load-balancing, caching, observability, virtual key isolation

Port :18903 — Max-plan local shortcut (teamsteph@betterfiles.com, 40 agents)

  • anthropic-max-router binary bound at 127.0.0.1:18903
  • Routes OAuth Max subscription for teamsteph@betterfiles.com
  • Added 2026-04-30; serves strategic + ops agents on Max plan flat-rate billing
  • Phase 8 cutover FAILED 2026-05-02: claude-max-api-proxy@teamsteph at :18910 only served OpenAI /v1/chat/completions; proxy.js sends Anthropic-native /v1/messages. Cutover blocked; :18903 remains in place. See openclaw-fragmentation-fix-2026-05-01 §A2.9 finding F26.
  • Concurrency cap: 5 in-flight max (MAX_PLAN_LOCAL_CAP=5); pool saturation returns explicit 502 with x-openclaw-max-plan-fallback: pool_saturated — NEVER silent fallback to Portkey cloud
  • Only Anthropic models route to :18903 (gated by isAnthropicModel check); non-Anthropic models fall through to Portkey cloud path

proxy.js key fixes

Fix #1 — Kimi 404 per-model gating

  • Problem: Agents requesting moonshot/kimi-for-coding (or similar non-Claude models) caused 404s on the max-plan :18903 path because that upstream only handles Anthropic models.
  • Fix: isAnthropicModel() guard at routing-decision site; non-Anthropic models fall through to Portkey cloud path regardless of tier assignment.
  • Reference: proxy.js lines ~340-348

Fix #16 — cache_control injection on max-plan path

  • Problem: Old Tailscale max-proxy translated requests to OpenAI format, stripping cache_control — prompt caching silently disabled for 35+ max-plan agents.
  • Fix: New anthropic-max-router at :18903 uses claude-code-20250219 beta and speaks Anthropic-native. cache_control: {type: "ephemeral"} now injected on every Anthropic-format request for both :18900 and :18903 paths. System prompt, last user message block, and last tool definition all get cache_control.
  • Reference: proxy.js lines ~362-395

CHOKEPOINT-1 enforcement

Every LLM call writes a tool_calls row to supabase CCP project before returning.

proxy.js → logToolCall() → POST /rest/v1/tool_calls (fire-and-forget, non-blocking)
         ↓ (if Supabase unreachable)
         → /tmp/openclaw/tool-calls-fallback.jsonl (drains within 1h)

Required NOT-NULL fields (post-migration 2026-05-02-002): agent_id, tool_name, called_at, result_ok, latency_ms

Nullable (plan-tier signal): cost_usd, tokens_in, tokens_out, cache_read/write_tokens — NULL = Max-plan flat-rate; >0 = API-tier paid call

JSONB parameters required keys: tier and model (CHECK constraint tool_calls_parameters_required_keys)

Drift detection: tool-calls-health-check.timer runs every 5 min; compares Portkey call count vs tool_calls insert count; >10% delta alerts Discord ops. See OBSERVABILITY for metrics schema.

Virtual keys (10 live)

SlugPurposeNotes
anthropic-primaryMain API-tier AnthropicActive
anthropic-main-dfdf10Duplicate of primaryConsolidation candidate
openclaw-produc-16fe8fLegacy DUPConsolidation candidate
anthropic-max-p-99c12dMax-plan Tailscale targetDead credentials (401 on direct probe per 2026-04-24 audit)
openai-producti-bdd7b2OpenAIActive
openrouter-fallbackOpenRouter Tier 2Fallback routing
moonshot-kimi-84f4a1Moonshot / Kimi Tier 2Needs credits
voyage-embeddin-9dffd3Voyage embeddingsActive (616K req/mo)
voyageVoyage DUPConsolidation candidate
google-ai-studi-8748ccGoogle AI StudioActive

Voyage VK alias: pc-opencl-22d508. Credentials reference: op://Aurora/portkey/api-key

Tier configs (12 total, live state 2026-04-24)

Config slugTierNotes
pc-opencl-7b85feStrategic (Opus)Dormant since 2026-03-10
pc-opencl-62fac7Operations (Sonnet)Dormant since 2026-03-10
pc-opencl-c14739Automation (Haiku)Dormant since 2026-03-10
pc-opencl-d8b498High-AvailabilityDormant since 2026-03-10
pc-opencl-454882Default (Sonnet/gsd)Active
pc-opencl-bd87abMax-Plan (91% of traffic)Active — routes via Anthropic Max subscription
pc-opencl-22d508Voyage EmbeddingsActive
pc-opencl-3b86e9Sonnet OptimizedAdded 2026-03-02
pc-opencl-d4f457Gemma denseActive
pc-opencl-98748eGemma MoEActive
pc-opencl-1373adGemma 26BActive
pc-opencl-2f97e0Gemma 31BActive

⚠️ CORRECTION (2026-04-24): pc-opencl-bd87ab was documented as “Anthropic Max via Tailscale” — this is WRONG. Live admin API shows it routes to anthropic-primary VK (API-tier), "mode": "simple" cache, no Tailscale custom_host. The anthropic-max-p-99c12d VK has dead credentials. See project_portkey_pro_audit_correction_2026-04-24 + plan amendment ∆8.

How it’s used

  • Trigger: Any of the 36 agents issuing an LLM request via OpenClaw Gateway → proxy.js intercepts on :18900 or :18903
  • Workflow: Agent request → proxy extracts agent ID from URL path → reads model from body → maps to Portkey tier config slug → injects x-portkey-config, x-portkey-metadata, x-portkey-cache-namespace → strips provider prefixes (e.g. anthropic/claude-...claude-...) → forwards to https://api.portkey.ai/v1 (or :18903 local shortcut) → logs cache status from response headers → fires tool_calls INSERT
  • Agents involved: All 36 agents via _summary, _summary, and all Operations/Automation tier agents
  • Failure mode — Portkey cloud down: Requests error out; no silent fallback to direct Anthropic (different billing). Discord ops alert via Portkey alert config.
  • Failure mode — Supabase unreachable: logToolCall() falls back to /tmp/openclaw/tool-calls-fallback.jsonl; drains within 1h or escalates per CHOKEPOINT-1 spec
  • Failure mode — :18903 pool saturated: Explicit 502 with x-openclaw-max-plan-fallback: pool_saturated; never silently reroutes
  • Success criteria: All LLM calls appear in Portkey dashboard + tool_calls Supabase table within 5 min; tool-calls-health-check.timer reports <10% drift

Agents that touch this

  • _summary — Strategic Opus tier; primary Max-plan consumer
  • _summary — Operations Sonnet tier
  • All 36 agents route via this gateway

Skills that invoke this

Plans that govern this

Feedback rules

KB / source docs

System maps

Cluster anchor: this hub

HubRole
anthropicPrimary LLM provider; all calls route via this hub
supabasetool_calls writes land here (CHOKEPOINT-1)
openrouter (Tier 2 — hub pending)Fallback routing via openrouter-fallback VK
moonshot-kimi (Tier 2 — hub pending)Non-Anthropic model routing; Fix #1 gating

Cluster anchor: supabase

HubRole
anthropicLLM primary; embeddings separate via Voyage
supabaseVector storage for embeddings + tool_calls writes
voyage (Tier 2 — hub pending)Embedding calls via voyage-embeddin-9dffd3 VK (616K req/mo)

Cluster anchor: 1password

HubRole
1passwordStores op://Aurora/portkey/api-key
All integration hubsConsume credentials via op:// references only

Open issues / TODOs

  • VK consolidation: 3 duplicate VKs (anthropic-main-dfdf10, openclaw-produc-16fe8f, voyage DUP) pending cleanup
  • anthropic-max-p-99c12d VK has dead credentials (401) — rotate or archive
  • Phase 8 cutover to :18910 blocked on /v1/messages format compat — see openclaw-fragmentation-fix-2026-05-01 §A2.9 F26
  • portkey-proxy/models-backups/ drift: verify all agent models.json backups are current after last gateway restart
  • NemoClaw cred-proxy (B1-B6 pending) would change :18900 port topology — see nemoclaw-audit-2026-05-03
  • tool-calls-health-check.timer drift tolerance: current 10% threshold — evaluate for tightening after 30-day baseline
  • ai-osil — AI/OSIL domain hub (this is the primary integration for OSIL)

Recent activity

  • 2026-05-04: ai-osil domain hub cross-linked (ai-osil) as primary domain for OSIL AI self-improvement layer
  • 2026-05-03: hub created (W1-S5)
  • 2026-05-02: proxy.js Fix #1 (Kimi 404 per-model gating) + Fix #16 (cache_control on max-plan path)
  • 2026-05-02: Phase 8 cutover to :18910 attempted + rolled back (F26)
  • 2026-04-30: :18903 anthropic-max-router added for teamsteph@betterfiles.com Max plan
  • 2026-04-24: Pro tier upgrade audit; VK correction (∆8); account tier documented