Portkey AI Gateway

Portkey is the OpenClaw LLM gateway proxy (port :18900) and CHOKEPOINT-1 enforcement surface. Every LLM call from every one of the 36 agents routes through proxy.js which maps model → Portkey tier config, injects cache headers, and writes a tool_calls row to Supabase before returning. This file is the authoritative hub for all Portkey configuration, routing, dual-port topology, and drift detection. Read it whenever touching model routing, cost tracking, virtual keys, or proxy behavior.

Quick reference

Field	Value
Vendor	Portkey AI
URL	https://app.portkey.ai
KB doc	API
Auth method	API key (x-portkey-api-key header)
Auth credential	`op://Aurora/portkey/api-key`
Cred-proxy port	n/a (until B1-B6 ratified — see nemoclaw-audit-2026-05-03)
Webhook port	n/a
Webhook handler	n/a
Tunnel path	n/a
Backup/recovery	vendor-owned (Portkey cloud SaaS); no local backup
Primary proxy port	`:18900` — henryRERI primary, all API-tier agent traffic
Second Max port	`:18903` — teamsteph@betterfiles.com Max plan via `anthropic-max-router`, serves 40 agents
Proxy file	`~/.openclaw/portkey-proxy/proxy.js`
Outbound API base	`https://api.portkey.ai/v1`
Account tier	Production ($49/mo as of 2026-04-24)
Rate limits	Per PRO-TIER-LIMITS
Rate-limit action	429 → exponential backoff (3 retries), Discord ops alert
Cost	$49/mo flat + Portkey-cloud LLM passthrough at provider rates
Discord alert channel	ops
Drift cadence	5-min cron via `tool-calls-health-check.timer`
Status	production

Components

~/.openclaw/portkey-proxy/proxy.js — Express proxy; maps model → tier config; injects x-portkey-config, x-portkey-metadata, cache_control; writes tool_calls to supabase
~/.openclaw/portkey-proxy/tier-config.json — Portkey tier config slugs: Strategic / Operations / Automation / Default / Voyage
~/.openclaw/workspace/scripts/tool-calls-health-check.js — CHOKEPOINT-1 drift detector; compares Portkey call count vs Supabase tool_calls inserts; >10% delta triggers Discord ops
systemd unit: portkey-proxy.service — user-level daemon binding :18900; restart via systemctl --user restart portkey-proxy
/tmp/openclaw/tool-calls-fallback.jsonl — fallback queue when Supabase unreachable; drains within 1h or escalates
~/.openclaw/workspace/knowledge-base/portkey/ — KB docs: API.md, GUARDRAILS.md, MODEL-CATALOG.md, OBSERVABILITY.md, ADMIN-API.md, PRO-TIER-LIMITS.md, PROMPT-LIBRARY.md

Dual-port topology (`:18900` + `:18903`)

Two distinct routing paths exist for LLM calls:

Port `:18900` — Primary Portkey proxy (henryRERI + API-tier agents)

All 36 agents default here via models.json baseUrl: http://127.0.0.1:18900
Outbound: https://api.portkey.ai/v1 with active tier config slug injected
Portkey applies load-balancing, caching, observability, virtual key isolation

Port `:18903` — Max-plan local shortcut (teamsteph@betterfiles.com, 40 agents)

anthropic-max-router binary bound at 127.0.0.1:18903
Routes OAuth Max subscription for teamsteph@betterfiles.com
Added 2026-04-30; serves strategic + ops agents on Max plan flat-rate billing
Phase 8 cutover FAILED 2026-05-02: claude-max-api-proxy@teamsteph at :18910 only served OpenAI /v1/chat/completions; proxy.js sends Anthropic-native /v1/messages. Cutover blocked; :18903 remains in place. See openclaw-fragmentation-fix-2026-05-01 §A2.9 finding F26.
Concurrency cap: 5 in-flight max (MAX_PLAN_LOCAL_CAP=5); pool saturation returns explicit 502 with x-openclaw-max-plan-fallback: pool_saturated — NEVER silent fallback to Portkey cloud
Only Anthropic models route to :18903 (gated by isAnthropicModel check); non-Anthropic models fall through to Portkey cloud path

proxy.js key fixes

Fix #1 — Kimi 404 per-model gating

Problem: Agents requesting moonshot/kimi-for-coding (or similar non-Claude models) caused 404s on the max-plan :18903 path because that upstream only handles Anthropic models.
Fix: isAnthropicModel() guard at routing-decision site; non-Anthropic models fall through to Portkey cloud path regardless of tier assignment.
Reference: proxy.js lines ~340-348

Fix #16 — `cache_control` injection on max-plan path

Problem: Old Tailscale max-proxy translated requests to OpenAI format, stripping cache_control — prompt caching silently disabled for 35+ max-plan agents.
Fix: New anthropic-max-router at :18903 uses claude-code-20250219 beta and speaks Anthropic-native. cache_control: {type: "ephemeral"} now injected on every Anthropic-format request for both :18900 and :18903 paths. System prompt, last user message block, and last tool definition all get cache_control.
Reference: proxy.js lines ~362-395

CHOKEPOINT-1 enforcement

Every LLM call writes a tool_calls row to supabase CCP project before returning.

proxy.js → logToolCall() → POST /rest/v1/tool_calls (fire-and-forget, non-blocking)
         ↓ (if Supabase unreachable)
         → /tmp/openclaw/tool-calls-fallback.jsonl (drains within 1h)

Required NOT-NULL fields (post-migration 2026-05-02-002): agent_id, tool_name, called_at, result_ok, latency_ms

Nullable (plan-tier signal): cost_usd, tokens_in, tokens_out, cache_read/write_tokens — NULL = Max-plan flat-rate; >0 = API-tier paid call

JSONB parameters required keys: tier and model (CHECK constraint tool_calls_parameters_required_keys)

Drift detection: tool-calls-health-check.timer runs every 5 min; compares Portkey call count vs tool_calls insert count; >10% delta alerts Discord ops. See OBSERVABILITY for metrics schema.

Virtual keys (10 live)

Slug	Purpose	Notes
`anthropic-primary`	Main API-tier Anthropic	Active
`anthropic-main-dfdf10`	Duplicate of primary	Consolidation candidate
`openclaw-produc-16fe8f`	Legacy DUP	Consolidation candidate
`anthropic-max-p-99c12d`	Max-plan Tailscale target	Dead credentials (401 on direct probe per 2026-04-24 audit)
`openai-producti-bdd7b2`	OpenAI	Active
`openrouter-fallback`	OpenRouter Tier 2	Fallback routing
`moonshot-kimi-84f4a1`	Moonshot / Kimi Tier 2	Needs credits
`voyage-embeddin-9dffd3`	Voyage embeddings	Active (616K req/mo)
`voyage`	Voyage DUP	Consolidation candidate
`google-ai-studi-8748cc`	Google AI Studio	Active

Voyage VK alias: pc-opencl-22d508. Credentials reference: op://Aurora/portkey/api-key

Tier configs (12 total, live state 2026-04-24)

Config slug	Tier	Notes
`pc-opencl-7b85fe`	Strategic (Opus)	Dormant since 2026-03-10
`pc-opencl-62fac7`	Operations (Sonnet)	Dormant since 2026-03-10
`pc-opencl-c14739`	Automation (Haiku)	Dormant since 2026-03-10
`pc-opencl-d8b498`	High-Availability	Dormant since 2026-03-10
`pc-opencl-454882`	Default (Sonnet/gsd)	Active
`pc-opencl-bd87ab`	Max-Plan (91% of traffic)	Active — routes via Anthropic Max subscription
`pc-opencl-22d508`	Voyage Embeddings	Active
`pc-opencl-3b86e9`	Sonnet Optimized	Added 2026-03-02
`pc-opencl-d4f457`	Gemma dense	Active
`pc-opencl-98748e`	Gemma MoE	Active
`pc-opencl-1373ad`	Gemma 26B	Active
`pc-opencl-2f97e0`	Gemma 31B	Active

⚠️ CORRECTION (2026-04-24): pc-opencl-bd87ab was documented as “Anthropic Max via Tailscale” — this is WRONG. Live admin API shows it routes to anthropic-primary VK (API-tier), "mode": "simple" cache, no Tailscale custom_host. The anthropic-max-p-99c12d VK has dead credentials. See project_portkey_pro_audit_correction_2026-04-24 + plan amendment ∆8.

How it’s used

Trigger: Any of the 36 agents issuing an LLM request via OpenClaw Gateway → proxy.js intercepts on :18900 or :18903
Workflow: Agent request → proxy extracts agent ID from URL path → reads model from body → maps to Portkey tier config slug → injects x-portkey-config, x-portkey-metadata, x-portkey-cache-namespace → strips provider prefixes (e.g. anthropic/claude-... → claude-...) → forwards to https://api.portkey.ai/v1 (or :18903 local shortcut) → logs cache status from response headers → fires tool_calls INSERT
Agents involved: All 36 agents via _summary, _summary, and all Operations/Automation tier agents
Failure mode — Portkey cloud down: Requests error out; no silent fallback to direct Anthropic (different billing). Discord ops alert via Portkey alert config.
Failure mode — Supabase unreachable: logToolCall() falls back to /tmp/openclaw/tool-calls-fallback.jsonl; drains within 1h or escalates per CHOKEPOINT-1 spec
Failure mode — :18903 pool saturated: Explicit 502 with x-openclaw-max-plan-fallback: pool_saturated; never silently reroutes
Success criteria: All LLM calls appear in Portkey dashboard + tool_calls Supabase table within 5 min; tool-calls-health-check.timer reports <10% drift

Cross-links

Agents that touch this

_summary — Strategic Opus tier; primary Max-plan consumer
_summary — Operations Sonnet tier
All 36 agents route via this gateway

Skills that invoke this

aurora-model-swap — model routing reconfiguration
acquisitions-followup — triggers LLM calls via gateway
acquisitions-outreach — LLM dispatch for SMS generation

Plans that govern this

openclaw-fragmentation-fix-2026-05-01 — CHOKEPOINT-1 definition + tool_calls schema migrations
openclaw-self-improvement-layer-2026-05-03 — OSIL adds new model tiers via this gateway
nemoclaw-audit-2026-05-03 — pending B1-B6; NemoClaw cred-proxy would inject at L7 (changes port topology)
second-max-plan-decision — dual Max plan architecture decision

Feedback rules

feedback_models_guardian_pattern — agent models.json gets overwritten by gateway on restart; edit portkey-proxy/models-backups/ instead
feedback_model_routing_pattern — Opus plans/audits in main; Sonnet executes via subagent; Haiku trivial
feedback_chokepoint_principle — CHOKEPOINT-1 enforcement; every LLM call writes tool_calls
feedback_dual_write_required — dual-write required for all state mutation
feedback_action_gate_violation_repeated — service restarts require explicit auth
project_portkey_pro_audit_correction_2026-04-24 — correction for VK config errors in prior KB

KB / source docs

API — universal API formats, virtual key management
MODEL-CATALOG — integrations, budgets, rate limits
OBSERVABILITY — metrics, alerts, feedback API
GUARDRAILS — deterministic + LLM + partner guardrails
ADMIN-API — endpoint groups, auth patterns
PRO-TIER-LIMITS — retention, overage, gating

System maps

vm-gateway-topology — gateway + proxy port map
vm-osil-overview — OSIL LLM tier visualization

Cluster anchor: this hub

Hub	Role
anthropic	Primary LLM provider; all calls route via this hub
supabase	tool_calls writes land here (CHOKEPOINT-1)
openrouter (Tier 2 — hub pending)	Fallback routing via `openrouter-fallback` VK
moonshot-kimi (Tier 2 — hub pending)	Non-Anthropic model routing; Fix #1 gating

Cluster anchor: supabase

Hub	Role
anthropic	LLM primary; embeddings separate via Voyage
supabase	Vector storage for embeddings + tool_calls writes
voyage (Tier 2 — hub pending)	Embedding calls via `voyage-embeddin-9dffd3` VK (616K req/mo)

Cluster anchor: 1password

Hub	Role
1password	Stores `op://Aurora/portkey/api-key`
All integration hubs	Consume credentials via `op://` references only

Open issues / TODOs

VK consolidation: 3 duplicate VKs (anthropic-main-dfdf10, openclaw-produc-16fe8f, voyage DUP) pending cleanup
anthropic-max-p-99c12d VK has dead credentials (401) — rotate or archive
Phase 8 cutover to :18910 blocked on /v1/messages format compat — see openclaw-fragmentation-fix-2026-05-01 §A2.9 F26
portkey-proxy/models-backups/ drift: verify all agent models.json backups are current after last gateway restart
NemoClaw cred-proxy (B1-B6 pending) would change :18900 port topology — see nemoclaw-audit-2026-05-03
tool-calls-health-check.timer drift tolerance: current 10% threshold — evaluate for tightening after 30-day baseline

ai-osil — AI/OSIL domain hub (this is the primary integration for OSIL)

Recent activity

2026-05-04: ai-osil domain hub cross-linked (ai-osil) as primary domain for OSIL AI self-improvement layer
2026-05-03: hub created (W1-S5)
2026-05-02: proxy.js Fix #1 (Kimi 404 per-model gating) + Fix #16 (cache_control on max-plan path)
2026-05-02: Phase 8 cutover to :18910 attempted + rolled back (F26)
2026-04-30: :18903 anthropic-max-router added for teamsteph@betterfiles.com Max plan
2026-04-24: Pro tier upgrade audit; VK correction (∆8); account tier documented

Quartz 4

Explorer

Portkey AI Gateway

Portkey AI Gateway

Quick reference

Components

Dual-port topology (`:18900` + `:18903`)

Port `:18900` — Primary Portkey proxy (henryRERI + API-tier agents)

Port `:18903` — Max-plan local shortcut (teamsteph@betterfiles.com, 40 agents)

proxy.js key fixes

Fix #1 — Kimi 404 per-model gating

Fix #16 — `cache_control` injection on max-plan path

CHOKEPOINT-1 enforcement

Virtual keys (10 live)

Tier configs (12 total, live state 2026-04-24)

How it’s used

Cross-links

Agents that touch this

Skills that invoke this

Plans that govern this

Feedback rules

KB / source docs

System maps

Open issues / TODOs

Recent activity

Graph View

Table of Contents

Backlinks

Quartz 4

Explorer

Portkey AI Gateway

Portkey AI Gateway

Quick reference

Components

Dual-port topology (:18900 + :18903)

Port :18900 — Primary Portkey proxy (henryRERI + API-tier agents)

Port :18903 — Max-plan local shortcut (teamsteph@betterfiles.com, 40 agents)

proxy.js key fixes

Fix #1 — Kimi 404 per-model gating

Fix #16 — cache_control injection on max-plan path

CHOKEPOINT-1 enforcement

Virtual keys (10 live)

Tier configs (12 total, live state 2026-04-24)

How it’s used

Cross-links

Agents that touch this

Skills that invoke this

Plans that govern this

Feedback rules

KB / source docs

System maps

Related: LLM routing cluster

Related: Memory/embeddings cluster

Related: Credential layer cluster

Open issues / TODOs

Related domain hubs

Recent activity

Graph View

Table of Contents

Backlinks

Dual-port topology (`:18900` + `:18903`)

Port `:18900` — Primary Portkey proxy (henryRERI + API-tier agents)

Port `:18903` — Max-plan local shortcut (teamsteph@betterfiles.com, 40 agents)

Fix #16 — `cache_control` injection on max-plan path