Anthropic Claude API
Anthropic is the primary LLM provider for all 36 OpenClaw agents. All calls go via portkey — never direct. This hub documents the model lineup, dual Max plan architecture, billing topology, deprecation timeline, and the Hermes P2 demotion. Read it whenever choosing a model, auditing billing paths, planning OSIL model tiers, or rotating the Anthropic API key. For routing configuration, proxy fixes, virtual keys, and CHOKEPOINT-1 enforcement, start at portkey first.
Quick reference
| Field | Value |
|---|---|
| Vendor | Anthropic |
| URL | https://console.anthropic.com |
| KB doc | API |
| Auth method | API key (x-api-key header) — but NEVER sent direct; always via Portkey |
| Auth credential | op://Aurora/anthropic/api-key |
| Cred-proxy port | pending B1-B6 ratification — see nemoclaw-audit-2026-05-03 |
| Webhook port | n/a (all calls via Portkey; no inbound webhook) |
| Webhook handler | n/a |
| Webhook dedup table | n/a |
| Tunnel path | n/a |
| Direct API base | https://api.anthropic.com/v1/messages — FOR REFERENCE ONLY; OpenClaw never calls this directly |
| Proxy routing | :18900 primary (Portkey cloud) + :18903 Max shortcut — see portkey |
| Outbound API base | https://api.portkey.ai/v1 via portkey |
| Rate limits | Per Anthropic tier; surfaced in Portkey dashboard |
| Rate-limit action | 429 → Portkey handles backoff; Discord ops alert |
| Cost | API-tier: pay-per-token; Max-plan: flat-rate monthly subscription |
| Backup/recovery | Anthropic-owned (vendor cloud); no local backup |
| Discord alert channel | ops |
| Drift cadence | Model deprecation tracking; tool-calls-health-check.timer 5-min |
| Status | production |
Model lineup (authoritative — as of 2026-05-03)
⚠️ Model deprecation: old IDs sunset June 15 2026. Migrate before then.
| Model | Canonical ID | Cost (in/out) | Tier | Agents |
|---|---|---|---|---|
| Claude Opus 4.7 | claude-opus-4-7 | 25 per MTok | Strategic | aurora, solara (2 agents) |
| Claude Sonnet 4.6 | claude-sonnet-4-6 | 15 per MTok | Operations | 14 agents |
| Claude Haiku 4.5 | claude-haiku-4-5-20251001 | Lowest | Automation | 18+ agents |
Via Portkey, prefix with @anthropic-provider/ for direct-provider routing, e.g.:
@anthropic-provider/claude-opus-4-7@anthropic-provider/claude-sonnet-4-6@anthropic-provider/claude-haiku-4-5-20251001
Deprecated IDs (migrate by June 15 2026):
claude-opus-4-20250514→ useclaude-opus-4-7claude-sonnet-4-5-20250514→ useclaude-sonnet-4-6claude-haiku-4-20250514→ useclaude-haiku-4-5-20251001
Model routing policy: see feedback_model_routing_pattern — Opus plans/audits (main conversation); Sonnet subagents execute; Haiku trivial lookups only.
Dual Max plan topology
Two billing paths exist for Anthropic LLM calls:
Path A — API tier (:18900 → Portkey cloud → Anthropic API)
- Pay-per-token; surfaces in Portkey dashboard +
tool_callsSupabase table cost_usd, tokens_in, tokens_outfields populated in everytool_callsrow- Used by: all agents NOT routed via Path B
Path B — Max plan flat-rate (:18903 → anthropic-max-router → teamsteph@betterfiles.com OAuth)
teamsteph@betterfiles.comMax subscription serves 40 agentscost_usd= NULL intool_callsrows (flat-rate signal per CHOKEPOINT-1 spec)- Routes via
proxy.js → 127.0.0.1:18903; concurrency cap 5 (MAX_PLAN_LOCAL_CAP) henryRERIaccount Max plan remains on primary:18900path for interactive VSCode use- Added 2026-04-30; Phase 8 cutover to
:18910FAILED 2026-05-02 (format mismatch);:18903remains canonical
Path A vs Path B — cost signal table
Field in tool_calls | Path A (API-tier) | Path B (Max-plan) |
|---|---|---|
cost_usd | >0 | NULL |
tokens_in | >0 | NULL or 0 |
tokens_out | >0 | NULL or 0 |
cache_read_tokens | >0 if cache hit | >0 (from buffer parse) |
cache_write_tokens | >0 on first write | >0 (from buffer parse) |
For full proxy routing details, Fix #1 (Kimi 404 gating), and Fix #16 (cache_control on max-plan), see portkey.
Prompt caching (cache_control)
Anthropic supports prompt caching via cache_control: {type: "ephemeral"} on system prompts, large user message blocks, and tool definitions.
Fix #16 (2026-05-02): The old Tailscale max-proxy translated requests to OpenAI format, silently stripping cache_control for all max-plan agents. The new anthropic-max-router at :18903 uses the claude-code-20250219 beta header and accepts Anthropic-native format — cache_control is now injected by proxy.js on both :18900 and :18903 paths.
Effect: 35+ max-plan agents now get actual cache hits on their system prompts. Cache observability for the max-plan path is tapped via response buffer parsing (lines ~160-180 in proxy.js) and written to tool_calls.cache_read_tokens / cache_write_tokens.
See API for full prompt caching spec.
Hermes status — demoted to P2
Hermes (NousResearch-self-evolution) is NOT a dependency for OSIL. Demoted to P2 side-track per OSIL B5 ratification (2026-05-03).
OSIL Phase 1-14 proceeds with GEPA + DSPy + Reflexion + Voyager + autoresearch stack. Hermes is tracked as a future enhancement but does not gate any OSIL phase. Reference: openclaw-self-improvement-layer-2026-05-03 §B5.
NemoClaw cred-proxy (pending)
NemoClaw plan nemoclaw-audit-2026-05-03 includes B1-B6 blockers covering L7 credential injection that would change how the Anthropic API key reaches the proxy. Until B1-B6 are ratified, the credential path is:
op://Aurora/anthropic/api-key → master.env → proxy.js environment → x-api-key header (injected by Portkey)
Do NOT add any plaintext API key to hub files, config files, or scripts. op:// reference only per G-NO-PLAINTEXT-CREDS.
How it’s used
- Trigger: Any of the 36 agents issuing a structured-output or chat request; routed via portkey
proxy.js - Workflow (API-tier): Agent → OpenClaw Gateway
:18789→proxy.js:18900→ Portkey cloud →api.anthropic.com→ response buffered and returned;tool_callsrow written non-blocking - Workflow (Max-plan): Agent → OpenClaw Gateway
:18789→proxy.js:18900→isAnthropicModel()check → local shortcut to:18903→anthropic-max-router→ Anthropic OAuth → response;tool_callsrow written withcost_usd=NULL - Agents involved: All 36 agents; Strategic tier (Opus) = aurora, solara; Operations tier (Sonnet) = 14 agents; Automation tier (Haiku) = 18+ agents
- Failure mode — Anthropic API down: Portkey handles fallback to OpenRouter (Tier 2) or moonshot-kimi (Tier 2) per tier config; see portkey
- Failure mode — Max-plan pool saturated: 502 with
x-openclaw-max-plan-fallback: pool_saturated; agent must retry; no automatic fallback to API-tier (different billing) - Failure mode — model ID deprecated post-June 15 2026: 404/invalid model; update
models.jsonfor affected agents per feedback_models_guardian_pattern - Success criteria: All requests resolved via Portkey within SLA;
tool_callsrows present; cache hit rate visible in Portkey dashboard +tool_calls.cache_read_tokens
Cross-links
Agents that touch this
- _summary — Strategic Opus tier; primary Max-plan consumer (via
:18903) - _summary — Strategic Opus tier
- All 36 OpenClaw agents consume Anthropic models via this integration
Skills that invoke this
- aurora-model-swap — model ID rotation for Aurora agent
- claude-api — Anthropic SDK patterns, caching, tool use, model migration
- acquisitions-outreach — LLM call for SMS draft generation
- acquisitions-followup — LLM call for reply classification
- hubspot-deal-ingest — LLM enrichment calls
Plans that govern this
- openclaw-self-improvement-layer-2026-05-03 — OSIL: adds multi-tier DSPy/GEPA optimizers that call Claude; Hermes P2 demoted
- openclaw-fragmentation-fix-2026-05-01 — CHOKEPOINT-1 definition; tool_calls schema; proxy.js Fix #16
- nemoclaw-audit-2026-05-03 — NemoClaw cred-proxy pending B1-B6; would change credential injection path
- second-max-plan-decision — dual Max plan architecture; teamsteph@betterfiles.com topology
- vendor-deep-audit-comprehensive-2026-05-02 — Anthropic model cost baseline in audit
Feedback rules
- feedback_model_routing_pattern — Opus plans main; Sonnet executes via subagent; Haiku trivial only
- feedback_models_guardian_pattern —
models.jsonoverwritten by gateway on restart; edit backups dir - feedback_chokepoint_principle — CHOKEPOINT-1: every LLM call writes
tool_callsrow - feedback_dual_write_required — dual-write for all state mutation
- feedback_no_plaintext_creds — credentials as
op://references only; no raw keys in files - project_second_max_plan_decision — teamsteph Max plan decision log
KB / source docs
- API — messages API, streaming, tool use, prompt caching, extended thinking
- MODEL-CATALOG — full model lineup with context windows and pricing
- MODEL-CATALOG — how Portkey routes to Anthropic models
- OBSERVABILITY — Portkey-side metrics for Anthropic calls
System maps
- vm-gateway-topology — full call path visualization
- vm-osil-overview — OSIL LLM tiers (DSPy, GEPA, Reflexion over Claude)
Related: LLM routing cluster
Cluster anchor: portkey
| Hub | Role |
|---|---|
| portkey | All calls MUST route via this proxy; Fix #1 + Fix #16 live here |
| supabase | tool_calls writes; cost_usd / token accounting |
| openrouter (Tier 2 — hub pending) | Fallback if Anthropic unavailable |
| moonshot-kimi (Tier 2 — hub pending) | Non-Anthropic model alternative; gated by Fix #1 |
Related: Memory/embeddings cluster
Cluster anchor: supabase
| Hub | Role |
|---|---|
| portkey | Voyage embeddings via voyage-embeddin-9dffd3 VK (616K req/mo) |
| supabase | 44 SQLite DBs + vec0 vector search; chunk embeddings via Voyage |
| voyage (Tier 2 — hub pending) | Embedding model; Anthropic handles LLM; Voyage handles vectors |
Related: Credential layer cluster
Cluster anchor: 1password
| Hub | Role |
|---|---|
| 1password | Stores op://Aurora/anthropic/api-key |
| portkey | Consumes Anthropic key via virtual key abstraction |
Open issues / TODOs
- P0: Model deprecation deadline June 15 2026 — audit all
models.jsonfiles for old IDs (claude-opus-4-20250514,claude-sonnet-4-5-20250514,claude-haiku-4-20250514); migrate to canonical IDs before cutoff - P1: Phase 8 cutover blocked —
:18910binary format mismatch; see openclaw-fragmentation-fix-2026-05-01 §A2.9 F26; needs either proxy.js translation layer or different upstream binary - P1: NemoClaw B1-B6 — pending Henry ratification; if ratified, cred-proxy changes how
op://Aurora/anthropic/api-keyreaches the proxy; update this hub when resolved - P2: Hermes side-track — NousResearch-self-evolution demoted to P2; revisit in OSIL Wave 3 if GEPA/DSPy stack reaches token efficiency ceiling
- P2: Max-plan VK dead credentials —
anthropic-max-p-99c12dVK in Portkey returns 401; archive or rotate
Related domain hubs
- ai-osil — AI/OSIL domain hub (this is the primary integration for OSIL)
Recent activity
- 2026-05-04: ai-osil domain hub cross-linked (ai-osil) as primary domain for OSIL AI self-improvement layer
- 2026-05-03: hub created (W1-S5); Hermes P2 demotion documented; model deprecation timeline added
- 2026-05-02: proxy.js Fix #16 (cache_control on max-plan path); Phase 8 cutover attempted + rolled back
- 2026-04-30: :18903 Max shortcut added for teamsteph@betterfiles.com (40 agents)
- 2026-05-01: Second Max plan live; dual billing topology established