How Chat-Driven Execution Works
The Headline Diagram
Section titled “The Headline Diagram”Every prompt you type travels through eight distinct stages before a response appears. The diagram below maps the full journey — from keypress to memory persistence.
The 8 Stages Explained
Section titled “The 8 Stages Explained”Stage 1 — User Input → Claude Code CLI
Section titled “Stage 1 — User Input → Claude Code CLI”The Claude Code CLI receives your prompt text. Nothing else has happened yet — no governance rules are loaded, no tools are called. The CLI is the entry point that bootstraps everything that follows.
Stage 2 — SessionStart Hook (first turn only)
Section titled “Stage 2 — SessionStart Hook (first turn only)”On the very first turn of a session, the hook script on-session-start.sh fires before Claude processes anything. It reads SESSION-AUDIT.md (the running log of what is in flight), ACTIVE-PROJECT.md (the current primary work item), and performs a memory search using the claude-memory.py CLI against the 44 per-agent SQLite databases in /home/opsadmin/.openclaw/memory/. The result is injected directly into the system prompt, so Claude starts every session already oriented to current state rather than asking you “where were we?”
Stage 3 — Request Routing via CLAUDE.md
Section titled “Stage 3 — Request Routing via CLAUDE.md”Claude reads the full governance document at /home/opsadmin/CLAUDE.md — specifically the Request Routing Protocol section and the Tool Trigger Conditions table. This step determines: Does a matching plan already exist in ~/.claude/plans/? Which agent domain owns this work? What skill, if any, should auto-invoke? Is this execution-mode work that should be dispatched to a Sonnet subagent rather than handled by Opus directly? Routing happens before any tool is called or any code is written.
Stage 4 — Tool Selection
Section titled “Stage 4 — Tool Selection”Claude picks the appropriate tool for the work:
- Skill — loads a YAML workflow from
~/.claude/skills/<name>/SKILL.mdand re-enters with that context in scope - MCP tool — dispatches to one of the 14 connected MCP servers (configured in
~/.mcp.json), which runs as a child process - Agent tool — spawns a Sonnet or Haiku subprocess with an isolated context window and a structured 8-element prompt
- Bash / Read / Edit — built-in tools that execute directly in the shell or filesystem
Stage 5 — Sub-system Dispatch
Section titled “Stage 5 — Sub-system Dispatch”Each tool type has a different execution substrate. Bash tools run shell commands directly and return stdout/stderr. MCP tools invoke a JSON-RPC call to a connected server process — for example, memory_search calls into the openclaw-tools MCP server, which runs claude-memory.py under the hood. Agent tool dispatches create a fully isolated subprocess with its own context; that subagent reads its own files, calls its own tools, and returns a structured result. Skills re-inject YAML-based instructions and constraints into the current conversation before Claude continues.
Stage 6 — LLM Call via Portkey :18900
Section titled “Stage 6 — LLM Call via Portkey :18900”Every call to an LLM — Anthropic, OpenAI, Moonshot, or OpenRouter — goes through the Portkey proxy running at port 18900 (/home/opsadmin/.openclaw/portkey-proxy/proxy.js). The proxy reads a config slug (e.g. sonnet-for-coding, kimi-for-calls) to determine which provider and model to route to, applies the correct per-agent cache namespace, and attaches metadata (agent ID, session ID, tier). The response comes back through Portkey, which non-blockingly writes a row to the tool_calls Supabase table (CHOKEPOINT-1) and sends a trace to Langfuse at :18840 for observability. If Postgres is unreachable, the row is queued to /tmp/openclaw/tool-calls-fallback.jsonl for later drain.
Stage 7 — PostToolUse Hook
Section titled “Stage 7 — PostToolUse Hook”After every tool call completes, on-post-tool.sh fires. It writes a JSONL audit entry to /home/opsadmin/.openclaw/logs/claude-code-audit.log and enqueues the interaction summary to the memory pipeline. The pipeline calls Voyage-4 to embed the text, then stores the embedding in the chunks_vec table (vec0 vector index) and the raw text in chunks_fts (FTS5 full-text index) of the relevant agent’s SQLite database. This is how decisions, plans, and discoveries survive across sessions.
Stage 8 — Stop Hook (session end)
Section titled “Stage 8 — Stop Hook (session end)”When the Claude Code session ends, on-stop.sh fires. It calls session-summary-extractor.py, which reads the bridge JSON file written at /tmp/claude-session-summary-$$.json (or falls back to plan-title inference if the bridge file is absent). The summary is appended to SESSION-AUDIT.md under the SESSION HISTORY section and enqueued for memory embedding. This is what makes “where did we leave off?” answerable at the start of the next session.
Five Worked Examples
Section titled “Five Worked Examples”Example 1 — Reading a Log
Section titled “Example 1 — Reading a Log”Henry types: “show me the gateway logs from last hour”
Routing identifies this as a read-only status query — no plan check needed, no subagent needed. Claude selects the Bash tool and runs:
journalctl --user -u openclaw-gateway --since "1 hour ago" --no-pagerThe PreToolUse hook validates the command against the safety blocklist in pre-bash-check.sh — it passes. The command runs, stdout comes back, Claude formats it and returns the response. PostToolUse writes the audit entry. No LLM subagent was involved, no Portkey call was made. The total cost: zero tokens beyond the display response.
Example 2 — Searching Memory
Section titled “Example 2 — Searching Memory”Henry types: “what did we decide about the cred rotation?”
Routing matches the memory_search MCP tool trigger. Claude calls mcp__openclaw-tools__memory_search with the query string. The MCP server invokes claude-memory.py search "cred rotation", which embeds the query using Voyage-4 and runs a hybrid search (70% vector similarity via vec0 + 30% BM25 keyword via FTS5) across the memory database. The top-K chunks are returned, formatted, and displayed. No LLM subagent. No Portkey call for the retrieval itself — Voyage-4 runs locally in the Python process.
Example 3 — Sonnet Subagent Dispatch
Section titled “Example 3 — Sonnet Subagent Dispatch”Henry types: “go ahead and implement the new tool_calls schema migration”
Routing identifies the keyword “go ahead” as an execution-mode trigger (per G-MODEL-ROUTING-AT-EXEC). Opus — the main conversation model — acts as dispatcher. It builds an 8-element prompt containing the task goal, evidence gathered (the migration file path, schema diff, prior test results), the tool to use (psql + the migration SQL file), acceptance criteria (grep for new column, check NOT NULL constraint), rollback command (DROP COLUMN), and rule constraints (G-NO-PLAINTEXT-CREDS, CHOKEPOINT-3). This prompt is dispatched via the Agent tool with model: sonnet.
Sonnet runs in an isolated subprocess: it reads the migration file, validates it, runs psql against the Supabase connection string, checks the output, and returns a structured result. Opus receives the result, runs its own independent verification (querying the table schema directly), and only marks the phase complete if the check passes.
Example 4 — Skill Invocation
Section titled “Example 4 — Skill Invocation”Henry types: /dispo-blast
The Tool Trigger Conditions table in CLAUDE.md matches the /dispo-blast keyword to the dispo-blast skill. Claude invokes the Skill tool, which loads ~/.claude/skills/dispo-blast/SKILL.md — a YAML document that defines the workflow, the tools to call, the pre-blast history checks (already-blasted, recent-engagement, saturation, suppression), and the constraint that --dry-run is required unless live blast is explicitly authorized. Claude re-enters the conversation with that YAML in scope and executes the skill workflow step by step. The skill wraps existing scripts — it does not reimplement logic that already lives in dispo-blast-engine.js.
Example 5 — Hook Blocking a Dangerous Command
Section titled “Example 5 — Hook Blocking a Dangerous Command”Henry (hypothetically) types: “rm -rf /”
Claude selects the Bash tool as the appropriate tool for this shell command. Before the command runs, the PreToolUse hook fires: pre-bash-check.sh evaluates the command against its blocklist of destructive patterns. The pattern rm -rf / matches. The hook exits with code 2, which signals the Claude Code harness to block the tool call. The Bash tool never executes. Claude receives the hook’s rejection and returns an explanation to Henry. The audit log records the blocked attempt.
Why Hooks Matter
Section titled “Why Hooks Matter”The four hooks — SessionStart, PreToolUse, PostToolUse, and Stop — enforce properties that neither Claude nor the operator should have to think about manually.
SessionStart solves the cold-start problem. Without it, every session would begin from scratch, forcing Henry to re-explain context. With it, Claude starts each session already loaded with the current work queue, the active project, and the top memory hits from prior sessions.
PreToolUse is the last safety gate before any shell command or external call executes. It runs synchronously, blocking execution if the command matches the blocklist. This cannot be bypassed by prompt engineering — it runs outside Claude’s context entirely.
PostToolUse is what makes the memory system self-maintaining. Every meaningful interaction gets embedded and stored automatically, without Claude needing to remember to call a save function. The embedding pipeline runs asynchronously so it does not add latency to the response.
Stop closes the loop. Without it, session history would exist only in the conversation transcript, invisible to future sessions. The stop hook extracts the summary and appends it to SESSION-AUDIT.md, which SessionStart reads the next time. This is the mechanism that makes the system accumulate institutional memory over time rather than resetting every 24 hours.
Why Portkey Is a Chokepoint
Section titled “Why Portkey Is a Chokepoint”Every LLM call in the system — whether from the main conversation, a Sonnet subagent, or an agent running independently — funnels through the Portkey proxy at :18900. This is intentional and governed by CHOKEPOINT-1.
Three things happen at the chokepoint that cannot happen if calls bypass it: First, a tool_calls row is written to Supabase with the model, cost, token counts, latency, and agent ID. This is the audit trail that makes cost attribution and drift detection possible. Second, a trace is sent to Langfuse at :18840, providing request-level observability across all providers and all agents. Third, provider routing and per-agent cache namespaces are applied consistently — the same Sonnet call from two different agents gets routed to the right provider and billed to the right cost center.
A call that bypasses Portkey produces no audit row, no Langfuse trace, and no cache hit. The tool-calls-health-check.timer runs every 5 minutes comparing Portkey call counts against tool_calls inserts; a greater-than-10% delta fires a Discord alert to #ops.
Where to Go Next
Section titled “Where to Go Next”- Chat Execution Sequence (full diagram) — the standalone sequence diagram in more detail
- Hook Lifecycle Map — how all four hooks interact with the broader system
- Skills Primer — what skills are, how they differ from MCP tools, and how to invoke them
- Architecture Snapshot — the full service topology, port registry, and agent tier map