Model Selector — Optimal Model Routing Gate

Presents an interactive popup to recommend and confirm the best model for an upcoming execution task. Implements the G-MODEL-ROUTING-AT-EXEC matrix as a user-facing decision gate.

When to invoke

  • Auto (Step 1) — called by wave-execute before any hub generation dispatch
  • Auto — called when GSD enters execution mode for a batch task >3 items
  • Manual/model-selector before any operation where model choice matters
  • Trigger keywords — “which model should I use”, “recommend a model”, “model select”

Model selection matrix

Task typeRecommended modelDispatch methodCost/M
Bulk templated generation (>3 same type)Gemma 4 26Bmcp__gemma__gemma_ask or gemma-generate.js$0.06
Reasoning / OSIL / governance / scienceGemma 4 31Bmcp__gemma__gemma_reason$0.13
Audit / evidence gathering (grep-heavy)Claude HaikuAgent(model: "haiku")$0.25
Novel hub / first instance of new typeClaude SonnetAgent(model: "sonnet")$3.00
Code / script / webhook / cron pageKimi (moonshot-v1-8k)gemma-generate.js --model moonshot-v1-8k~$0.15
Orchestration / planning / rule arbitrationClaude OpusStay in main session$15.00

Decision rule: proven pattern → Gemma 26B. First-ever instance → Sonnet. Code content → Kimi. Audit → Haiku always. Novel architecture → Opus stays.

Execution protocol

Step 1 — Read task context

Before presenting the popup:

  • Identify the task scope (how many items, what type)
  • Pre-select a recommendation based on the matrix above
  • Calculate rough cost estimate (tokens × rate)

Step 2 — Present AskUserQuestion popup

AskUserQuestion(questions=[{
  question: "Which model should execute this phase? (Recommended option pre-selected based on task type)",
  header: "Model route",
  multiSelect: false,
  options: [
    {
      label: "Gemma 4 26B — Bulk gen",
      description: "→ $0.06/M. Best for: bulk templated hubs, same pattern repeated >3×. 256K context. Deterministic. (Recommended for most waves)"
    },
    {
      label: "Claude Haiku — Audit",
      description: "→ $0.25/M. Best for: all 9 audit checks, grep-heavy evidence gathering, structured JSON output."
    },
    {
      label: "Claude Sonnet — Novel",
      description: "→ $3.00/M. Best for: first-instance of new hub type, complex cross-linking, unproven patterns."
    },
    {
      label: "Gemma 4 31B — Reasoning",
      description: "→ $0.13/M. Best for: OSIL/science/governance hubs. GPQA 84.3% vs Sonnet 74.1%."
    }
  ]
}])

Note: If the task is Kimi (code/script pages) or Opus (orchestration/planning), present those as the “Other” text entry option in the AskUserQuestion. The four-option UI covers the 80% case; other routes use the free-text escape hatch.

Step 3 — Confirm and route

After selection, output the routing decision before proceeding:

MODEL-SELECTOR RESULT:
TASK: {task description — e.g., "Wave 4: 40 agent-peer hubs"}
CLASSIFIED AS: {task type}
RECOMMENDED: {model name}
CONFIRMED: {model name} [(same as recommended) | (overridden → X by Henry)]
DISPATCH: {exact dispatch method}
EST. ITEMS: {N}
EST. TOKENS: ~{Xk in / Yk out per item × N items}
EST. COST: ~${estimate}

Then proceed with dispatch using the confirmed model.

Integration with wave-execute

In wave-execute Step 1 (before pre-flight), call this skill to confirm the model for each task category in the wave:

Wave N contains:
  - Bulk hub generation: → invoke model-selector → expected: Gemma 26B
  - Audit pass: → invoke model-selector → expected: Haiku
  - Novel hubs (if any): → invoke model-selector → expected: Sonnet

If all match expected defaults → skip popup for defaults (Henry has pre-authorized the matrix); show popup only for novel/first-instance hubs.

Cost comparison reference

For a typical Wave 4 (~40 hubs):

RouteCost
Gemma 26B (bulk 38) + Haiku (audit) + Sonnet (2 novel)~$0.04
All Sonnet~$2.40
All Opus~$12.00

Savings vs all-Sonnet: ~98% cost reduction.

Override behavior

If Henry selects a different model than recommended:

  1. Log the override: "Henry overrode Gemma 26B → Sonnet for [reason]"
  2. Proceed with Henry’s choice — no blocking
  3. Note the override in the morning brief if it affected cost materially
  • wave-execute — calls this as Step 1 before dispatch
  • feedback_model_routing_at_exec.md — G-MODEL-ROUTING-AT-EXEC gate
  • feedback_model_routing_pattern.md — full Opus/Sonnet/Haiku rubric
  • ~/.openclaw/workspace/scripts/wave-execute/gemma-generate.js — Gemma + Kimi fallback script

Invokes / Invoked by

Invokes: SKILL, AskUserQuestion Invoked by: SKILL, _summary

Recent activity

  • 2026-05-04: skill created (Henry request post Wave 3 compact — “popup with a recommendation for the model selection for the best and highest productivity”)