Route Expensive Model Calls to Subagents with sessions_spawn

Switching Tony’s primary model from MiniMax to Claude Opus felt like an upgrade. Smarter responses, better reasoning, more nuanced task management. It was an upgrade — for about four hours.

Then the health monitor started restarting the Discord session every ten minutes.

The Problem

After switching to Claude Opus as the primary model, OpenClaw’s internal health monitor began flagging the Discord provider as stuck on a reliable ten-minute cycle:

[health-monitor] [discord:default] health-monitor: restarting (reason: stuck)

The restarts were clean — Tony came back each time — but any in-progress response was dropped. Mid-sentence replies. Half-written task updates. The session logs told the story:

[EventQueue] Slow listener detected: DiscordMessageListener took 88.6 seconds
[EventQueue] Slow listener detected: InteractionEventListener took 65322ms

The health monitor runs on a 300-second interval with a 60-second grace window. Claude Opus responses were taking 65-90 seconds. The heartbeat couldn’t complete inside the grace window, so the monitor called it stuck and restarted. Tony wasn’t broken — he was thinking. The health monitor couldn’t tell the difference.

Why This Happens

OpenClaw’s Discord provider and the agent’s LLM inference run in the same Node.js event loop. When an LLM call blocks for 65+ seconds, the heartbeat that the health monitor is waiting for doesn’t fire. The monitor waits 60 seconds, gives up, and restarts the provider.

This isn’t a bug in the health monitor — it’s working correctly. A truly stuck session looks the same from the outside as a session waiting on a slow model. The health monitor can’t distinguish between “processing a long inference” and “hung.”

The root fix is keeping the event loop responsive. The practical answer: don’t run slow models in the main session.

The Pattern: Orchestrator + Model-Specific Subagent

sessions_spawn accepts a model parameter that overrides the model for the spawned subagent run. The parent session stays on its primary model (fast, cheap); the subagent gets exactly the model it needs for the task.

{
  "tool": "sessions_spawn",
  "input": {
    "model": "anthropic/claude-opus-4-6",
    "label": "draft: pixelspace-manifesto",
    "task": "..."
  }
}

The subagent runs in an isolated agent:main:subagent:<uuid> session. It doesn’t share context with the parent — it gets a clean slate and whatever you put in task. After it completes, OpenClaw runs an announce step and posts the result back to the originating channel automatically.

The parent session returns { status: "accepted", runId, childSessionKey } immediately and moves on. The event loop stays clear.

Applied: A Two-Model Draft Skill

Tony’s /draft skill was the right place to apply this. Content drafting is the one task that genuinely benefits from Opus — nuanced prose, voice matching, creative structure. Everything else (task management, morning briefings, quick responses) runs fine on MiniMax.

The skill is split into two responsibilities:

Tony (MiniMax — the orchestrator):

Determine the topic (inline input → ideas backlog → synthesis from recent notes)
Read the 3-5 most relevant knowledge notes
Read VOICE.md — Adam’s full writing voice profile
Compose a self-contained brief with all context embedded
Call sessions_spawn with model: anthropic/claude-opus-4-6
Immediately reply: ✍️ Drafting "{topic}" with Opus — will post when done.

Opus subagent (the writer):

Receives topic, voice profile, and source material in the task string — no file access needed
Writes the draft to disk
Runs image generation
Runs backup.sh to commit and push
Returns a four-line summary that gets announced back to the channel

The key design principle: the subagent gets a tight brief with everything embedded. It doesn’t need to browse the workspace. Tony did that work already.

## Voice Profile (follow precisely)
{full contents of VOICE.md pasted here}

## Reference Material
{relevant knowledge note excerpts pasted here}

## Writing Instructions
400-800 words, flowing prose, no headers...

This is the pattern: preparation is cheap (fast model reads files), execution is expensive (slow model writes prose). Keep them separate.

Wiring the Backup

One gotcha: make the post-writing steps explicit and unconditional in the task brief. The first run of this skill had the subagent skip backup.sh entirely after image generation failed — it saw a failure and apparently interpreted “continue” loosely.

The fix is unambiguous instruction ordering:

## After Writing

1. Attempt hero image generation (non-blocking — always continue to step 2 regardless):
   cd /path/to/site && node scripts/generate-image.mjs /path/to/draft.md

2. Run backup — mandatory, always run even if step 1 failed:
   /path/to/scripts/backup.sh

3. Reply with exactly four lines:
   ...

Subagents will skip steps that feel optional if the brief is ambiguous. Make the mandatory steps look mandatory.

The Config Change

Switching back to MiniMax as primary is a one-line edit in openclaw.json, and OpenClaw hot-reloads it without a restart:

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "minimax/MiniMax-M2.5",
        "fallbacks": [
          "anthropic/claude-opus-4-6",
          "ollama/qwen3:4b"
        ]
      }
    }
  }
}

The health monitor restarts stopped immediately. Tony’s Discord session has been stable since.

Key Takeaway

sessions_spawn’s model parameter lets you treat model selection as a routing decision, not a global setting. Keep your primary model fast enough that the event loop stays healthy. Spawn expensive models only for the tasks that justify the latency — and make sure those tasks run in subagents where the wait can’t destabilize your main session.

The orchestrator’s job is coordination, not computation. Give it a model that matches that job.

Resources

OpenClaw sessions_spawn docs — full parameter reference for spawning subagents, including model override, sandbox inheritance, and announce behavior.