The Problem
Tony needed to send tasks to Mel over HTTP. Simple enough: Tony POSTs a TASK message to Mel’s endpoint, Mel processes it and replies. Clean, direct, no Discord dependency.
The endpoint accepted messages and returned {"status": "received"}. Then it started returning 500s. Then it started timing out entirely.
The error, once we dug it out of the uvicorn logs:
TimeoutError
File "server.py", line 76, in send_to_agent
stdout, stderr = await asyncio.wait_for(proc.communicate(), timeout=60)
Sixty seconds wasn’t enough. An OpenClaw agent turn — reading context, running Claude, producing a response — takes longer than that. The HTTP handler was await-ing the entire thing inline, the client timed out, and the whole request blew up.
Why This Happens
HTTP is request-response. The client sends a request and holds the connection open waiting for a reply. If you trigger an LLM inference call inside that handler, you’re asking the client to wait for Claude. Claude takes as long as Claude takes.
This is a shape mismatch, not a bug. HTTP expects milliseconds to seconds. LLM turns expect seconds to minutes. You can’t bridge that gap by raising a timeout — you bridge it by changing the shape.
The fix has two parts:
- Accept immediately — return 200 before the agent starts working
- Callback endpoint — give the agent a route to POST its response when it’s done
The Fix
Part 1: Fire and Forget
Move the agent invocation off the request path using asyncio.create_task():
async def _run_agent(message: str):
"""Run openclaw agent in background; log errors."""
try:
proc = await asyncio.create_subprocess_exec(
"openclaw", "agent", "--agent", "main", "-m", message,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
)
stdout, stderr = await asyncio.wait_for(proc.communicate(), timeout=300)
if proc.returncode != 0:
log("agent_error", {"error": stderr.decode()})
except Exception as e:
log("agent_error", {"error": str(e)})
def send_to_agent(message: str):
"""Schedule agent turn in background; return immediately."""
asyncio.create_task(_run_agent(message))
@app.post("/message")
async def receive_message(payload: dict, x_shared_secret: str = Header(None)):
if x_shared_secret != SHARED_SECRET:
raise HTTPException(status_code=401, detail="unauthorized")
log("inbound", payload)
await handle(payload)
return {"status": "received"} # returns before the agent starts
The HTTP response goes back in milliseconds. The agent turn runs in the background.
Note: asyncio.create_task() requires an active event loop — FastAPI’s async context provides one. Don’t call this from a sync function.
Part 2: Give the Agent a Way to Reply
The agent now runs in the background with no way to send a response back. You need a callback endpoint — a route the agent can hit when it has something to say:
@app.post("/send")
async def send_message(request: Request, x_shared_secret: str = Header(None)):
"""The agent calls this to send a message to the other agent."""
if x_shared_secret != SHARED_SECRET:
raise HTTPException(status_code=401, detail="unauthorized")
body = await request.json()
msg_type = body.get("type")
payload = body.get("payload", {})
if not msg_type:
raise HTTPException(status_code=400, detail="missing type")
await send_to_other_agent(msg_type, payload)
return {"status": "sent"}
Then include the curl command in the message you forward to the agent, so it knows exactly how to call back:
REPLY_INSTRUCTIONS = f"""
To send your response, POST to /send:
curl -s -X POST http://your-server-ip:8700/send \\
-H "X-Shared-Secret: your-shared-secret" \\
-H "Content-Type: application/json" \\
-d '{{"type": "BREAKDOWN", "payload": {{...}}}}'
Valid types: BREAKDOWN, STATUS, PR_READY.
"""
The agent reads this, does the work, constructs the response, and fires the curl. No polling. No long-polling. No WebSockets.
The Shape You’re Building
Sender → POST /message → "received" (immediate)
↓
[agent works]
↓
Agent → POST /send → forward to sender
Two endpoints, two directions, fully decoupled from HTTP timeouts.
Key Takeaway
Multi-agent HTTP communication isn’t request-response — it’s message-passing with a callback. The moment you put an LLM turn on the request path, you’ve already lost. Accept the message, fire the agent, return immediately. Give the agent a route to reply when it’s ready. That’s the shape.
FAQ
Q: Why not just use async/await to await the agent response in the handler?
await in FastAPI still blocks the HTTP response until the awaitable completes. asyncio.create_task() schedules the coroutine to run concurrently without blocking the response. You want the 200 to go back to the sender before the agent starts.
Q: What if the agent fails silently in the background?
Log errors in _run_agent’s except block to a file you monitor. The sender gets a 200 regardless — if you need delivery guarantees, add a retry queue. For most agent-to-agent use cases, logging + monitoring is enough.
Q: Can I use this pattern with any LLM framework, not just OpenClaw?
Yes. The pattern is framework-agnostic: subprocess exec, httpx call, or LangChain invoke — anything that takes time. Fire it with asyncio.create_task(), log failures, expose a callback endpoint.
Q: What’s the right timeout for asyncio.wait_for in _run_agent?
Set it to the longest reasonable agent turn, not the HTTP client timeout. 300 seconds (5 minutes) covers most Claude sessions. If tasks routinely exceed that, you have a different problem.