Files
nanoclaw/container
Chip Tonkin 1e7cb8b8c8 feat(providers): add codex provider via app-server JSON-RPC
Adds `codex` as a third AgentProvider alongside `claude` and `opencode`.

## Goal

Bring Codex up to the same feature bar as the Claude Agent SDK integration:
persistent sessions, streaming output, MCP tool access, and native
compaction — all driven through the AgentProvider interface without
polluting the poll-loop with per-SDK quirks.

## Why not the @openai/codex-sdk

The SDK-wrapping approach (sketched in agent-runner-details.md) was
tried first but fell short of Claude Agent SDK's feature set on
several fronts. The `codex app-server` JSON-RPC protocol closes most
of the gap:

| Capability                  | Claude SDK | Codex SDK | Codex app-server |
|-----------------------------|:----------:|:---------:|:----------------:|
| Session resume              |     ✓      |     ✓     |        ✓         |
| Streaming event output      |     ✓      |     ✓     |        ✓         |
| Mid-turn input push         |     ✓      |     ✗     | queue + drain*   |
| MCP servers                 |     ✓      |  config   |     config       |
| Approval hooks              |    hooks   |  limited  |  server reqs     |
| Native compaction trigger   |    auto    |     ✗     |        ✓         |
| Multi-turn subagents        |    Task    |     ✗     |   spawnAgent     |
| Config overrides            |     API    |     JS    |   -c flags       |

*Codex turns don't accept mid-turn input. The provider queues push()
 and drains between turns — matches the opencode provider pattern and
 the poll-loop doesn't dispatch new messages mid-turn anyway.

## Known shortcomings (both tractable in follow-ups)

**No native file/web tools.** Claude ships Read/Write/Edit/Glob/Grep/
WebFetch/WebSearch in-SDK. Codex (both SDK and app-server) leaves the
agent to shell out via bash — slower, messier output. Mitigation: a
follow-up can expose these as MCP tools on the container side, reusing
the existing `send_message` / `send_file` tool plumbing.

**Higher tool-call latency.** Every tool call round-trips through
JSON-RPC over stdio (~10-50ms per hop vs Claude's in-process dispatch).
Mitigation: batch MCP calls, pipeline commands — same playbook as
opencode.

Neither blocks feature parity for the common conversational + MCP
cases this PR targets.

## Files

- container/agent-runner/src/providers/codex-app-server.ts
    JSON-RPC transport: spawn, request/response, notification dispatch,
    auto-approval, thread lifecycle, MCP config.toml writer. Kept
    separate so it can be unit-tested without the full provider.
- container/agent-runner/src/providers/codex.ts
    CodexProvider implementing AgentProvider. Emits `activity` on every
    notification so idle timer stays honest. isSessionInvalid matches
    on stale-thread-ID errors for clean recovery.
- container/agent-runner/src/providers/codex.factory.test.ts
    Factory + isSessionInvalid + supportsNativeSlashCommands coverage.
- src/providers/codex.ts
    Host-side container contribution. Per-session ~/.codex copy with
    auth.json copied from host, so the container rewrites config.toml
    on every wake without touching the host's. Forwards OPENAI_API_KEY
    / CODEX_MODEL / OPENAI_BASE_URL.
- container/Dockerfile
    Pins CODEX_VERSION=0.121.0, adds @openai/codex to the pnpm global
    install block alongside claude-code/agent-browser/vercel.

## Validation

Smoke-tested end-to-end against real codex-cli 0.118 on macOS: observed
init → activity → progress → result events with a stable thread ID as
continuation, result text matched expected output. Unit tests: 26 pass
in agent-runner, 137 in host. Typecheck + prettier clean on all added
files.
2026-04-19 10:30:47 -04:00
..