Compare commits

...

25 Commits

Author SHA1 Message Date
gavrielc b429ab37b8 Merge pull request #2152 from glifocat/fix/opencode-process-group-and-timeout
fix(opencode): kill server process group + configurable IDLE_TIMEOUT_MS
2026-05-01 18:40:39 +03:00
gavrielc 09ddde33e1 Merge pull request #2153 from glifocat/fix/opencode-instructions-pipeline
fix(opencode): use native instructions config to load CLAUDE.md and fragments
2026-05-01 17:44:26 +03:00
gavrielc c0c46c14d6 fix(opencode): drop obsolete NANOCLAW_IS_MAIN / global CLAUDE.md branch
NANOCLAW_IS_MAIN no longer exists in the v2 codebase, and
groups/global/ is explicitly migrated away by composeGroupClaudeMd
(shared base now lives in container/CLAUDE.md → /app/CLAUDE.md, which
the instructions array already includes). Carrying /workspace/global
would give OpenCode more context than the Claude provider sees.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 17:44:01 +03:00
Ethan c993527e25 fix(opencode): load CLAUDE.md context via native instructions pipeline
Before: wrapPromptWithContext concatenated /workspace/agent/CLAUDE.md
verbatim into a <system>...</system> block in the user-message text.
That file is the host-composed entry point that contains only `@./...`
includes (the .claude-shared.md symlink to /app/CLAUDE.md, plus the
module fragments). OpenCode does not expand `@` in instruction files —
the syntax is a Claude Code convention for the model's own Read tool
to lazy-load. Result: the model received the literal lines
"@./.claude-shared.md\n@./.claude-fragments/module-...md" as text and
saw none of the actual content (workspace, memory, conversation
history, agents/core/interactive/scheduling/self-mod modules, OneCLI,
etc.). Confirmed empirically by dumping the constructed prompt for a
turn before any fix.

After: pass the concrete files via OpenCode's native `instructions`
config field. Per packages/opencode/src/session/instruction.ts on the
upstream `dev` branch, absolute paths and globs are resolved with
`fs.glob(basename, { cwd: dirname, absolute: true, include: 'file' })`,
files are read raw, and the resulting strings are concatenated into
the LLM's system prompt at packages/opencode/src/session/prompt.ts:1442.
That is the canonical channel — same one OpenCode uses for AGENTS.md /
CLAUDE.md auto-discovery.

Configured set:
  /app/CLAUDE.md                              shared base
  /workspace/agent/.claude-fragments/*.md     per-skill fragments
  /workspace/agent/CLAUDE.local.md            per-group memory
  /workspace/global/CLAUDE.md                 cross-group memory (non-main)

Removed: readClaudeMdForPrompt, the manual <system> wrap of its output,
and the now-unused `fs` import. The dynamic systemInstructions wrap
(assistant name + destinations) stays as-is — that varies per call.

Verified post-fix: Citiclaw (gemma4:31b via Ollama) responded to "what
self-modification tools do you have" with the exact MCP tool names and
a structured bullet list — vs. the previous ungrounded "I haven't
received instructions" response.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 12:54:56 +02:00
Ethan 0367dbb6f0 fix(opencode): kill server process group + configurable idle timeout
Two bugs in the upstream OpenCode provider that fire together when a
local backend (Ollama, llama.cpp) is slower than the hardcoded 90s
event timeout:

1. proc.kill('SIGKILL') only kills the wrapper process the spawn
   returned, not the opencode-linux-*/bin/opencode child it execs into.
   The child keeps holding port 4096, so the next spawnOpencodeServer()
   fails with "Failed to start server on port 4096" / EADDRINUSE.
   Fix: spawn detached and signal the whole process group via
   process.kill(-pid, 'SIGKILL') in a new killProcessTree() helper.

2. IDLE_TIMEOUT_MS = 90_000 is hardcoded. For a local 31B model the
   first prompt's time-to-first-token routinely exceeds that, tripping
   the timeout. Fix: read OPENCODE_IDLE_TIMEOUT_MS from env, default
   300_000 (5 min) — generous for cloud APIs, just enough for local.

Per-group override goes in container.json env (e.g. "600000" for a
slow local box), no rebuild needed since src/ is bind-mounted.

Same bugs exist on origin/providers — should be ported upstream.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 12:54:51 +02:00
gavrielc c7a7a709ed Merge main into providers
Bring providers up to date with main, including the channel-inbound
attachment path-traversal fix.

Resolved conflicts:
- .claude/skills/add-codex/SKILL.md: took main (newer CODEX_VERSION,
  more accurate schema docs)
- container/Dockerfile: kept main's stratified pnpm-install layers for
  caching, added a separate layer for @openai/codex, bumped Dockerfile
  CODEX_VERSION default to 0.124.0 to match SKILL.md
2026-04-28 13:53:47 +03:00
gavrielc 7c8d220115 Merge pull request #1966 from IamAdamJowett/fix/codex-resolve-imports-and-local-md
fix(codex-provider): resolve @-imports and load CLAUDE.local.md
2026-04-24 15:32:14 +03:00
Adam 2d2c3204bc fix(codex-provider): don't double-append global CLAUDE.md when @-imports resolve it
readAgentAndGlobalClaudeMd was appending /workspace/global/CLAUDE.md
explicitly at the end, but since @-import resolution was added the
group's own CLAUDE.md already pulls global in via its default
`@./.claude-global.md` import (see src/group-init.ts). Result:
non-main groups got global instructions inlined twice, wasting
context tokens and occasionally producing contradictory repeated
guidance.

Drop the explicit global append. Groups that want global content
import it via the @-directive in their CLAUDE.md (the default) —
groups that intentionally don't import it now correctly skip it,
matching Claude-backed agent behaviour. The NANOCLAW_IS_MAIN
env branch also becomes unnecessary and is removed.

Addresses Codex Review P2 on qwibitai/nanoclaw#1966.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:45:16 +10:00
Adam 4836cb59df fix(codex-provider): resolve @-imports and load CLAUDE.local.md
The Codex app-server doesn't expand Claude Code's @-import syntax and
doesn't auto-load CLAUDE.local.md from the working directory. Left
alone, Codex-backed agents saw only the raw import directives as
literal text — no composed CLAUDE.md, no module fragments, no
per-group memory. The Claude provider worked because Claude Code
resolves both natively; anything non-Claude was running half-blind.

Adds a pure resolveClaudeImports(content, baseDir) helper that inlines
`^@<path>$` directives relative to the file they live in, recursing
into imported files and breaking cycles. readAgentAndGlobalClaudeMd
now uses it on the group CLAUDE.md, then appends CLAUDE.local.md
(same resolution base), then the global CLAUDE.md.

Missing or cyclic imports are dropped silently (empty text) rather
than left as raw `@path` lines, which would confuse the model.

Fixes the symptom where a Codex-backed agent with a well-written
CLAUDE.local.md (e.g. image-delivery discipline, wiki instructions)
behaved as if none of it existed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:33:43 +10:00
gavrielc c20213a133 docs(add-codex): fix Dockerfile install step — separate RUN block, not combined list
Keep this in sync with main. The skill's prior instruction (append to a
combined `pnpm install -g` block) no longer matches main's Dockerfile, which
splits each global CLI into its own RUN layer for cache granularity. Update
to add a standalone RUN block for Codex that matches the existing pattern.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 21:38:40 +03:00
gavrielc d9ed98fd65 style: apply prettier to merged files 2026-04-22 11:53:55 +03:00
gavrielc 5ae1c33fff Merge v2 into providers
Picks up 105 commits from v2 (engage modes, sender/channel approval flows,
host-sweep heartbeat lifecycle, setup/onecli refactor, setup-flow docs).

Retires 9 deprecated skills that moved out of this branch's scope:
add-compact, add-gmail, add-image-vision, add-pdf-reader, add-reactions,
add-telegram-swarm, add-voice-transcription, channel-formatting,
use-local-whisper.

Preserves providers-branch code: codex provider (6 files), opencode
provider + MCP bridge, plus the CODEX_VERSION Dockerfile ARG.

Conflict: container/Dockerfile — combined v2's CLAUDE_CODE_VERSION bump
(2.1.112 → 2.1.116) with the branch-local CODEX_VERSION ARG.
2026-04-22 11:52:37 +03:00
gavrielc af542adad5 Merge pull request #1843 from chiptoe-svg/feat/codex-provider
feat(providers): add codex provider via app-server JSON-RPC
2026-04-20 23:45:04 +03:00
Chip Tonkin 894b154e41 fix(codex): remove client-driven compaction threshold
The 40 000-input-token threshold and its supporting plumbing
(cumulativeInputTokens counter, thread/tokenUsage/updated handler,
setInputTokens callback on runOneTurn) were SDK-era residue. In the
SDK implementation we ran custom summarize-and-restart around 40k
because the SDK had no native compaction; that reasoning stopped
applying once we switched to app-server's native thread/compact/start.

With that migration the threshold wasn't removed, only re-plumbed —
calling thread/compact/start from the client every time we crossed
an arbitrary 40k. On 128k-context models that over-compacts by
~3× relative to the actual context window, adding latency for no
durability gain.

Trust app-server to handle its own compaction (same posture as the
Claude provider trusts Claude Code's auto_compact). If empirical
behavior shows turns failing at the context wall, we'll revisit with
a server-notification-driven trigger rather than a client-side
threshold.

Also removes now-unused imports (sendCodexRequest, local log helper).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 17:50:24 -04:00
Chip Tonkin 831ef88f16 fix(codex): harden resume fallback and MCP config.toml escaping
Surfaced by adversarial review of the PR diff.

1. thread/resume fallback (startOrResumeCodexThread). Previously fell back
   to thread/start on ANY resume error, silently discarding session state
   for transient / auth / version-mismatch failures that should surface.
   Now gated on STALE_THREAD_RE — the same pattern CodexProvider
   .isSessionInvalid uses on the caller side — and re-throws everything
   else so the poll-loop can decide what to do.

2. MCP config.toml (writeCodexMcpConfigToml). Raw string interpolation for
   command / args / env values would produce invalid TOML on unescaped
   quotes or backslashes, and silently reinterpret the value if a newline
   snuck in. Now routed through a new tomlBasicString helper that escapes
   `"` and `\` per TOML basic-string rules, and rejects newlines with a
   clear error rather than mask misconfiguration.

STALE_THREAD_RE moves to codex-app-server.ts (exported) since both files
need it; codex.ts imports to keep the two detection paths in sync.

Tests: tomlBasicString (quotes, backslashes, escape-order, newline
rejection) and STALE_THREAD_RE (stale vs transient messages).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 13:02:25 -04:00
Chip Tonkin d05923f274 feat(skills): /add-codex — install Codex as the agent provider
Ports the Codex provider files from the providers branch into trunk,
wires host and container barrels, updates the Dockerfile to install
the Codex CLI, and documents the three auth paths (ChatGPT
subscription, OPENAI_API_KEY, BYO OpenAI-compatible endpoint).

Mirrors the /add-opencode skill shape so the two non-Claude provider
install paths look familiar. No agent-runner package dependency —
Codex is a CLI binary, not a library.

Requested by the maintainer on the PR thread — the skill lands
alongside feat(providers): add codex provider via app-server JSON-RPC
so users have a guided install path when trunk eventually picks up
the provider.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 11:31:08 -04:00
Chip Tonkin aba618215d fix(codex): load /workspace/global/CLAUDE.md for non-main groups
Codex's app-server doesn't expand Claude Code's `@-import` syntax, so
the previous loader only passed group CLAUDE.md — global content never
reached the model. This mirrors the OpenCode provider's
readClaudeMdForPrompt: load group CLAUDE.md, then (unless NANOCLAW_IS_MAIN=1)
append /workspace/global/CLAUDE.md.

The literal `@./.claude-global.md` line at the top of group CLAUDE.md is
left in place. OpenCode does the same — keeps the two non-Claude
providers behaviorally consistent, and upstream may revise the group
template later.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 11:12:08 -04:00
Chip Tonkin 138c277fae fix(codex): remove no-op model_provider_base_url override
`-c model_provider_base_url=...` is not a valid Codex config key. Codex
uses nested paths (e.g. `model_providers.<name>.base_url`) per the
`-c` help text, so the override was creating a top-level TOML entry
that Codex never reads.

The override was also redundant: Codex's built-in `openai` provider
honors the `OPENAI_BASE_URL` env var natively (standard OpenAI SDK
convention), verified against codex-cli 0.118 — a request with
`OPENAI_BASE_URL=http://127.0.0.1:9/v1` tries to reach exactly that URL.

The host provider already forwards `OPENAI_BASE_URL` into the container
(src/providers/codex.ts), so BYO-endpoint users keep their config path
without this broken `-c` flag in the way.

Net: -5 LOC, one less silent-fail branch, no user-visible change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 11:10:07 -04:00
Chip Tonkin 1e7cb8b8c8 feat(providers): add codex provider via app-server JSON-RPC
Adds `codex` as a third AgentProvider alongside `claude` and `opencode`.

## Goal

Bring Codex up to the same feature bar as the Claude Agent SDK integration:
persistent sessions, streaming output, MCP tool access, and native
compaction — all driven through the AgentProvider interface without
polluting the poll-loop with per-SDK quirks.

## Why not the @openai/codex-sdk

The SDK-wrapping approach (sketched in agent-runner-details.md) was
tried first but fell short of Claude Agent SDK's feature set on
several fronts. The `codex app-server` JSON-RPC protocol closes most
of the gap:

| Capability                  | Claude SDK | Codex SDK | Codex app-server |
|-----------------------------|:----------:|:---------:|:----------------:|
| Session resume              |     ✓      |     ✓     |        ✓         |
| Streaming event output      |     ✓      |     ✓     |        ✓         |
| Mid-turn input push         |     ✓      |     ✗     | queue + drain*   |
| MCP servers                 |     ✓      |  config   |     config       |
| Approval hooks              |    hooks   |  limited  |  server reqs     |
| Native compaction trigger   |    auto    |     ✗     |        ✓         |
| Multi-turn subagents        |    Task    |     ✗     |   spawnAgent     |
| Config overrides            |     API    |     JS    |   -c flags       |

*Codex turns don't accept mid-turn input. The provider queues push()
 and drains between turns — matches the opencode provider pattern and
 the poll-loop doesn't dispatch new messages mid-turn anyway.

## Known shortcomings (both tractable in follow-ups)

**No native file/web tools.** Claude ships Read/Write/Edit/Glob/Grep/
WebFetch/WebSearch in-SDK. Codex (both SDK and app-server) leaves the
agent to shell out via bash — slower, messier output. Mitigation: a
follow-up can expose these as MCP tools on the container side, reusing
the existing `send_message` / `send_file` tool plumbing.

**Higher tool-call latency.** Every tool call round-trips through
JSON-RPC over stdio (~10-50ms per hop vs Claude's in-process dispatch).
Mitigation: batch MCP calls, pipeline commands — same playbook as
opencode.

Neither blocks feature parity for the common conversational + MCP
cases this PR targets.

## Files

- container/agent-runner/src/providers/codex-app-server.ts
    JSON-RPC transport: spawn, request/response, notification dispatch,
    auto-approval, thread lifecycle, MCP config.toml writer. Kept
    separate so it can be unit-tested without the full provider.
- container/agent-runner/src/providers/codex.ts
    CodexProvider implementing AgentProvider. Emits `activity` on every
    notification so idle timer stays honest. isSessionInvalid matches
    on stale-thread-ID errors for clean recovery.
- container/agent-runner/src/providers/codex.factory.test.ts
    Factory + isSessionInvalid + supportsNativeSlashCommands coverage.
- src/providers/codex.ts
    Host-side container contribution. Per-session ~/.codex copy with
    auth.json copied from host, so the container rewrites config.toml
    on every wake without touching the host's. Forwards OPENAI_API_KEY
    / CODEX_MODEL / OPENAI_BASE_URL.
- container/Dockerfile
    Pins CODEX_VERSION=0.121.0, adds @openai/codex to the pnpm global
    install block alongside claude-code/agent-browser/vercel.

## Validation

Smoke-tested end-to-end against real codex-cli 0.118 on macOS: observed
init → activity → progress → result events with a stable thread ID as
continuation, result text matched expected output. Unit tests: 26 pass
in agent-runner, 137 in host. Typecheck + prettier clean on all added
files.
2026-04-19 10:30:47 -04:00
gavrielc eb0055a0b0 Merge remote-tracking branch 'origin/v2' into providers-sync 2026-04-18 22:07:32 +03:00
gavrielc ef6ea87628 Merge branch 'v2' into providers 2026-04-18 19:15:58 +03:00
gavrielc b29df213ad Merge remote-tracking branch 'origin/v2' into providers 2026-04-18 15:59:25 +03:00
gavrielc dd53875574 fix(providers): restore opencode files that v2 merge deleted
The v2 merge brought in commit e0258e8 ('move opencode provider off
v2 trunk') which correctly removed opencode from v2 but also applied
that deletion on this branch, where opencode must live. Restored:

- container/agent-runner/src/providers/opencode.ts
- container/agent-runner/src/providers/mcp-to-opencode.ts
- container/agent-runner/src/providers/mcp-to-opencode.test.ts
- src/providers/opencode.ts
- @opencode-ai/sdk dep in container/agent-runner/package.json
- opencode self-registration imports in both providers barrels

All 21 container tests now pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 14:57:38 +03:00
gavrielc a1ce73c376 Merge branch 'v2' into providers 2026-04-18 14:54:54 +03:00
gavrielc 48e4172899 refactor(providers-branch): split opencode test into its own file
Moves the opencode test case out of factory.test.ts into
opencode.factory.test.ts so the /add-opencode skill can install the
opencode bundle as a pure file copy — no in-place merging of the shared
factory.test.ts required. Keeps factory.test.ts identical to v2 trunk.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 14:13:04 +03:00
16 changed files with 1499 additions and 0 deletions
+4
View File
@@ -20,6 +20,7 @@ ARG INSTALL_CJK_FONTS=false
# mean every rebuild silently picks up the latest and can break in lockstep
# across all users.
ARG CLAUDE_CODE_VERSION=2.1.116
ARG CODEX_VERSION=0.124.0
ARG AGENT_BROWSER_VERSION=latest
ARG VERCEL_VERSION=latest
ARG BUN_VERSION=1.3.12
@@ -101,6 +102,9 @@ RUN --mount=type=cache,target=/root/.cache/pnpm \
RUN --mount=type=cache,target=/root/.cache/pnpm \
pnpm install -g "agent-browser@${AGENT_BROWSER_VERSION}"
RUN --mount=type=cache,target=/root/.cache/pnpm \
pnpm install -g "@openai/codex@${CODEX_VERSION}"
RUN --mount=type=cache,target=/root/.cache/pnpm \
pnpm install -g "@anthropic-ai/claude-code@${CLAUDE_CODE_VERSION}"
+3
View File
@@ -7,6 +7,7 @@
"dependencies": {
"@anthropic-ai/claude-agent-sdk": "^0.2.116",
"@modelcontextprotocol/sdk": "^1.12.1",
"@opencode-ai/sdk": "^1.4.3",
"cron-parser": "^5.0.0",
"zod": "^4.0.0",
},
@@ -44,6 +45,8 @@
"@modelcontextprotocol/sdk": ["@modelcontextprotocol/sdk@1.29.0", "", { "dependencies": { "@hono/node-server": "^1.19.9", "ajv": "^8.17.1", "ajv-formats": "^3.0.1", "content-type": "^1.0.5", "cors": "^2.8.5", "cross-spawn": "^7.0.5", "eventsource": "^3.0.2", "eventsource-parser": "^3.0.0", "express": "^5.2.1", "express-rate-limit": "^8.2.1", "hono": "^4.11.4", "jose": "^6.1.3", "json-schema-typed": "^8.0.2", "pkce-challenge": "^5.0.0", "raw-body": "^3.0.0", "zod": "^3.25 || ^4.0", "zod-to-json-schema": "^3.25.1" }, "peerDependencies": { "@cfworker/json-schema": "^4.1.1" }, "optionalPeers": ["@cfworker/json-schema"] }, "sha512-zo37mZA9hJWpULgkRpowewez1y6ML5GsXJPY8FI0tBBCd77HEvza4jDqRKOXgHNn867PVGCyTdzqpz0izu5ZjQ=="],
"@opencode-ai/sdk": ["@opencode-ai/sdk@1.4.11", "", { "dependencies": { "cross-spawn": "7.0.6" } }, "sha512-EJxSfc7D/dda/vrw8zQe4g7yVTxERktvb5SvIBlGBnKYQJGOgo9RyA/1EL3l208rHeo6jm1sdrAF0E6o/k94ug=="],
"@types/bun": ["@types/bun@1.3.12", "", { "dependencies": { "bun-types": "1.3.12" } }, "sha512-DBv81elK+/VSwXHDlnH3Qduw+KxkTIWi7TXkAeh24zpi5l0B2kUg9Ga3tb4nJaPcOFswflgi/yAvMVBPrxMB+A=="],
"@types/node": ["@types/node@22.19.17", "", { "dependencies": { "undici-types": "~6.21.0" } }, "sha512-wGdMcf+vPYM6jikpS/qhg6WiqSV/OhG+jeeHT/KlVqxYfD40iYJf9/AE1uQxVWFvU7MipKRkRv8NSHiCGgPr8Q=="],
+1
View File
@@ -11,6 +11,7 @@
"dependencies": {
"@anthropic-ai/claude-agent-sdk": "^0.2.116",
"@modelcontextprotocol/sdk": "^1.12.1",
"@opencode-ai/sdk": "^1.4.3",
"cron-parser": "^5.0.0",
"zod": "^4.0.0"
},
@@ -0,0 +1,47 @@
import { describe, it, expect } from 'bun:test';
import { STALE_THREAD_RE, tomlBasicString } from './codex-app-server.js';
describe('tomlBasicString', () => {
it('leaves safe strings unchanged inside quotes', () => {
expect(tomlBasicString('hello')).toBe('"hello"');
expect(tomlBasicString('bun')).toBe('"bun"');
expect(tomlBasicString('/usr/local/bin/node')).toBe('"/usr/local/bin/node"');
});
it('escapes double-quotes', () => {
expect(tomlBasicString('a"b')).toBe('"a\\"b"');
expect(tomlBasicString('"quoted"')).toBe('"\\"quoted\\""');
});
it('escapes backslashes', () => {
expect(tomlBasicString('a\\b')).toBe('"a\\\\b"');
expect(tomlBasicString('C:\\path\\to\\bin')).toBe('"C:\\\\path\\\\to\\\\bin"');
});
it('escapes backslash before quote (order matters)', () => {
expect(tomlBasicString('\\"')).toBe('"\\\\\\""');
});
it('rejects strings containing newlines', () => {
expect(() => tomlBasicString('line1\nline2')).toThrow(/newline/);
expect(() => tomlBasicString('trailing\n')).toThrow(/newline/);
expect(() => tomlBasicString('crlf\r\nhere')).toThrow(/newline/);
});
});
describe('STALE_THREAD_RE', () => {
it('matches stale-thread error messages', () => {
expect(STALE_THREAD_RE.test('thread not found')).toBe(true);
expect(STALE_THREAD_RE.test('unknown thread xyz')).toBe(true);
expect(STALE_THREAD_RE.test('No such thread: abc')).toBe(true);
expect(STALE_THREAD_RE.test('invalid thread_id')).toBe(true);
});
it('does not match transient or unrelated errors', () => {
expect(STALE_THREAD_RE.test('rate limit exceeded')).toBe(false);
expect(STALE_THREAD_RE.test('authentication failed')).toBe(false);
expect(STALE_THREAD_RE.test('connection reset by peer')).toBe(false);
expect(STALE_THREAD_RE.test('internal server error')).toBe(false);
});
});
@@ -0,0 +1,393 @@
/**
* Codex app-server JSON-RPC transport primitives.
*
* Communicates with `codex app-server` over stdio. This module is just the
* plumbing — spawn the process, send requests, dispatch responses and
* notifications. Higher-level semantics (threads, turns, event translation)
* live in codex.ts.
*
* Kept separate so the transport can be unit-tested without pulling in the
* full provider and so any future Codex tooling (e.g. a CLI for manual
* debugging) can reuse the same primitives.
*/
import fs from 'fs';
import path from 'path';
import { spawn, type ChildProcess } from 'child_process';
import { createInterface, type Interface as ReadlineInterface } from 'readline';
function log(msg: string): void {
console.error(`[codex-app-server] ${msg}`);
}
const INIT_TIMEOUT_MS = 30_000;
/**
* Errors from `thread/resume` that indicate the thread ID is unusable —
* typically because the app-server has no memory of it (thread transcript
* was deleted, server was wiped, ID is from a different codex version).
* Only errors matching this pattern trigger silent fallback to a fresh
* thread; everything else bubbles up so the caller can decide what to do.
*
* Shared with `codex.ts`'s `isSessionInvalid` to keep the two detection
* paths in sync.
*/
export const STALE_THREAD_RE = /thread\s+not\s+found|unknown\s+thread|thread[_\s]id|no such thread/i;
/**
* Escape a string for emission inside a TOML basic string (double-quoted).
* Handles `"` and `\`. Rejects newlines: basic strings can't contain raw
* newlines, and silently converting them to `\n` would mask misconfiguration
* (e.g. a secret pasted with a trailing newline). Multiline strings are
* unsupported for `config.toml` use here.
*/
export function tomlBasicString(value: string): string {
if (value.includes('\n') || value.includes('\r')) {
throw new Error(
`MCP config value contains newline (not supported in config.toml): ${JSON.stringify(value.slice(0, 40))}${value.length > 40 ? '…' : ''}`,
);
}
return `"${value.replace(/\\/g, '\\\\').replace(/"/g, '\\"')}"`;
}
// ── JSON-RPC types ──────────────────────────────────────────────────────────
let nextRequestId = 1;
interface JsonRpcRequest {
id: number;
method: string;
params: Record<string, unknown>;
}
export interface JsonRpcResponse {
id: number;
result?: unknown;
error?: { code: number; message: string; data?: unknown };
}
export interface JsonRpcNotification {
method: string;
params: Record<string, unknown>;
}
export interface JsonRpcServerRequest {
id: number;
method: string;
params: Record<string, unknown>;
}
type JsonRpcMessage = JsonRpcResponse | JsonRpcNotification | JsonRpcServerRequest;
function makeRequest(method: string, params: Record<string, unknown>): JsonRpcRequest {
return { id: nextRequestId++, method, params };
}
function isResponse(msg: JsonRpcMessage): msg is JsonRpcResponse {
return 'id' in msg && ('result' in msg || 'error' in msg) && !('method' in msg);
}
function isServerRequest(msg: JsonRpcMessage): msg is JsonRpcServerRequest {
return 'id' in msg && 'method' in msg;
}
// ── App-server handle ───────────────────────────────────────────────────────
export interface AppServer {
process: ChildProcess;
readline: ReadlineInterface;
pending: Map<number, { resolve: (r: JsonRpcResponse) => void; reject: (e: Error) => void }>;
notificationHandlers: ((n: JsonRpcNotification) => void)[];
serverRequestHandlers: ((r: JsonRpcServerRequest) => void)[];
}
export function spawnCodexAppServer(configOverrides: string[] = []): AppServer {
const args = ['app-server', '--listen', 'stdio://'];
for (const override of configOverrides) args.push('-c', override);
log(`Spawning: codex ${args.join(' ')}`);
const proc = spawn('codex', args, {
stdio: ['pipe', 'pipe', 'pipe'],
env: { ...process.env },
});
const rl = createInterface({ input: proc.stdout! });
const server: AppServer = {
process: proc,
readline: rl,
pending: new Map(),
notificationHandlers: [],
serverRequestHandlers: [],
};
proc.stderr?.on('data', (chunk: Buffer) => {
const text = chunk.toString().trim();
if (text) log(`[stderr] ${text}`);
});
rl.on('line', (line: string) => {
if (!line.trim()) return;
let msg: JsonRpcMessage;
try {
msg = JSON.parse(line);
} catch {
log(`[parse-error] ${line.slice(0, 200)}`);
return;
}
if (isResponse(msg)) {
const handler = server.pending.get(msg.id);
if (handler) {
server.pending.delete(msg.id);
handler.resolve(msg);
}
} else if (isServerRequest(msg)) {
for (const h of server.serverRequestHandlers) h(msg);
} else if ('method' in msg) {
for (const h of server.notificationHandlers) h(msg as JsonRpcNotification);
}
});
proc.on('error', (err) => {
log(`[process-error] ${err.message}`);
for (const [, handler] of server.pending) handler.reject(err);
server.pending.clear();
});
proc.on('exit', (code, signal) => {
log(`[exit] code=${code} signal=${signal}`);
const err = new Error(`Codex app-server exited: code=${code} signal=${signal}`);
for (const [, handler] of server.pending) handler.reject(err);
server.pending.clear();
});
return server;
}
export function sendCodexRequest(
server: AppServer,
method: string,
params: Record<string, unknown>,
timeoutMs = 60_000,
): Promise<JsonRpcResponse> {
const req = makeRequest(method, params);
const line = JSON.stringify(req) + '\n';
return new Promise<JsonRpcResponse>((resolve, reject) => {
const timer = setTimeout(() => {
server.pending.delete(req.id);
reject(new Error(`Timeout waiting for ${method} response (${timeoutMs}ms)`));
}, timeoutMs);
server.pending.set(req.id, {
resolve: (r) => {
clearTimeout(timer);
resolve(r);
},
reject: (e) => {
clearTimeout(timer);
reject(e);
},
});
try {
server.process.stdin!.write(line);
} catch (err) {
clearTimeout(timer);
server.pending.delete(req.id);
reject(err instanceof Error ? err : new Error(String(err)));
}
});
}
export function sendCodexResponse(server: AppServer, id: number, result: unknown): void {
const line = JSON.stringify({ id, result }) + '\n';
try {
server.process.stdin!.write(line);
} catch (err) {
log(`[send-error] Failed to send response for id=${id}: ${err instanceof Error ? err.message : String(err)}`);
}
}
export function killCodexAppServer(server: AppServer): void {
try {
server.readline.close();
server.process.kill('SIGTERM');
} catch {
/* ignore */
}
}
// ── Auto-approval ───────────────────────────────────────────────────────────
// The container sandbox is already the security boundary; inside it, Codex's
// own approval prompts would just block every tool call on a user that isn't
// watching. Accept everything and let sandbox limits do the enforcement.
export function attachCodexAutoApproval(server: AppServer): void {
server.serverRequestHandlers.push((req) => {
const method = req.method;
log(`[approval] ${method}`);
switch (method) {
case 'item/commandExecution/requestApproval':
case 'item/fileChange/requestApproval':
sendCodexResponse(server, req.id, { decision: 'accept' });
break;
case 'item/permissions/requestApproval':
sendCodexResponse(server, req.id, {
permissions: { fileSystem: { read: ['/'], write: ['/'] }, network: { enabled: true } },
scope: 'session',
});
break;
case 'applyPatchApproval':
case 'execCommandApproval':
sendCodexResponse(server, req.id, { decision: 'approved' });
break;
case 'item/tool/call': {
const toolName = (req.params as { tool?: string }).tool || 'unknown';
log(`[approval] Unexpected dynamic tool call: ${toolName}`);
sendCodexResponse(server, req.id, {
success: false,
contentItems: [{ type: 'inputText', text: `Tool "${toolName}" is not available. Use MCP tools instead.` }],
});
break;
}
case 'item/tool/requestUserInput':
case 'mcpServer/elicitation/request':
sendCodexResponse(server, req.id, { input: null });
break;
default:
log(`[approval] Unknown method ${method}, generic accept`);
sendCodexResponse(server, req.id, { decision: 'accept' });
break;
}
});
}
// ── High-level helpers ──────────────────────────────────────────────────────
export async function initializeCodexAppServer(server: AppServer): Promise<void> {
log('Sending initialize…');
const resp = await sendCodexRequest(
server,
'initialize',
{
clientInfo: { name: 'nanoclaw', version: '1.0.0' },
capabilities: { experimentalApi: false },
},
INIT_TIMEOUT_MS,
);
if (resp.error) throw new Error(`Initialize failed: ${resp.error.message}`);
log('Initialize successful');
}
export interface ThreadParams {
model: string;
cwd: string;
sandbox?: string;
approvalPolicy?: string;
personality?: string;
baseInstructions?: string;
}
/**
* Start or resume a Codex thread. If `threadId` is provided, attempts
* `thread/resume` first and falls back to a fresh `thread/start` on failure
* (stale thread IDs commonly outlive containers). Returns the active thread
* ID either way.
*/
export async function startOrResumeCodexThread(
server: AppServer,
threadId: string | undefined,
params: ThreadParams,
): Promise<string> {
if (threadId) {
log(`Resuming thread: ${threadId}`);
const resp = await sendCodexRequest(server, 'thread/resume', {
threadId,
...(params as unknown as Record<string, unknown>),
});
if (!resp.error) {
log(`Thread resumed: ${threadId}`);
return threadId;
}
// Only fall through to fresh-thread on recognized stale-thread errors.
// Auth, version, or transient failures would otherwise silently discard
// session state — fail loud instead so the caller can retry or surface.
if (!STALE_THREAD_RE.test(resp.error.message)) {
throw new Error(`thread/resume failed: ${resp.error.message}`);
}
log(`Stale thread ${threadId}; starting fresh thread.`);
}
log('Starting new thread…');
const resp = await sendCodexRequest(server, 'thread/start', {
...(params as unknown as Record<string, unknown>),
});
if (resp.error) throw new Error(`thread/start failed: ${resp.error.message}`);
const result = resp.result as { thread?: { id?: string } } | undefined;
const newThreadId = result?.thread?.id;
if (!newThreadId) throw new Error('thread/start response missing thread ID');
log(`New thread: ${newThreadId}`);
return newThreadId;
}
export interface TurnParams {
threadId: string;
inputText: string;
model?: string;
cwd?: string;
}
export async function startCodexTurn(server: AppServer, params: TurnParams): Promise<void> {
const resp = await sendCodexRequest(server, 'turn/start', {
threadId: params.threadId,
input: [{ type: 'text', text: params.inputText }],
model: params.model,
cwd: params.cwd,
});
if (resp.error) throw new Error(`turn/start failed: ${resp.error.message}`);
}
// ── MCP config.toml ─────────────────────────────────────────────────────────
// Codex discovers MCP servers by reading ~/.codex/config.toml at startup.
// We rewrite it on every spawn from whatever mcpServers the agent-runner
// passes in, so the container's config reflects the current host wiring.
export interface CodexMcpServer {
command: string;
args?: string[];
env?: Record<string, string>;
}
export function writeCodexMcpConfigToml(servers: Record<string, CodexMcpServer>): void {
const codexConfigDir = path.join(process.env.HOME || '/home/node', '.codex');
fs.mkdirSync(codexConfigDir, { recursive: true });
const configTomlPath = path.join(codexConfigDir, 'config.toml');
const lines: string[] = [];
for (const [name, config] of Object.entries(servers)) {
lines.push(`[mcp_servers.${name}]`);
lines.push('type = "stdio"');
lines.push(`command = ${tomlBasicString(config.command)}`);
if (config.args && config.args.length > 0) {
const argsStr = config.args.map(tomlBasicString).join(', ');
lines.push(`args = [${argsStr}]`);
}
if (config.env && Object.keys(config.env).length > 0) {
lines.push(`[mcp_servers.${name}.env]`);
for (const [key, value] of Object.entries(config.env)) {
lines.push(`${key} = ${tomlBasicString(value)}`);
}
}
lines.push('');
}
fs.writeFileSync(configTomlPath, lines.join('\n'));
log(`Wrote MCP config.toml (${Object.keys(servers).length} server(s))`);
}
export function createCodexConfigOverrides(): string[] {
return ['features.use_linux_sandbox_bwrap=false'];
}
@@ -0,0 +1,80 @@
import fs from 'fs';
import os from 'os';
import path from 'path';
import { describe, it, expect } from 'bun:test';
import { createProvider } from './factory.js';
import { CodexProvider, resolveClaudeImports } from './codex.js';
describe('createProvider (codex)', () => {
it('returns CodexProvider for codex', () => {
expect(createProvider('codex')).toBeInstanceOf(CodexProvider);
});
it('flags stale thread errors as session-invalid', () => {
const p = new CodexProvider();
expect(p.isSessionInvalid(new Error('thread not found'))).toBe(true);
expect(p.isSessionInvalid(new Error('unknown thread 123'))).toBe(true);
expect(p.isSessionInvalid(new Error('No such thread: abc'))).toBe(true);
});
it('does not flag unrelated errors as session-invalid', () => {
const p = new CodexProvider();
expect(p.isSessionInvalid(new Error('rate limit exceeded'))).toBe(false);
expect(p.isSessionInvalid(new Error('connection reset'))).toBe(false);
expect(p.isSessionInvalid(new Error('codex app-server exited: code=1'))).toBe(false);
});
it('declares no native slash command support', () => {
const p = new CodexProvider();
expect(p.supportsNativeSlashCommands).toBe(false);
});
});
describe('resolveClaudeImports', () => {
function scratchDir(): string {
return fs.mkdtempSync(path.join(os.tmpdir(), 'codex-imports-'));
}
it('inlines a single relative import', () => {
const dir = scratchDir();
fs.writeFileSync(path.join(dir, 'fragment.md'), 'FRAGMENT CONTENT');
const resolved = resolveClaudeImports('before\n@./fragment.md\nafter', dir);
expect(resolved).toContain('FRAGMENT CONTENT');
expect(resolved).not.toContain('@./fragment.md');
expect(resolved).toMatch(/before[\s\S]*FRAGMENT CONTENT[\s\S]*after/);
});
it('expands nested imports relative to the parent file', () => {
const dir = scratchDir();
fs.mkdirSync(path.join(dir, 'sub'));
fs.writeFileSync(path.join(dir, 'sub', 'inner.md'), 'INNER');
fs.writeFileSync(path.join(dir, 'sub', 'outer.md'), '@./inner.md');
const resolved = resolveClaudeImports('@./sub/outer.md', dir);
expect(resolved).toBe('INNER');
});
it('drops missing imports to empty text rather than leaving raw @path', () => {
const dir = scratchDir();
const resolved = resolveClaudeImports('before\n@./does-not-exist.md\nafter', dir);
expect(resolved).not.toContain('@./does-not-exist.md');
expect(resolved).toContain('before');
expect(resolved).toContain('after');
});
it('breaks cycles', () => {
const dir = scratchDir();
fs.writeFileSync(path.join(dir, 'a.md'), '@./b.md');
fs.writeFileSync(path.join(dir, 'b.md'), '@./a.md');
// Just needs to terminate without a stack overflow.
const resolved = resolveClaudeImports('@./a.md', dir);
expect(typeof resolved).toBe('string');
});
it('leaves non-import @ mentions alone (only line-anchored @<path> is imported)', () => {
const dir = scratchDir();
const resolved = resolveClaudeImports('email @someone for details', dir);
expect(resolved).toBe('email @someone for details');
});
});
@@ -0,0 +1,332 @@
/**
* OpenAI Codex provider — wraps `codex app-server` via JSON-RPC.
*
* Unlike the (deprecated) @openai/codex-sdk approach, the app-server
* protocol exposes proper session/stream semantics, native compaction, and
* stable MCP config via ~/.codex/config.toml — which is the same mechanism
* the standalone codex CLI uses, so the container and host share one
* provider-integration story.
*
* Codex turns don't accept mid-turn input. Follow-up `push()` messages are
* queued and drained after the current turn completes (same pattern as the
* opencode provider — see poll-loop for why that's correct: the poll-loop
* only pushes once it has new pending messages, and we only drain between
* turns, so no message is dropped).
*/
import fs from 'fs';
import path from 'path';
import { registerProvider } from './provider-registry.js';
import type { AgentProvider, AgentQuery, ProviderEvent, ProviderOptions, QueryInput } from './types.js';
import {
type AppServer,
type JsonRpcNotification,
STALE_THREAD_RE,
attachCodexAutoApproval,
createCodexConfigOverrides,
initializeCodexAppServer,
killCodexAppServer,
spawnCodexAppServer,
startCodexTurn,
startOrResumeCodexThread,
writeCodexMcpConfigToml,
} from './codex-app-server.js';
/** Hard ceiling for a single turn. Guards against app-server wedging. */
const TURN_TIMEOUT_MS = 5 * 60 * 1000;
// ── System-prompt assembly ──────────────────────────────────────────────────
// Codex's app-server doesn't expand Claude Code's `@-import` syntax in
// CLAUDE.md, and doesn't auto-load CLAUDE.local.md from the working dir the
// way Claude Code does. Left alone, the agent sees only the raw import
// directives as literal text and none of the composed content — no shared
// CLAUDE.md, no module fragments, no per-group memory. We resolve both here
// so Codex (and any other non-Claude provider) gets the same effective
// system prompt the Claude provider gets natively.
/**
* Inline `@<path>` import directives (line-anchored) with the contents of
* the referenced file, resolved relative to `baseDir`. Recurses so imports
* within imported files expand too. Cycles and missing files are silently
* dropped (replaced with empty text) rather than left as raw `@path` lines,
* which would confuse the model.
*/
export function resolveClaudeImports(content: string, baseDir: string, seen: Set<string> = new Set()): string {
return content.replace(/^@(\S+)\s*$/gm, (_match, importPath: string) => {
try {
const resolved = path.resolve(baseDir, importPath);
if (seen.has(resolved)) return '';
if (!fs.existsSync(resolved)) return '';
const nextSeen = new Set(seen);
nextSeen.add(resolved);
const imported = fs.readFileSync(resolved, 'utf-8');
return resolveClaudeImports(imported, path.dirname(resolved), nextSeen);
} catch {
return '';
}
});
}
function readAgentAndGlobalClaudeMd(): string | undefined {
// Per-group CLAUDE.md is responsible for pulling in the global instructions
// if the group wants them (the default scaffold starts with
// `@./.claude-global.md` which resolveClaudeImports inlines). Appending
// `/workspace/global/CLAUDE.md` explicitly here would double-inline the
// global content for any non-main group, wasting context tokens and
// risking contradictory instructions. Groups that don't import global
// intentionally don't get it — same as Claude-backed agents.
const groupDir = '/workspace/agent';
const groupPath = `${groupDir}/CLAUDE.md`;
const localPath = `${groupDir}/CLAUDE.local.md`;
const parts: string[] = [];
if (fs.existsSync(groupPath)) {
parts.push(resolveClaudeImports(fs.readFileSync(groupPath, 'utf-8'), groupDir));
}
if (fs.existsSync(localPath)) {
parts.push(resolveClaudeImports(fs.readFileSync(localPath, 'utf-8'), groupDir));
}
return parts.length > 0 ? parts.join('\n\n---\n\n') : undefined;
}
function composeBaseInstructions(promptAddendum: string | undefined): string | undefined {
const claudeMd = readAgentAndGlobalClaudeMd();
const pieces = [claudeMd, promptAddendum].filter((s): s is string => Boolean(s));
return pieces.length > 0 ? pieces.join('\n\n---\n\n') : undefined;
}
// ── Provider ────────────────────────────────────────────────────────────────
export class CodexProvider implements AgentProvider {
readonly supportsNativeSlashCommands = false;
private readonly mcpServers: Record<string, { command: string; args: string[]; env: Record<string, string> }>;
private readonly model: string;
constructor(options: ProviderOptions = {}) {
this.mcpServers = options.mcpServers ?? {};
this.model = (options.env?.CODEX_MODEL as string | undefined) ?? 'gpt-5.4-mini';
}
isSessionInvalid(err: unknown): boolean {
const msg = err instanceof Error ? err.message : String(err);
return STALE_THREAD_RE.test(msg);
}
query(input: QueryInput): AgentQuery {
const pending: string[] = [];
let waiting: (() => void) | null = null;
let ended = false;
let aborted = false;
const kick = (): void => {
waiting?.();
};
pending.push(input.prompt);
const self = this;
async function* gen(): AsyncGenerator<ProviderEvent> {
// One app-server per query invocation. The poll-loop keeps a single
// query active per batch of pending messages and ends it on idle, so
// spawn-per-query matches that cadence naturally.
writeCodexMcpConfigToml(self.mcpServers);
const server = spawnCodexAppServer(createCodexConfigOverrides());
attachCodexAutoApproval(server);
let threadId: string | undefined = input.continuation;
let initYielded = false;
try {
await initializeCodexAppServer(server);
const threadParams = {
model: self.model,
cwd: input.cwd,
sandbox: 'danger-full-access',
approvalPolicy: 'never',
personality: 'friendly',
baseInstructions: composeBaseInstructions(input.systemContext?.instructions),
};
threadId = await startOrResumeCodexThread(server, threadId, threadParams);
while (!aborted) {
while (pending.length === 0 && !ended && !aborted) {
await new Promise<void>((resolve) => {
waiting = resolve;
});
waiting = null;
}
if (aborted) return;
if (pending.length === 0 && ended) return;
const text = pending.shift()!;
// One turn = one channel of streaming events. Each notification
// from the app-server yields an `activity` first (so the
// poll-loop's idle timer stays honest) and then, where relevant,
// an init / result / progress event.
yield* runOneTurn(
server,
threadId!,
text,
self.model,
input.cwd,
() => initYielded,
() => {
initYielded = true;
},
);
}
} finally {
killCodexAppServer(server);
}
}
return {
push: (message: string) => {
pending.push(message);
kick();
},
end: () => {
ended = true;
kick();
},
abort: () => {
aborted = true;
kick();
},
events: gen(),
};
}
}
// ── Per-turn event pump ─────────────────────────────────────────────────────
// Pulled out because the gen() loop above reads cleaner with it extracted,
// and because it's a natural seam for future unit tests that drive it with
// a fake notification stream.
async function* runOneTurn(
server: AppServer,
threadId: string,
inputText: string,
model: string,
cwd: string,
hasInit: () => boolean,
markInit: () => void,
): AsyncGenerator<ProviderEvent> {
// Mutable refs via object properties — TS can't track closure assignments
// for narrowing, but property access keeps the declared type visible.
const turnState: { error: Error | null } = { error: null };
let resultText = '';
let turnDone = false;
// Buffered event queue so we can `yield` across the async notification
// callback. Each notification pushes zero or more ProviderEvents; the
// generator drains the buffer.
const buffer: ProviderEvent[] = [];
let waker: (() => void) | null = null;
const kick = (): void => {
waker?.();
waker = null;
};
const handler = (n: JsonRpcNotification): void => {
const method = n.method;
const params = n.params;
// Every inbound notification counts as activity for the poll-loop's
// idle timer — yield before any event-specific translation so even
// long tool executions keep the loop awake.
buffer.push({ type: 'activity' });
switch (method) {
case 'thread/started': {
const thread = params.thread as { id?: string } | undefined;
if (thread?.id && !hasInit()) {
markInit();
buffer.push({ type: 'init', continuation: thread.id });
}
break;
}
case 'item/agentMessage/delta': {
const delta = params.delta as string;
if (delta) resultText += delta;
break;
}
case 'item/completed': {
const item = params.item as { type?: string; text?: string } | undefined;
if (item?.type === 'agentMessage' && item.text) resultText = item.text;
break;
}
case 'turn/completed':
turnDone = true;
break;
case 'turn/failed': {
const e = params.error as { message?: string } | undefined;
turnState.error = new Error(e?.message || 'Turn failed');
turnDone = true;
break;
}
case 'thread/status/changed': {
const status = params.status as string | undefined;
if (status) buffer.push({ type: 'progress', message: `status: ${status}` });
break;
}
default:
// Silently handle the many item/* notifications — they already
// contributed an activity event above.
break;
}
kick();
};
server.notificationHandlers.push(handler);
const timer = setTimeout(() => {
turnState.error = new Error(`Turn timed out after ${TURN_TIMEOUT_MS}ms`);
turnDone = true;
kick();
}, TURN_TIMEOUT_MS);
try {
// If we yield init before turn/start, the poll-loop stores
// continuation early and survives a mid-turn crash.
if (!hasInit()) {
markInit();
buffer.push({ type: 'init', continuation: threadId });
}
await startCodexTurn(server, { threadId, inputText, model, cwd });
while (true) {
while (buffer.length > 0) {
const ev = buffer.shift()!;
yield ev;
}
if (turnDone) break;
await new Promise<void>((resolve) => {
waker = resolve;
});
waker = null;
}
while (buffer.length > 0) yield buffer.shift()!;
if (turnState.error) {
yield { type: 'error', message: turnState.error.message, retryable: false };
return;
}
yield { type: 'result', text: resultText || null };
} finally {
clearTimeout(timer);
const idx = server.notificationHandlers.indexOf(handler);
if (idx >= 0) server.notificationHandlers.splice(idx, 1);
}
}
registerProvider('codex', (opts) => new CodexProvider(opts));
@@ -2,6 +2,7 @@ import { describe, it, expect } from 'bun:test';
import { createProvider, type ProviderName } from './factory.js';
import { ClaudeProvider } from './claude.js';
import { CodexProvider } from './codex.js';
import { MockProvider } from './mock.js';
describe('createProvider', () => {
@@ -9,6 +10,10 @@ describe('createProvider', () => {
expect(createProvider('claude')).toBeInstanceOf(ClaudeProvider);
});
it('returns CodexProvider for codex', () => {
expect(createProvider('codex')).toBeInstanceOf(CodexProvider);
});
it('returns MockProvider for mock', () => {
expect(createProvider('mock')).toBeInstanceOf(MockProvider);
});
@@ -3,4 +3,6 @@
// level. Skills add a new provider by appending one import line below.
import './claude.js';
import './codex.js';
import './mock.js';
import './opencode.js';
@@ -0,0 +1,59 @@
import { describe, it, expect } from 'bun:test';
import { mcpServersToOpenCodeConfig } from './mcp-to-opencode.js';
describe('mcpServersToOpenCodeConfig', () => {
it('maps nanoclaw + extra server like v2 index.ts merge', () => {
const servers = {
nanoclaw: {
command: 'node',
args: ['/app/src/mcp-tools/index.js'],
env: {
SESSION_INBOUND_DB_PATH: '/workspace/inbound.db',
SESSION_OUTBOUND_DB_PATH: '/workspace/outbound.db',
SESSION_HEARTBEAT_PATH: '/workspace/.heartbeat',
},
},
extra: {
command: 'npx',
args: ['-y', 'some-mcp'],
env: { FOO: 'bar' },
},
};
const mcp = mcpServersToOpenCodeConfig(servers);
expect(mcp.nanoclaw).toEqual({
type: 'local',
command: ['node', '/app/src/mcp-tools/index.js'],
environment: {
SESSION_INBOUND_DB_PATH: '/workspace/inbound.db',
SESSION_OUTBOUND_DB_PATH: '/workspace/outbound.db',
SESSION_HEARTBEAT_PATH: '/workspace/.heartbeat',
},
enabled: true,
});
expect(mcp.extra).toEqual({
type: 'local',
command: ['npx', '-y', 'some-mcp'],
environment: { FOO: 'bar' },
enabled: true,
});
});
it('omits environment when env is empty', () => {
const mcp = mcpServersToOpenCodeConfig({
x: { command: 'true', args: [], env: {} },
});
expect(mcp.x).toEqual({
type: 'local',
command: ['true'],
enabled: true,
});
});
it('returns empty record for undefined', () => {
expect(mcpServersToOpenCodeConfig(undefined)).toEqual({});
});
});
@@ -0,0 +1,39 @@
import type { McpServerConfig } from './types.js';
/** OpenCode `mcp` entry shape (local stdio server). */
export type OpenCodeMcpLocal = {
type: 'local';
command: string[];
environment?: Record<string, string>;
enabled: true;
};
/** OpenCode `mcp` entry shape (remote HTTP server). */
export type OpenCodeMcpRemote = {
type: 'remote';
url: string;
headers?: Record<string, string>;
enabled: true;
};
export type OpenCodeMcpEntry = OpenCodeMcpLocal | OpenCodeMcpRemote;
/**
* Map NanoClaw v2 MCP definitions (same shape as Claude Agent SDK) into
* OpenCode config `mcp` field. Stdio-only until `McpServerConfig` gains remote.
*/
export function mcpServersToOpenCodeConfig(
servers: Record<string, McpServerConfig> | undefined,
): Record<string, OpenCodeMcpEntry> {
const out: Record<string, OpenCodeMcpEntry> = {};
if (!servers) return out;
for (const [name, cfg] of Object.entries(servers)) {
out[name] = {
type: 'local',
command: [cfg.command, ...cfg.args],
...(Object.keys(cfg.env).length > 0 ? { environment: cfg.env } : {}),
enabled: true,
};
}
return out;
}
@@ -0,0 +1,10 @@
import { describe, it, expect } from 'bun:test';
import { createProvider } from './factory.js';
import { OpenCodeProvider } from './opencode.js';
describe('createProvider (opencode)', () => {
it('returns OpenCodeProvider for opencode', () => {
expect(createProvider('opencode')).toBeInstanceOf(OpenCodeProvider);
});
});
@@ -0,0 +1,423 @@
import { spawn, type ChildProcess } from 'child_process';
import { createOpencodeClient, type OpencodeClient } from '@opencode-ai/sdk';
import { registerProvider } from './provider-registry.js';
import type { AgentProvider, AgentQuery, ProviderEvent, ProviderOptions, QueryInput } from './types.js';
import { mcpServersToOpenCodeConfig } from './mcp-to-opencode.js';
function log(msg: string): void {
console.error(`[opencode-provider] ${msg}`);
}
const SESSION_STATUS_RETRY_ERROR_AFTER = 3;
/** Stale / dead OpenCode session heuristics (complement Claude-centric host patterns). */
const STALE_SESSION_RE =
/no conversation found|ENOENT.*\.jsonl|session.*not found|NotFoundError|connection reset|ECONNRESET|404|event timeout/i;
function killProcessTree(proc: ChildProcess): void {
if (!proc.pid) return;
try {
process.kill(-proc.pid, 'SIGKILL');
} catch {
try {
proc.kill('SIGKILL');
} catch {
/* ignore */
}
}
}
function spawnOpencodeServer(config: Record<string, unknown>, timeoutMs = 10_000): Promise<{ url: string; proc: ChildProcess }> {
return new Promise((resolve, reject) => {
const hostname = '127.0.0.1';
const port = 4096;
const proc = spawn('opencode', ['serve', `--hostname=${hostname}`, `--port=${port}`], {
env: {
...process.env,
OPENCODE_CONFIG_CONTENT: JSON.stringify(config),
},
detached: true,
});
const id = setTimeout(() => {
killProcessTree(proc);
reject(new Error(`Timeout waiting for OpenCode server to start after ${timeoutMs}ms`));
}, timeoutMs);
let output = '';
proc.stdout?.on('data', (chunk: Buffer) => {
output += chunk.toString();
for (const line of output.split('\n')) {
if (line.startsWith('opencode server listening')) {
const match = line.match(/on\s+(https?:\/\/[^\s]+)/);
if (match) {
clearTimeout(id);
resolve({ url: match[1], proc });
}
}
}
});
proc.stderr?.on('data', (chunk: Buffer) => {
output += chunk.toString();
});
proc.on('exit', (code) => {
clearTimeout(id);
let msg = `OpenCode server exited with code ${code}`;
if (output.trim()) msg += `\nServer output: ${output}`;
reject(new Error(msg));
});
proc.on('error', (err) => {
clearTimeout(id);
reject(err);
});
});
}
function wrapPromptWithContext(text: string, systemInstructions?: string): string {
let out = text;
if (systemInstructions) {
out = `<system>\n${systemInstructions}\n</system>\n\n${out}`;
}
return out;
}
function buildOpenCodeConfig(options: ProviderOptions): Record<string, unknown> {
const provider = process.env.OPENCODE_PROVIDER || 'anthropic';
const model = process.env.OPENCODE_MODEL;
const smallModel = process.env.OPENCODE_SMALL_MODEL;
const proxyUrl = process.env.ANTHROPIC_BASE_URL;
const providerModelId = model ? model.replace(new RegExp(`^${provider}/`), '') : undefined;
const providerSmallModelId = smallModel ? smallModel.replace(new RegExp(`^${provider}/`), '') : undefined;
const modelsToRegister = [providerModelId, providerSmallModelId]
.filter(Boolean)
.filter((mid, i, a) => a.indexOf(mid as string) === i);
const providerOptions: Record<string, unknown> =
provider === 'anthropic'
? {}
: {
[provider]: {
options: { apiKey: 'placeholder', baseURL: proxyUrl },
...(modelsToRegister.length > 0
? {
models: Object.fromEntries(
modelsToRegister.map((mid) => [mid, { id: mid, name: mid, tool_call: true }]),
),
}
: {}),
},
};
const mcp = mcpServersToOpenCodeConfig(options.mcpServers);
// Load shared base + per-group fragments + per-group memory through OpenCode's
// native instructions pipeline (session/instruction.ts). Absolute paths with
// globs are supported. Files are read raw — `@./...` includes are NOT expanded
// by OpenCode, so point at the concrete files, not at composed CLAUDE.md.
const instructions = [
'/app/CLAUDE.md',
'/workspace/agent/.claude-fragments/*.md',
'/workspace/agent/CLAUDE.local.md',
];
return {
...(model ? { model } : {}),
...(smallModel ? { small_model: smallModel } : {}),
enabled_providers: [provider],
permission: 'allow',
autoupdate: false,
snapshot: false,
provider: providerOptions,
instructions,
mcp,
};
}
type SharedRuntime = {
proc: ChildProcess;
client: OpencodeClient;
stream: AsyncGenerator<{ type: string; properties: Record<string, unknown> }, void, void>;
streamRelease: () => void;
};
let sharedRuntime: SharedRuntime | null = null;
let sharedConfigKey: string | null = null;
let sharedInit: Promise<SharedRuntime> | null = null;
function runtimeConfigKey(options: ProviderOptions): string {
return JSON.stringify({
mcp: mcpServersToOpenCodeConfig(options.mcpServers),
model: process.env.OPENCODE_MODEL,
small: process.env.OPENCODE_SMALL_MODEL,
op: process.env.OPENCODE_PROVIDER,
});
}
async function ensureSharedRuntime(options: ProviderOptions): Promise<SharedRuntime> {
const key = runtimeConfigKey(options);
if (sharedRuntime && sharedConfigKey === key) return sharedRuntime;
if (sharedInit) return sharedInit;
sharedInit = (async () => {
if (sharedRuntime) {
destroySharedRuntime();
}
const config = buildOpenCodeConfig(options);
const { url, proc } = await spawnOpencodeServer(config);
const client = createOpencodeClient({ baseUrl: url });
const sub = await client.event.subscribe();
const stream = sub.stream as AsyncGenerator<{ type: string; properties: Record<string, unknown> }, void, void>;
sharedRuntime = {
proc,
client,
stream,
streamRelease: () => {
void stream.return?.(undefined);
},
};
sharedConfigKey = key;
sharedInit = null;
return sharedRuntime;
})();
return sharedInit;
}
export function destroySharedRuntime(): void {
if (sharedRuntime) {
try {
sharedRuntime.streamRelease();
} catch {
/* ignore */
}
killProcessTree(sharedRuntime.proc);
sharedRuntime = null;
sharedConfigKey = null;
}
sharedInit = null;
}
function sessionErrorMessage(props: { error?: unknown }): string {
const err = props.error as { data?: { message?: string } } | undefined;
if (err && typeof err === 'object' && err.data && typeof err.data.message === 'string') {
return err.data.message;
}
return JSON.stringify(props.error) || 'OpenCode session error';
}
export class OpenCodeProvider implements AgentProvider {
readonly supportsNativeSlashCommands = false;
private readonly options: ProviderOptions;
private activeSessionId: string | undefined;
constructor(options: ProviderOptions = {}) {
this.options = options;
}
isSessionInvalid(err: unknown): boolean {
const msg = err instanceof Error ? err.message : String(err);
return STALE_SESSION_RE.test(msg);
}
query(input: QueryInput): AgentQuery {
if (input.continuation) {
this.activeSessionId = input.continuation;
} else {
this.activeSessionId = undefined;
}
const pending: string[] = [];
let waiting: (() => void) | null = null;
let ended = false;
let aborted = false;
const systemInstructions = input.systemContext?.instructions;
pending.push(wrapPromptWithContext(input.prompt, systemInstructions));
const kick = (): void => {
waiting?.();
};
const self = this;
const IDLE_TIMEOUT_MS = Number(process.env.OPENCODE_IDLE_TIMEOUT_MS) || 300_000;
async function* gen(): AsyncGenerator<ProviderEvent> {
let initYielded = false;
const rt = await ensureSharedRuntime(self.options);
const { client, stream } = rt;
while (!aborted) {
while (pending.length === 0 && !ended && !aborted) {
await new Promise<void>((resolve) => {
waiting = resolve;
});
waiting = null;
}
if (aborted) return;
if (pending.length === 0 && ended) return;
const text = pending.shift()!;
let sessionId = self.activeSessionId;
if (!sessionId) {
const created = await client.session.create();
if (created.error) {
throw new Error(`OpenCode: failed to create session: ${JSON.stringify(created.error)}`);
}
sessionId = created.data?.id;
if (!sessionId) throw new Error('OpenCode: failed to create session (no id)');
self.activeSessionId = sessionId;
}
if (!initYielded) {
yield { type: 'init', continuation: sessionId };
initYielded = true;
}
const promptRes = await client.session.promptAsync({
path: { id: sessionId },
body: { parts: [{ type: 'text', text }] },
});
if (promptRes.error) {
self.activeSessionId = undefined;
throw new Error(`OpenCode promptAsync: ${JSON.stringify(promptRes.error)}`);
}
const partTextByMessageId = new Map<string, string>();
const roleByMessageId = new Map<string, string>();
let lastEventAt = Date.now();
let eventTimedOut = false;
const timeoutCheck = setInterval(() => {
if (Date.now() - lastEventAt > IDLE_TIMEOUT_MS) {
log(`OpenCode event timeout (${IDLE_TIMEOUT_MS}ms) — clearing session ${sessionId}`);
eventTimedOut = true;
self.activeSessionId = undefined;
destroySharedRuntime();
kick();
}
}, 5000);
try {
turn: while (true) {
if (aborted) return;
if (eventTimedOut) {
throw new Error(`OpenCode event timeout (${IDLE_TIMEOUT_MS}ms)`);
}
const { value: ev, done } = await stream.next();
if (done) {
throw new Error('OpenCode SSE stream ended unexpectedly');
}
if (!ev?.type || ev.type === 'server.connected' || ev.type === 'server.heartbeat') continue;
lastEventAt = Date.now();
yield { type: 'activity' };
switch (ev.type) {
case 'message.updated': {
const info = ev.properties.info as { id?: string; role?: string } | undefined;
if (info?.id && info?.role) {
roleByMessageId.set(info.id, info.role);
}
break;
}
case 'message.part.updated': {
const part = ev.properties.part as { type?: string; messageID?: string; text?: string } | undefined;
if (part?.type === 'text' && part.messageID && part.text) {
partTextByMessageId.set(part.messageID, part.text);
}
break;
}
case 'permission.updated': {
const perm = ev.properties as { id?: string; sessionID?: string };
if (perm.sessionID === sessionId && perm.id) {
try {
await client.postSessionIdPermissionsPermissionId({
path: { id: sessionId, permissionID: perm.id },
body: { response: 'always' },
});
} catch (err) {
log(`Failed to auto-reply permission: ${err instanceof Error ? err.message : String(err)}`);
}
}
break;
}
case 'session.status': {
const props = ev.properties as {
sessionID?: string;
status?: { type?: string; attempt?: number; message?: string };
};
if (props.sessionID !== sessionId) break;
const st = props.status;
if (
st?.type === 'retry' &&
typeof st.attempt === 'number' &&
st.attempt >= SESSION_STATUS_RETRY_ERROR_AFTER &&
st.message
) {
self.activeSessionId = undefined;
throw new Error(`OpenCode retry limit (${st.attempt}): ${st.message}`);
}
break;
}
case 'session.error': {
const props = ev.properties as { sessionID?: string; error?: unknown };
if (props.sessionID === sessionId || props.sessionID === undefined) {
self.activeSessionId = undefined;
throw new Error(sessionErrorMessage(props));
}
break;
}
case 'session.idle': {
const sid = (ev.properties as { sessionID?: string }).sessionID;
if (sid === sessionId) {
break turn;
}
break;
}
default:
break;
}
}
} finally {
clearInterval(timeoutCheck);
}
let resultText = '';
for (const [msgId, role] of roleByMessageId) {
if (role === 'assistant') {
resultText = partTextByMessageId.get(msgId) ?? resultText;
}
}
yield { type: 'result', text: resultText || null };
}
}
return {
push: (message: string) => {
pending.push(wrapPromptWithContext(message, systemInstructions));
kick();
},
end: () => {
ended = true;
kick();
},
events: gen(),
abort: () => {
aborted = true;
this.activeSessionId = undefined;
kick();
destroySharedRuntime();
},
};
}
}
registerProvider('opencode', (opts) => new OpenCodeProvider(opts));
+49
View File
@@ -0,0 +1,49 @@
/**
* Host-side container config for the `codex` provider.
*
* Codex reads auth and MCP config from ~/.codex. We give each session its
* own private copy of that directory so:
*
* - The user's host ~/.codex/auth.json reaches the container without us
* touching their host config.toml (which the host's own `codex` CLI
* might be using).
* - The in-container provider can rewrite config.toml freely on every
* wake with container-appropriate MCP server paths, without racing
* other sessions or leaking per-session paths back to the host.
*
* Env passthrough covers the two knobs that are read at runtime:
* OPENAI_API_KEY — fallback auth when auth.json isn't a subscription token
* CODEX_MODEL — model override if the user wants something other than the default
* OPENAI_BASE_URL — rare, but supports API-compatible alternates
*/
import fs from 'fs';
import path from 'path';
import { registerProviderContainerConfig } from './provider-container-registry.js';
registerProviderContainerConfig('codex', (ctx) => {
const codexDir = path.join(ctx.sessionDir, 'codex');
fs.mkdirSync(codexDir, { recursive: true });
// Copy the host's auth.json into the per-session dir if it exists.
// We only copy auth.json, not the full ~/.codex — config.toml would
// get clobbered by the container on every wake anyway.
const hostHome = ctx.hostEnv.HOME;
if (hostHome) {
const hostAuth = path.join(hostHome, '.codex', 'auth.json');
if (fs.existsSync(hostAuth)) {
fs.copyFileSync(hostAuth, path.join(codexDir, 'auth.json'));
}
}
const env: Record<string, string> = {};
for (const key of ['OPENAI_API_KEY', 'CODEX_MODEL', 'OPENAI_BASE_URL'] as const) {
const value = ctx.hostEnv[key];
if (value) env[key] = value;
}
return {
mounts: [{ hostPath: codexDir, containerPath: '/home/node/.codex', readonly: false }],
env,
};
});
+3
View File
@@ -4,3 +4,6 @@
// needs (claude, mock) don't appear here.
//
// Skills add a new provider by appending one import line below.
import './codex.js';
import './opencode.js';
+49
View File
@@ -0,0 +1,49 @@
/**
* Host-side container config for the `opencode` provider.
*
* OpenCode's `opencode serve` process stores state under XDG_DATA_HOME, which
* we pin to a per-session host directory mounted at /opencode-xdg. The
* OPENCODE_* env vars tell the CLI which provider/model to use at runtime
* (read on the host, injected into the container). NO_PROXY / no_proxy are
* merged with host values so the in-container OpenCode client can talk to
* 127.0.0.1 even when HTTPS_PROXY is set by OneCLI.
*/
import fs from 'fs';
import path from 'path';
import { registerProviderContainerConfig } from './provider-container-registry.js';
function mergeNoProxy(current: string | undefined, additions: string): string {
if (!current?.trim()) return additions;
const parts = new Set(
current
.split(/[\s,]+/)
.map((s) => s.trim())
.filter(Boolean),
);
for (const addition of additions.split(',')) {
const trimmed = addition.trim();
if (trimmed) parts.add(trimmed);
}
return [...parts].join(',');
}
registerProviderContainerConfig('opencode', (ctx) => {
const opencodeDir = path.join(ctx.sessionDir, 'opencode-xdg');
fs.mkdirSync(opencodeDir, { recursive: true });
const env: Record<string, string> = {
XDG_DATA_HOME: '/opencode-xdg',
NO_PROXY: mergeNoProxy(ctx.hostEnv.NO_PROXY, '127.0.0.1,localhost'),
no_proxy: mergeNoProxy(ctx.hostEnv.no_proxy, '127.0.0.1,localhost'),
};
for (const key of ['OPENCODE_PROVIDER', 'OPENCODE_MODEL', 'OPENCODE_SMALL_MODEL'] as const) {
const value = ctx.hostEnv[key];
if (value) env[key] = value;
}
return {
mounts: [{ hostPath: opencodeDir, containerPath: '/opencode-xdg', readonly: false }],
env,
};
});