Compare commits

..

34 Commits

Author SHA1 Message Date
Omri Maya 780265225f chore(codex): trim filename comment to the non-obvious bit
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 16:58:35 +03:00
Omri Maya 9fa85ccf95 chore(codex): drop ponytail comment
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 16:56:03 +03:00
Omri Maya dfd3ee31a9 fix(codex): derive date-prefix strip from regex, drop magic offset
f.slice(11) silently assumed len("YYYY-MM-DD-") and was coupled to the
`dated` regex two lines up — change the date format and it points at the
wrong offset while still "working". Strip with f.replace(dated, '') so the
prefix length lives in exactly one place.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 16:54:34 +03:00
Koshkoshinsk 47f8296e67 fix(codex): date-prefix the per-thread archive filename, stable across days
Restore a sortable `YYYY-MM-DD-` prefix to the conversation archive filename
(parity with the Claude path) while keeping it thread-stable: reuse the
thread's existing file regardless of date so later exchanges keep appending to
one file, and only stamp the creation date when the file doesn't exist yet.
Exact-suffix match past the date prefix avoids substring collisions between
threads with shared prefixes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 15:10:13 +03:00
Koshkoshinsk dab93fc592 feat(codex): append per-thread conversation archive
The Codex provider wrote one standalone file per exchange because its
app-server keeps conversation history server-side, with no on-disk
transcript to roll up at a compaction boundary. Key the archive file on
the thread/continuation id and append each completed exchange instead, so
a session lands in one growing file — matching the Claude path's
one-file-per-session granularity.

The thread-level header (provider, continuation id) is written once when
the file is created; each appended block carries its own timestamp and
status. Distinct threads still get distinct files.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 15:10:13 +03:00
gavrielc 0ffe5582f0 refactor(codex): install Codex CLI via cli-tools.json manifest, not the Dockerfile
Trunk's adfae67 moved global Node CLI installs into container/cli-tools.json
(a json-merge seam) so a skill adds a CLI without editing the Dockerfile. The
Codex provider still hardcoded its CLI into the Dockerfile and guarded that
shape with a test. Migrate it:

- Drop the codex ARG + RUN from this branch's reference Dockerfile.
- Replace codex-dockerfile.test.ts with codex-cli-tools.test.ts, which asserts
  the @openai/codex entry in cli-tools.json (skips when the manifest is absent,
  e.g. on the bare providers branch).
- setup/providers/codex install-check verifies the manifest entry, not the
  Dockerfile.

Pairs with the trunk-side add-codex.sh / SKILL.md change that appends the
manifest entry instead of awk-ing the Dockerfile.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 21:36:16 +03:00
gavrielc 4b5576faea Merge pull request #2757 from nanocoai/providers-codex-v2
feat(codex): Codex agent-provider payload v2 — app-server on capability seams, vault-only auth
2026-06-14 21:29:20 +03:00
Omri Maya 904871aaa7 feat(codex): Codex agent-provider payload v2 — app-server on capability seams, vault-only auth
Rewrite the Codex provider onto the host's capability seams. Codex runs as a
real agent provider via `codex app-server` — planning, MCP tools, server-side
history, session resume — not as an MCP tool under Claude.

- Host provider (src/providers/codex.ts, codex-agents-md.ts): registers on the
  provider-container seam, composes AGENTS.md from the real config row, mounts a
  per-group ~/.codex state dir, vault-only auth stub (no credential in-container).
- Container runtime (codex.ts, codex-app-server.ts): app-server transport, turn
  lifecycle, racing-follow-up fix (clear the active turn on completion).
- Provider-owned per-exchange archiving (exchange-archive.ts) via onExchangeComplete.
- Codex CLI pinned to 0.138.0 in the Dockerfile (ARG + global install), guarded
  by a structural dockerfile test.
- macOS first-spawn fix: pre-create the auth-stub mountpoint.

The /add-codex skill is dropped from this branch — trunk is its canonical home.
The authored-skills canonical store is deferred to a future provider-seam PR
(its stale host-contribution assertion is removed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 07:49:23 +03:00
gavrielc 5e76b9d7e8 test(providers): add barrel-driven registration tests for opencode + codex
Behavior tests in both trees that import ONLY the real barrel (host
src/providers/index.ts → listProviderContainerConfigNames; container
providers/index.ts → listProviderNames) and assert the provider is present.
Unlike the existing *.factory.test.ts (which import the provider module
directly, self-register, and stay green when the barrel line is deleted),
these go red if the barrel reach-in is removed/drifts. The opencode container
test also implicitly guards @opencode-ai/sdk via its unmocked barrel import.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 19:09:08 +03:00
gavrielc b429ab37b8 Merge pull request #2152 from glifocat/fix/opencode-process-group-and-timeout
fix(opencode): kill server process group + configurable IDLE_TIMEOUT_MS
2026-05-01 18:40:39 +03:00
gavrielc 09ddde33e1 Merge pull request #2153 from glifocat/fix/opencode-instructions-pipeline
fix(opencode): use native instructions config to load CLAUDE.md and fragments
2026-05-01 17:44:26 +03:00
gavrielc c0c46c14d6 fix(opencode): drop obsolete NANOCLAW_IS_MAIN / global CLAUDE.md branch
NANOCLAW_IS_MAIN no longer exists in the v2 codebase, and
groups/global/ is explicitly migrated away by composeGroupClaudeMd
(shared base now lives in container/CLAUDE.md → /app/CLAUDE.md, which
the instructions array already includes). Carrying /workspace/global
would give OpenCode more context than the Claude provider sees.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 17:44:01 +03:00
Ethan c993527e25 fix(opencode): load CLAUDE.md context via native instructions pipeline
Before: wrapPromptWithContext concatenated /workspace/agent/CLAUDE.md
verbatim into a <system>...</system> block in the user-message text.
That file is the host-composed entry point that contains only `@./...`
includes (the .claude-shared.md symlink to /app/CLAUDE.md, plus the
module fragments). OpenCode does not expand `@` in instruction files —
the syntax is a Claude Code convention for the model's own Read tool
to lazy-load. Result: the model received the literal lines
"@./.claude-shared.md\n@./.claude-fragments/module-...md" as text and
saw none of the actual content (workspace, memory, conversation
history, agents/core/interactive/scheduling/self-mod modules, OneCLI,
etc.). Confirmed empirically by dumping the constructed prompt for a
turn before any fix.

After: pass the concrete files via OpenCode's native `instructions`
config field. Per packages/opencode/src/session/instruction.ts on the
upstream `dev` branch, absolute paths and globs are resolved with
`fs.glob(basename, { cwd: dirname, absolute: true, include: 'file' })`,
files are read raw, and the resulting strings are concatenated into
the LLM's system prompt at packages/opencode/src/session/prompt.ts:1442.
That is the canonical channel — same one OpenCode uses for AGENTS.md /
CLAUDE.md auto-discovery.

Configured set:
  /app/CLAUDE.md                              shared base
  /workspace/agent/.claude-fragments/*.md     per-skill fragments
  /workspace/agent/CLAUDE.local.md            per-group memory
  /workspace/global/CLAUDE.md                 cross-group memory (non-main)

Removed: readClaudeMdForPrompt, the manual <system> wrap of its output,
and the now-unused `fs` import. The dynamic systemInstructions wrap
(assistant name + destinations) stays as-is — that varies per call.

Verified post-fix: Citiclaw (gemma4:31b via Ollama) responded to "what
self-modification tools do you have" with the exact MCP tool names and
a structured bullet list — vs. the previous ungrounded "I haven't
received instructions" response.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 12:54:56 +02:00
Ethan 0367dbb6f0 fix(opencode): kill server process group + configurable idle timeout
Two bugs in the upstream OpenCode provider that fire together when a
local backend (Ollama, llama.cpp) is slower than the hardcoded 90s
event timeout:

1. proc.kill('SIGKILL') only kills the wrapper process the spawn
   returned, not the opencode-linux-*/bin/opencode child it execs into.
   The child keeps holding port 4096, so the next spawnOpencodeServer()
   fails with "Failed to start server on port 4096" / EADDRINUSE.
   Fix: spawn detached and signal the whole process group via
   process.kill(-pid, 'SIGKILL') in a new killProcessTree() helper.

2. IDLE_TIMEOUT_MS = 90_000 is hardcoded. For a local 31B model the
   first prompt's time-to-first-token routinely exceeds that, tripping
   the timeout. Fix: read OPENCODE_IDLE_TIMEOUT_MS from env, default
   300_000 (5 min) — generous for cloud APIs, just enough for local.

Per-group override goes in container.json env (e.g. "600000" for a
slow local box), no rebuild needed since src/ is bind-mounted.

Same bugs exist on origin/providers — should be ported upstream.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 12:54:51 +02:00
gavrielc c7a7a709ed Merge main into providers
Bring providers up to date with main, including the channel-inbound
attachment path-traversal fix.

Resolved conflicts:
- .claude/skills/add-codex/SKILL.md: took main (newer CODEX_VERSION,
  more accurate schema docs)
- container/Dockerfile: kept main's stratified pnpm-install layers for
  caching, added a separate layer for @openai/codex, bumped Dockerfile
  CODEX_VERSION default to 0.124.0 to match SKILL.md
2026-04-28 13:53:47 +03:00
gavrielc 7c8d220115 Merge pull request #1966 from IamAdamJowett/fix/codex-resolve-imports-and-local-md
fix(codex-provider): resolve @-imports and load CLAUDE.local.md
2026-04-24 15:32:14 +03:00
Adam 2d2c3204bc fix(codex-provider): don't double-append global CLAUDE.md when @-imports resolve it
readAgentAndGlobalClaudeMd was appending /workspace/global/CLAUDE.md
explicitly at the end, but since @-import resolution was added the
group's own CLAUDE.md already pulls global in via its default
`@./.claude-global.md` import (see src/group-init.ts). Result:
non-main groups got global instructions inlined twice, wasting
context tokens and occasionally producing contradictory repeated
guidance.

Drop the explicit global append. Groups that want global content
import it via the @-directive in their CLAUDE.md (the default) —
groups that intentionally don't import it now correctly skip it,
matching Claude-backed agent behaviour. The NANOCLAW_IS_MAIN
env branch also becomes unnecessary and is removed.

Addresses Codex Review P2 on qwibitai/nanoclaw#1966.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:45:16 +10:00
Adam 4836cb59df fix(codex-provider): resolve @-imports and load CLAUDE.local.md
The Codex app-server doesn't expand Claude Code's @-import syntax and
doesn't auto-load CLAUDE.local.md from the working directory. Left
alone, Codex-backed agents saw only the raw import directives as
literal text — no composed CLAUDE.md, no module fragments, no
per-group memory. The Claude provider worked because Claude Code
resolves both natively; anything non-Claude was running half-blind.

Adds a pure resolveClaudeImports(content, baseDir) helper that inlines
`^@<path>$` directives relative to the file they live in, recursing
into imported files and breaking cycles. readAgentAndGlobalClaudeMd
now uses it on the group CLAUDE.md, then appends CLAUDE.local.md
(same resolution base), then the global CLAUDE.md.

Missing or cyclic imports are dropped silently (empty text) rather
than left as raw `@path` lines, which would confuse the model.

Fixes the symptom where a Codex-backed agent with a well-written
CLAUDE.local.md (e.g. image-delivery discipline, wiki instructions)
behaved as if none of it existed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:33:43 +10:00
gavrielc c20213a133 docs(add-codex): fix Dockerfile install step — separate RUN block, not combined list
Keep this in sync with main. The skill's prior instruction (append to a
combined `pnpm install -g` block) no longer matches main's Dockerfile, which
splits each global CLI into its own RUN layer for cache granularity. Update
to add a standalone RUN block for Codex that matches the existing pattern.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 21:38:40 +03:00
gavrielc d9ed98fd65 style: apply prettier to merged files 2026-04-22 11:53:55 +03:00
gavrielc 5ae1c33fff Merge v2 into providers
Picks up 105 commits from v2 (engage modes, sender/channel approval flows,
host-sweep heartbeat lifecycle, setup/onecli refactor, setup-flow docs).

Retires 9 deprecated skills that moved out of this branch's scope:
add-compact, add-gmail, add-image-vision, add-pdf-reader, add-reactions,
add-telegram-swarm, add-voice-transcription, channel-formatting,
use-local-whisper.

Preserves providers-branch code: codex provider (6 files), opencode
provider + MCP bridge, plus the CODEX_VERSION Dockerfile ARG.

Conflict: container/Dockerfile — combined v2's CLAUDE_CODE_VERSION bump
(2.1.112 → 2.1.116) with the branch-local CODEX_VERSION ARG.
2026-04-22 11:52:37 +03:00
gavrielc af542adad5 Merge pull request #1843 from chiptoe-svg/feat/codex-provider
feat(providers): add codex provider via app-server JSON-RPC
2026-04-20 23:45:04 +03:00
Chip Tonkin 894b154e41 fix(codex): remove client-driven compaction threshold
The 40 000-input-token threshold and its supporting plumbing
(cumulativeInputTokens counter, thread/tokenUsage/updated handler,
setInputTokens callback on runOneTurn) were SDK-era residue. In the
SDK implementation we ran custom summarize-and-restart around 40k
because the SDK had no native compaction; that reasoning stopped
applying once we switched to app-server's native thread/compact/start.

With that migration the threshold wasn't removed, only re-plumbed —
calling thread/compact/start from the client every time we crossed
an arbitrary 40k. On 128k-context models that over-compacts by
~3× relative to the actual context window, adding latency for no
durability gain.

Trust app-server to handle its own compaction (same posture as the
Claude provider trusts Claude Code's auto_compact). If empirical
behavior shows turns failing at the context wall, we'll revisit with
a server-notification-driven trigger rather than a client-side
threshold.

Also removes now-unused imports (sendCodexRequest, local log helper).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 17:50:24 -04:00
Chip Tonkin 831ef88f16 fix(codex): harden resume fallback and MCP config.toml escaping
Surfaced by adversarial review of the PR diff.

1. thread/resume fallback (startOrResumeCodexThread). Previously fell back
   to thread/start on ANY resume error, silently discarding session state
   for transient / auth / version-mismatch failures that should surface.
   Now gated on STALE_THREAD_RE — the same pattern CodexProvider
   .isSessionInvalid uses on the caller side — and re-throws everything
   else so the poll-loop can decide what to do.

2. MCP config.toml (writeCodexMcpConfigToml). Raw string interpolation for
   command / args / env values would produce invalid TOML on unescaped
   quotes or backslashes, and silently reinterpret the value if a newline
   snuck in. Now routed through a new tomlBasicString helper that escapes
   `"` and `\` per TOML basic-string rules, and rejects newlines with a
   clear error rather than mask misconfiguration.

STALE_THREAD_RE moves to codex-app-server.ts (exported) since both files
need it; codex.ts imports to keep the two detection paths in sync.

Tests: tomlBasicString (quotes, backslashes, escape-order, newline
rejection) and STALE_THREAD_RE (stale vs transient messages).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 13:02:25 -04:00
Chip Tonkin d05923f274 feat(skills): /add-codex — install Codex as the agent provider
Ports the Codex provider files from the providers branch into trunk,
wires host and container barrels, updates the Dockerfile to install
the Codex CLI, and documents the three auth paths (ChatGPT
subscription, OPENAI_API_KEY, BYO OpenAI-compatible endpoint).

Mirrors the /add-opencode skill shape so the two non-Claude provider
install paths look familiar. No agent-runner package dependency —
Codex is a CLI binary, not a library.

Requested by the maintainer on the PR thread — the skill lands
alongside feat(providers): add codex provider via app-server JSON-RPC
so users have a guided install path when trunk eventually picks up
the provider.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 11:31:08 -04:00
Chip Tonkin aba618215d fix(codex): load /workspace/global/CLAUDE.md for non-main groups
Codex's app-server doesn't expand Claude Code's `@-import` syntax, so
the previous loader only passed group CLAUDE.md — global content never
reached the model. This mirrors the OpenCode provider's
readClaudeMdForPrompt: load group CLAUDE.md, then (unless NANOCLAW_IS_MAIN=1)
append /workspace/global/CLAUDE.md.

The literal `@./.claude-global.md` line at the top of group CLAUDE.md is
left in place. OpenCode does the same — keeps the two non-Claude
providers behaviorally consistent, and upstream may revise the group
template later.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 11:12:08 -04:00
Chip Tonkin 138c277fae fix(codex): remove no-op model_provider_base_url override
`-c model_provider_base_url=...` is not a valid Codex config key. Codex
uses nested paths (e.g. `model_providers.<name>.base_url`) per the
`-c` help text, so the override was creating a top-level TOML entry
that Codex never reads.

The override was also redundant: Codex's built-in `openai` provider
honors the `OPENAI_BASE_URL` env var natively (standard OpenAI SDK
convention), verified against codex-cli 0.118 — a request with
`OPENAI_BASE_URL=http://127.0.0.1:9/v1` tries to reach exactly that URL.

The host provider already forwards `OPENAI_BASE_URL` into the container
(src/providers/codex.ts), so BYO-endpoint users keep their config path
without this broken `-c` flag in the way.

Net: -5 LOC, one less silent-fail branch, no user-visible change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 11:10:07 -04:00
Chip Tonkin 1e7cb8b8c8 feat(providers): add codex provider via app-server JSON-RPC
Adds `codex` as a third AgentProvider alongside `claude` and `opencode`.

## Goal

Bring Codex up to the same feature bar as the Claude Agent SDK integration:
persistent sessions, streaming output, MCP tool access, and native
compaction — all driven through the AgentProvider interface without
polluting the poll-loop with per-SDK quirks.

## Why not the @openai/codex-sdk

The SDK-wrapping approach (sketched in agent-runner-details.md) was
tried first but fell short of Claude Agent SDK's feature set on
several fronts. The `codex app-server` JSON-RPC protocol closes most
of the gap:

| Capability                  | Claude SDK | Codex SDK | Codex app-server |
|-----------------------------|:----------:|:---------:|:----------------:|
| Session resume              |     ✓      |     ✓     |        ✓         |
| Streaming event output      |     ✓      |     ✓     |        ✓         |
| Mid-turn input push         |     ✓      |     ✗     | queue + drain*   |
| MCP servers                 |     ✓      |  config   |     config       |
| Approval hooks              |    hooks   |  limited  |  server reqs     |
| Native compaction trigger   |    auto    |     ✗     |        ✓         |
| Multi-turn subagents        |    Task    |     ✗     |   spawnAgent     |
| Config overrides            |     API    |     JS    |   -c flags       |

*Codex turns don't accept mid-turn input. The provider queues push()
 and drains between turns — matches the opencode provider pattern and
 the poll-loop doesn't dispatch new messages mid-turn anyway.

## Known shortcomings (both tractable in follow-ups)

**No native file/web tools.** Claude ships Read/Write/Edit/Glob/Grep/
WebFetch/WebSearch in-SDK. Codex (both SDK and app-server) leaves the
agent to shell out via bash — slower, messier output. Mitigation: a
follow-up can expose these as MCP tools on the container side, reusing
the existing `send_message` / `send_file` tool plumbing.

**Higher tool-call latency.** Every tool call round-trips through
JSON-RPC over stdio (~10-50ms per hop vs Claude's in-process dispatch).
Mitigation: batch MCP calls, pipeline commands — same playbook as
opencode.

Neither blocks feature parity for the common conversational + MCP
cases this PR targets.

## Files

- container/agent-runner/src/providers/codex-app-server.ts
    JSON-RPC transport: spawn, request/response, notification dispatch,
    auto-approval, thread lifecycle, MCP config.toml writer. Kept
    separate so it can be unit-tested without the full provider.
- container/agent-runner/src/providers/codex.ts
    CodexProvider implementing AgentProvider. Emits `activity` on every
    notification so idle timer stays honest. isSessionInvalid matches
    on stale-thread-ID errors for clean recovery.
- container/agent-runner/src/providers/codex.factory.test.ts
    Factory + isSessionInvalid + supportsNativeSlashCommands coverage.
- src/providers/codex.ts
    Host-side container contribution. Per-session ~/.codex copy with
    auth.json copied from host, so the container rewrites config.toml
    on every wake without touching the host's. Forwards OPENAI_API_KEY
    / CODEX_MODEL / OPENAI_BASE_URL.
- container/Dockerfile
    Pins CODEX_VERSION=0.121.0, adds @openai/codex to the pnpm global
    install block alongside claude-code/agent-browser/vercel.

## Validation

Smoke-tested end-to-end against real codex-cli 0.118 on macOS: observed
init → activity → progress → result events with a stable thread ID as
continuation, result text matched expected output. Unit tests: 26 pass
in agent-runner, 137 in host. Typecheck + prettier clean on all added
files.
2026-04-19 10:30:47 -04:00
gavrielc eb0055a0b0 Merge remote-tracking branch 'origin/v2' into providers-sync 2026-04-18 22:07:32 +03:00
gavrielc ef6ea87628 Merge branch 'v2' into providers 2026-04-18 19:15:58 +03:00
gavrielc b29df213ad Merge remote-tracking branch 'origin/v2' into providers 2026-04-18 15:59:25 +03:00
gavrielc dd53875574 fix(providers): restore opencode files that v2 merge deleted
The v2 merge brought in commit e0258e8 ('move opencode provider off
v2 trunk') which correctly removed opencode from v2 but also applied
that deletion on this branch, where opencode must live. Restored:

- container/agent-runner/src/providers/opencode.ts
- container/agent-runner/src/providers/mcp-to-opencode.ts
- container/agent-runner/src/providers/mcp-to-opencode.test.ts
- src/providers/opencode.ts
- @opencode-ai/sdk dep in container/agent-runner/package.json
- opencode self-registration imports in both providers barrels

All 21 container tests now pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 14:57:38 +03:00
gavrielc a1ce73c376 Merge branch 'v2' into providers 2026-04-18 14:54:54 +03:00
gavrielc 48e4172899 refactor(providers-branch): split opencode test into its own file
Moves the opencode test case out of factory.test.ts into
opencode.factory.test.ts so the /add-opencode skill can install the
opencode bundle as a pure file copy — no in-place merging of the shared
factory.test.ts required. Keeps factory.test.ts identical to v2 trunk.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 14:13:04 +03:00
61 changed files with 3545 additions and 1310 deletions
-161
View File
@@ -1,161 +0,0 @@
---
name: add-codex
description: Use Codex (CLI + AppServer) as the full agent provider — planning, tool orchestration, native compaction, MCP tools, session resume — in place of the Claude Agent SDK. ChatGPT subscription or OPENAI_API_KEY. Per-group via agent_provider. Distinct from using OpenAI as an MCP tool (where Claude remains the planner).
---
# Codex agent provider
NanoClaw runs agents in a long-lived **poll loop** inside the container. The backend is selected with **`AGENT_PROVIDER`** (`claude` | `opencode` | `codex` | `mock`).
Trunk ships with only the `claude` provider baked in. This skill copies the Codex provider files in from the `providers` branch, wires them into the host and container barrels, updates the Dockerfile to install the Codex CLI, and rebuilds the image.
The Codex provider runs `codex app-server` as a child process and speaks JSON-RPC over stdio. That gives it native session resume, streaming events, MCP tool access, and `thread/compact/start` compaction — same feature bar as the Claude Agent SDK, without the Anthropic-only lock-in.
## Install
### Pre-flight
If all of the following are already present, skip to **Configuration**:
- `src/providers/codex.ts`
- `container/agent-runner/src/providers/codex.ts`
- `container/agent-runner/src/providers/codex-app-server.ts`
- `container/agent-runner/src/providers/codex.factory.test.ts`
- `import './codex.js';` line in `src/providers/index.ts`
- `import './codex.js';` line in `container/agent-runner/src/providers/index.ts`
- `ARG CODEX_VERSION` and `"@openai/codex@${CODEX_VERSION}"` in the pnpm global-install block in `container/Dockerfile`
Missing pieces — continue below. All steps are idempotent; re-running is safe.
### 1. Fetch the providers branch
```bash
git fetch origin providers
```
### 2. Copy the Codex source files
Wholesale copies (owned entirely by this skill — user edits to these files won't survive a re-run, as designed):
```bash
git show origin/providers:src/providers/codex.ts > src/providers/codex.ts
git show origin/providers:container/agent-runner/src/providers/codex.ts > container/agent-runner/src/providers/codex.ts
git show origin/providers:container/agent-runner/src/providers/codex-app-server.ts > container/agent-runner/src/providers/codex-app-server.ts
git show origin/providers:container/agent-runner/src/providers/codex.factory.test.ts > container/agent-runner/src/providers/codex.factory.test.ts
```
### 3. Append the self-registration imports
Each barrel gets one line — alphabetical placement keeps diffs small.
`src/providers/index.ts`:
```typescript
import './codex.js';
```
`container/agent-runner/src/providers/index.ts`:
```typescript
import './codex.js';
```
### 4. Add the Codex CLI to the container Dockerfile
Two edits to `container/Dockerfile`, both idempotent (skip if already present):
**(a)** In the "Pin CLI versions" ARG block (around line 18), add after `ARG CLAUDE_CODE_VERSION=...`:
```dockerfile
ARG CODEX_VERSION=0.124.0
```
**(b)** Add a new standalone `RUN` block for the Codex CLI, after the existing per-CLI install blocks (around line 106, right after the `@anthropic-ai/claude-code` block). The Dockerfile splits each global CLI into its own layer for cache granularity — keep that pattern; do not collapse them into a single combined `pnpm install -g` call:
```dockerfile
RUN --mount=type=cache,target=/root/.cache/pnpm \
pnpm install -g "@openai/codex@${CODEX_VERSION}"
```
Note: **no agent-runner package dependency** — Codex is a CLI binary, not a library. Unlike OpenCode, there's nothing to add to `container/agent-runner/package.json`.
### 5. Build
```bash
pnpm run build # host
pnpm exec tsc -p container/agent-runner/tsconfig.json --noEmit # container typecheck
./container/build.sh # agent image
```
## Configuration
Codex supports two primary auth paths and one experimental BYO-endpoint path. Pick the one that matches your setup.
### Option A — ChatGPT subscription (recommended for individuals)
On the host (not inside the container), run Codex's OAuth login:
```bash
codex login
```
This writes `~/.codex/auth.json` with a subscription token. The host-side Codex provider ([src/providers/codex.ts](../../../src/providers/codex.ts)) copies `auth.json` into a per-session `~/.codex` directory mounted into the container — your host's own Codex CLI is never touched.
No `.env` variables required for this mode.
### Option B — API key (recommended for CI or API billing)
```env
OPENAI_API_KEY=sk-...
CODEX_MODEL=gpt-5.4-mini
```
The host forwards both variables into the container. If both subscription (`auth.json`) and `OPENAI_API_KEY` are present, Codex prefers the subscription.
### Option C — BYO OpenAI-compatible endpoint (experimental)
Codex's built-in `openai` provider honors the `OPENAI_BASE_URL` env var directly. Point it at any OpenAI-compatible endpoint — Groq, Together, self-hosted vLLM, an OpenAI proxy, etc.
```env
OPENAI_API_KEY=...
OPENAI_BASE_URL=https://api.groq.com/openai/v1
CODEX_MODEL=llama-3.3-70b-versatile
```
Codex also ships first-class local-runner flags — `codex --oss --local-provider ollama` or `--local-provider lmstudio` — that auto-detect a local server. To use those inside NanoClaw, set `CODEX_MODEL` to a model your local runner serves and add the corresponding base URL; see the Codex CLI docs for the full `model_provider = oss` configuration.
**Experimental caveat:** tool-calling quality depends on the model and endpoint. Not every OpenAI-compat provider implements the full function-calling spec, and smaller models (< 30B) often struggle with multi-step tool orchestration. Test before committing.
### Per group / per session
Set `"provider": "codex"` in the group's **`container.json`** (`groups/<folder>/container.json`) — the in-container runner reads `provider` from there, not from the DB. The DB columns **`agent_groups.agent_provider`** and **`sessions.agent_provider`** (session overrides group) only drive host-side provider contribution — per-session `~/.codex` mount, `OPENAI_*` / `CODEX_MODEL` env passthrough — and do not propagate into `container.json` at spawn time. Set both, or just edit `container.json`; if they disagree, the runner uses `container.json` and the host-side resolver falls back through session → group → `container.json``'claude'`.
`CODEX_MODEL` applies process-wide via `.env`; if you need different models for different groups, set them via `container_config.env` on the group.
Extra MCP servers still come from **`NANOCLAW_MCP_SERVERS`** / `container_config.mcpServers` on the host. The runner merges them into the same `mcpServers` object passed to all providers.
## Operational notes
- **Spawn-per-query:** Codex's app-server is spawned fresh per query invocation, matching the OpenCode pattern. No long-lived daemon to keep healthy across sessions.
- **Per-session `~/.codex` isolation:** each group gets its own copy of the host's `auth.json`. The container can rewrite `config.toml` freely on every wake without touching the host's Codex config.
- **Native compaction:** kicks in automatically at 40K cumulative input tokens between turns, via `thread/compact/start`. If compaction fails, the provider logs and continues uncompacted — no fatal error.
- **Approvals:** auto-accepted inside the container (the container is the sandbox; same posture as Claude/OpenCode).
- **Mid-turn input:** Codex turns don't accept mid-turn messages. Follow-up `push()` calls queue and drain between turns, matching the OpenCode pattern. The poll-loop only pushes between turns anyway, so no messages are dropped.
- **Stale thread recovery:** `isSessionInvalid` matches on stale-thread-ID errors (`thread not found`, `unknown thread`, etc.) so a cold-started app-server can recover cleanly when it sees a stored continuation it no longer has.
## Verify
```bash
grep -q "./codex.js" container/agent-runner/src/providers/index.ts && echo "container barrel: OK"
grep -q "./codex.js" src/providers/index.ts && echo "host barrel: OK"
grep -q "@openai/codex@" container/Dockerfile && echo "Dockerfile install: OK"
cd container/agent-runner && bun test src/providers/codex.factory.test.ts && cd -
```
After the image rebuild, set `agent_provider = 'codex'` on a test group and send a message. Successful round-trip looks like:
- `init` event with a stable thread ID as continuation
- One or more `activity` / `progress` events during the turn
- `result` event with the model's reply
If the agent hangs or errors, check `~/.codex/auth.json` exists on the host (Option A) or that `OPENAI_API_KEY` is forwarding correctly (Option B) — `docker exec` into a running container and `env | grep -i openai` to confirm.
+2 -8
View File
@@ -60,7 +60,7 @@ pnpm run build
1. Go to [api.slack.com/apps](https://api.slack.com/apps) and click **Create New App** > **From scratch**
2. Name it (e.g., "NanoClaw") and select your workspace
3. Go to **OAuth & Permissions** and add Bot Token Scopes:
- `chat:write`, `im:write`, `channels:history`, `groups:history`, `im:history`, `channels:read`, `groups:read`, `users:read`, `reactions:write`
- `chat:write`, `channels:history`, `groups:history`, `im:history`, `channels:read`, `groups:read`, `users:read`, `reactions:write`
4. Click **Install to Workspace** and copy the **Bot User OAuth Token** (`xoxb-...`)
5. Go to **Basic Information** and copy the **Signing Secret**
@@ -76,13 +76,7 @@ pnpm run build
10. Under **Subscribe to bot events**, add:
- `message.channels`, `message.groups`, `message.im`, `app_mention`
11. Click **Save Changes**
### Interactivity
12. Go to **Interactivity & Shortcuts** and toggle **Interactivity** on
13. Set the **Request URL** to the same `https://your-domain/webhook/slack`
14. Click **Save Changes**
15. Slack will show a banner asking you to **reinstall the app** — click it to apply the new settings
12. Slack will show a banner asking you to **reinstall the app** — click it to apply the new event subscriptions
### Configure environment
+1 -11
View File
@@ -186,17 +186,7 @@ launchctl kickstart -k gui/$(id -u)/com.nanoclaw # restart
systemctl --user start|stop|restart nanoclaw
```
## Troubleshooting
Check these first when something goes wrong:
| What | Where |
|------|-------|
| Host logs | `logs/nanoclaw.error.log` first (delivery failures, crash-loop backoff, warnings), then `logs/nanoclaw.log` for the full routing chain |
| Setup logs | `logs/setup.log` (overall), `logs/setup-steps/*.log` (per-step: bootstrap, environment, container, onecli, mounts, service, etc.) |
| Session DBs | `data/v2-sessions/<agent-group>/<session>/``inbound.db` (`messages_in`: did the message reach the container?), `outbound.db` (`messages_out`: did the agent produce a response?) |
Note: container logs are lost after the container exits (`--rm` flag). If the agent silently failed inside the container, there's no persistent log to inspect.
Host logs: `logs/nanoclaw.log` (normal) and `logs/nanoclaw.error.log` (errors only — some delivery/approval failures only show up here).
## Supply Chain Security (pnpm)
+15
View File
@@ -0,0 +1,15 @@
You are a NanoClaw agent. Your name, destinations, and message-sending rules are provided in the runtime system prompt at the top of each turn.
## Communication
Be concise. Prefer outcomes over play-by-play; when the work is done, the final message should be about the result.
When you produce a file for the user in the workspace — a document, export, or asset — deliver it with `send_file` in the same turn; announcing without sending is an unfinished reply.
## Workspace
Files you create are saved in `/workspace/agent/`. Use this for notes, research, artifacts, and anything that should persist across turns in this group.
## Conversation History
The `conversations/` folder holds searchable past conversation transcripts or exchange archives for this group. Use it to recall prior context when a request references something that happened before.
+3
View File
@@ -7,6 +7,7 @@
"dependencies": {
"@anthropic-ai/claude-agent-sdk": "^0.2.116",
"@modelcontextprotocol/sdk": "^1.12.1",
"@opencode-ai/sdk": "^1.4.3",
"cron-parser": "^5.0.0",
"zod": "^4.0.0",
},
@@ -44,6 +45,8 @@
"@modelcontextprotocol/sdk": ["@modelcontextprotocol/sdk@1.29.0", "", { "dependencies": { "@hono/node-server": "^1.19.9", "ajv": "^8.17.1", "ajv-formats": "^3.0.1", "content-type": "^1.0.5", "cors": "^2.8.5", "cross-spawn": "^7.0.5", "eventsource": "^3.0.2", "eventsource-parser": "^3.0.0", "express": "^5.2.1", "express-rate-limit": "^8.2.1", "hono": "^4.11.4", "jose": "^6.1.3", "json-schema-typed": "^8.0.2", "pkce-challenge": "^5.0.0", "raw-body": "^3.0.0", "zod": "^3.25 || ^4.0", "zod-to-json-schema": "^3.25.1" }, "peerDependencies": { "@cfworker/json-schema": "^4.1.1" }, "optionalPeers": ["@cfworker/json-schema"] }, "sha512-zo37mZA9hJWpULgkRpowewez1y6ML5GsXJPY8FI0tBBCd77HEvza4jDqRKOXgHNn867PVGCyTdzqpz0izu5ZjQ=="],
"@opencode-ai/sdk": ["@opencode-ai/sdk@1.4.11", "", { "dependencies": { "cross-spawn": "7.0.6" } }, "sha512-EJxSfc7D/dda/vrw8zQe4g7yVTxERktvb5SvIBlGBnKYQJGOgo9RyA/1EL3l208rHeo6jm1sdrAF0E6o/k94ug=="],
"@types/bun": ["@types/bun@1.3.12", "", { "dependencies": { "bun-types": "1.3.12" } }, "sha512-DBv81elK+/VSwXHDlnH3Qduw+KxkTIWi7TXkAeh24zpi5l0B2kUg9Ga3tb4nJaPcOFswflgi/yAvMVBPrxMB+A=="],
"@types/node": ["@types/node@22.19.17", "", { "dependencies": { "undici-types": "~6.21.0" } }, "sha512-wGdMcf+vPYM6jikpS/qhg6WiqSV/OhG+jeeHT/KlVqxYfD40iYJf9/AE1uQxVWFvU7MipKRkRv8NSHiCGgPr8Q=="],
+1
View File
@@ -11,6 +11,7 @@
"dependencies": {
"@anthropic-ai/claude-agent-sdk": "^0.2.116",
"@modelcontextprotocol/sdk": "^1.12.1",
"@opencode-ai/sdk": "^1.4.3",
"cron-parser": "^5.0.0",
"zod": "^4.0.0"
},
+11 -57
View File
@@ -21,37 +21,6 @@ function generateId(): string {
return `msg-${Date.now()}-${Math.random().toString(36).slice(2, 8)}`;
}
/**
* Find the first matching substitution rule from the provider, if any.
* Returns the rule (caller logs the name) or null when nothing matches —
* there is intentionally no fallback message.
*/
function findSubstitution(
text: string,
provider: AgentProvider,
): { name: string; replace: string } | null {
for (const rule of provider.errorSubstitutions ?? []) {
if (rule.test.test(text)) return { name: rule.name, replace: rule.replace };
}
return null;
}
function writeSubstitutedMessage(
routing: RoutingContext,
ruleName: string,
text: string,
): void {
log(`Substituting output via rule "${ruleName}"`);
writeMessageOut({
id: generateId(),
kind: 'chat',
platform_id: routing.platformId,
channel_type: routing.channelType,
thread_id: routing.threadId,
content: JSON.stringify({ text }),
});
}
export interface PollLoopConfig {
provider: AgentProvider;
/**
@@ -202,7 +171,7 @@ export async function runPollLoop(config: PollLoopConfig): Promise<void> {
const skippedSet = new Set(skipped);
const processingIds = ids.filter((id) => !commandIds.includes(id) && !skippedSet.has(id));
try {
const result = await processQuery(query, routing, processingIds, config.provider, config.providerName);
const result = await processQuery(query, routing, processingIds, config.providerName);
if (result.continuation && result.continuation !== continuation) {
continuation = result.continuation;
setContinuation(config.providerName, continuation);
@@ -220,22 +189,15 @@ export async function runPollLoop(config: PollLoopConfig): Promise<void> {
clearContinuation(config.providerName);
}
// Write error response so the user knows something went wrong.
// Apply provider-defined substitutions first — e.g. swap "Please run
// /login" for an actionable host-aware message.
const sub = findSubstitution(errMsg, config.provider);
if (sub) {
writeSubstitutedMessage(routing, sub.name, sub.replace);
} else {
writeMessageOut({
id: generateId(),
kind: 'chat',
platform_id: routing.platformId,
channel_type: routing.channelType,
thread_id: routing.threadId,
content: JSON.stringify({ text: `Error: ${errMsg}` }),
});
}
// Write error response so the user knows something went wrong
writeMessageOut({
id: generateId(),
kind: 'chat',
platform_id: routing.platformId,
channel_type: routing.channelType,
thread_id: routing.threadId,
content: JSON.stringify({ text: `Error: ${errMsg}` }),
});
}
// Ensure completed even if processQuery ended without a result event
@@ -287,7 +249,6 @@ async function processQuery(
query: AgentQuery,
routing: RoutingContext,
initialBatchIds: string[],
provider: AgentProvider,
providerName: string,
): Promise<QueryResult> {
let queryContinuation: string | undefined;
@@ -349,14 +310,7 @@ async function processQuery(
// at all — either way the turn is finished.
markCompleted(initialBatchIds);
if (event.text) {
// Apply provider-defined substitutions before dispatch so banners
// like "Please run /login" surface as actionable host-aware text.
const sub = findSubstitution(event.text, provider);
if (sub) {
writeSubstitutedMessage(routing, sub.name, sub.replace);
} else {
dispatchResultText(event.text, routing);
}
dispatchResultText(event.text, routing);
}
}
}
@@ -1,47 +0,0 @@
import { describe, it, expect } from 'bun:test';
import { ClaudeProvider } from './claude.js';
describe('ClaudeProvider.errorSubstitutions', () => {
const provider = new ClaudeProvider();
const findRule = (name: string) => provider.errorSubstitutions.find((r) => r.name === name);
describe('auth-required', () => {
const rule = findRule('auth-required')!;
it('exists', () => {
expect(rule).toBeDefined();
});
it('matches the "Not logged in" banner', () => {
expect(rule.test.test('Not logged in · Please run /login')).toBe(true);
});
it('matches the "Invalid API key" banner', () => {
expect(rule.test.test('Invalid API key · Please run /login')).toBe(true);
});
it('matches with trailing content after the banner', () => {
expect(rule.test.test('Not logged in · Please run /login\n\nstack trace …')).toBe(true);
});
it('does not match when the agent quotes the phrase mid-sentence', () => {
const quoted = "The error 'Invalid API key · Please run /login' means your auth has expired.";
expect(rule.test.test(quoted)).toBe(false);
});
it('does not match when the phrase is wrapped in quotes at the start', () => {
const prose = '"Not logged in · Please run /login" is a Claude Code error.';
expect(rule.test.test(prose)).toBe(false);
});
it('does not match a different separator', () => {
expect(rule.test.test('Not logged in - Please run /login')).toBe(false);
});
it('replace text names the operator remediation', () => {
expect(rule.replace).toContain('Anthropic credentials');
expect(rule.replace).toContain('claude');
});
});
});
+2 -33
View File
@@ -5,15 +5,7 @@ import { query as sdkQuery, type HookCallback, type PreCompactHookInput } from '
import { clearContainerToolInFlight, setContainerToolInFlight } from '../db/connection.js';
import { registerProvider } from './provider-registry.js';
import type {
AgentProvider,
AgentQuery,
ErrorSubstitution,
McpServerConfig,
ProviderEvent,
ProviderOptions,
QueryInput,
} from './types.js';
import type { AgentProvider, AgentQuery, McpServerConfig, ProviderEvent, ProviderOptions, QueryInput } from './types.js';
function log(msg: string): void {
console.error(`[claude-provider] ${msg}`);
@@ -234,12 +226,8 @@ function createPreCompactHook(assistantName?: string): HookCallback {
/**
* Claude Code auto-compacts context at this window (tokens). Kept here so
* the generic bootstrap doesn't need to know about Claude-specific env vars.
*
* Operator override: set CLAUDE_CODE_AUTO_COMPACT_WINDOW in the host env to
* raise or lower the threshold without editing source — useful when running
* with a 1M-context model variant or when emergency-tuning a deployment.
*/
const CLAUDE_CODE_AUTO_COMPACT_WINDOW = process.env.CLAUDE_CODE_AUTO_COMPACT_WINDOW || '165000';
const CLAUDE_CODE_AUTO_COMPACT_WINDOW = '165000';
/**
* Stale-session detection. Matches Claude Code's error text when a
@@ -248,27 +236,8 @@ const CLAUDE_CODE_AUTO_COMPACT_WINDOW = process.env.CLAUDE_CODE_AUTO_COMPACT_WIN
*/
const STALE_SESSION_RE = /no conversation found|ENOENT.*\.jsonl|session.*not found/i;
/**
* Provider-specific output substitutions. Each rule is a `(test, replace)`
* pair; the first match wins. The poll-loop applies these to result text
* and to error text before delivery so users see actionable host-aware
* messages instead of raw CLI banners they can't act on from chat.
*/
const ERROR_SUBSTITUTIONS: readonly ErrorSubstitution[] = [
{
name: 'auth-required',
// Anchored to start-of-string with the specific `·` separator (U+00B7)
// the CLI emits, so an agent that quotes the phrase verbatim mid-sentence
// in a normal reply doesn't trip the rule.
test: /^(Not logged in|Invalid API key)\s*·\s*Please run \/login/,
replace:
"I can't reach my Anthropic credentials right now. The operator running NanoClaw needs to re-run setup, or run `claude` in the project directory on the machine I'm running on.",
},
];
export class ClaudeProvider implements AgentProvider {
readonly supportsNativeSlashCommands = true;
readonly errorSubstitutions = ERROR_SUBSTITUTIONS;
private assistantName?: string;
private mcpServers: Record<string, McpServerConfig>;
@@ -0,0 +1,162 @@
import { describe, expect, it, afterEach } from 'bun:test';
import fs from 'fs';
import os from 'os';
import path from 'path';
import {
type AppServer,
attachCodexAutoApproval,
buildCodexProcessEnv,
tomlBasicString,
writeCodexConfigToml,
} from './codex-app-server.js';
let tmpHome: string | null = null;
const originalHome = process.env.HOME;
afterEach(() => {
process.env.HOME = originalHome;
if (tmpHome) {
fs.rmSync(tmpHome, { recursive: true, force: true });
tmpHome = null;
}
});
describe('Codex config TOML', () => {
it('escapes basic strings', () => {
expect(tomlBasicString('a "quoted" \\\\ value')).toBe('"a \\"quoted\\" \\\\\\\\ value"');
});
it('rejects newlines', () => {
expect(() => tomlBasicString('bad\nvalue')).toThrow(/newline/);
});
it('hardcodes danger-full-access + never and writes model, effort, and MCP servers', () => {
tmpHome = fs.mkdtempSync(path.join(os.tmpdir(), 'codex-home-'));
process.env.HOME = tmpHome;
writeCodexConfigToml(
{
nanoclaw: {
command: 'bun',
args: ['run', '/app/src/mcp-tools/index.ts'],
env: { FOO: 'bar' },
},
},
{ model: 'gpt-5', effort: 'medium' },
);
const content = fs.readFileSync(path.join(tmpHome, '.codex', 'config.toml'), 'utf-8');
expect(content).toContain('sandbox_mode = "danger-full-access"');
expect(content).toContain('approval_policy = "never"');
expect(content).toContain('project_doc_max_bytes = 32768');
expect(content).toContain('model = "gpt-5"');
expect(content).toContain('model_reasoning_effort = "medium"');
expect(content).not.toContain('[sandbox_workspace_write]');
expect(content).not.toContain('writable_roots =');
expect(content).toContain('[mcp_servers.nanoclaw]');
expect(content).toContain('command = "bun"');
expect(content).toContain('args = ["run", "/app/src/mcp-tools/index.ts"]');
expect(content).toContain('[mcp_servers.nanoclaw.env]');
expect(content).toContain('FOO = "bar"');
});
});
describe('Codex auto-approval', () => {
// NanoClaw (container isolation + OneCLI) is the boundary, so the handler accepts
// every request unconditionally — even paths/commands a sandbox policy would refuse.
it('grants full filesystem + network for permission requests', () => {
const { server, writes } = fakeServer();
attachCodexAutoApproval(server);
server.serverRequestHandlers[0]({
id: 1,
method: 'item/permissions/requestApproval',
params: { permissions: { fileSystem: { read: ['/workspace/agent'], write: ['/workspace/agent'] } } },
});
const result = JSON.parse(writes[0]).result as {
permissions: { fileSystem: { read: string[]; write: string[] }; network: { enabled: boolean } };
scope: string;
};
expect(result.scope).toBe('turn');
expect(result.permissions.fileSystem.read).toEqual(['/']);
expect(result.permissions.fileSystem.write).toEqual(['/']);
expect(result.permissions.network.enabled).toBe(true);
});
it('accepts file-change and command-exec approvals regardless of path', () => {
const { server, writes } = fakeServer();
attachCodexAutoApproval(server);
server.serverRequestHandlers[0]({ id: 2, method: 'item/fileChange/requestApproval', params: { grantRoot: '/etc' } });
server.serverRequestHandlers[0]({
id: 3,
method: 'item/commandExecution/requestApproval',
params: { command: 'rm -rf /', cwd: '/' },
});
expect(JSON.parse(writes[0]).result).toEqual({ decision: 'accept' });
expect(JSON.parse(writes[1]).result).toEqual({ decision: 'accept' });
});
it('approves legacy patch and command-exec approvals regardless of path', () => {
const { server, writes } = fakeServer();
attachCodexAutoApproval(server);
server.serverRequestHandlers[0]({
id: 4,
method: 'applyPatchApproval',
params: { fileChanges: { '/etc/passwd': {} } },
});
server.serverRequestHandlers[0]({ id: 5, method: 'execCommandApproval', params: { command: 'rm -rf /', cwd: '/' } });
expect(JSON.parse(writes[0]).result).toEqual({ decision: 'approved' });
expect(JSON.parse(writes[1]).result).toEqual({ decision: 'approved' });
});
it('fails closed for unknown server requests', () => {
const { server, writes } = fakeServer();
attachCodexAutoApproval(server);
server.serverRequestHandlers[0]({ id: 6, method: 'new/unknown/request' });
const response = JSON.parse(writes[0]);
expect(response.error.message).toContain('Unhandled Codex app-server request');
});
});
describe('Codex process env', () => {
it('forwards proxy/runtime env without leaking secret-like host env', () => {
const env = buildCodexProcessEnv({
PATH: '/bin',
HOME: '/home/node',
CODEX_HOME: '/home/node/.codex',
HTTPS_PROXY: 'http://proxy',
OPENAI_API_KEY: 'sk-test',
ONECLI_API_KEY: 'onecli-secret',
SOME_TOKEN: 'token',
});
expect(env.PATH).toBe('/bin');
expect(env.HOME).toBe('/home/node');
expect(env.CODEX_HOME).toBe('/home/node/.codex');
expect(env.HTTPS_PROXY).toBe('http://proxy');
expect(env.OPENAI_API_KEY).toBeUndefined();
expect(env.ONECLI_API_KEY).toBeUndefined();
expect(env.SOME_TOKEN).toBeUndefined();
});
});
function fakeServer(): { server: AppServer; writes: string[] } {
const writes: string[] = [];
const server = {
process: { stdin: { write: (line: string) => writes.push(line) } },
readline: { close: () => {} },
pending: new Map(),
notificationHandlers: [],
exitHandlers: [],
serverRequestHandlers: [],
} as unknown as AppServer;
return { server, writes };
}
@@ -0,0 +1,441 @@
import fs from 'fs';
import path from 'path';
import { spawn, type ChildProcess } from 'child_process';
import { createInterface, type Interface as ReadlineInterface } from 'readline';
// Cap Codex's project-doc loading (AGENTS.md). The host-side composer
// (src/providers/codex-agents-md.ts) enforces the same cap at compose time —
// host and container share no modules, so the constant lives in both.
const CODEX_PROJECT_DOC_MAX_BYTES = 32 * 1024;
function log(msg: string): void {
console.error(`[codex-app-server] ${msg}`);
}
const INIT_TIMEOUT_MS = 30_000;
export const STALE_THREAD_RE = /thread\s+not\s+found|unknown\s+thread|thread[_\s]id|no such thread/i;
let nextRequestId = 1;
export interface JsonRpcResponse {
id: number | string;
result?: unknown;
error?: { code: number; message: string; data?: unknown };
}
export interface JsonRpcNotification {
method: string;
params?: Record<string, unknown>;
}
export interface JsonRpcServerRequest {
id: number | string;
method: string;
params?: Record<string, unknown>;
}
type JsonRpcMessage = JsonRpcResponse | JsonRpcNotification | JsonRpcServerRequest;
export interface AppServer {
process: ChildProcess;
readline: ReadlineInterface;
pending: Map<number | string, { resolve: (r: JsonRpcResponse) => void; reject: (e: Error) => void }>;
notificationHandlers: Array<(n: JsonRpcNotification) => void>;
serverRequestHandlers: Array<(r: JsonRpcServerRequest) => void>;
/**
* Fired when the app-server process dies (exit or spawn error). Pending
* request/response pairs are rejected separately via failPending — but a
* turn in flight has NO pending request (turn/start already resolved); it
* is parked on a notification waker that a dead process will never kick.
* Without these handlers a mid-turn crash surfaces as a 10-minute turn
* timeout instead of the real exit code, after the --rm container has
* already taken the server's stderr with it.
*/
exitHandlers: Array<(err: Error) => void>;
}
export interface CodexMcpServer {
command: string;
args?: string[];
env?: Record<string, string>;
}
export type CodexReasoningEffort = 'none' | 'minimal' | 'low' | 'medium' | 'high' | 'xhigh';
// Codex runs unrestricted inside the container. NanoClaw's container isolation and
// the OneCLI allow-list are the security boundary — not Codex's own sandbox/approval
// primitives (which can't run here anyway: workspace-write/read-only need user
// namespaces, which the agent containers deny). Both are hardcoded as instance-level
// defaults in config.toml; threads and turns inherit them, never override them.
const CODEX_SANDBOX_MODE = 'danger-full-access';
const CODEX_APPROVAL_POLICY = 'never';
const CODEX_ENV_ALLOWLIST = new Set([
'ALL_PROXY',
'CURL_CA_BUNDLE',
'GIT_SSL_CAINFO',
'HOME',
'HTTP_PROXY',
'HTTPS_PROXY',
'LANG',
'LC_ALL',
'NODE_EXTRA_CA_CERTS',
'NO_PROXY',
'PATH',
'PNPM_HOME',
'REQUESTS_CA_BUNDLE',
'SSL_CERT_DIR',
'SSL_CERT_FILE',
'TEMP',
'TERM',
'TMP',
'TMPDIR',
'TZ',
'USER',
'all_proxy',
'http_proxy',
'https_proxy',
'no_proxy',
'CODEX_HOME',
]);
export interface ThreadParams {
model?: string;
cwd: string;
baseInstructions?: string;
developerInstructions?: string;
}
export interface TurnParams {
threadId: string;
inputText: string;
model?: string;
effort?: string;
cwd?: string;
}
export function spawnCodexAppServer(): AppServer {
const args = ['app-server', '--listen', 'stdio://'];
log(`Spawning: codex ${args.join(' ')}`);
const proc = spawn('codex', args, {
stdio: ['pipe', 'pipe', 'pipe'],
env: buildCodexProcessEnv(process.env),
});
const rl = createInterface({ input: proc.stdout! });
const server: AppServer = {
process: proc,
readline: rl,
pending: new Map(),
notificationHandlers: [],
exitHandlers: [],
serverRequestHandlers: [],
};
proc.stderr?.on('data', (chunk: Buffer) => {
const text = chunk.toString().trim();
if (text) log(`[stderr] ${text}`);
});
rl.on('line', (line: string) => {
if (!line.trim()) return;
let msg: JsonRpcMessage;
try {
msg = JSON.parse(line) as JsonRpcMessage;
} catch {
log(`[parse-error] ${line.slice(0, 200)}`);
return;
}
if (isResponse(msg)) {
const handler = server.pending.get(msg.id);
if (handler) {
server.pending.delete(msg.id);
handler.resolve(msg);
}
} else if (isServerRequest(msg)) {
for (const h of server.serverRequestHandlers) h(msg);
} else if ('method' in msg) {
for (const h of server.notificationHandlers) h(msg as JsonRpcNotification);
}
});
const failPending = (err: Error): void => {
for (const [, handler] of server.pending) handler.reject(err);
server.pending.clear();
};
proc.on('error', (err) => {
log(`[process-error] ${err.message}`);
failPending(err);
for (const h of [...server.exitHandlers]) h(err);
});
proc.on('exit', (code, signal) => {
log(`[exit] code=${code} signal=${signal}`);
const err = new Error(`Codex app-server exited: code=${code} signal=${signal}`);
failPending(err);
for (const h of [...server.exitHandlers]) h(err);
});
return server;
}
export function sendCodexRequest(
server: AppServer,
method: string,
params?: Record<string, unknown>,
timeoutMs = 60_000,
): Promise<JsonRpcResponse> {
const id = nextRequestId++;
const req = params === undefined ? { id, method } : { id, method, params };
const line = JSON.stringify(req) + '\n';
return new Promise<JsonRpcResponse>((resolve, reject) => {
const timer = setTimeout(() => {
server.pending.delete(id);
reject(new Error(`Timeout waiting for ${method} response (${timeoutMs}ms)`));
}, timeoutMs);
server.pending.set(id, {
resolve: (r) => {
clearTimeout(timer);
resolve(r);
},
reject: (e) => {
clearTimeout(timer);
reject(e);
},
});
try {
server.process.stdin!.write(line);
} catch (err) {
clearTimeout(timer);
server.pending.delete(id);
reject(err instanceof Error ? err : new Error(String(err)));
}
});
}
export function sendCodexNotification(server: AppServer, method: string, params?: Record<string, unknown>): void {
const line = JSON.stringify(params === undefined ? { method } : { method, params }) + '\n';
server.process.stdin!.write(line);
}
export function sendCodexResponse(server: AppServer, id: number | string, result: unknown): void {
try {
server.process.stdin!.write(JSON.stringify({ id, result }) + '\n');
} catch (err) {
log(`[send-error] response id=${id}: ${err instanceof Error ? err.message : String(err)}`);
}
}
export function killCodexAppServer(server: AppServer): void {
try {
server.readline.close();
server.process.kill('SIGTERM');
} catch {
/* ignore */
}
}
export async function initializeCodexAppServer(server: AppServer): Promise<void> {
const resp = await sendCodexRequest(
server,
'initialize',
{
clientInfo: { name: 'nanoclaw', title: 'NanoClaw', version: '2.0' },
capabilities: { experimentalApi: true },
},
INIT_TIMEOUT_MS,
);
if (resp.error) throw new Error(`initialize failed: ${resp.error.message}`);
sendCodexNotification(server, 'initialized');
}
export async function startOrResumeCodexThread(
server: AppServer,
threadId: string | undefined,
params: ThreadParams,
): Promise<string> {
const baseParams = {
model: params.model,
cwd: params.cwd,
approvalPolicy: CODEX_APPROVAL_POLICY,
sandbox: CODEX_SANDBOX_MODE,
baseInstructions: params.baseInstructions,
developerInstructions: params.developerInstructions,
personality: 'friendly',
sessionStartSource: 'startup',
persistExtendedHistory: false,
};
if (threadId) {
const resp = await sendCodexRequest(server, 'thread/resume', {
threadId,
...baseParams,
excludeTurns: true,
});
if (!resp.error) return threadId;
if (!STALE_THREAD_RE.test(resp.error.message)) {
throw new Error(`thread/resume failed: ${resp.error.message}`);
}
log(`Stale thread ${threadId}; starting fresh thread.`);
}
const resp = await sendCodexRequest(server, 'thread/start', {
...baseParams,
experimentalRawEvents: false,
});
if (resp.error) throw new Error(`thread/start failed: ${resp.error.message}`);
const result = resp.result as { thread?: { id?: string } } | undefined;
const newThreadId = result?.thread?.id;
if (!newThreadId) throw new Error('thread/start response missing thread ID');
return newThreadId;
}
export async function startCodexTurn(server: AppServer, params: TurnParams): Promise<string> {
const resp = await sendCodexRequest(server, 'turn/start', {
threadId: params.threadId,
input: [{ type: 'text', text: params.inputText, text_elements: [] }],
model: params.model,
effort: params.effort,
cwd: params.cwd,
});
if (resp.error) throw new Error(`turn/start failed: ${resp.error.message}`);
const result = resp.result as { turn?: { id?: string } } | undefined;
const turnId = result?.turn?.id;
if (!turnId) throw new Error('turn/start response missing turn ID');
return turnId;
}
export async function steerCodexTurn(
server: AppServer,
threadId: string,
turnId: string,
inputText: string,
): Promise<void> {
const resp = await sendCodexRequest(server, 'turn/steer', {
threadId,
expectedTurnId: turnId,
input: [{ type: 'text', text: inputText, text_elements: [] }],
});
if (resp.error) throw new Error(`turn/steer failed: ${resp.error.message}`);
}
export async function interruptCodexTurn(server: AppServer, threadId: string, turnId: string): Promise<void> {
const resp = await sendCodexRequest(server, 'turn/interrupt', { threadId, turnId }, 10_000);
if (resp.error) throw new Error(`turn/interrupt failed: ${resp.error.message}`);
}
// With approval_policy=never the command/patch approval requests don't fire, but the
// app-server still sends a few non-approval server→client requests (permission
// negotiation, MCP elicitations, tool calls) that must be answered or the turn hangs.
// NanoClaw is the boundary, so accept/grant everything.
export function attachCodexAutoApproval(server: AppServer): void {
server.serverRequestHandlers.push((req) => {
switch (req.method) {
case 'item/commandExecution/requestApproval':
case 'item/fileChange/requestApproval':
sendCodexResponse(server, req.id, { decision: 'accept' });
break;
case 'applyPatchApproval':
case 'execCommandApproval':
sendCodexResponse(server, req.id, { decision: 'approved' });
break;
case 'item/permissions/requestApproval':
sendCodexResponse(server, req.id, {
permissions: { fileSystem: { read: ['/'], write: ['/'] }, network: { enabled: true } },
scope: 'turn',
strictAutoReview: true,
});
break;
case 'item/tool/requestUserInput':
sendCodexResponse(server, req.id, { answers: {} });
break;
case 'mcpServer/elicitation/request':
sendCodexResponse(server, req.id, { action: 'cancel', content: null, _meta: null });
break;
case 'item/tool/call':
sendCodexResponse(server, req.id, { success: false, contentItems: [] });
break;
default:
sendCodexError(server, req.id, `Unhandled Codex app-server request: ${req.method}`);
break;
}
});
}
export function writeCodexConfigToml(
servers: Record<string, CodexMcpServer>,
opts: { model?: string; effort?: string } = {},
): void {
const codexConfigDir = path.join(process.env.HOME || '/home/node', '.codex');
fs.mkdirSync(codexConfigDir, { recursive: true });
const configTomlPath = path.join(codexConfigDir, 'config.toml');
// Instance-level defaults the app-server reads on startup; threads/turns inherit them.
const lines: string[] = [
`sandbox_mode = ${tomlBasicString(CODEX_SANDBOX_MODE)}`,
`approval_policy = ${tomlBasicString(CODEX_APPROVAL_POLICY)}`,
`project_doc_max_bytes = ${CODEX_PROJECT_DOC_MAX_BYTES}`,
];
if (opts.model) lines.push(`model = ${tomlBasicString(opts.model)}`);
if (opts.effort) lines.push(`model_reasoning_effort = ${tomlBasicString(opts.effort)}`);
lines.push('');
for (const [name, config] of Object.entries(servers)) {
lines.push(`[mcp_servers.${name}]`);
lines.push(`command = ${tomlBasicString(config.command)}`);
if (config.args && config.args.length > 0) {
lines.push(`args = [${config.args.map(tomlBasicString).join(', ')}]`);
}
if (config.env && Object.keys(config.env).length > 0) {
lines.push(`[mcp_servers.${name}.env]`);
for (const [key, value] of Object.entries(config.env)) {
lines.push(`${key} = ${tomlBasicString(value)}`);
}
}
lines.push('');
}
fs.writeFileSync(configTomlPath, lines.join('\n'));
}
export function buildCodexProcessEnv(env: NodeJS.ProcessEnv): NodeJS.ProcessEnv {
const next: NodeJS.ProcessEnv = {};
for (const key of CODEX_ENV_ALLOWLIST) {
const value = env[key];
if (value !== undefined) next[key] = value;
}
if (!next.CODEX_HOME) next.CODEX_HOME = next.HOME ? path.join(next.HOME, '.codex') : '/home/node/.codex';
if (!next.HOME) next.HOME = '/home/node';
return next;
}
export function tomlBasicString(value: string): string {
if (value.includes('\n') || value.includes('\r')) {
throw new Error(`MCP config value contains newline: ${JSON.stringify(value.slice(0, 40))}`);
}
return `"${value.replace(/\\/g, '\\\\').replace(/"/g, '\\"')}"`;
}
function sendCodexError(server: AppServer, id: number | string, message: string, data?: unknown): void {
try {
server.process.stdin!.write(JSON.stringify({ id, error: { code: -32000, message, data } }) + '\n');
} catch (err) {
log(`[send-error] error id=${id}: ${err instanceof Error ? err.message : String(err)}`);
}
}
function isResponse(msg: JsonRpcMessage): msg is JsonRpcResponse {
return 'id' in msg && ('result' in msg || 'error' in msg) && !('method' in msg);
}
function isServerRequest(msg: JsonRpcMessage): msg is JsonRpcServerRequest {
return 'id' in msg && 'method' in msg;
}
@@ -0,0 +1,39 @@
// Structural guard for the Codex CLI install in container/cli-tools.json.
//
// @openai/codex is a CLI *binary* installed from the global-CLI manifest (a
// json-merge seam), not an importable package, so the barrel-driven
// registration tests cannot see it. This test reads the real cli-tools.json
// and asserts the @openai/codex entry is present and pinned to an exact
// version. It goes red if the manifest entry is dropped or unpins.
//
// Runs under bun (same suite as the container registration test):
// cd container/agent-runner && bun test src/providers/codex-cli-tools.test.ts
import { existsSync, readFileSync } from 'fs';
import path from 'path';
import { describe, it, expect } from 'bun:test';
// container/agent-runner/src/providers/ -> container/cli-tools.json
const MANIFEST = path.join(import.meta.dir, '..', '..', '..', 'cli-tools.json');
const manifestPresent = existsSync(MANIFEST);
// Read lazily — `describe.skipIf` still runs the body to register tests, so the
// read has to be guarded for the bare-branch (no manifest) case.
const tools: Array<{ name: string; version: string }> = manifestPresent
? JSON.parse(readFileSync(MANIFEST, 'utf8'))
: [];
const codex = tools.find((t) => t.name === '@openai/codex');
// cli-tools.json is a trunk file; on the bare providers branch it isn't present,
// so skip there. In an installed tree (trunk + this payload) it must carry the
// pinned @openai/codex entry.
describe.skipIf(!manifestPresent)('container/cli-tools.json codex CLI install', () => {
it('includes the @openai/codex entry', () => {
expect(codex).toBeDefined();
});
it('pins it to an exact semver (no latest, no ranges)', () => {
expect(codex?.version).toMatch(/^\d+\.\d+\.\d+(?:[-+][0-9A-Za-z.-]+)?$/);
});
});
@@ -0,0 +1,22 @@
/**
* Integration test for the codex provider's CONTAINER-side reach-in: the self-registration
* import in container/agent-runner/src/providers/index.ts. Importing the barrel runs
* codex.ts's top-level registerProvider('codex', …); without that import line
* createProvider('codex') throws 'Unknown provider' at runtime.
*
* Behavior, not structural, and BARREL-ONLY: it imports the real barrel (./index.js),
* never ./codex.js directly, then asserts listProviderNames() contains the provider. The
* existing codex.factory.test.ts imports ./codex.js directly, so it self-registers and
* stays GREEN when the barrel line is deleted — a unit test, not a registration guard.
* This goes red if the barrel import is deleted/drifts or the barrel fails to evaluate. codex uses the @openai/codex CLI *binary* (not an importable package), so this test does not guard that dependency — the Dockerfile install line is guarded structurally + by the container build (see the skill validate step).
*/
import { describe, it, expect } from 'bun:test';
import { listProviderNames } from './provider-registry.js';
import './index.js'; // the real container provider barrel — triggers each provider's registerProvider()
describe('codex provider registration', () => {
it('registers codex via the provider barrel', () => {
expect(listProviderNames()).toContain('codex');
});
});
@@ -0,0 +1,17 @@
import { describe, expect, it } from 'bun:test';
import { CodexProvider } from './codex.js';
describe('CodexProvider', () => {
it('rejects unsupported reasoning effort values', () => {
expect(() => new CodexProvider({ effort: 'max' })).toThrow(/Unsupported Codex reasoning effort/);
});
it('normalizes supported reasoning effort values', () => {
expect(new CodexProvider({ effort: 'HIGH' })).toBeInstanceOf(CodexProvider);
});
it('accepts supported reasoning effort values', () => {
expect(new CodexProvider({ effort: 'xhigh' })).toBeInstanceOf(CodexProvider);
});
});
@@ -0,0 +1,419 @@
import fs from 'fs';
import path from 'path';
import { registerProvider } from './provider-registry.js';
import type {
AgentProvider,
AgentQuery,
McpServerConfig,
ProviderEvent,
ProviderExchange,
ProviderOptions,
QueryInput,
} from './types.js';
import { archiveProviderExchange } from './exchange-archive.js';
import {
type AppServer,
type CodexReasoningEffort,
type JsonRpcNotification,
STALE_THREAD_RE,
attachCodexAutoApproval,
initializeCodexAppServer,
interruptCodexTurn,
killCodexAppServer,
spawnCodexAppServer,
startCodexTurn,
startOrResumeCodexThread,
steerCodexTurn,
writeCodexConfigToml,
} from './codex-app-server.js';
const TURN_TIMEOUT_MS = 10 * 60 * 1000;
const SUPPORTED_EFFORTS = new Set<CodexReasoningEffort>(['none', 'minimal', 'low', 'medium', 'high', 'xhigh']);
export interface CodexRuntimeDeps {
writeCodexConfigToml: typeof writeCodexConfigToml;
spawnCodexAppServer: typeof spawnCodexAppServer;
attachCodexAutoApproval: typeof attachCodexAutoApproval;
initializeCodexAppServer: typeof initializeCodexAppServer;
startOrResumeCodexThread: typeof startOrResumeCodexThread;
startCodexTurn: typeof startCodexTurn;
steerCodexTurn: typeof steerCodexTurn;
interruptCodexTurn: typeof interruptCodexTurn;
killCodexAppServer: typeof killCodexAppServer;
}
const defaultCodexRuntimeDeps: CodexRuntimeDeps = {
writeCodexConfigToml,
spawnCodexAppServer,
attachCodexAutoApproval,
initializeCodexAppServer,
startOrResumeCodexThread,
startCodexTurn,
steerCodexTurn,
interruptCodexTurn,
killCodexAppServer,
};
function classifyError(message: string): string | undefined {
if (/auth|api key|unauthorized|login|credential/i.test(message)) return 'auth';
if (/quota|rate limit|insufficient|billing|credit/i.test(message)) return 'quota';
if (/sandbox|permission|denied/i.test(message)) return 'sandbox';
if (/thread|conversation|session/i.test(message)) return 'stale-session';
return undefined;
}
function normalizeEffort(effort: string | undefined): CodexReasoningEffort | undefined {
const normalized = effort?.trim().toLowerCase();
if (!normalized) return undefined;
if (!SUPPORTED_EFFORTS.has(normalized as CodexReasoningEffort)) {
throw new Error(`Unsupported Codex reasoning effort: ${effort}`);
}
return normalized as CodexReasoningEffort;
}
export class CodexProvider implements AgentProvider {
readonly supportsNativeSlashCommands = false;
// Codex has no native NanoClaw memory — opt in to the runner's persistent
// memory/ scaffold (see memory-scaffold.ts).
readonly usesMemoryScaffold = true;
// The app-server keeps history server-side; there is no on-disk transcript,
// so the provider persists each exchange itself into `conversations/`
// (see exchange-archive.ts). The poll-loop reports exchanges through this
// hook and does nothing else — archiving is payload code, not runner code.
onExchangeComplete(exchange: ProviderExchange): void {
archiveProviderExchange({
provider: 'codex',
prompt: exchange.prompt,
result: exchange.result,
continuation: exchange.continuation,
status: exchange.status,
});
}
private readonly mcpServers: Record<string, McpServerConfig>;
private readonly model?: string;
private readonly effort?: CodexReasoningEffort;
private readonly runtime: CodexRuntimeDeps;
constructor(options: ProviderOptions = {}, runtime: CodexRuntimeDeps = defaultCodexRuntimeDeps) {
this.mcpServers = options.mcpServers ?? {};
this.model = options.model;
this.runtime = runtime;
this.effort = normalizeEffort(options.effort);
}
isSessionInvalid(err: unknown): boolean {
const msg = err instanceof Error ? err.message : String(err);
return STALE_THREAD_RE.test(msg);
}
query(input: QueryInput): AgentQuery {
const pending: string[] = [input.prompt];
let waiting: (() => void) | null = null;
let ended = false;
let aborted = false;
let activeServer: AppServer | null = null;
let activeThreadId: string | null = null;
let activeTurnId: string | null = null;
let wakeActiveTurn: (() => void) | null = null;
const wake = (): void => {
waiting?.();
waiting = null;
};
const pushOrSteer = (message: string): void => {
if (activeServer && activeThreadId && activeTurnId) {
void this.runtime.steerCodexTurn(activeServer, activeThreadId, activeTurnId, message).catch(() => {
pending.push(message);
wake();
});
return;
}
pending.push(message);
wake();
};
const self = this;
async function* gen(): AsyncGenerator<ProviderEvent> {
self.runtime.writeCodexConfigToml(self.mcpServers, { model: self.model, effort: self.effort });
const server = self.runtime.spawnCodexAppServer();
activeServer = server;
self.runtime.attachCodexAutoApproval(server);
let threadId: string | undefined = input.continuation;
let initYielded = false;
try {
await self.runtime.initializeCodexAppServer(server);
threadId = await self.runtime.startOrResumeCodexThread(server, threadId, {
model: self.model,
cwd: input.cwd,
baseInstructions: input.systemContext?.instructions,
});
activeThreadId = threadId;
while (!aborted) {
while (pending.length === 0 && !ended && !aborted) {
await new Promise<void>((resolve) => {
waiting = resolve;
});
}
if (aborted) return;
if (pending.length === 0 && ended) return;
const text = pending.shift()!;
yield* runOneTurn(
server,
threadId,
text,
self.model,
self.effort,
input.cwd,
(turnId) => {
activeTurnId = turnId;
},
() => {
activeTurnId = null;
},
() => initYielded,
() => {
initYielded = true;
},
() => aborted,
(waker) => {
wakeActiveTurn = waker;
},
self.runtime.startCodexTurn,
);
}
} finally {
activeTurnId = null;
activeThreadId = null;
activeServer = null;
wakeActiveTurn = null;
self.runtime.killCodexAppServer(server);
}
}
return {
push: pushOrSteer,
end: () => {
ended = true;
wake();
},
abort: () => {
aborted = true;
if (activeServer && activeThreadId && activeTurnId) {
void this.runtime.interruptCodexTurn(activeServer, activeThreadId, activeTurnId).catch(() => {});
}
wakeActiveTurn?.();
wake();
},
events: gen(),
};
}
}
async function* runOneTurn(
server: AppServer,
threadId: string,
inputText: string,
model: string | undefined,
effort: string | undefined,
cwd: string,
setActiveTurn: (turnId: string) => void,
clearActiveTurn: () => void,
hasInit: () => boolean,
markInit: () => void,
isAborted: () => boolean,
setAbortWaker: (waker: (() => void) | null) => void,
startTurn: typeof startCodexTurn,
): AsyncGenerator<ProviderEvent> {
const state: { error: Error | null } = { error: null };
let resultText = '';
let turnDone = false;
let turnId: string | null = null;
// A finished turn can no longer absorb steered input: codex's turn/steer
// against a completed turn resolves as a no-op, so a follow-up routed there
// is lost silently. Clear the active-turn marker the moment the turn ends —
// before the generator drains and tears down in its `finally` — so
// pushOrSteer queues any racing follow-up into a fresh turn instead.
const finishTurn = (): void => {
turnDone = true;
clearActiveTurn();
};
const buffer: ProviderEvent[] = [];
let waker: (() => void) | null = null;
const kick = (): void => {
waker?.();
waker = null;
};
setAbortWaker(kick);
const handler = (n: JsonRpcNotification): void => {
const method = n.method;
const params = n.params ?? {};
buffer.push({ type: 'activity' });
switch (method) {
case 'thread/started': {
const thread = params.thread as { id?: string } | undefined;
if (thread?.id && !hasInit()) {
markInit();
buffer.push({ type: 'init', continuation: thread.id });
}
break;
}
case 'turn/started': {
const turn = params.turn as { id?: string } | undefined;
if (turn?.id) {
turnId = turn.id;
setActiveTurn(turn.id);
}
break;
}
case 'item/agentMessage/delta': {
const delta = params.delta as string | undefined;
if (delta) resultText += delta;
break;
}
case 'item/completed': {
const item = params.item as { type?: string; text?: string } | undefined;
if (item?.type === 'agentMessage' && item.text) resultText = item.text;
break;
}
case 'thread/status/changed': {
const status = params.status as string | undefined;
if (status) buffer.push({ type: 'progress', message: `status: ${status}` });
break;
}
case 'error': {
const err = params.error as { message?: string; additionalDetails?: string | null } | undefined;
const msg = [err?.message, err?.additionalDetails].filter(Boolean).join(': ') || 'Codex turn failed';
state.error = new Error(msg);
finishTurn();
break;
}
case 'turn/completed': {
const turn = params.turn as
| { error?: { message?: string; additionalDetails?: string | null } | null; items?: unknown[] }
| undefined;
const agentMessage = turn?.items
?.filter((item): item is { type: string; text?: string } => typeof item === 'object' && item !== null)
.find((item) => item.type === 'agentMessage' && item.text);
if (agentMessage?.text) resultText = agentMessage.text;
if (turn?.error) {
const msg =
[turn.error.message, turn.error.additionalDetails].filter(Boolean).join(': ') || 'Codex turn failed';
state.error = new Error(msg);
}
finishTurn();
break;
}
default:
break;
}
kick();
};
server.notificationHandlers.push(handler);
// A dead app-server can't send the notification this turn is parked on —
// end the turn immediately with the real cause instead of the 10-min timeout.
const onServerExit = (err: Error): void => {
if (turnDone) return;
state.error = err;
finishTurn();
kick();
};
server.exitHandlers.push(onServerExit);
const timer = setTimeout(() => {
state.error = new Error(`Turn timed out after ${TURN_TIMEOUT_MS}ms`);
finishTurn();
kick();
}, TURN_TIMEOUT_MS);
try {
if (!hasInit()) {
markInit();
buffer.push({ type: 'init', continuation: threadId });
}
turnId = await startTurn(server, {
threadId,
inputText,
model,
effort,
cwd,
});
setActiveTurn(turnId);
const imagesBefore = listGeneratedImages(threadId);
if (isAborted()) return;
while (true) {
while (buffer.length > 0) {
yield buffer.shift()!;
}
if (turnDone || isAborted()) break;
await new Promise<void>((resolve) => {
waker = resolve;
});
waker = null;
}
while (buffer.length > 0) yield buffer.shift()!;
if (isAborted()) return;
if (state.error) {
yield {
type: 'error',
message: state.error.message,
retryable: false,
classification: classifyError(state.error.message),
};
throw state.error;
}
for (const imagePath of listGeneratedImages(threadId)) {
if (!imagesBefore.has(imagePath)) {
yield { type: 'file', path: imagePath };
}
}
yield { type: 'result', text: resultText || null };
} finally {
clearTimeout(timer);
clearActiveTurn();
setAbortWaker(null);
const idx = server.notificationHandlers.indexOf(handler);
if (idx >= 0) server.notificationHandlers.splice(idx, 1);
const exitIdx = server.exitHandlers.indexOf(onServerExit);
if (exitIdx >= 0) server.exitHandlers.splice(exitIdx, 1);
}
}
/**
* Codex's built-in image generation saves into CODEX_HOME/generated_images/
* <threadId>/ — its native client renders those to the user, so the model
* believes delivery already happened and won't send_file them. The runner
* must deliver them itself: snapshot the dir at turn start, emit a `file`
* event for anything new at turn end.
*/
function listGeneratedImages(threadId: string): Set<string> {
const dir = path.join(process.env.CODEX_HOME || '/home/node/.codex', 'generated_images', threadId);
try {
return new Set(fs.readdirSync(dir).map((f) => path.join(dir, f)));
} catch {
return new Set();
}
}
registerProvider('codex', (opts) => new CodexProvider(opts));
@@ -0,0 +1,267 @@
import { afterEach, beforeEach, describe, expect, it } from 'bun:test';
import fs from 'fs';
import os from 'os';
import path from 'path';
import { CodexProvider, type CodexRuntimeDeps } from './codex.js';
import type { AppServer, JsonRpcNotification, TurnParams } from './codex-app-server.js';
import type { ProviderEvent } from './types.js';
describe('CodexProvider active turns', () => {
it('steers follow-ups into the active turn and yields liveness activity', async () => {
const fake = createFakeCodexRuntime();
const provider = new CodexProvider({}, fake.runtime);
const query = provider.query({ prompt: 'first prompt', cwd: '/workspace/agent' });
const events: ProviderEvent[] = [];
const collect = collectEvents(query.events, events);
await waitFor(() => fake.startCalls.length === 1);
query.push('follow-up prompt');
await waitFor(() => fake.steerCalls.length === 1);
query.end();
fake.completeTurn('final answer');
await collect;
expect(fake.startCalls).toHaveLength(1);
expect(fake.startCalls[0].inputText).toBe('first prompt');
expect(fake.steerCalls).toEqual([{ threadId: 'thread-1', turnId: 'turn-1', inputText: 'follow-up prompt' }]);
expect(events.filter((event) => event.type === 'activity').length).toBeGreaterThanOrEqual(2);
expect(events.filter((event) => event.type === 'result')).toEqual([{ type: 'result', text: 'final answer' }]);
expect(fake.killed).toBe(true);
});
it('queues follow-ups for the next turn when steering is rejected', async () => {
const fake = createFakeCodexRuntime({ rejectSteer: true });
const provider = new CodexProvider({}, fake.runtime);
const query = provider.query({ prompt: 'first prompt', cwd: '/workspace/agent' });
const events: ProviderEvent[] = [];
const collect = collectEvents(query.events, events);
await waitFor(() => fake.startCalls.length === 1);
query.push('queued follow-up');
await waitFor(() => fake.steerCalls.length === 1);
await sleep(0);
fake.completeTurn('first answer');
await waitFor(() => fake.startCalls.length === 2);
query.end();
fake.completeTurn('second answer');
await collect;
expect(fake.startCalls.map((call) => call.inputText)).toEqual(['first prompt', 'queued follow-up']);
expect(fake.steerCalls).toHaveLength(1);
expect(events.filter((event) => event.type === 'result')).toEqual([
{ type: 'result', text: 'first answer' },
{ type: 'result', text: 'second answer' },
]);
});
it('queues a follow-up that races turn completion into a new turn, never steering the finished turn', async () => {
const fake = createFakeCodexRuntime();
const provider = new CodexProvider({}, fake.runtime);
const query = provider.query({ prompt: 'first prompt', cwd: '/workspace/agent' });
const events: ProviderEvent[] = [];
const collect = collectEvents(query.events, events);
await waitFor(() => fake.startCalls.length === 1);
// The turn completes, then a follow-up lands in the same tick — before the
// generator has drained and torn the turn down. codex's turn/steer no-ops
// on a finished turn (resolves without error), so steering here would drop
// the message silently. It must start a fresh turn instead.
fake.completeTurn('first answer');
query.push('racing follow-up');
await waitFor(() => fake.startCalls.length === 2);
query.end();
fake.completeTurn('second answer');
await collect;
expect(fake.steerCalls).toHaveLength(0);
expect(fake.startCalls.map((call) => call.inputText)).toEqual(['first prompt', 'racing follow-up']);
expect(events.filter((event) => event.type === 'result')).toEqual([
{ type: 'result', text: 'first answer' },
{ type: 'result', text: 'second answer' },
]);
});
it('interrupts the active turn and closes the stream on abort', async () => {
const fake = createFakeCodexRuntime();
const provider = new CodexProvider({}, fake.runtime);
const query = provider.query({ prompt: 'first prompt', cwd: '/workspace/agent' });
const events: ProviderEvent[] = [];
const collect = collectEvents(query.events, events);
await waitFor(() => fake.startCalls.length === 1);
query.abort();
await collect;
expect(fake.interruptCalls).toEqual([{ threadId: 'thread-1', turnId: 'turn-1' }]);
expect(events.some((event) => event.type === 'result')).toBe(false);
expect(fake.killed).toBe(true);
});
it('threads the configured model and effort into the turn', async () => {
const fake = createFakeCodexRuntime();
const provider = new CodexProvider({ model: 'gpt-5.5', effort: 'high' }, fake.runtime);
const query = provider.query({ prompt: 'first prompt', cwd: '/workspace/agent' });
const events: ProviderEvent[] = [];
const collect = collectEvents(query.events, events);
await waitFor(() => fake.startCalls.length === 1);
query.end();
fake.completeTurn('final answer');
await collect;
expect(fake.startCalls[0].model).toBe('gpt-5.5');
expect(fake.startCalls[0].effort).toBe('high');
expect(events.filter((event) => event.type === 'result')).toEqual([{ type: 'result', text: 'final answer' }]);
});
it('delivers harness-generated images as file events — the model never sends them itself', async () => {
const codexHome = fs.mkdtempSync(path.join(os.tmpdir(), 'codex-home-'));
const prevHome = process.env.CODEX_HOME;
process.env.CODEX_HOME = codexHome;
try {
const fake = createFakeCodexRuntime();
const provider = new CodexProvider({}, fake.runtime);
const query = provider.query({ prompt: 'make an image', cwd: '/workspace/agent' });
const events: ProviderEvent[] = [];
const collect = collectEvents(query.events, events);
await waitFor(() => fake.startCalls.length === 1);
// Codex's built-in image_gen writes into CODEX_HOME mid-turn.
const imagesDir = path.join(codexHome, 'generated_images', 'thread-1');
fs.mkdirSync(imagesDir, { recursive: true });
fs.writeFileSync(path.join(imagesDir, 'ig_abc.png'), 'png-bytes');
query.end();
fake.completeTurn('Here you go — created the image.');
await collect;
const files = events.filter((event) => event.type === 'file') as Array<{ type: 'file'; path: string }>;
expect(files).toHaveLength(1);
expect(files[0].path).toBe(path.join(imagesDir, 'ig_abc.png'));
// file events arrive before the result so delivery shares the turn.
expect(events.findIndex((e) => e.type === 'file')).toBeLessThan(events.findIndex((e) => e.type === 'result'));
} finally {
if (prevHome === undefined) delete process.env.CODEX_HOME;
else process.env.CODEX_HOME = prevHome;
fs.rmSync(codexHome, { recursive: true, force: true });
}
});
it('ends the turn immediately with the real cause when the app-server dies mid-turn', async () => {
const fake = createFakeCodexRuntime();
const provider = new CodexProvider({}, fake.runtime);
const query = provider.query({ prompt: 'prompt', cwd: '/workspace/agent' });
const events: ProviderEvent[] = [];
const collect = collectEvents(query.events, events);
await waitFor(() => fake.startCalls.length === 1);
// No pending request exists mid-turn (turn/start already resolved), so
// only the exitHandlers seam can end the turn — without it this parks
// on the waker until the 10-minute turn timeout.
fake.crashServer(new Error('Codex app-server exited: code=1 signal=null'));
// The generator yields the error event, then rethrows to its consumer.
await collect.catch(() => {});
const errors = events.filter((event) => event.type === 'error');
expect(errors).toHaveLength(1);
expect((errors[0] as { message: string }).message).toContain('app-server exited');
});
});
function createFakeCodexRuntime(opts: { rejectSteer?: boolean } = {}) {
const server = fakeServer();
const startCalls: TurnParams[] = [];
const steerCalls: Array<{ threadId: string; turnId: string; inputText: string }> = [];
const interruptCalls: Array<{ threadId: string; turnId: string }> = [];
let killed = false;
const notify = (method: string, params?: Record<string, unknown>): void => {
const notification: JsonRpcNotification = { method, params };
for (const handler of [...server.notificationHandlers]) handler(notification);
};
const runtime: CodexRuntimeDeps = {
writeCodexConfigToml: () => {},
spawnCodexAppServer: () => server,
attachCodexAutoApproval: () => {},
initializeCodexAppServer: async () => {},
startOrResumeCodexThread: async (_server, threadId) => threadId ?? 'thread-1',
startCodexTurn: async (_server, params) => {
startCalls.push(params);
const turnId = `turn-${startCalls.length}`;
notify('turn/started', { turn: { id: turnId } });
return turnId;
},
steerCodexTurn: async (_server, threadId, turnId, inputText) => {
steerCalls.push({ threadId, turnId, inputText });
if (opts.rejectSteer) throw new Error('steer rejected');
},
interruptCodexTurn: async (_server, threadId, turnId) => {
interruptCalls.push({ threadId, turnId });
},
killCodexAppServer: () => {
killed = true;
},
};
return {
runtime,
startCalls,
steerCalls,
interruptCalls,
get killed() {
return killed;
},
completeTurn(text: string) {
notify('turn/completed', { turn: { items: [{ type: 'agentMessage', text }] } });
},
crashServer(err: Error) {
for (const h of [...server.exitHandlers]) h(err);
},
};
}
function fakeServer(): AppServer {
return {
process: { stdin: { write: () => true }, kill: () => true },
readline: { close: () => {} },
pending: new Map(),
notificationHandlers: [],
exitHandlers: [],
serverRequestHandlers: [],
} as unknown as AppServer;
}
async function collectEvents(events: AsyncIterable<ProviderEvent>, sink: ProviderEvent[]): Promise<void> {
for await (const event of events) {
sink.push(event);
}
}
async function waitFor(condition: () => boolean, timeoutMs = 1000): Promise<void> {
const start = Date.now();
while (!condition()) {
if (Date.now() - start > timeoutMs) throw new Error('waitFor timeout');
await sleep(10);
}
}
function sleep(ms: number): Promise<void> {
return new Promise((resolve) => setTimeout(resolve, ms));
}
@@ -0,0 +1,136 @@
import { afterEach, describe, expect, it } from 'bun:test';
import fs from 'fs';
import os from 'os';
import path from 'path';
import { archiveProviderExchange } from './exchange-archive.js';
let tmpDir: string | null = null;
afterEach(() => {
if (tmpDir) {
fs.rmSync(tmpDir, { recursive: true, force: true });
tmpDir = null;
}
});
function makeTmpDir(): string {
tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'nanoclaw-archive-'));
return tmpDir;
}
describe('provider exchange archive', () => {
it('appends same-thread exchanges into one file with a single header', () => {
const conversationsDir = makeTmpDir();
const timestamp = new Date('2026-06-03T12:34:56.789Z');
const first = archiveProviderExchange({
conversationsDir,
provider: 'codex',
prompt: 'hello',
result: 'world',
continuation: 'thread-123',
status: 'completed',
timestamp,
});
const second = archiveProviderExchange({
conversationsDir,
provider: 'codex',
prompt: 'hello again',
result: 'world again',
continuation: 'thread-123',
status: 'completed',
timestamp,
});
// Same thread → same date-prefixed, thread-stable file, not one per exchange.
expect(first).toBe('2026-06-03-codex-thread-123.md');
expect(second).toBe(first);
expect(fs.readdirSync(conversationsDir)).toHaveLength(1);
const content = fs.readFileSync(path.join(conversationsDir, first!), 'utf-8');
// Header (thread-level metadata) written exactly once.
expect(content.match(/# Codex Conversation/g)).toHaveLength(1);
expect(content).toContain('Provider: codex');
expect(content).toContain('Continuation/thread id: thread-123');
// Both exchanges present, each with its own status line.
expect(content).toContain('**User**: hello');
expect(content).toContain('**Assistant**: world');
expect(content).toContain('**User**: hello again');
expect(content).toContain('**Assistant**: world again');
expect(content.match(/Status: completed/g)).toHaveLength(2);
});
it('writes a separate file per thread', () => {
const conversationsDir = makeTmpDir();
const timestamp = new Date('2026-06-03T12:34:56.789Z');
const a = archiveProviderExchange({
conversationsDir,
provider: 'codex',
prompt: 'p',
result: 'r',
continuation: 'thread-a',
status: 'completed',
timestamp,
});
const b = archiveProviderExchange({
conversationsDir,
provider: 'codex',
prompt: 'p',
result: 'r',
continuation: 'thread-b',
status: 'completed',
timestamp,
});
expect(a).toBe('2026-06-03-codex-thread-a.md');
expect(b).toBe('2026-06-03-codex-thread-b.md');
expect(fs.readdirSync(conversationsDir)).toHaveLength(2);
});
it('keeps the creation-date prefix stable when later exchanges land on another day', () => {
const conversationsDir = makeTmpDir();
const first = archiveProviderExchange({
conversationsDir,
provider: 'codex',
prompt: 'a',
result: 'b',
continuation: 'thread-x',
status: 'completed',
timestamp: new Date('2026-06-03T10:00:00.000Z'),
});
// A later exchange on a different day must append to the same file, not
// mint a new 2026-06-05-* one (the bug a naive date-from-timestamp scheme
// would introduce).
const second = archiveProviderExchange({
conversationsDir,
provider: 'codex',
prompt: 'c',
result: 'd',
continuation: 'thread-x',
status: 'completed',
timestamp: new Date('2026-06-05T10:00:00.000Z'),
});
expect(first).toBe('2026-06-03-codex-thread-x.md');
expect(second).toBe(first);
expect(fs.readdirSync(conversationsDir)).toHaveLength(1);
});
it('skips empty result text', () => {
const conversationsDir = makeTmpDir();
const filename = archiveProviderExchange({
conversationsDir,
provider: 'codex',
prompt: 'hello',
result: ' ',
continuation: 'thread-123',
status: 'completed',
});
expect(filename).toBeNull();
expect(fs.readdirSync(conversationsDir)).toHaveLength(0);
});
});
@@ -0,0 +1,105 @@
import fs from 'fs';
import path from 'path';
/**
* Per-thread conversation archive for providers with no on-disk transcript —
* payload code, shipped with the provider that needs it. The provider's
* `onExchangeComplete` hook (see types.ts) calls this with each completed
* exchange; the runner never archives on a provider's behalf.
*
* One file per thread (keyed on the continuation id), named
* `<date>-<provider>-<thread>.md` and appended to as exchanges complete —
* mirroring the Claude path's one-file-per-session granularity and its
* date-prefixed, name-sortable filenames, since the Codex app-server keeps
* history server-side with no transcript to roll up at a compaction boundary.
* The date is the thread's creation day and stays stable across later appends.
*/
const DEFAULT_CONVERSATIONS_DIR = '/workspace/agent/conversations';
export interface ProviderExchangeArchiveOptions {
provider: string;
prompt: string;
result: string | null | undefined;
continuation?: string;
status: string;
timestamp?: Date;
conversationsDir?: string;
}
/**
* Append a single prompt/result exchange to its thread's conversation file,
* writing the thread-level header once when the file is first created. Returns
* the (thread-stable) filename, or null when there is nothing to archive
* (empty result).
*/
export function archiveProviderExchange(options: ProviderExchangeArchiveOptions): string | null {
const result = options.result?.trim();
if (!result) return null;
const timestamp = options.timestamp ?? new Date();
const conversationsDir =
options.conversationsDir || process.env.NANOCLAW_CONVERSATIONS_DIR || DEFAULT_CONVERSATIONS_DIR;
fs.mkdirSync(conversationsDir, { recursive: true });
const filename = threadArchiveFilename(conversationsDir, options.provider, options.continuation, timestamp);
const filePath = path.join(conversationsDir, filename);
// Thread-level metadata (provider, thread id) belongs in the header, written
// once. Per-exchange metadata (timestamp, status) rides in each appended
// block. Each block leads with a blank line + `---` so the separator renders
// as a thematic break, not a setext heading underline on the prior line.
const parts: string[] = [];
if (!fs.existsSync(filePath)) {
parts.push(
`# ${titleCase(options.provider)} Conversation`,
'',
`Provider: ${options.provider}`,
`Continuation/thread id: ${options.continuation || '(none)'}`,
);
}
parts.push(
'',
'---',
'',
`Archived: ${timestamp.toISOString()} · Status: ${options.status}`,
'',
`**User**: ${truncate(options.prompt)}`,
'',
`**Assistant**: ${truncate(result)}`,
'',
);
fs.appendFileSync(filePath, parts.join('\n'));
return filename;
}
function threadArchiveFilename(
dir: string,
provider: string,
continuation: string | undefined,
timestamp: Date,
): string {
const thread = sanitizeSlug(continuation || 'no-thread').slice(0, 48) || 'no-thread';
const suffix = `${sanitizeSlug(provider)}-${thread}.md`;
// Reuse this thread's existing file whatever day it was created; only stamp a
// new date when none exists. Match on the suffix after the date prefix.
const dated = /^\d{4}-\d{2}-\d{2}-/;
const existing = fs.readdirSync(dir).find((f) => dated.test(f) && f.replace(dated, '') === suffix);
if (existing) return existing;
return `${timestamp.toISOString().split('T')[0]}-${suffix}`;
}
function sanitizeSlug(value: string): string {
return value
.toLowerCase()
.replace(/[^a-z0-9]+/g, '-')
.replace(/^-+|-+$/g, '');
}
function titleCase(value: string): string {
return value ? value[0].toUpperCase() + value.slice(1) : 'Provider';
}
function truncate(value: string): string {
return value.length > 2000 ? value.slice(0, 2000) + '...' : value;
}
@@ -2,6 +2,7 @@ import { describe, it, expect } from 'bun:test';
import { createProvider, type ProviderName } from './factory.js';
import { ClaudeProvider } from './claude.js';
import { CodexProvider } from './codex.js';
import { MockProvider } from './mock.js';
describe('createProvider', () => {
@@ -9,6 +10,10 @@ describe('createProvider', () => {
expect(createProvider('claude')).toBeInstanceOf(ClaudeProvider);
});
it('returns CodexProvider for codex', () => {
expect(createProvider('codex')).toBeInstanceOf(CodexProvider);
});
it('returns MockProvider for mock', () => {
expect(createProvider('mock')).toBeInstanceOf(MockProvider);
});
@@ -3,4 +3,6 @@
// level. Skills add a new provider by appending one import line below.
import './claude.js';
import './codex.js';
import './mock.js';
import './opencode.js';
@@ -0,0 +1,59 @@
import { describe, it, expect } from 'bun:test';
import { mcpServersToOpenCodeConfig } from './mcp-to-opencode.js';
describe('mcpServersToOpenCodeConfig', () => {
it('maps nanoclaw + extra server like v2 index.ts merge', () => {
const servers = {
nanoclaw: {
command: 'node',
args: ['/app/src/mcp-tools/index.js'],
env: {
SESSION_INBOUND_DB_PATH: '/workspace/inbound.db',
SESSION_OUTBOUND_DB_PATH: '/workspace/outbound.db',
SESSION_HEARTBEAT_PATH: '/workspace/.heartbeat',
},
},
extra: {
command: 'npx',
args: ['-y', 'some-mcp'],
env: { FOO: 'bar' },
},
};
const mcp = mcpServersToOpenCodeConfig(servers);
expect(mcp.nanoclaw).toEqual({
type: 'local',
command: ['node', '/app/src/mcp-tools/index.js'],
environment: {
SESSION_INBOUND_DB_PATH: '/workspace/inbound.db',
SESSION_OUTBOUND_DB_PATH: '/workspace/outbound.db',
SESSION_HEARTBEAT_PATH: '/workspace/.heartbeat',
},
enabled: true,
});
expect(mcp.extra).toEqual({
type: 'local',
command: ['npx', '-y', 'some-mcp'],
environment: { FOO: 'bar' },
enabled: true,
});
});
it('omits environment when env is empty', () => {
const mcp = mcpServersToOpenCodeConfig({
x: { command: 'true', args: [], env: {} },
});
expect(mcp.x).toEqual({
type: 'local',
command: ['true'],
enabled: true,
});
});
it('returns empty record for undefined', () => {
expect(mcpServersToOpenCodeConfig(undefined)).toEqual({});
});
});
@@ -0,0 +1,39 @@
import type { McpServerConfig } from './types.js';
/** OpenCode `mcp` entry shape (local stdio server). */
export type OpenCodeMcpLocal = {
type: 'local';
command: string[];
environment?: Record<string, string>;
enabled: true;
};
/** OpenCode `mcp` entry shape (remote HTTP server). */
export type OpenCodeMcpRemote = {
type: 'remote';
url: string;
headers?: Record<string, string>;
enabled: true;
};
export type OpenCodeMcpEntry = OpenCodeMcpLocal | OpenCodeMcpRemote;
/**
* Map NanoClaw v2 MCP definitions (same shape as Claude Agent SDK) into
* OpenCode config `mcp` field. Stdio-only until `McpServerConfig` gains remote.
*/
export function mcpServersToOpenCodeConfig(
servers: Record<string, McpServerConfig> | undefined,
): Record<string, OpenCodeMcpEntry> {
const out: Record<string, OpenCodeMcpEntry> = {};
if (!servers) return out;
for (const [name, cfg] of Object.entries(servers)) {
out[name] = {
type: 'local',
command: [cfg.command, ...cfg.args],
...(Object.keys(cfg.env).length > 0 ? { environment: cfg.env } : {}),
enabled: true,
};
}
return out;
}
@@ -0,0 +1,22 @@
/**
* Integration test for the opencode provider's CONTAINER-side reach-in: the self-registration
* import in container/agent-runner/src/providers/index.ts. Importing the barrel runs
* opencode.ts's top-level registerProvider('opencode', …); without that import line
* createProvider('opencode') throws 'Unknown provider' at runtime.
*
* Behavior, not structural, and BARREL-ONLY: it imports the real barrel (./index.js),
* never ./opencode.js directly, then asserts listProviderNames() contains the provider. The
* existing opencode.factory.test.ts imports ./opencode.js directly, so it self-registers and
* stays GREEN when the barrel line is deleted — a unit test, not a registration guard.
* This goes red if the barrel import is deleted/drifts or the barrel fails to evaluate, or if @opencode-ai/sdk is not installed (the unmocked barrel import throws) — so it also implicitly guards that dependency.
*/
import { describe, it, expect } from 'bun:test';
import { listProviderNames } from './provider-registry.js';
import './index.js'; // the real container provider barrel — triggers each provider's registerProvider()
describe('opencode provider registration', () => {
it('registers opencode via the provider barrel', () => {
expect(listProviderNames()).toContain('opencode');
});
});
@@ -0,0 +1,10 @@
import { describe, it, expect } from 'bun:test';
import { createProvider } from './factory.js';
import { OpenCodeProvider } from './opencode.js';
describe('createProvider (opencode)', () => {
it('returns OpenCodeProvider for opencode', () => {
expect(createProvider('opencode')).toBeInstanceOf(OpenCodeProvider);
});
});
@@ -0,0 +1,423 @@
import { spawn, type ChildProcess } from 'child_process';
import { createOpencodeClient, type OpencodeClient } from '@opencode-ai/sdk';
import { registerProvider } from './provider-registry.js';
import type { AgentProvider, AgentQuery, ProviderEvent, ProviderOptions, QueryInput } from './types.js';
import { mcpServersToOpenCodeConfig } from './mcp-to-opencode.js';
function log(msg: string): void {
console.error(`[opencode-provider] ${msg}`);
}
const SESSION_STATUS_RETRY_ERROR_AFTER = 3;
/** Stale / dead OpenCode session heuristics (complement Claude-centric host patterns). */
const STALE_SESSION_RE =
/no conversation found|ENOENT.*\.jsonl|session.*not found|NotFoundError|connection reset|ECONNRESET|404|event timeout/i;
function killProcessTree(proc: ChildProcess): void {
if (!proc.pid) return;
try {
process.kill(-proc.pid, 'SIGKILL');
} catch {
try {
proc.kill('SIGKILL');
} catch {
/* ignore */
}
}
}
function spawnOpencodeServer(config: Record<string, unknown>, timeoutMs = 10_000): Promise<{ url: string; proc: ChildProcess }> {
return new Promise((resolve, reject) => {
const hostname = '127.0.0.1';
const port = 4096;
const proc = spawn('opencode', ['serve', `--hostname=${hostname}`, `--port=${port}`], {
env: {
...process.env,
OPENCODE_CONFIG_CONTENT: JSON.stringify(config),
},
detached: true,
});
const id = setTimeout(() => {
killProcessTree(proc);
reject(new Error(`Timeout waiting for OpenCode server to start after ${timeoutMs}ms`));
}, timeoutMs);
let output = '';
proc.stdout?.on('data', (chunk: Buffer) => {
output += chunk.toString();
for (const line of output.split('\n')) {
if (line.startsWith('opencode server listening')) {
const match = line.match(/on\s+(https?:\/\/[^\s]+)/);
if (match) {
clearTimeout(id);
resolve({ url: match[1], proc });
}
}
}
});
proc.stderr?.on('data', (chunk: Buffer) => {
output += chunk.toString();
});
proc.on('exit', (code) => {
clearTimeout(id);
let msg = `OpenCode server exited with code ${code}`;
if (output.trim()) msg += `\nServer output: ${output}`;
reject(new Error(msg));
});
proc.on('error', (err) => {
clearTimeout(id);
reject(err);
});
});
}
function wrapPromptWithContext(text: string, systemInstructions?: string): string {
let out = text;
if (systemInstructions) {
out = `<system>\n${systemInstructions}\n</system>\n\n${out}`;
}
return out;
}
function buildOpenCodeConfig(options: ProviderOptions): Record<string, unknown> {
const provider = process.env.OPENCODE_PROVIDER || 'anthropic';
const model = process.env.OPENCODE_MODEL;
const smallModel = process.env.OPENCODE_SMALL_MODEL;
const proxyUrl = process.env.ANTHROPIC_BASE_URL;
const providerModelId = model ? model.replace(new RegExp(`^${provider}/`), '') : undefined;
const providerSmallModelId = smallModel ? smallModel.replace(new RegExp(`^${provider}/`), '') : undefined;
const modelsToRegister = [providerModelId, providerSmallModelId]
.filter(Boolean)
.filter((mid, i, a) => a.indexOf(mid as string) === i);
const providerOptions: Record<string, unknown> =
provider === 'anthropic'
? {}
: {
[provider]: {
options: { apiKey: 'placeholder', baseURL: proxyUrl },
...(modelsToRegister.length > 0
? {
models: Object.fromEntries(
modelsToRegister.map((mid) => [mid, { id: mid, name: mid, tool_call: true }]),
),
}
: {}),
},
};
const mcp = mcpServersToOpenCodeConfig(options.mcpServers);
// Load shared base + per-group fragments + per-group memory through OpenCode's
// native instructions pipeline (session/instruction.ts). Absolute paths with
// globs are supported. Files are read raw — `@./...` includes are NOT expanded
// by OpenCode, so point at the concrete files, not at composed CLAUDE.md.
const instructions = [
'/app/CLAUDE.md',
'/workspace/agent/.claude-fragments/*.md',
'/workspace/agent/CLAUDE.local.md',
];
return {
...(model ? { model } : {}),
...(smallModel ? { small_model: smallModel } : {}),
enabled_providers: [provider],
permission: 'allow',
autoupdate: false,
snapshot: false,
provider: providerOptions,
instructions,
mcp,
};
}
type SharedRuntime = {
proc: ChildProcess;
client: OpencodeClient;
stream: AsyncGenerator<{ type: string; properties: Record<string, unknown> }, void, void>;
streamRelease: () => void;
};
let sharedRuntime: SharedRuntime | null = null;
let sharedConfigKey: string | null = null;
let sharedInit: Promise<SharedRuntime> | null = null;
function runtimeConfigKey(options: ProviderOptions): string {
return JSON.stringify({
mcp: mcpServersToOpenCodeConfig(options.mcpServers),
model: process.env.OPENCODE_MODEL,
small: process.env.OPENCODE_SMALL_MODEL,
op: process.env.OPENCODE_PROVIDER,
});
}
async function ensureSharedRuntime(options: ProviderOptions): Promise<SharedRuntime> {
const key = runtimeConfigKey(options);
if (sharedRuntime && sharedConfigKey === key) return sharedRuntime;
if (sharedInit) return sharedInit;
sharedInit = (async () => {
if (sharedRuntime) {
destroySharedRuntime();
}
const config = buildOpenCodeConfig(options);
const { url, proc } = await spawnOpencodeServer(config);
const client = createOpencodeClient({ baseUrl: url });
const sub = await client.event.subscribe();
const stream = sub.stream as AsyncGenerator<{ type: string; properties: Record<string, unknown> }, void, void>;
sharedRuntime = {
proc,
client,
stream,
streamRelease: () => {
void stream.return?.(undefined);
},
};
sharedConfigKey = key;
sharedInit = null;
return sharedRuntime;
})();
return sharedInit;
}
export function destroySharedRuntime(): void {
if (sharedRuntime) {
try {
sharedRuntime.streamRelease();
} catch {
/* ignore */
}
killProcessTree(sharedRuntime.proc);
sharedRuntime = null;
sharedConfigKey = null;
}
sharedInit = null;
}
function sessionErrorMessage(props: { error?: unknown }): string {
const err = props.error as { data?: { message?: string } } | undefined;
if (err && typeof err === 'object' && err.data && typeof err.data.message === 'string') {
return err.data.message;
}
return JSON.stringify(props.error) || 'OpenCode session error';
}
export class OpenCodeProvider implements AgentProvider {
readonly supportsNativeSlashCommands = false;
private readonly options: ProviderOptions;
private activeSessionId: string | undefined;
constructor(options: ProviderOptions = {}) {
this.options = options;
}
isSessionInvalid(err: unknown): boolean {
const msg = err instanceof Error ? err.message : String(err);
return STALE_SESSION_RE.test(msg);
}
query(input: QueryInput): AgentQuery {
if (input.continuation) {
this.activeSessionId = input.continuation;
} else {
this.activeSessionId = undefined;
}
const pending: string[] = [];
let waiting: (() => void) | null = null;
let ended = false;
let aborted = false;
const systemInstructions = input.systemContext?.instructions;
pending.push(wrapPromptWithContext(input.prompt, systemInstructions));
const kick = (): void => {
waiting?.();
};
const self = this;
const IDLE_TIMEOUT_MS = Number(process.env.OPENCODE_IDLE_TIMEOUT_MS) || 300_000;
async function* gen(): AsyncGenerator<ProviderEvent> {
let initYielded = false;
const rt = await ensureSharedRuntime(self.options);
const { client, stream } = rt;
while (!aborted) {
while (pending.length === 0 && !ended && !aborted) {
await new Promise<void>((resolve) => {
waiting = resolve;
});
waiting = null;
}
if (aborted) return;
if (pending.length === 0 && ended) return;
const text = pending.shift()!;
let sessionId = self.activeSessionId;
if (!sessionId) {
const created = await client.session.create();
if (created.error) {
throw new Error(`OpenCode: failed to create session: ${JSON.stringify(created.error)}`);
}
sessionId = created.data?.id;
if (!sessionId) throw new Error('OpenCode: failed to create session (no id)');
self.activeSessionId = sessionId;
}
if (!initYielded) {
yield { type: 'init', continuation: sessionId };
initYielded = true;
}
const promptRes = await client.session.promptAsync({
path: { id: sessionId },
body: { parts: [{ type: 'text', text }] },
});
if (promptRes.error) {
self.activeSessionId = undefined;
throw new Error(`OpenCode promptAsync: ${JSON.stringify(promptRes.error)}`);
}
const partTextByMessageId = new Map<string, string>();
const roleByMessageId = new Map<string, string>();
let lastEventAt = Date.now();
let eventTimedOut = false;
const timeoutCheck = setInterval(() => {
if (Date.now() - lastEventAt > IDLE_TIMEOUT_MS) {
log(`OpenCode event timeout (${IDLE_TIMEOUT_MS}ms) — clearing session ${sessionId}`);
eventTimedOut = true;
self.activeSessionId = undefined;
destroySharedRuntime();
kick();
}
}, 5000);
try {
turn: while (true) {
if (aborted) return;
if (eventTimedOut) {
throw new Error(`OpenCode event timeout (${IDLE_TIMEOUT_MS}ms)`);
}
const { value: ev, done } = await stream.next();
if (done) {
throw new Error('OpenCode SSE stream ended unexpectedly');
}
if (!ev?.type || ev.type === 'server.connected' || ev.type === 'server.heartbeat') continue;
lastEventAt = Date.now();
yield { type: 'activity' };
switch (ev.type) {
case 'message.updated': {
const info = ev.properties.info as { id?: string; role?: string } | undefined;
if (info?.id && info?.role) {
roleByMessageId.set(info.id, info.role);
}
break;
}
case 'message.part.updated': {
const part = ev.properties.part as { type?: string; messageID?: string; text?: string } | undefined;
if (part?.type === 'text' && part.messageID && part.text) {
partTextByMessageId.set(part.messageID, part.text);
}
break;
}
case 'permission.updated': {
const perm = ev.properties as { id?: string; sessionID?: string };
if (perm.sessionID === sessionId && perm.id) {
try {
await client.postSessionIdPermissionsPermissionId({
path: { id: sessionId, permissionID: perm.id },
body: { response: 'always' },
});
} catch (err) {
log(`Failed to auto-reply permission: ${err instanceof Error ? err.message : String(err)}`);
}
}
break;
}
case 'session.status': {
const props = ev.properties as {
sessionID?: string;
status?: { type?: string; attempt?: number; message?: string };
};
if (props.sessionID !== sessionId) break;
const st = props.status;
if (
st?.type === 'retry' &&
typeof st.attempt === 'number' &&
st.attempt >= SESSION_STATUS_RETRY_ERROR_AFTER &&
st.message
) {
self.activeSessionId = undefined;
throw new Error(`OpenCode retry limit (${st.attempt}): ${st.message}`);
}
break;
}
case 'session.error': {
const props = ev.properties as { sessionID?: string; error?: unknown };
if (props.sessionID === sessionId || props.sessionID === undefined) {
self.activeSessionId = undefined;
throw new Error(sessionErrorMessage(props));
}
break;
}
case 'session.idle': {
const sid = (ev.properties as { sessionID?: string }).sessionID;
if (sid === sessionId) {
break turn;
}
break;
}
default:
break;
}
}
} finally {
clearInterval(timeoutCheck);
}
let resultText = '';
for (const [msgId, role] of roleByMessageId) {
if (role === 'assistant') {
resultText = partTextByMessageId.get(msgId) ?? resultText;
}
}
yield { type: 'result', text: resultText || null };
}
}
return {
push: (message: string) => {
pending.push(wrapPromptWithContext(message, systemInstructions));
kick();
},
end: () => {
ended = true;
kick();
},
events: gen(),
abort: () => {
aborted = true;
this.activeSessionId = undefined;
kick();
destroySharedRuntime();
},
};
}
}
registerProvider('opencode', (opts) => new OpenCodeProvider(opts));
@@ -14,33 +14,6 @@ export interface AgentProvider {
* (missing transcript, unknown session, etc.) and should be cleared.
*/
isSessionInvalid(err: unknown): boolean;
/**
* Provider-specific (test, replace) pairs applied to result text and
* error text before delivery to the user. Iterated in declaration
* order; the first rule whose `test` matches wins, and its `replace`
* is sent verbatim. If no rule matches, the original text passes
* through unchanged — there is no fallback.
*
* Use this to swap raw SDK/CLI banners that the user can't act on
* (e.g. "Please run /login" — they're not on the host) for actionable
* messages naming the operator's actual remediation path.
*/
errorSubstitutions?: readonly ErrorSubstitution[];
}
/**
* A single rule for swapping raw provider output with a user-facing
* message. Each rule is a `(test, replace)` pair plus a short `name`
* used only for logging when the rule fires.
*/
export interface ErrorSubstitution {
/** Short identifier for logs — e.g. "auth-required", "rate-limited". */
name: string;
/** Regex tested against the error/result text. First match wins. */
test: RegExp;
/** User-facing replacement when `test` matches. */
replace: string;
}
/**
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "nanoclaw",
"version": "2.0.16",
"version": "2.0.14",
"description": "Personal Claude assistant. Lightweight, secure, customizable.",
"type": "module",
"packageManager": "pnpm@10.33.0",
+4 -4
View File
@@ -1,5 +1,5 @@
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="90" height="20" role="img" aria-label="134k tokens, 67% of context window">
<title>134k tokens, 67% of context window</title>
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="90" height="20" role="img" aria-label="133k tokens, 66% of context window">
<title>133k tokens, 66% of context window</title>
<linearGradient id="s" x2="0" y2="100%">
<stop offset="0" stop-color="#bbb" stop-opacity=".1"/>
<stop offset="1" stop-opacity=".1"/>
@@ -15,8 +15,8 @@
<g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" font-size="11">
<text aria-hidden="true" x="26" y="15" fill="#010101" fill-opacity=".3">tokens</text>
<text x="26" y="14">tokens</text>
<text aria-hidden="true" x="71" y="15" fill="#010101" fill-opacity=".3">134k</text>
<text x="71" y="14">134k</text>
<text aria-hidden="true" x="71" y="15" fill="#010101" fill-opacity=".3">133k</text>
<text x="71" y="14">133k</text>
</g>
</g>
</a>

Before

Width:  |  Height:  |  Size: 1.1 KiB

After

Width:  |  Height:  |  Size: 1.1 KiB

+39 -139
View File
@@ -46,14 +46,13 @@ import {
} from './lib/setup-config-parse.js';
import { runAdvancedScreen } from './lib/setup-config-screen.js';
import { runWindowedStep } from './lib/windowed-runner.js';
import { detectRegisteredGroups, detectExistingDisplayName } from './environment.js';
import { pollHealth } from './onecli.js';
import { getLaunchdLabel, getSystemdUnit } from '../src/install-slug.js';
import { claudeCliAvailable, resolveTimezoneViaClaude } from './lib/tz-from-claude.js';
import * as setupLog from './logs.js';
import { ensureAnswer, fail, runQuietChild, runQuietStep } from './lib/runner.js';
import { emit as phEmit } from './lib/diagnostics.js';
import { accentGreen, brandBody, brandBold, brandChip, dimWrap, fitToWidth, note, wrapForGutter } from './lib/theme.js';
import { brandBold, brandChip, dimWrap, fitToWidth, wrapForGutter } from './lib/theme.js';
import { isValidTimezone } from '../src/timezone.js';
const CLI_AGENT_NAME = 'Terminal Agent';
@@ -122,47 +121,12 @@ async function main(): Promise<void> {
}
}
// Detect existing .env and offer to reuse it so the user doesn't have to
// paste credentials again on a re-run.
const existingEnv = detectExistingEnv();
if (existingEnv) {
const lines = Object.values(existingEnv.groups).map(
(g) => ` ${k.green('✓')} ${g.label}`,
);
note(lines.join('\n'), 'Found existing configuration');
const reuseChoice = ensureAnswer(
await brightSelect({
message: 'Use this existing environment?',
options: [
{ value: 'reuse', label: 'Yes, use what I already have', hint: 'recommended' },
{ value: 'fresh', label: 'No, start fresh' },
],
initialValue: 'reuse',
}),
) as 'reuse' | 'fresh';
setupLog.userInput('existing_env_choice', reuseChoice);
if (reuseChoice === 'reuse') {
for (const [key, value] of Object.entries(existingEnv.raw)) {
if (!process.env[key]) process.env[key] = value;
}
if (existingEnv.groups.onecli) skip.add('onecli');
if (detectRegisteredGroups(process.cwd())) {
skip.add('cli-agent');
skip.add('first-chat');
}
}
}
if (!skip.has('container')) {
p.log.message(brandBody(dimWrap('Your assistant lives in its own sandbox. It can only see what you explicitly share.', 4)));
p.log.message(dimWrap('Your assistant lives in its own sandbox. It can only see what you explicitly share.', 4));
p.log.message(
brandBody(
dimWrap(
'The first build pulls a base image and installs a few tools. On a fresh machine this usually takes 310 minutes.',
4,
),
dimWrap(
'The first build pulls a base image and installs a few tools. On a fresh machine this usually takes 310 minutes.',
4,
),
);
const res = await runWindowedStep('container', {
@@ -197,11 +161,9 @@ async function main(): Promise<void> {
if (!skip.has('onecli')) {
p.log.message(
brandBody(
dimWrap(
'Your assistant never gets your API keys directly. The vault adds them to approved requests as they leave the sandbox.',
4,
),
dimWrap(
'Your assistant never gets your API keys directly. The vault adds them to approved requests as they leave the sandbox.',
4,
),
);
@@ -325,27 +287,22 @@ async function main(): Promise<void> {
await fail('service', "Couldn't start NanoClaw.", 'See logs/nanoclaw.error.log for details.');
}
if (res.terminal?.fields.DOCKER_GROUP_STALE === 'true') {
p.log.warn(brandBody("NanoClaw's permissions need a tweak before it can reach Docker."));
p.log.warn("NanoClaw's permissions need a tweak before it can reach Docker.");
p.log.message(
brandBody(
' sudo setfacl -m u:$(whoami):rw /var/run/docker.sock\n' + ` systemctl --user restart ${getSystemdUnit()}`,
),
' sudo setfacl -m u:$(whoami):rw /var/run/docker.sock\n' + ` systemctl --user restart ${getSystemdUnit()}`,
);
}
}
let displayName: string | undefined;
async function resolveDisplayName(): Promise<string> {
if (displayName) return displayName;
const preset = process.env.NANOCLAW_DISPLAY_NAME?.trim();
const existing = detectExistingDisplayName(process.cwd());
const needsDisplayName = !skip.has('cli-agent') || !skip.has('channel');
if (needsDisplayName) {
const fallback = process.env.USER?.trim() || 'Operator';
displayName = preset || existing || (await askDisplayName(fallback));
return displayName;
const preset = process.env.NANOCLAW_DISPLAY_NAME?.trim();
displayName = preset || (await askDisplayName(fallback));
}
if (!skip.has('cli-agent')) {
await resolveDisplayName();
const res = await runQuietStep(
'cli-agent',
{
@@ -363,18 +320,16 @@ async function main(): Promise<void> {
}
if (!skip.has('first-chat')) {
p.log.message(
brandBody(
dimWrap(
"Your assistant runs in an isolated sandbox. I'm going to send it a quick test message (ping) and wait for a reply (pong) to confirm it's responding. First startup typically takes 3060 seconds while the sandbox warms up.",
4,
),
dimWrap(
"Your assistant runs in an isolated sandbox. I'm going to send it a quick test message (ping) and wait for a reply (pong) to confirm it's responding. First startup typically takes 3060 seconds while the sandbox warms up.",
4,
),
);
const ping = await confirmAssistantResponds();
if (ping === 'ok') {
phEmit('first_chat_ready');
const next = ensureAnswer(
await brightSelect<'continue' | 'chat'>({
await p.select({
message: 'What next?',
options: [
{
@@ -416,9 +371,6 @@ async function main(): Promise<void> {
let channelChoice: ChannelChoice = 'skip';
if (!skip.has('channel')) {
channelChoice = await askChannelChoice();
if (channelChoice !== 'skip') {
await resolveDisplayName();
}
if (channelChoice === 'telegram') {
await runTelegramChannel(displayName!);
} else if (channelChoice === 'discord') {
@@ -435,11 +387,9 @@ async function main(): Promise<void> {
await runIMessageChannel(displayName!);
} else {
p.log.info(
brandBody(
wrapForGutter(
'No messaging app for now. You can add one later (like Telegram, Discord, WhatsApp, Teams, Slack, or iMessage).',
4,
),
wrapForGutter(
'No messaging app for now. You can add one later (like Telegram, Discord, WhatsApp, Teams, Slack, or iMessage).',
4,
),
);
}
@@ -485,7 +435,7 @@ async function main(): Promise<void> {
);
}
if (notes.length > 0) {
note(notes.join('\n'), "What's left");
p.note(notes.join('\n'), "What's left");
}
// "What's left" is a soft failure — we don't abort like fail(), but the
// user is still stuck and a fix is exactly what claude-assist is for.
@@ -517,11 +467,11 @@ async function main(): Promise<void> {
];
const labelWidth = Math.max(...rows.map(([l]) => l.length));
const nextSteps = rows.map(([l, c]) => `${k.cyan(l.padEnd(labelWidth))} ${c}`).join('\n');
note(nextSteps, 'Try these');
p.note(nextSteps, 'Try these');
// Always-on warning goes before the "check your DMs" directive so the
// caveat doesn't land after the user's already looked away at their phone.
note(
p.note(
wrapForGutter(
"NanoClaw runs on this machine. It's only reachable while this computer is on and connected to the internet. For always-on availability, run it on a cloud VM — or keep this machine awake.",
6,
@@ -538,7 +488,7 @@ async function main(): Promise<void> {
// that the welcome-message signal was too easy to miss. Use p.note so it
// renders with a visible box, cyan-bold the directive line, and put it
// as the last thing before outro.
note(`${brandBold('→')} ${k.bold(`Check your ${dmTarget} — your assistant is saying hi.`)}`, 'Go say hi');
p.note(`${brandBold('→')} ${k.bold(`Check your ${dmTarget} — your assistant is saying hi.`)}`, 'Go say hi');
p.outro(k.green("You're set."));
} else {
p.outro(k.green("You're ready! Chat with `pnpm run chat hi`."));
@@ -560,7 +510,10 @@ function channelDmLabel(choice: ChannelChoice): string | null {
case 'imessage':
return 'iMessage';
case 'slack':
return 'Slack DMs';
// Slack install doesn't wire an agent or send a welcome DM — the
// driver prints its own "finish in your Slack app" note. Falling
// through to null avoids a misleading "check your Slack DMs" banner.
return null;
default:
return null;
}
@@ -617,7 +570,7 @@ function renderPingFailureNote(result: PingResult): void {
'No reply from your assistant within 30 seconds. Check `logs/nanoclaw.log` for clues, then try `pnpm run chat hi`.',
6,
);
note(body, 'Skipping the first chat');
p.note(body, 'Skipping the first chat');
}
/**
@@ -632,7 +585,7 @@ function renderPingFailureNote(result: PingResult): void {
* clearly optional.
*/
async function runFirstChat(): Promise<void> {
note(
p.note(
wrapForGutter(
[
'Your assistant runs in a sandbox on this machine.',
@@ -679,7 +632,7 @@ function sendChatMessage(message: string): Promise<void> {
async function runAuthStep(): Promise<void> {
if (anthropicSecretExists()) {
p.log.success(brandBody('Your Claude account is already connected.'));
p.log.success('Your Claude account is already connected.');
setupLog.step('auth', 'skipped', 0, { REASON: 'secret-already-present' });
return;
}
@@ -727,7 +680,7 @@ async function runAuthStep(): Promise<void> {
}
async function runSubscriptionAuth(): Promise<void> {
p.log.step(brandBody('Opening the Claude sign-in flow…'));
p.log.step('Opening the Claude sign-in flow…');
console.log(k.dim(' (a browser will open for sign-in; this part is interactive)'));
console.log();
const start = Date.now();
@@ -746,7 +699,7 @@ async function runSubscriptionAuth(): Promise<void> {
);
}
setupLog.step('auth', 'interactive', durationMs, { METHOD: 'subscription' });
p.log.success(brandBody('Claude account connected.'));
p.log.success('Claude account connected.');
}
async function runPasteAuth(method: 'oauth' | 'api'): Promise<void> {
@@ -756,7 +709,6 @@ async function runPasteAuth(method: 'oauth' | 'api'): Promise<void> {
const answer = ensureAnswer(
await p.password({
message: `Paste your ${label}`,
clearOnError: true,
validate: (v) => {
if (!v || !v.trim()) return 'Required';
if (!v.trim().startsWith(prefix)) {
@@ -970,11 +922,9 @@ async function runTimezoneStep(): Promise<void> {
tz = await resolveTimezoneViaClaude(raw);
} else {
p.log.warn(
brandBody(
wrapForGutter(
"That's not a standard IANA zone and I can't call Claude to interpret it here — try again with a zone like `America/New_York` or `Europe/London`.",
4,
),
wrapForGutter(
"That's not a standard IANA zone and I can't call Claude to interpret it here — try again with a zone like `America/New_York` or `Europe/London`.",
4,
),
);
}
@@ -1017,7 +967,7 @@ async function runTimezoneStep(): Promise<void> {
async function askDisplayName(fallback: string): Promise<string> {
const answer = ensureAnswer(
await p.text({
message: `What should your assistant call ${accentGreen('you')}?`,
message: 'What should your assistant call you?',
placeholder: fallback,
defaultValue: fallback,
}),
@@ -1063,56 +1013,6 @@ async function askChannelChoice(): Promise<ChannelChoice> {
// ─── interactive / env helpers ─────────────────────────────────────────
interface ExistingEnvGroup {
label: string;
keys: string[];
}
const ENV_KEY_GROUPS: Record<string, { label: string; keys: string[] }> = {
onecli: { label: 'OneCLI', keys: ['ONECLI_URL'] },
telegram: { label: 'Telegram', keys: ['TELEGRAM_BOT_TOKEN'] },
discord: { label: 'Discord', keys: ['DISCORD_BOT_TOKEN', 'DISCORD_APPLICATION_ID', 'DISCORD_PUBLIC_KEY'] },
slack: { label: 'Slack', keys: ['SLACK_BOT_TOKEN', 'SLACK_SIGNING_SECRET'] },
signal: { label: 'Signal', keys: ['SIGNAL_ACCOUNT'] },
teams: { label: 'Teams', keys: ['TEAMS_APP_ID', 'TEAMS_APP_PASSWORD', 'TEAMS_APP_TENANT_ID', 'TEAMS_APP_TYPE'] },
whatsapp: { label: 'WhatsApp', keys: ['ASSISTANT_HAS_OWN_NUMBER'] },
imessage: { label: 'iMessage', keys: ['IMESSAGE_LOCAL', 'IMESSAGE_ENABLED', 'IMESSAGE_SERVER_URL', 'IMESSAGE_API_KEY'] },
};
function detectExistingEnv(): { groups: Record<string, ExistingEnvGroup>; raw: Record<string, string> } | null {
const envPath = path.join(process.cwd(), '.env');
if (!fs.existsSync(envPath)) return null;
let content: string;
try {
content = fs.readFileSync(envPath, 'utf-8');
} catch {
return null;
}
const raw: Record<string, string> = {};
for (const line of content.split('\n')) {
const trimmed = line.trim();
if (!trimmed || trimmed.startsWith('#')) continue;
const eq = trimmed.indexOf('=');
if (eq < 1) continue;
raw[trimmed.slice(0, eq)] = trimmed.slice(eq + 1);
}
if (Object.keys(raw).length === 0) return null;
const groups: Record<string, ExistingEnvGroup> = {};
for (const [id, def] of Object.entries(ENV_KEY_GROUPS)) {
const found = def.keys.filter((key) => raw[key] !== undefined);
if (found.length > 0) {
groups[id] = { label: def.label, keys: found };
}
}
if (Object.keys(groups).length === 0) return null;
return { groups, raw };
}
function anthropicSecretExists(): boolean {
try {
const res = spawnSync('onecli', ['secrets', 'list'], {
@@ -1189,7 +1089,7 @@ function maybeReexecUnderSg(): void {
if (!/permission denied/i.test(err)) return;
if (spawnSync('which', ['sg'], { stdio: 'ignore' }).status !== 0) return;
p.log.warn(brandBody('Docker socket not accessible in current group. Re-executing under `sg docker`.'));
p.log.warn('Docker socket not accessible in current group. Re-executing under `sg docker`.');
const res = spawnSync('sg', ['docker', '-c', 'pnpm run setup:auto'], {
stdio: 'inherit',
env: { ...process.env, NANOCLAW_REEXEC_SG: '1' },
+7 -21
View File
@@ -31,7 +31,6 @@ import { brightSelect } from '../lib/bright-select.js';
import { confirmThenOpen } from '../lib/browser.js';
import { askOperatorRole } from '../lib/role-prompt.js';
import { ensureAnswer, fail, runQuietChild } from '../lib/runner.js';
import { accentGreen, brandBody, note } from '../lib/theme.js';
const DEFAULT_AGENT_NAME = 'Nano';
const DISCORD_API = 'https://discord.com/api/v10';
@@ -156,7 +155,7 @@ async function askHasBotToken(): Promise<boolean> {
async function walkThroughBotCreation(): Promise<void> {
const url = 'https://discord.com/developers/applications';
note(
p.note(
[
"You'll create a Discord bot in the Developer Portal. It's free and takes about a minute.",
'',
@@ -185,7 +184,7 @@ function showTokenLocationReminder(hasExistingBot: boolean): void {
// to find it — tokens in the Dev Portal aren't visible after first reveal,
// and "Reset Token" issues a new one.
if (hasExistingBot) {
note(
p.note(
[
"Where to find your bot token:",
'',
@@ -217,7 +216,7 @@ async function walkThroughServerCreation(): Promise<void> {
// the web client and rely on the + button being visible. The steps below
// are the same whether they're in the desktop app or the browser.
const url = 'https://discord.com/channels/@me';
note(
p.note(
[
"A Discord server is just a private space for you and the bot. Free and takes 30 seconds.",
'',
@@ -240,22 +239,9 @@ async function walkThroughServerCreation(): Promise<void> {
}
async function collectDiscordToken(): Promise<string> {
const existing = process.env.DISCORD_BOT_TOKEN?.trim();
if (existing && /^[A-Za-z0-9._-]{50,}$/.test(existing)) {
const reuse = ensureAnswer(await p.confirm({
message: `Found an existing Discord bot token (${existing.slice(0, 10)}…). Use it?`,
initialValue: true,
}));
if (reuse) {
setupLog.userInput('discord_token', 'reused-existing');
return existing;
}
}
const answer = ensureAnswer(
await p.password({
message: 'Paste your bot token',
clearOnError: true,
validate: (v) => {
const t = (v ?? '').trim();
if (!t) return 'Token is required';
@@ -399,14 +385,14 @@ async function resolveOwnerUserId(
}
} else {
p.log.info(
brandBody("Your bot is owned by a Developer Team, so we need your Discord user ID directly."),
"Your bot is owned by a Developer Team, so we need your Discord user ID directly.",
);
}
return await promptForUserIdWithDevMode();
}
async function promptForUserIdWithDevMode(): Promise<string> {
note(
p.note(
[
"To get your Discord user ID:",
'',
@@ -444,7 +430,7 @@ async function promptInviteBot(
`&scope=bot` +
`&permissions=${INVITE_PERMISSIONS}`;
note(
p.note(
[
`@${botUsername} needs to share a server with you before it can DM you.`,
'',
@@ -520,7 +506,7 @@ async function resolveAgentName(): Promise<string> {
}
const answer = ensureAnswer(
await p.text({
message: `What should your ${accentGreen('assistant')} be called?`,
message: 'What should your assistant be called?',
placeholder: DEFAULT_AGENT_NAME,
defaultValue: DEFAULT_AGENT_NAME,
}),
+5 -19
View File
@@ -36,7 +36,7 @@ import * as setupLog from '../logs.js';
import { brightSelect } from '../lib/bright-select.js';
import { askOperatorRole } from '../lib/role-prompt.js';
import { ensureAnswer, fail, runQuietChild } from '../lib/runner.js';
import { accentGreen, note, wrapForGutter } from '../lib/theme.js';
import { wrapForGutter } from '../lib/theme.js';
const DEFAULT_AGENT_NAME = 'Nano';
@@ -189,7 +189,7 @@ async function walkThroughFullDiskAccess(): Promise<void> {
}
const nodeDir = path.dirname(nodePath);
note(
p.note(
wrapForGutter(
[
`iMessage needs Full Disk Access granted to the Node binary:`,
@@ -222,20 +222,7 @@ async function walkThroughFullDiskAccess(): Promise<void> {
}
async function collectRemoteCreds(): Promise<RemoteCreds> {
const existingUrl = process.env.IMESSAGE_SERVER_URL?.trim();
const existingKey = process.env.IMESSAGE_API_KEY?.trim();
if (existingUrl && existingKey && /^https?:\/\//i.test(existingUrl)) {
const reuse = ensureAnswer(await p.confirm({
message: `Found existing Photon credentials (${existingUrl}). Use them?`,
initialValue: true,
}));
if (reuse) {
setupLog.userInput('imessage_remote_creds', 'reused-existing');
return { serverUrl: existingUrl, apiKey: existingKey };
}
}
note(
p.note(
[
"Photon is a separate service that owns an iMessage account and",
"exposes it over HTTP. NanoClaw will talk to it via its API.",
@@ -263,7 +250,6 @@ async function collectRemoteCreds(): Promise<RemoteCreds> {
const keyAnswer = ensureAnswer(
await p.password({
message: 'Photon API key',
clearOnError: true,
validate: (v) => ((v ?? '').trim() ? undefined : 'API key is required'),
}),
);
@@ -278,7 +264,7 @@ async function collectRemoteCreds(): Promise<RemoteCreds> {
}
async function askOperatorHandle(): Promise<string> {
note(
p.note(
[
"What phone number or email do you iMessage with?",
"That's where your assistant will send its welcome message.",
@@ -317,7 +303,7 @@ async function resolveAgentName(): Promise<string> {
}
const answer = ensureAnswer(
await p.text({
message: `What should your ${accentGreen('assistant')} be called?`,
message: 'What should your assistant be called?',
placeholder: DEFAULT_AGENT_NAME,
defaultValue: DEFAULT_AGENT_NAME,
}),
+3 -4
View File
@@ -44,7 +44,6 @@ import {
writeStepEntry,
} from '../lib/runner.js';
import { askOperatorRole } from '../lib/role-prompt.js';
import { accentGreen, note } from '../lib/theme.js';
const DEFAULT_AGENT_NAME = 'Nano';
@@ -140,7 +139,7 @@ async function ensureSignalCli(): Promise<void> {
if (!probe.error && probe.status === 0) return;
if (process.platform === 'darwin') {
note(
p.note(
[
"NanoClaw talks to Signal through signal-cli, which isn't installed yet.",
'',
@@ -153,7 +152,7 @@ async function ensureSignalCli(): Promise<void> {
'signal-cli not found',
);
} else {
note(
p.note(
[
"NanoClaw talks to Signal through signal-cli, which isn't installed yet.",
'',
@@ -347,7 +346,7 @@ async function resolveAgentName(): Promise<string> {
}
const answer = ensureAnswer(
await p.text({
message: `What should your ${accentGreen('assistant')} be called?`,
message: 'What should your assistant be called?',
placeholder: DEFAULT_AGENT_NAME,
defaultValue: DEFAULT_AGENT_NAME,
}),
+28 -203
View File
@@ -1,23 +1,24 @@
/**
* Slack channel flow for setup:auto.
*
* `runSlackChannel(displayName)` owns the full branch from creating a
* Slack app through the welcome DM:
* `runSlackChannel(displayName)` walks the operator from a bare Slack
* workspace through a running bot, then stops before wiring an agent:
*
* 1. Walk through creating a Slack app (api.slack.com/apps) — scopes,
* event subscriptions, and signing secret
* 2. Paste the bot token + signing secret (clack password prompts)
* 3. Validate via auth.test → resolves workspace + bot identity
* 4. Install the adapter (setup/add-slack.sh, non-interactive)
* 5. Ask for the operator's Slack user ID
* 6. conversations.open to get the DM channel ID
* 7. Ask for the messaging-agent name (defaulting to "Nano")
* 8. Wire the agent via scripts/init-first-agent.ts
* 5. Print the post-install checklist: set the public webhook URL in
* Slack's Event Subscriptions, DM the bot to bootstrap the channel,
* then `/manage-channels` to wire an agent.
*
* The welcome DM is sent via outbound delivery (chat.postMessage), which
* works without Event Subscriptions being configured. The user sees the
* greeting in Slack immediately; inbound replies require webhooks, so the
* post-install note covers that.
* Why no welcome DM here: unlike Discord/Telegram (gateway / long-poll),
* Slack needs a public Event Subscriptions URL for inbound events, and
* opening an unsolicited DM would need `im:write` scope we don't force
* the SKILL.md to require. Shipping a honest "here's what's left" note
* is better than a welcome DM the user won't receive until they
* configure the webhook anyway.
*
* All output obeys the three-level contract. See docs/setup-flow.md.
*/
@@ -26,13 +27,11 @@ import k from 'kleur';
import * as setupLog from '../logs.js';
import { confirmThenOpen } from '../lib/browser.js';
import { askOperatorRole } from '../lib/role-prompt.js';
import { ensureAnswer, fail, runQuietChild } from '../lib/runner.js';
import { accentGreen, note, wrapForGutter } from '../lib/theme.js';
import { wrapForGutter } from '../lib/theme.js';
const SLACK_API = 'https://slack.com/api';
const SLACK_APPS_URL = 'https://api.slack.com/apps';
const DEFAULT_AGENT_NAME = 'Nano';
interface WorkspaceInfo {
teamName: string;
@@ -41,7 +40,10 @@ interface WorkspaceInfo {
botUserId: string;
}
export async function runSlackChannel(displayName: string): Promise<void> {
// displayName is reserved for when we start wiring the first agent here.
// Kept to match the `run<X>Channel(displayName)` signature every other
// channel driver uses, so auto.ts can dispatch without a branch.
export async function runSlackChannel(_displayName: string): Promise<void> {
await walkThroughAppCreation();
const token = await collectBotToken();
@@ -76,61 +78,19 @@ export async function runSlackChannel(displayName: string): Promise<void> {
);
}
const ownerUserId = await collectSlackUserId();
const dmChannelId = await openDmChannel(token, ownerUserId);
const platformId = `slack:${dmChannelId}`;
const role = await askOperatorRole('Slack');
setupLog.userInput('slack_role', role);
const agentName = await resolveAgentName();
const init = await runQuietChild(
'init-first-agent',
'pnpm',
[
'exec', 'tsx', 'scripts/init-first-agent.ts',
'--channel', 'slack',
'--user-id', `slack:${ownerUserId}`,
'--platform-id', platformId,
'--display-name', displayName,
'--agent-name', agentName,
'--role', role,
],
{
running: `Wiring ${agentName} to your Slack DMs…`,
done: 'Agent wired.',
},
{
extraFields: {
CHANNEL: 'slack',
AGENT_NAME: agentName,
PLATFORM_ID: platformId,
},
},
);
if (!init.ok) {
await fail(
'init-first-agent',
`Couldn't finish connecting ${agentName}.`,
'You can retry later with `/init-first-agent` in Claude Code.',
);
}
showPostInstallChecklist(info);
}
async function walkThroughAppCreation(): Promise<void> {
note(
p.note(
[
"You'll create a Slack app that the assistant talks through.",
"Free and stays inside the workspaces you pick.",
'',
' 1. Create a new app "From scratch", name it, pick a workspace',
' 2. OAuth & Permissions → add Bot Token Scopes:',
' chat:write, im:write, channels:history, groups:history,',
' im:history, channels:read, groups:read, users:read,',
' reactions:write',
' chat:write, channels:history, groups:history, im:history,',
' channels:read, groups:read, users:read, reactions:write',
' 3. App Home → enable "Messages Tab" and "Allow users to send',
' slash commands and messages from the messages tab"',
' 4. Basic Information → copy the "Signing Secret"',
@@ -151,22 +111,9 @@ async function walkThroughAppCreation(): Promise<void> {
}
async function collectBotToken(): Promise<string> {
const existing = process.env.SLACK_BOT_TOKEN?.trim();
if (existing && existing.startsWith('xoxb-') && existing.length >= 24) {
const reuse = ensureAnswer(await p.confirm({
message: `Found an existing Slack bot token (${existing.slice(0, 10)}…). Use it?`,
initialValue: true,
}));
if (reuse) {
setupLog.userInput('slack_bot_token', 'reused-existing');
return existing;
}
}
const answer = ensureAnswer(
await p.password({
message: 'Paste your Slack bot token',
clearOnError: true,
validate: (v) => {
const t = (v ?? '').trim();
if (!t) return 'Token is required';
@@ -185,22 +132,9 @@ async function collectBotToken(): Promise<string> {
}
async function collectSigningSecret(): Promise<string> {
const existing = process.env.SLACK_SIGNING_SECRET?.trim();
if (existing && /^[a-f0-9]{16,}$/i.test(existing)) {
const reuse = ensureAnswer(await p.confirm({
message: 'Found an existing Slack signing secret. Use it?',
initialValue: true,
}));
if (reuse) {
setupLog.userInput('slack_signing_secret', 'reused-existing');
return existing;
}
}
const answer = ensureAnswer(
await p.password({
message: 'Paste your Slack signing secret',
clearOnError: true,
validate: (v) => {
const t = (v ?? '').trim();
if (!t) return 'Signing secret is required';
@@ -287,135 +221,26 @@ async function validateSlackToken(token: string): Promise<WorkspaceInfo> {
}
}
async function collectSlackUserId(): Promise<string> {
note(
[
"To get your Slack member ID:",
'',
' 1. In Slack, click your profile picture (top right)',
' 2. Click "Profile"',
' 3. Click the three dots (⋯) → "Copy member ID"',
].join('\n'),
'Find your Slack user ID',
);
const answer = ensureAnswer(
await p.text({
message: 'Paste your Slack member ID',
validate: (v) => {
const t = (v ?? '').trim();
if (!t) return 'Member ID is required';
if (!/^U[A-Z0-9]{8,}$/.test(t)) {
return "That doesn't look like a Slack member ID (starts with U)";
}
return undefined;
},
}),
);
const id = (answer as string).trim();
setupLog.userInput('slack_user_id', id);
return id;
}
async function openDmChannel(token: string, userId: string): Promise<string> {
const s = p.spinner();
const start = Date.now();
s.start('Opening a DM channel…');
try {
const res = await fetch(`${SLACK_API}/conversations.open`, {
method: 'POST',
headers: {
Authorization: `Bearer ${token}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({ users: userId }),
});
const data = (await res.json()) as {
ok?: boolean;
channel?: { id?: string };
error?: string;
};
const elapsedS = Math.round((Date.now() - start) / 1000);
if (data.ok && data.channel?.id) {
s.stop(`DM channel ready. ${k.dim(`(${elapsedS}s)`)}`);
setupLog.step('slack-open-dm', 'success', Date.now() - start, {
DM_CHANNEL_ID: data.channel.id,
});
return data.channel.id;
}
const reason = data.error ?? `HTTP ${res.status}`;
s.stop(`Couldn't open a DM channel: ${reason}`, 1);
setupLog.step('slack-open-dm', 'failed', Date.now() - start, {
ERROR: reason,
});
if (reason === 'missing_scope') {
await fail(
'slack-open-dm',
"Your Slack app is missing the im:write scope.",
'Go to OAuth & Permissions in your Slack app settings, add the im:write scope, reinstall the app, then retry setup.',
);
}
await fail(
'slack-open-dm',
"Couldn't open a DM channel with you.",
`Slack said "${reason}". Check the member ID and app permissions, then retry.`,
);
} catch (err) {
const elapsedS = Math.round((Date.now() - start) / 1000);
s.stop(`Couldn't reach Slack. ${k.dim(`(${elapsedS}s)`)}`, 1);
const message = err instanceof Error ? err.message : String(err);
setupLog.step('slack-open-dm', 'failed', Date.now() - start, {
ERROR: message,
});
await fail(
'slack-open-dm',
"Couldn't reach Slack.",
'Check your internet connection and retry setup.',
);
}
}
async function resolveAgentName(): Promise<string> {
const preset = process.env.NANOCLAW_AGENT_NAME?.trim();
if (preset) {
setupLog.userInput('agent_name', preset);
return preset;
}
const answer = ensureAnswer(
await p.text({
message: `What should your ${accentGreen('assistant')} be called?`,
placeholder: DEFAULT_AGENT_NAME,
defaultValue: DEFAULT_AGENT_NAME,
}),
);
const value = (answer as string).trim() || DEFAULT_AGENT_NAME;
setupLog.userInput('agent_name', value);
return value;
}
function showPostInstallChecklist(info: WorkspaceInfo): void {
note(
p.note(
wrapForGutter(
[
`Your agent is wired to Slack and a welcome DM is on its way.`,
`To receive replies, Slack needs a public URL for delivering events:`,
`The Slack adapter is installed and your creds are saved. ${info.teamName} still needs two things before it can talk to you:`,
'',
' 1. Expose NanoClaw\'s webhook server (port 3000) via ngrok,',
' Cloudflare Tunnel, or a reverse proxy on a VPS.',
' 1. A public URL so Slack can deliver events.',
' NanoClaw serves a webhook on port 3000 by default — expose it',
' via ngrok, Cloudflare Tunnel, or a reverse proxy on a VPS.',
'',
' 2. In your Slack app → Event Subscriptions:',
' • Toggle "Enable Events" on',
` • Request URL: https://<your-public-host>/webhook/slack`,
' • Subscribe to bot events: message.channels, message.groups,',
' message.im, app_mention',
' • Save Changes',
' • Save, then reinstall the app when Slack prompts',
'',
' 3. In your Slack app → Interactivity & Shortcuts:',
' • Toggle "Interactivity" on',
` • Request URL: https://<your-public-host>/webhook/slack`,
' • Save Changes',
'',
' 4. Slack will prompt you to reinstall the app — do it to apply',
' the new settings',
` 3. DM @${info.botName} from Slack once — that bootstraps the`,
' messaging group. Then run `/manage-channels` in `claude` to',
' wire an agent to it.',
].join('\n'),
6,
),
+10 -34
View File
@@ -40,7 +40,6 @@ import {
} from '../lib/claude-handoff.js';
import { ensureAnswer, fail, runQuietChild } from '../lib/runner.js';
import { buildTeamsAppPackage } from '../lib/teams-manifest.js';
import { note } from '../lib/theme.js';
import * as setupLog from '../logs.js';
const CHANNEL = 'teams';
@@ -60,28 +59,6 @@ export async function runTeamsChannel(_displayName: string): Promise<void> {
const collected: Collected = {};
const completed: string[] = [];
const existingAppId = process.env.TEAMS_APP_ID?.trim();
const existingPassword = process.env.TEAMS_APP_PASSWORD?.trim();
if (existingAppId && existingPassword) {
const reuse = ensureAnswer(await p.confirm({
message: `Found existing Teams credentials (App ID: ${existingAppId.slice(0, 8)}…). Use them?`,
initialValue: true,
}));
if (reuse) {
collected.appId = existingAppId;
collected.appPassword = existingPassword;
collected.appType = (process.env.TEAMS_APP_TYPE?.trim() as 'SingleTenant' | 'MultiTenant') || 'MultiTenant';
if (collected.appType === 'SingleTenant') {
collected.tenantId = process.env.TEAMS_APP_TENANT_ID?.trim();
}
setupLog.userInput('teams_credentials', 'reused-existing');
await installAdapter(collected);
completed.push('Adapter installed and service restarted (reused existing credentials).');
await finishWithHandoff(collected, completed);
return;
}
}
printIntro();
await confirmPrereqs({ collected, completed });
@@ -102,7 +79,7 @@ export async function runTeamsChannel(_displayName: string): Promise<void> {
// ─── step: intro / prereqs ──────────────────────────────────────────────
function printIntro(): void {
note(
p.note(
[
'Setting up Teams is more involved than the other channels — about',
'7 steps across the Azure portal and Teams admin.',
@@ -116,7 +93,7 @@ function printIntro(): void {
}
async function confirmPrereqs(args: { collected: Collected; completed: string[] }): Promise<void> {
note(
p.note(
[
'Before we start, confirm you have:',
'',
@@ -142,7 +119,7 @@ async function confirmPrereqs(args: { collected: Collected; completed: string[]
// ─── step: public URL ──────────────────────────────────────────────────
async function stepPublicUrl(args: { collected: Collected; completed: string[] }): Promise<void> {
note(
p.note(
[
"Azure Bot Service delivers messages to an HTTPS endpoint you",
"control. The endpoint needs to reach this machine's webhook",
@@ -198,7 +175,7 @@ async function stepAppRegistration(args: {
collected: Collected;
completed: string[];
}): Promise<void> {
note(
p.note(
[
`1. In ${AZURE_PORTAL_URL}, search "App registrations" → "New registration"`,
'2. Name it (e.g. "NanoClaw")',
@@ -282,7 +259,7 @@ async function stepClientSecret(args: {
collected: Collected;
completed: string[];
}): Promise<void> {
note(
p.note(
[
`1. In your app registration, open "Certificates & secrets"`,
'2. Click "New client secret"',
@@ -299,7 +276,6 @@ async function stepClientSecret(args: {
const answer = ensureAnswer(
await p.password({
message: 'Paste the client secret Value',
clearOnError: true,
validate: validateWithHelpEscape((v) => {
const t = (v ?? '').trim();
if (!t) return 'Required';
@@ -352,7 +328,7 @@ async function stepAzureBot(args: {
` --appid ${args.collected.appId} \\\n` +
` ${tenantFlag}--endpoint "${endpoint}"`;
note(
p.note(
[
`In ${AZURE_PORTAL_URL}, search "Azure Bot" → Create.`,
'',
@@ -389,7 +365,7 @@ async function stepEnableTeamsChannel(args: {
collected: Collected;
completed: string[];
}): Promise<void> {
note(
p.note(
[
'1. Open your Azure Bot resource → Channels',
'2. Click Microsoft Teams → Accept terms → Apply',
@@ -459,7 +435,7 @@ async function stepSideload(args: {
completed: string[];
zipPath: string;
}): Promise<void> {
note(
p.note(
[
'1. Open Microsoft Teams',
'2. Go to Apps → Manage your apps → Upload an app',
@@ -525,7 +501,7 @@ async function finishWithHandoff(
collected: Collected,
completed: string[],
): Promise<void> {
note(
p.note(
[
'The Teams adapter is live and the service is running.',
'',
@@ -554,7 +530,7 @@ async function finishWithHandoff(
);
if (choice === 'self') {
note(
p.note(
[
' 1. Find your bot in Teams (search by name, or via the sideloaded',
' app) and send it a message ("hi" is fine)',
+5 -18
View File
@@ -33,7 +33,7 @@ import {
spawnStep,
writeStepEntry,
} from '../lib/runner.js';
import { accentGreen, brandBold, note } from '../lib/theme.js';
import { brandBold } from '../lib/theme.js';
const DEFAULT_AGENT_NAME = 'Nano';
@@ -47,7 +47,7 @@ export async function runTelegramChannel(displayName: string): Promise<void> {
// installed, or the bot's web profile if not. tg://resolve?domain= is
// more direct but silently fails when the scheme isn't registered.
const botUrl = `https://t.me/${botUsername}`;
note(
p.note(
[
`Opening @${botUsername} in Telegram so it's ready when the pairing code shows up.`,
'',
@@ -132,19 +132,7 @@ export async function runTelegramChannel(displayName: string): Promise<void> {
}
async function collectTelegramToken(): Promise<string> {
const existing = process.env.TELEGRAM_BOT_TOKEN?.trim();
if (existing && /^[0-9]+:[A-Za-z0-9_-]{35,}$/.test(existing)) {
const reuse = ensureAnswer(await p.confirm({
message: `Found an existing Telegram bot token (${existing.slice(0, 8)}…). Use it?`,
initialValue: true,
}));
if (reuse) {
setupLog.userInput('telegram_token', 'reused-existing');
return existing;
}
}
note(
p.note(
[
"Your assistant talks to you through a Telegram bot you create.",
"Here's how:",
@@ -162,7 +150,6 @@ async function collectTelegramToken(): Promise<string> {
const answer = ensureAnswer(
await p.password({
message: 'Paste your bot token',
clearOnError: true,
validate: (v) => {
if (!v || !v.trim()) return "Token is required";
if (!/^[0-9]+:[A-Za-z0-9_-]{35,}$/.test(v.trim())) {
@@ -253,7 +240,7 @@ async function runPairTelegram(): Promise<
} else {
stopSpinner("Old code expired. Here's a fresh one.");
}
note(formatCodeCard(block.fields.CODE ?? '????'), 'Secret code');
p.note(formatCodeCard(block.fields.CODE ?? '????'), 'Secret code');
s.start('Waiting for you to send the code from Telegram…');
spinnerActive = true;
} else if (block.type === 'PAIR_TELEGRAM_ATTEMPT') {
@@ -304,7 +291,7 @@ async function resolveAgentName(): Promise<string> {
}
const answer = ensureAnswer(
await p.text({
message: `What should your ${accentGreen('assistant')} be called?`,
message: 'What should your assistant be called?',
placeholder: DEFAULT_AGENT_NAME,
defaultValue: DEFAULT_AGENT_NAME,
}),
+6 -6
View File
@@ -46,7 +46,7 @@ import {
writeStepEntry,
} from '../lib/runner.js';
import { askOperatorRole } from '../lib/role-prompt.js';
import { accentGreen, brandBody, brandBold, note } from '../lib/theme.js';
import { brandBold } from '../lib/theme.js';
const DEFAULT_AGENT_NAME = 'Nano';
const AUTH_CREDS_PATH = path.join(process.cwd(), 'store', 'auth', 'creds.json');
@@ -171,7 +171,7 @@ async function askAuthMethod(): Promise<AuthMethod> {
}
async function askPhoneNumber(): Promise<string> {
note(
p.note(
[
"Enter your phone number the way WhatsApp expects it:",
'',
@@ -249,7 +249,7 @@ async function runWhatsAppAuth(
} else if (block.type === 'WHATSAPP_AUTH_PAIRING_CODE') {
const code = block.fields.CODE ?? '????';
stopSpinner('Your pairing code is ready.');
note(formatPairingCard(code), 'Pairing code');
p.note(formatPairingCard(code), 'Pairing code');
s.start('Waiting for you to enter the code…');
spinnerActive = true;
} else if (block.type === 'WHATSAPP_AUTH') {
@@ -267,7 +267,7 @@ async function runWhatsAppAuth(
if (spinnerActive) {
stopSpinner('WhatsApp linked.');
} else {
p.log.success(brandBody('WhatsApp linked.'));
p.log.success('WhatsApp linked.');
}
} else if (status === 'failed') {
if (qrLinesPrinted > 0) {
@@ -395,7 +395,7 @@ async function restartService(): Promise<void> {
}
async function askChatPhone(authedPhone: string): Promise<string> {
note(
p.note(
[
`Authenticated with ${k.cyan('+' + authedPhone)}.`,
'',
@@ -462,7 +462,7 @@ async function resolveAgentName(): Promise<string> {
}
const answer = ensureAnswer(
await p.text({
message: `What should your ${accentGreen('assistant')} be called?`,
message: 'What should your assistant be called?',
placeholder: DEFAULT_AGENT_NAME,
defaultValue: DEFAULT_AGENT_NAME,
}),
-18
View File
@@ -11,24 +11,6 @@ import { log } from '../src/log.js';
import { commandExists, getPlatform, isHeadless, isWSL } from './platform.js';
import { emitStatus } from './status.js';
export function detectExistingDisplayName(projectRoot: string): string | null {
const dbPath = path.join(projectRoot, 'data', 'v2.db');
if (!fs.existsSync(dbPath)) return null;
let db: Database.Database | null = null;
try {
db = new Database(dbPath, { readonly: true });
const row = db
.prepare(`SELECT display_name FROM users WHERE id = 'cli:local'`)
.get() as { display_name: string } | undefined;
return row?.display_name?.trim() || null;
} catch {
return null;
} finally {
db?.close();
}
}
export function detectRegisteredGroups(projectRoot: string): boolean {
if (fs.existsSync(path.join(projectRoot, 'data', 'registered_groups.json'))) {
return true;
+6 -9
View File
@@ -18,8 +18,6 @@ import { SelectPrompt } from '@clack/core';
import { isCancel } from '@clack/prompts';
import { styleText } from 'node:util';
import { brandBody } from './theme.js';
const BULLET_ACTIVE = '●';
const BULLET_INACTIVE = '○';
const BAR = '│';
@@ -97,7 +95,7 @@ export function brightSelect<T>(
const shown =
st === 'cancel'
? styleText(['strikethrough', 'dim'], selected)
: styleText('dim', brandBody(selected));
: styleText('dim', selected);
lines.push(`${grayBar} ${shown}`);
return lines.join('\n');
}
@@ -106,12 +104,11 @@ export function brightSelect<T>(
options.forEach((opt, idx) => {
const label = opt.label ?? String(opt.value);
const hint = opt.hint ? ` ${styleText('dim', `(${opt.hint})`)}` : '';
const isActive = idx === cursor;
const marker = isActive
? styleText('green', BULLET_ACTIVE)
: styleText('dim', BULLET_INACTIVE);
const shownLabel = isActive ? brandBody(label) : label;
lines.push(`${bar} ${marker} ${shownLabel}${hint}`);
const marker =
idx === cursor
? styleText('green', BULLET_ACTIVE)
: styleText('dim', BULLET_INACTIVE);
lines.push(`${bar} ${marker} ${label}${hint}`);
});
lines.push(styleText(color, CAP_BOT));
return lines.join('\n');
+12 -102
View File
@@ -2,11 +2,8 @@
* Offer Claude-assisted debugging when a setup step fails.
*
* Flow:
* 1. Check `claude` is on PATH — if not, offer to install it via
* setup/install-claude.sh. Then check auth via `claude auth status`
* — if not signed in, offer to run `claude setup-token` (browser
* OAuth with code-paste fallback for headless/remote systems).
* If either is declined or fails, silently skip.
* 1. Check `claude` is on PATH and has a working credential. If not,
* silently skip — pre-auth failures can't use this path.
* 2. Ask the user for consent ("Want me to ask Claude for a fix?").
* 3. Build a minimal prompt: the one-paragraph situation, the failing
* step's name/message/hint, and a short list of *file references*
@@ -19,16 +16,15 @@
*
* Skippable with NANOCLAW_SKIP_CLAUDE_ASSIST=1 for CI/scripted runs.
*/
import { execSync, spawn, spawnSync } from 'child_process';
import { execSync, spawn } from 'child_process';
import fs from 'fs';
import os from 'os';
import path from 'path';
import * as p from '@clack/prompts';
import k from 'kleur';
import { ensureAnswer } from './runner.js';
import { brandBody, fitToWidth, note } from './theme.js';
import { fitToWidth } from './theme.js';
export interface AssistContext {
stepName: string;
@@ -94,7 +90,7 @@ export async function offerClaudeAssist(
projectRoot: string = process.cwd(),
): Promise<boolean> {
if (process.env.NANOCLAW_SKIP_CLAUDE_ASSIST === '1') return false;
if (!(await ensureClaudeReady(projectRoot))) return false;
if (!isClaudeUsable()) return false;
const want = ensureAnswer(
await p.confirm({
@@ -110,12 +106,12 @@ export async function offerClaudeAssist(
const parsed = parseResponse(response);
if (!parsed) {
p.log.warn(brandBody("Claude responded but I couldn't parse a command out of it."));
p.log.warn("Claude responded but I couldn't parse a command out of it.");
p.log.message(k.dim(response.trim().slice(0, 500)));
return false;
}
note(
p.note(
`${parsed.reason}\n\n${k.cyan('$')} ${parsed.command}`,
"Claude's suggestion",
);
@@ -132,101 +128,15 @@ export async function offerClaudeAssist(
return true;
}
function isClaudeInstalled(): boolean {
function isClaudeUsable(): boolean {
try {
execSync('command -v claude', { stdio: 'ignore' });
return true;
} catch {
return false;
}
}
function isClaudeAuthenticated(): boolean {
try {
execSync('claude auth status', { stdio: 'ignore', timeout: 5_000 });
return true;
} catch {
return false;
}
}
async function ensureClaudeReady(projectRoot: string): Promise<boolean> {
if (!isClaudeInstalled()) {
const install = ensureAnswer(
await p.confirm({
message:
'Claude CLI is needed to diagnose this. Install it now?',
initialValue: true,
}),
);
if (!install) return false;
const code = spawnSync('bash', ['setup/install-claude.sh'], {
cwd: projectRoot,
stdio: 'inherit',
}).status;
if (code !== 0 || !isClaudeInstalled()) {
p.log.error("Couldn't install the Claude CLI.");
return false;
}
p.log.success('Claude CLI installed.');
}
if (!isClaudeAuthenticated()) {
const auth = ensureAnswer(
await p.confirm({
message:
"Claude CLI isn't signed in. Sign in now? (a browser will open)",
initialValue: true,
}),
);
if (!auth) return false;
// setup-token has an interactive TUI; reset terminal to cooked mode
// so its prompts render correctly after clack's raw-mode prompts.
spawnSync('stty', ['sane'], { stdio: 'inherit' });
// Run under script(1) to capture the OAuth token from PTY output
// while preserving interactive TTY for the browser OAuth flow.
// Same approach as register-claude-token.sh, but we set the env var
// instead of writing to OneCLI.
const tmpfile = path.join(os.tmpdir(), `claude-setup-token-${process.pid}`);
try {
const isUtilLinux = (() => {
try {
return execSync('script --version 2>&1', { encoding: 'utf-8' }).includes('util-linux');
} catch { return false; }
})();
const scriptArgs = isUtilLinux
? ['-q', '-c', 'claude setup-token', tmpfile]
: ['-q', tmpfile, 'claude', 'setup-token'];
spawnSync('script', scriptArgs, {
cwd: projectRoot,
stdio: 'inherit',
});
if (!isClaudeAuthenticated() && fs.existsSync(tmpfile)) {
const raw = fs.readFileSync(tmpfile, 'utf-8');
const stripped = raw
.replace(/\x1b\[[0-9;]*[a-zA-Z]/g, '')
.replace(/[\n\r]/g, '');
const matches = stripped.match(/(sk-ant-oat[A-Za-z0-9_-]{80,500}AA)/g);
if (matches) {
process.env.CLAUDE_CODE_OAUTH_TOKEN = matches[matches.length - 1];
}
}
} finally {
try { fs.unlinkSync(tmpfile); } catch {}
}
if (!isClaudeAuthenticated()) {
p.log.error("Couldn't complete Claude sign-in.");
return false;
}
p.log.success('Claude CLI signed in.');
}
// Availability without auth is half the story; a real query will still
// fail if the token isn't registered. We try first and surface the error
// rather than pre-checking auth with a separate round trip.
return true;
}
@@ -358,7 +268,7 @@ async function queryClaudeUnderSpinner(
const elapsed = Math.round((Date.now() - start) / 1000);
const suffix = ` (${elapsed}s)`;
if (kind === 'ok') {
p.log.success(`${brandBody(fitToWidth('Claude replied.', suffix))}${k.dim(suffix)}`);
p.log.success(`${fitToWidth('Claude replied.', suffix)}${k.dim(suffix)}`);
resolve(payload);
} else {
p.log.error(
+3 -5
View File
@@ -27,8 +27,6 @@ import { execSync, spawn } from 'child_process';
import * as p from '@clack/prompts';
import k from 'kleur';
import { brandBody, note } from './theme.js';
export interface HandoffContext {
/** Channel this handoff is happening in (e.g., 'teams'). */
channel: string;
@@ -64,14 +62,14 @@ export interface HandoffContext {
export async function offerClaudeHandoff(ctx: HandoffContext): Promise<boolean> {
if (!isClaudeUsable()) {
p.log.warn(
brandBody("Claude isn't installed yet — can't hand you off here. Finish setup first, then retry."),
"Claude isn't installed yet — can't hand you off here. Finish setup first, then retry.",
);
return false;
}
const systemPrompt = buildSystemPrompt(ctx);
note(
p.note(
[
"I'm handing you off to Claude in interactive mode.",
"It has the context of where you are in setup.",
@@ -93,7 +91,7 @@ export async function offerClaudeHandoff(ctx: HandoffContext): Promise<boolean>
{ stdio: 'inherit' },
);
child.on('close', () => {
p.log.success(brandBody("Back from Claude. Let's continue."));
p.log.success("Back from Claude. Let's continue.");
resolve(true);
});
child.on('error', () => {
+2 -2
View File
@@ -20,7 +20,7 @@ import k from 'kleur';
import * as setupLog from '../logs.js';
import { offerClaudeAssist } from './claude-assist.js';
import { emit as phEmit } from './diagnostics.js';
import { brandBody, fitToWidth } from './theme.js';
import { fitToWidth } from './theme.js';
export type Fields = Record<string, string>;
export type Block = { type: string; fields: Fields };
@@ -390,7 +390,7 @@ export async function fail(
const skipList = [
...new Set([...existingSkip, ...setupLog.completedStepNames()]),
].join(',');
p.log.step(brandBody(`Retrying from ${stepName}`));
p.log.step(`Retrying from ${stepName}`);
const result = spawnSync('pnpm', ['--silent', 'run', 'setup:auto'], {
stdio: 'inherit',
env: { ...process.env, NANOCLAW_SKIP: skipList },
+1 -1
View File
@@ -115,7 +115,7 @@ async function promptOne(e: Entry, values: ConfigValues): Promise<void> {
};
const ans = ensureAnswer(
e.secret
? await p.password({ message: e.label, clearOnError: true, validate })
? await p.password({ message: e.label, validate })
: await p.text({
message: e.label,
placeholder: e.placeholder ?? e.default,
-46
View File
@@ -11,7 +11,6 @@
* - COLORTERM truecolor/24bit → 24-bit ANSI (exact brand cyan)
* - Otherwise → kleur's 16-color cyan (closest fallback)
*/
import * as p from '@clack/prompts';
import k from 'kleur';
const USE_ANSI = Boolean(process.stdout.isTTY) && !process.env.NO_COLOR;
@@ -39,41 +38,6 @@ export function brandChip(s: string): string {
return k.bgCyan(k.black(k.bold(s)));
}
/**
* Accent green (#3fba50) for emphasizing a single word inside prompt
* messages — currently the "you" in "What should your assistant call
* you?" so the operator parses at a glance who the question is about.
* Same TTY/NO_COLOR/truecolor gating as the rest of the palette.
*/
export function accentGreen(s: string): string {
if (!USE_ANSI) return s;
if (TRUECOLOR) return `\x1b[38;2;63;186;80m${s}\x1b[39m`;
return k.green(s);
}
/**
* Brand body color for setup-flow prose. Used for card bodies (via the
* `note()` formatter) and `p.log.*` body arguments — anywhere the
* previous "dim" treatment was making prose hard to read or washing
* out embedded brand emphasis.
*
* Multi-line input is colored line-by-line so embedded line breaks
* don't bleed the SGR sequence across clack's gutter prefix.
*/
export function brandBody(s: string): string {
if (!USE_ANSI) return s;
if (TRUECOLOR) {
return s
.split('\n')
.map((line) => (line.length > 0 ? `\x1b[38;2;43;183;206m${line}\x1b[39m` : line))
.join('\n');
}
return s
.split('\n')
.map((line) => (line.length > 0 ? k.cyan(line) : line))
.join('\n');
}
/**
* Wrap text so it fits inside clack's gutter without the terminal's soft
* wrap breaking the `│ …` bar on long lines. Works on a single string with
@@ -104,16 +68,6 @@ export function dimWrap(text: string, gutter: number): string {
return wrapForGutter(text, gutter);
}
/**
* Wrap clack's `p.note` so card bodies render in the brand body color
* (#2b6fdc) instead of clack's default dim. Clack runs the formatter
* on each line individually, so `brandBody` colors each line cleanly
* without bleeding across the gutter prefix.
*/
export function note(message: string, title?: string): void {
p.note(message, title, { format: brandBody });
}
const ANSI_RE = /\x1b\[[0-9;]*m/g;
function visibleLength(s: string): number {
+3 -3
View File
@@ -23,7 +23,7 @@ import { emit as phEmit } from './diagnostics.js';
import type { StepResult, SpinnerLabels } from './runner.js';
import { dumpTranscriptOnFailure, spawnStep, writeStepEntry } from './runner.js';
import * as setupLog from '../logs.js';
import { brandBody, fitToWidth } from './theme.js';
import { fitToWidth } from './theme.js';
const WINDOW_SIZE = 3;
const SPINNER_FRAMES = ['◒', '◐', '◓', '◑'];
@@ -169,7 +169,7 @@ async function runUnderWindow(
if (result.ok) {
const isSkipped = result.terminal?.fields.STATUS === 'skipped';
const msg = isSkipped && labels.skipped ? labels.skipped : labels.done;
p.log.success(`${brandBody(fitToWidth(msg, suffix))}${k.dim(suffix)}`);
p.log.success(`${fitToWidth(msg, suffix)}${k.dim(suffix)}`);
} else {
const failMsg = labels.failed ?? labels.running.replace(/…$/, ' failed');
p.log.error(`${fitToWidth(failMsg, suffix)}${k.dim(suffix)}`);
@@ -185,7 +185,7 @@ async function handleStall(
): Promise<void> {
render.pauseRender();
p.log.warn(
brandBody(`This looks stuck — no output from the ${stepName} step for the last 60 seconds.`),
`This looks stuck — no output from the ${stepName} step for the last 60 seconds.`,
);
phEmit('step_stalled', { step: stepName });
@@ -0,0 +1,22 @@
/**
* Setup-side registration guard for the codex provider (the third barrel of
* the multi-point archetype): imports the REAL setup/providers barrel and
* asserts the registry carries codex with its auth + install check. Red if
* the barrel line is deleted, the barrel fails to evaluate, or the payload
* module breaks. (Importing ./codex.js directly would self-register and stay
* green when the barrel line is deleted.)
*/
import { describe, expect, it } from 'vitest';
import { getSetupProvider } from './registry.js';
import './index.js'; // the real setup provider barrel
describe('codex setup registration', () => {
it('registers codex with auth + install check via the barrel', () => {
const codex = getSetupProvider('codex');
expect(codex).toBeDefined();
expect(typeof codex!.runAuth).toBe('function');
expect(typeof codex!.runInstallCheck).toBe('function');
expect(typeof codex!.offerFailureAssist).toBe('function');
});
});
+101
View File
@@ -0,0 +1,101 @@
import { EventEmitter } from 'events';
import fs from 'fs';
import os from 'os';
import path from 'path';
import { describe, expect, it, vi } from 'vitest';
// Mock child_process so runCodexLoginAuth never spawns a real codex CLI; the
// spawn stand-in plays `codex login` writing auth.json into whatever
// CODEX_HOME it was handed.
const mockSpawn = vi.fn();
const mockSpawnSync = vi.fn();
const mockExecFileSync = vi.fn();
vi.mock('child_process', () => ({
spawn: (...args: unknown[]) => mockSpawn(...args),
spawnSync: (...args: unknown[]) => mockSpawnSync(...args),
execFileSync: (...args: unknown[]) => mockExecFileSync(...args),
}));
// Keep the auth flow's structured logging out of logs/setup.log.
vi.mock('../logs.js', () => ({ step: vi.fn(), userInput: vi.fn() }));
import { buildCodexFailurePrompt, runCodexLoginAuth, verifyCodexInstall } from './codex.js';
// Structural guard for the codex payload wiring: provider files, both barrel
// imports, and the pinned Dockerfile install. Goes red if any of them is
// removed without going through the /add-codex (or its REMOVE.md) path.
describe('verifyCodexInstall', () => {
it('passes on a tree with the codex payload wired', () => {
const { ok, problems } = verifyCodexInstall();
expect(problems).toEqual([]);
expect(ok).toBe(true);
});
});
// Pure prompt builder for the failure-assist hook — no spawning involved.
describe('buildCodexFailurePrompt', () => {
it('carries the failure context and the de-duped reference list', () => {
const projectRoot = '/repo';
const prompt = buildCodexFailurePrompt(
{
stepName: 'verify',
msg: 'first-chat ping timed out',
hint: 'check the container logs',
rawLogPath: '/repo/logs/setup-steps/verify.log',
},
projectRoot,
);
expect(prompt).toContain('Failed step: verify');
expect(prompt).toContain('Error: first-chat ping timed out');
expect(prompt).toContain('Hint: check the container logs');
expect(prompt).toContain('README.md'); // BIG_PICTURE_FILES
expect(prompt).toContain('setup/verify.ts'); // STEP_FILES['verify']
expect(prompt).toContain('logs/setup.log');
expect(prompt).toContain('logs/setup-steps/verify.log'); // relativized rawLogPath
});
it('falls back to the step-log directory when no raw log path is given', () => {
const prompt = buildCodexFailurePrompt({ stepName: 'verify', msg: 'boom' }, '/repo');
expect(prompt).toContain('logs/setup-steps/');
expect(prompt).not.toContain('Hint:');
});
});
// Session-isolation invariant: the ChatGPT session vaulted for the gateway
// must never be the user's personal ~/.codex session — sharing one OAuth
// session across two consumers gets the whole family invalidated server-side
// when refresh tokens rotate (see the header of codex.ts).
describe('runCodexLoginAuth', () => {
it('logs in under an isolated CODEX_HOME, vaults from it, and deletes it', async () => {
mockSpawnSync.mockReturnValue({ status: 0, stdout: '', stderr: '' });
mockExecFileSync.mockReturnValue('');
let loginEnv: NodeJS.ProcessEnv | undefined;
mockSpawn.mockImplementation((...args: unknown[]) => {
const opts = args[2] as { env?: NodeJS.ProcessEnv };
loginEnv = opts.env;
fs.writeFileSync(path.join(opts.env!.CODEX_HOME!, 'auth.json'), '{"tokens":{}}');
const child = new EventEmitter();
setImmediate(() => child.emit('close', 0));
return child;
});
await runCodexLoginAuth('browser');
// The login spawn ran under a CODEX_HOME that is not the personal one.
const codexHome = loginEnv?.CODEX_HOME;
expect(codexHome).toBeDefined();
expect(codexHome).not.toBe(path.join(os.homedir(), '.codex'));
// The vault snapshot was read from the isolated dir, not ~/.codex.
const vaultCall = mockExecFileSync.mock.calls.find((c) => c[0] === 'onecli');
expect(vaultCall).toBeDefined();
const vaultArgs = vaultCall![1] as string[];
expect(vaultArgs[vaultArgs.indexOf('--file') + 1]).toBe(path.join(codexHome!, 'auth.json'));
// The isolated dir holds a live credential — gone once vaulted.
expect(fs.existsSync(codexHome!)).toBe(false);
});
});
+449
View File
@@ -0,0 +1,449 @@
/**
* Codex provider setup — auth walk-through + install verification.
*
* Codex-owned payload code: when the codex provider moves to the `providers`
* branch, this file travels with it and `/add-codex` copies it back in. The
* only trunk reach-in is one import + one picker entry in setup/auto.ts.
*
* Auth honors the v2 credential invariant — everything lands in the OneCLI
* vault, nothing in .env, nothing in the container:
* - ChatGPT subscription (the common case): `codex login` (browser) or
* `codex login --device-auth` (URL + pairing code) runs with CODEX_HOME
* pointed at a throwaway dir; the auth.json written there is stored
* WHOLE in the vault (`--file … --host-pattern chatgpt.com`) and the dir
* is deleted. The gateway injects it in flight; the container only ever
* sees the `onecli-managed` placeholder.
* - API key: pasted once, stored as an `openai` secret for api.openai.com.
*
* Session-isolation invariant: the vaulted ChatGPT session must be DEDICATED
* to the gateway. Never vault a copy of the user's live ~/.codex/auth.json.
* OpenAI rotates refresh tokens, so two consumers sharing one OAuth session
* strand each other on refresh, and replaying the stale token trips reuse
* detection — which invalidates the whole session family server-side
* (`token_invalidated`) for the gateway AND the user's personal Codex CLI.
*/
import { execFileSync, spawn, spawnSync } from 'child_process';
import fs from 'fs';
import os from 'os';
import path from 'path';
import * as p from '@clack/prompts';
import k from 'kleur';
import { brightSelect } from '../lib/bright-select.js';
import { type AssistContext, BIG_PICTURE_FILES, STEP_FILES } from '../lib/claude-assist.js';
import { brandBody, note } from '../lib/theme.js';
import * as setupLog from '../logs.js';
import { type FailureAssistResult, registerSetupProvider } from './registry.js';
// ─── OneCLI vault helpers ────────────────────────────────────────────────
interface OnecliSecret {
id: string;
name: string;
type: string;
hostPattern: string | null;
}
function listSecrets(): OnecliSecret[] {
const out = execFileSync('onecli', ['secrets', 'list'], { encoding: 'utf-8' });
const parsed = JSON.parse(out) as { data?: unknown };
return Array.isArray(parsed.data) ? (parsed.data as OnecliSecret[]) : [];
}
function findOpenAISecret(secrets: OnecliSecret[]): OnecliSecret | undefined {
return secrets.find((s) => {
const name = s.name.toLowerCase();
const type = s.type.toLowerCase();
const hostPattern = (s.hostPattern ?? '').toLowerCase();
return (
name === 'codex' ||
name === 'openai' ||
type === 'openai' ||
hostPattern.includes('api.openai.com') ||
hostPattern.includes('chatgpt.com')
);
});
}
function openAISecretExists(): boolean {
try {
return findOpenAISecret(listSecrets()) !== undefined;
} catch {
return false;
}
}
// ─── auth step ───────────────────────────────────────────────────────────
function ensureAnswer<T>(value: T | symbol): T {
if (p.isCancel(value)) {
p.cancel('Setup cancelled.');
process.exit(1);
}
return value as T;
}
export async function runCodexAuthStep(): Promise<void> {
if (openAISecretExists()) {
p.log.success(brandBody('Your OpenAI account is already connected.'));
setupLog.step('auth', 'skipped', 0, { REASON: 'openai-secret-already-present', PROVIDER: 'codex' });
return;
}
const method = ensureAnswer(
await brightSelect<'browser' | 'device' | 'api' | 'skip'>({
message: 'How would you like to connect Codex?',
options: [
{
value: 'browser',
label: 'Sign in with my ChatGPT subscription',
hint: 'recommended if you have Plus or Pro — opens a browser',
},
{
value: 'device',
label: 'ChatGPT device pairing',
hint: 'no browser handoff — shows a URL and a code',
},
{
value: 'api',
label: 'Paste an OpenAI API key',
hint: 'pay-per-use; stored in OneCLI, never copied into the container',
},
{
value: 'skip',
label: "Skip — I'll connect later",
hint: 'Codex groups will start, but model calls will fail auth',
},
],
}),
);
setupLog.userInput('codex_auth_method', method);
if (method === 'skip') {
const confirmed = ensureAnswer(
await p.confirm({
message: "Skip Codex sign-in? Codex won't be able to answer until you connect an OpenAI account.",
initialValue: false,
}),
);
if (!confirmed) return runCodexAuthStep();
setupLog.step('auth', 'skipped', 0, { REASON: 'user-skipped', PROVIDER: 'codex' });
p.log.warn(brandBody('Codex sign-in skipped. Add an OpenAI account to OneCLI before using Codex groups.'));
return;
}
if (method === 'api') {
await runCodexApiKeyAuth();
return;
}
await runCodexLoginAuth(method);
}
async function runCodexApiKeyAuth(): Promise<void> {
const key = ensureAnswer(
await p.password({
message: 'Paste your OpenAI API key (sk-…)',
validate: (v) => (v && v.trim().startsWith('sk-') ? undefined : 'That does not look like an OpenAI API key.'),
}),
) as string;
try {
execFileSync(
'onecli',
[
'secrets',
'create',
'--name',
'Codex',
'--type',
'openai',
'--value',
key.trim(),
'--host-pattern',
'api.openai.com',
],
{ stdio: ['ignore', 'pipe', 'pipe'] },
);
} catch (err) {
setupLog.step('auth', 'failed', 0, { PROVIDER: 'codex', METHOD: 'api', ERROR: String(err) });
p.log.error(
brandBody(
"Couldn't save your OpenAI key to the vault. Make sure OneCLI is running (`onecli version`), then retry.",
),
);
process.exit(1);
}
setupLog.step('auth', 'success', 0, { PROVIDER: 'codex', METHOD: 'api' });
p.log.success(brandBody('OpenAI account connected.'));
}
export async function runCodexLoginAuth(method: 'browser' | 'device'): Promise<void> {
const codexCheck = spawnSync('codex', ['--version'], { encoding: 'utf-8', stdio: ['ignore', 'pipe', 'pipe'] });
if (codexCheck.status !== 0) {
p.log.error(
brandBody(
'The Codex CLI is not installed on this machine. Install it with `npm install -g @openai/codex`, then re-run setup — or choose the API key option instead.',
),
);
setupLog.step('auth', 'failed', 0, { PROVIDER: 'codex', METHOD: method, ERROR: 'codex_cli_missing' });
process.exit(1);
}
if (method === 'browser') {
p.log.step(brandBody('Opening the Codex sign-in flow…'));
console.log(k.dim(' (a browser will open for sign-in; this part is interactive)'));
} else {
p.log.step(brandBody('Starting Codex device-code pairing…'));
console.log(k.dim(' (a URL and code will appear below — open the URL and enter the code)'));
}
console.log();
// Session-isolation invariant (see file header): the login runs under a
// throwaway CODEX_HOME so the vaulted session is dedicated to the gateway
// and never shared with the user's personal ~/.codex.
const loginHome = fs.mkdtempSync(path.join(os.tmpdir(), 'codex-vault-login-'));
// Holds a live credential after login — must go on every exit path. The
// failure branches call process.exit, which skips finally blocks, so each
// removes it explicitly.
const removeLoginHome = (): void => fs.rmSync(loginHome, { recursive: true, force: true });
const args = method === 'device' ? ['login', '--device-auth'] : ['login'];
const start = Date.now();
const code = await runInherit('codex', args, { CODEX_HOME: loginHome });
const durationMs = Date.now() - start;
console.log();
if (code !== 0) {
removeLoginHome();
setupLog.step('auth', 'failed', durationMs, { PROVIDER: 'codex', METHOD: method, EXIT_CODE: String(code) });
p.log.error(
brandBody(
"Couldn't complete the Codex sign-in. Re-run setup and try again, or choose the API key option instead.",
),
);
process.exit(1);
}
const authJsonPath = path.join(loginHome, 'auth.json');
if (!fs.existsSync(authJsonPath)) {
removeLoginHome();
setupLog.step('auth', 'failed', durationMs, { PROVIDER: 'codex', METHOD: method, ERROR: 'auth_json_not_found' });
p.log.error(
brandBody('Codex login succeeded but no auth.json was written. Try again, or paste an API key instead.'),
);
process.exit(1);
}
try {
execFileSync(
'onecli',
[
'secrets',
'create',
'--name',
'Codex',
'--type',
'openai',
'--file',
authJsonPath,
'--host-pattern',
'chatgpt.com',
],
{ stdio: ['ignore', 'pipe', 'pipe'] },
);
} catch (err) {
removeLoginHome();
setupLog.step('auth', 'failed', durationMs, { PROVIDER: 'codex', METHOD: method, ERROR: String(err) });
p.log.error(
brandBody(
"Couldn't save your Codex credentials to the vault. Make sure OneCLI is running (`onecli version`), then retry.",
),
);
process.exit(1);
}
removeLoginHome();
setupLog.step('auth', 'success', durationMs, { PROVIDER: 'codex', METHOD: method });
p.log.success(brandBody('OpenAI account connected — credentials live in your OneCLI vault, never in the container.'));
}
function runInherit(cmd: string, args: string[], extraEnv?: Record<string, string>): Promise<number> {
return new Promise((resolve) => {
const child = spawn(cmd, args, {
stdio: 'inherit',
env: extraEnv ? { ...process.env, ...extraEnv } : process.env,
});
child.on('close', (code) => resolve(code ?? 1));
child.on('error', () => resolve(1));
});
}
// ─── failure assist ──────────────────────────────────────────────────────
/**
* The Codex CLI can debug a setup failure only if the binary runs AND
* ~/.codex/auth.json exists (API-key-only installs keep the key in the
* OneCLI vault, so the host-side CLI has nothing to authenticate with).
*/
export function isCodexCliUsable(): boolean {
const codexCheck = spawnSync('codex', ['--version'], { encoding: 'utf-8', stdio: ['ignore', 'pipe', 'pipe'] });
if (codexCheck.status !== 0) return false;
return fs.existsSync(path.join(os.homedir(), '.codex', 'auth.json'));
}
/**
* Failure prompt handed to the interactive Codex session — same content as
* the dispatcher's Claude system prompt: what failed, the job ("diagnose and
* fix, be concise, exit when done"), and a de-duped file reference list.
*/
export function buildCodexFailurePrompt(ctx: AssistContext, projectRoot: string): string {
const stepRefs = STEP_FILES[ctx.stepName] ?? [];
const references = [
...BIG_PICTURE_FILES,
...stepRefs,
'logs/setup.log',
ctx.rawLogPath ? path.relative(projectRoot, ctx.rawLogPath) : 'logs/setup-steps/',
].filter((v, i, a) => a.indexOf(v) === i);
const lines: string[] = [
"The user is running NanoClaw's interactive setup flow and hit a failure.",
'',
`Failed step: ${ctx.stepName}`,
`Error: ${ctx.msg}`,
];
if (ctx.hint) lines.push(`Hint: ${ctx.hint}`);
lines.push(
'',
'Your job: help them diagnose and fix this issue. Read the referenced files',
'and logs to understand what went wrong, then help them fix it. You can read',
'files, run commands, check logs, and explain what happened. Be concise.',
"When they're ready to resume setup, tell them to exit Codex.",
'',
'Relevant files (read as needed):',
);
for (const f of references) lines.push(` - ${f}`);
return lines.join('\n');
}
/**
* Registry hook: offer to debug a setup failure with the Codex CLI. Returns
* 'unavailable' when the CLI can't run here so the dispatcher can fall back
* to its guarded Claude offer.
*/
export async function offerCodexFailureAssist(ctx: AssistContext, projectRoot: string): Promise<FailureAssistResult> {
if (!isCodexCliUsable()) return 'unavailable';
const want = ensureAnswer(
await p.confirm({
message: 'Want to debug this with Codex?',
initialValue: true,
}),
);
if (!want) return 'declined';
const prompt = buildCodexFailurePrompt(ctx, projectRoot);
note(
[
'Launching Codex to help debug this failure.',
'It has the context of what went wrong.',
'',
k.dim("Exit Codex (Ctrl-C or /quit) when you're ready to come back to setup."),
].join('\n'),
'Handing off to Codex',
);
return new Promise<FailureAssistResult>((resolve) => {
// codex accepts a positional initial prompt for the interactive TUI.
const child = spawn('codex', [prompt], { cwd: projectRoot, stdio: 'inherit' });
child.on('close', () => {
p.log.success(brandBody("Back from Codex. Let's continue."));
resolve('launched');
});
child.on('error', () => {
p.log.error("Couldn't launch Codex.");
resolve('unavailable');
});
});
}
// ─── install verification ────────────────────────────────────────────────
/**
* Verify the codex provider payload is fully wired — the same pre-flight the
* /add-codex skill checks. While codex ships in trunk these always pass; once
* the payload moves to the providers branch, a failed check means the install
* step should run (or the user finishes via /add-codex).
*/
export function verifyCodexInstall(): { ok: boolean; problems: string[] } {
const problems: string[] = [];
const root = process.cwd();
const requiredFiles = [
'src/providers/codex.ts',
'src/providers/codex-agents-md.ts',
'container/agent-runner/src/providers/codex.ts',
'container/agent-runner/src/providers/codex-app-server.ts',
];
for (const file of requiredFiles) {
if (!fs.existsSync(path.join(root, file))) problems.push(`missing file: ${file}`);
}
for (const barrel of ['src/providers/index.ts', 'container/agent-runner/src/providers/index.ts']) {
const barrelPath = path.join(root, barrel);
if (!fs.existsSync(barrelPath) || !fs.readFileSync(barrelPath, 'utf-8').includes("import './codex.js';")) {
problems.push(`missing barrel import in ${barrel}`);
}
}
const manifestPath = path.join(root, 'container', 'cli-tools.json');
let hasCodexCli = false;
if (fs.existsSync(manifestPath)) {
try {
const tools = JSON.parse(fs.readFileSync(manifestPath, 'utf-8')) as Array<{ name?: string }>;
hasCodexCli = Array.isArray(tools) && tools.some((t) => t.name === '@openai/codex');
} catch {
hasCodexCli = false;
}
}
if (!hasCodexCli) {
problems.push('container/cli-tools.json missing the @openai/codex CLI entry');
}
return { ok: problems.length === 0, problems };
}
export async function runCodexInstallCheck(): Promise<void> {
p.log.step(brandBody('Checking the Codex provider install…'));
const { ok, problems } = verifyCodexInstall();
if (ok) {
setupLog.step('codex-install', 'success', 0, {});
p.log.success(brandBody('Codex installed properly.'));
return;
}
setupLog.step('codex-install', 'failed', 0, { PROBLEMS: problems.join('; ') });
p.log.warn(brandBody('The Codex provider is not fully installed:'));
for (const problem of problems) console.log(k.dim(`${problem}`));
p.log.warn(
brandBody(
'Finish it with your coding agent of choice: open Codex CLI or Claude Code in this repo and run the /add-codex skill. Setup will continue — Codex groups will work once the install completes.',
),
);
}
// Self-registration: the setup picker and the standalone `provider-auth` step
// render from the registry — this call is codex's only reach-in to the setup
// flow (guarded by the barrel-driven registration test).
registerSetupProvider({
value: 'codex',
label: 'Codex',
hint: 'OpenAI — ChatGPT subscription or API key',
runAuth: runCodexAuthStep,
runInstallCheck: runCodexInstallCheck,
offerFailureAssist: offerCodexFailureAssist,
});
-197
View File
@@ -1,197 +0,0 @@
/**
* Unit tests for the startup circuit breaker.
*
* Covers state transitions, the documented backoff schedule, and the
* fresh-install case where DATA_DIR doesn't exist yet (the breaker runs
* before initDb, so it has to create the dir itself).
*/
import fs from 'fs';
import os from 'os';
import path from 'path';
import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest';
// vi.mock factories are hoisted above imports, so they can't close over local
// consts. vi.hoisted is hoisted alongside the mock and runs before any
// `import` — so it can only use globals (no path/os modules). Use require()
// inside the callback to compute the test dir.
const { TEST_DIR } = vi.hoisted(() => {
const nodePath = require('path') as typeof import('path');
const nodeOs = require('os') as typeof import('os');
return { TEST_DIR: nodePath.join(nodeOs.tmpdir(), 'nanoclaw-cb-test') };
});
const CB_PATH = path.join(TEST_DIR, 'circuit-breaker.json');
vi.mock('./config.js', async () => {
const actual = await vi.importActual<typeof import('./config.js')>('./config.js');
return { ...actual, DATA_DIR: TEST_DIR };
});
vi.mock('./log.js', () => ({
log: {
debug: vi.fn(),
info: vi.fn(),
warn: vi.fn(),
error: vi.fn(),
fatal: vi.fn(),
},
}));
import { enforceStartupBackoff, resetCircuitBreaker } from './circuit-breaker.js';
function readState(): { attempt: number; timestamp: string } {
return JSON.parse(fs.readFileSync(CB_PATH, 'utf-8'));
}
function seedState(attempt: number, timestamp = new Date().toISOString()): void {
fs.writeFileSync(CB_PATH, JSON.stringify({ attempt, timestamp }));
}
beforeEach(() => {
if (fs.existsSync(TEST_DIR)) fs.rmSync(TEST_DIR, { recursive: true });
fs.mkdirSync(TEST_DIR, { recursive: true });
});
afterEach(() => {
vi.useRealTimers();
if (fs.existsSync(TEST_DIR)) fs.rmSync(TEST_DIR, { recursive: true });
});
describe('resetCircuitBreaker', () => {
it('deletes the state file', () => {
seedState(3);
expect(fs.existsSync(CB_PATH)).toBe(true);
resetCircuitBreaker();
expect(fs.existsSync(CB_PATH)).toBe(false);
});
it('is a no-op when the file does not exist', () => {
expect(fs.existsSync(CB_PATH)).toBe(false);
expect(() => resetCircuitBreaker()).not.toThrow();
});
});
describe('enforceStartupBackoff — state transitions', () => {
it('first run writes attempt=1 and does not delay', async () => {
vi.useFakeTimers();
const start = Date.now();
await enforceStartupBackoff();
// No timers should have been queued — clean first start is 0s.
expect(Date.now() - start).toBe(0);
expect(readState().attempt).toBe(1);
});
it('within reset window, attempt is incremented', async () => {
seedState(1);
vi.useFakeTimers();
const promise = enforceStartupBackoff();
await vi.runAllTimersAsync();
await promise;
expect(readState().attempt).toBe(2);
});
it('outside reset window (>1h), attempt resets to 1', async () => {
const longAgo = new Date(Date.now() - 2 * 60 * 60 * 1000).toISOString();
seedState(5, longAgo);
await enforceStartupBackoff();
expect(readState().attempt).toBe(1);
});
it('exactly at the reset window boundary still counts as "within"', async () => {
// RESET_WINDOW_MS = 60min. Use 59min59s to stay inside even if the test
// takes a few ms to execute.
const justInside = new Date(Date.now() - (60 * 60 * 1000 - 1000)).toISOString();
seedState(2, justInside);
vi.useFakeTimers();
const promise = enforceStartupBackoff();
await vi.runAllTimersAsync();
await promise;
expect(readState().attempt).toBe(3);
});
it('treats a malformed state file as no prior state', async () => {
fs.writeFileSync(CB_PATH, '{ this is not json');
await enforceStartupBackoff();
expect(readState().attempt).toBe(1);
});
it('resetCircuitBreaker after a startup actually clears the counter for the next startup', async () => {
// Simulate: crash, restart (attempt=2), graceful shutdown, restart again.
seedState(1);
vi.useFakeTimers();
const p1 = enforceStartupBackoff();
await vi.runAllTimersAsync();
await p1;
expect(readState().attempt).toBe(2);
resetCircuitBreaker();
expect(fs.existsSync(CB_PATH)).toBe(false);
await enforceStartupBackoff();
expect(readState().attempt).toBe(1);
});
});
describe('enforceStartupBackoff — backoff schedule', () => {
/**
* Documented schedule:
*
* clean start → 1 crash → 2 crash → 3 crash → 4 crash → 5 crash → 6+ crash
* 0s → 0s → 10s → 30s → 2min → 5min → 15min cap
*
* Each row is [priorAttempt seeded in the file, expected delay this run
* produces in seconds]. priorAttempt=null = no file = very first start.
*
* To assert the *requested* delay (not just observed elapsed real time),
* we spy on global.setTimeout and look at the longest call. runAllTimersAsync
* lets the function complete so we can move on.
*/
const cases: Array<{ label: string; priorAttempt: number | null; expectedDelaySec: number }> = [
{ label: 'clean first start (no file)', priorAttempt: null, expectedDelaySec: 0 },
{ label: 'first crash (attempt=2)', priorAttempt: 1, expectedDelaySec: 0 },
{ label: 'second crash (attempt=3)', priorAttempt: 2, expectedDelaySec: 10 },
{ label: 'third crash (attempt=4)', priorAttempt: 3, expectedDelaySec: 30 },
{ label: 'fourth crash (attempt=5)', priorAttempt: 4, expectedDelaySec: 120 },
{ label: 'fifth crash (attempt=6)', priorAttempt: 5, expectedDelaySec: 300 },
{ label: 'sixth crash (attempt=7) — cap', priorAttempt: 6, expectedDelaySec: 900 },
{ label: 'far past cap (attempt=20)', priorAttempt: 19, expectedDelaySec: 900 },
];
for (const { label, priorAttempt, expectedDelaySec } of cases) {
it(`${label}: delays ${expectedDelaySec}s`, async () => {
if (priorAttempt !== null) seedState(priorAttempt);
vi.useFakeTimers();
const setTimeoutSpy = vi.spyOn(global, 'setTimeout');
const promise = enforceStartupBackoff();
await vi.runAllTimersAsync();
await promise;
// enforceStartupBackoff only calls setTimeout when delaySec > 0. Pick
// the longest delay it requested (vitest may queue small internal
// timers we don't care about).
const requestedDelays = setTimeoutSpy.mock.calls.map((c) => c[1] ?? 0);
const maxDelayMs = requestedDelays.length ? Math.max(...requestedDelays) : 0;
expect(maxDelayMs).toBe(expectedDelaySec * 1000);
});
}
});
describe('enforceStartupBackoff — fresh install (DATA_DIR missing)', () => {
/**
* The breaker runs before initDb (which is what creates DATA_DIR). On a
* fresh checkout the dir doesn't exist yet, so write() must create it
* before writing the state file — otherwise the host crashes on its very
* first start.
*/
it('creates DATA_DIR on demand and does not throw', async () => {
fs.rmSync(TEST_DIR, { recursive: true });
expect(fs.existsSync(TEST_DIR)).toBe(false);
await expect(enforceStartupBackoff()).resolves.toBeUndefined();
expect(fs.existsSync(TEST_DIR)).toBe(true);
expect(fs.existsSync(CB_PATH)).toBe(true);
expect(readState().attempt).toBe(1);
});
});
-84
View File
@@ -1,84 +0,0 @@
import fs from 'fs';
import path from 'path';
import { DATA_DIR } from './config.js';
import { log } from './log.js';
const CB_PATH = path.join(DATA_DIR, 'circuit-breaker.json');
const RESET_WINDOW_MS = 60 * 60 * 1000; // 1 hour
// Index = number of consecutive crashes (0 = clean start, attempt 1).
// 6+ crashes capped at 15min.
const BACKOFF_SCHEDULE_S = [0, 0, 10, 30, 120, 300, 900];
interface CircuitBreakerState {
attempt: number;
timestamp: string;
}
function read(): CircuitBreakerState | null {
try {
const raw = fs.readFileSync(CB_PATH, 'utf-8');
return JSON.parse(raw) as CircuitBreakerState;
} catch {
return null;
}
}
function write(state: CircuitBreakerState): void {
// The breaker runs before initDb (which is what creates DATA_DIR), so on a
// fresh checkout the dir may not exist yet.
fs.mkdirSync(DATA_DIR, { recursive: true });
fs.writeFileSync(CB_PATH, JSON.stringify(state, null, 2) + '\n');
}
function getDelay(attempt: number): number {
const idx = Math.min(attempt - 1, BACKOFF_SCHEDULE_S.length - 1);
return BACKOFF_SCHEDULE_S[idx];
}
export function resetCircuitBreaker(): void {
try {
fs.unlinkSync(CB_PATH);
log.info('Circuit breaker reset on clean shutdown');
} catch {}
}
export async function enforceStartupBackoff(): Promise<void> {
const now = new Date();
const prev = read();
let attempt: number;
if (!prev) {
attempt = 1;
} else {
const elapsedMs = now.getTime() - new Date(prev.timestamp).getTime();
if (elapsedMs < RESET_WINDOW_MS) {
attempt = prev.attempt + 1;
log.warn('Previous startup was not a clean shutdown', {
previousAttempt: prev.attempt,
previousTimestamp: prev.timestamp,
elapsedSec: Math.round(elapsedMs / 1000),
});
} else {
attempt = 1;
log.info('Circuit breaker reset — last startup was over 1h ago', {
previousAttempt: prev.attempt,
previousTimestamp: prev.timestamp,
});
}
}
write({ attempt, timestamp: now.toISOString() });
const delaySec = getDelay(attempt);
if (delaySec > 0) {
const resumeAt = new Date(now.getTime() + delaySec * 1000).toISOString();
log.warn('Circuit breaker: delaying startup due to repeated crashes', {
attempt,
delaySec,
resumeAt,
});
await new Promise((resolve) => setTimeout(resolve, delaySec * 1000));
log.info('Circuit breaker: backoff complete, resuming startup', { attempt });
}
}
+19 -29
View File
@@ -58,7 +58,7 @@ const activeContainers = new Map<string, { process: ChildProcess; containerName:
* a duplicate container against the same session directory, producing
* racy double-replies.
*/
const wakePromises = new Map<string, Promise<boolean>>();
const wakePromises = new Map<string, Promise<void>>();
export function getActiveContainerCount(): number {
return activeContainers.size;
@@ -73,32 +73,20 @@ export function isContainerRunning(sessionId: string): boolean {
* (the in-flight wake promise is reused).
*
* The container runs the v2 agent-runner which polls the session DB.
*
* Contract: never throws. Returns `true` on successful spawn, `false` on
* transient spawn failure (e.g. OneCLI gateway unreachable). Callers don't
* need to wrap — the inbound row stays pending and host-sweep retries on
* its next tick. Callers that care (e.g. the router's typing indicator)
* can branch on the boolean.
*/
export function wakeContainer(session: Session): Promise<boolean> {
export function wakeContainer(session: Session): Promise<void> {
if (activeContainers.has(session.id)) {
log.debug('Container already running', { sessionId: session.id });
return Promise.resolve(true);
return Promise.resolve();
}
const existing = wakePromises.get(session.id);
if (existing) {
log.debug('Container wake already in-flight — joining existing promise', { sessionId: session.id });
return existing;
}
const promise = spawnContainer(session)
.then(() => true)
.catch((err) => {
log.warn('wakeContainer failed — host-sweep will retry', { sessionId: session.id, err });
return false;
})
.finally(() => {
wakePromises.delete(session.id);
});
const promise = spawnContainer(session).finally(() => {
wakePromises.delete(session.id);
});
wakePromises.set(session.id, promise);
return promise;
}
@@ -447,18 +435,20 @@ async function buildContainerArgs(
}
// OneCLI gateway — injects HTTPS_PROXY + certs so container API calls
// are routed through the agent vault for credential injection. Treated as
// a transient hard failure: if we can't wire the gateway, we don't spawn.
// The caller (router or host-sweep) catches the throw, leaves the inbound
// message pending, and the next sweep tick retries.
if (agentIdentifier) {
await onecli.ensureAgent({ name: agentGroup.name, identifier: agentIdentifier });
// are routed through the agent vault for credential injection.
try {
if (agentIdentifier) {
await onecli.ensureAgent({ name: agentGroup.name, identifier: agentIdentifier });
}
const onecliApplied = await onecli.applyContainerConfig(args, { addHostMapping: false, agent: agentIdentifier });
if (onecliApplied) {
log.info('OneCLI gateway applied', { containerName });
} else {
log.warn('OneCLI gateway not applied — container will have no credentials', { containerName });
}
} catch (err) {
log.warn('OneCLI gateway error — container will have no credentials', { containerName, err });
}
const onecliApplied = await onecli.applyContainerConfig(args, { addHostMapping: false, agent: agentIdentifier });
if (!onecliApplied) {
throw new Error('OneCLI gateway not applied — refusing to spawn container without credentials');
}
log.info('OneCLI gateway applied', { containerName });
// Host gateway
args.push(...hostGatewayArgs());
-2
View File
@@ -168,8 +168,6 @@ async function sweepSession(session: Session): Promise<void> {
const dueCount = countDueMessages(inDb);
if (dueCount > 0 && !isContainerRunning(session.id)) {
log.info('Waking container for due messages', { sessionId: session.id, count: dueCount });
// wakeContainer never throws — transient spawn failures (OneCLI down,
// etc.) return false and leave messages pending for the next tick.
await wakeContainer(session);
}
+2 -13
View File
@@ -7,7 +7,6 @@
import path from 'path';
import { DATA_DIR } from './config.js';
import { enforceStartupBackoff, resetCircuitBreaker } from './circuit-breaker.js';
import { migrateGroupsToClaudeLocal } from './claude-md-compose.js';
import { initDb } from './db/connection.js';
import { runMigrations } from './db/migrations/index.js';
@@ -59,9 +58,6 @@ import { initChannelAdapters, teardownChannelAdapters, getChannelAdapter } from
async function main(): Promise<void> {
log.info('NanoClaw starting');
// 0. Circuit breaker — backoff on rapid restarts
await enforceStartupBackoff();
// 1. Init central DB
const dbPath = path.join(DATA_DIR, 'v2.db');
const db = initDb(dbPath);
@@ -178,15 +174,8 @@ async function shutdown(signal: string): Promise<void> {
}
stopDeliveryPolls();
stopHostSweep();
try {
await teardownChannelAdapters();
} finally {
// Always reset on graceful shutdown — even if teardown threw, we got here
// via SIGTERM/SIGINT, not a crash, so the next start shouldn't be counted
// as one.
resetCircuitBreaker();
process.exit(0);
}
await teardownChannelAdapters();
process.exit(0);
}
process.on('SIGTERM', () => shutdown('SIGTERM'));
+112
View File
@@ -0,0 +1,112 @@
/**
* The 32KB Codex project-doc cap must DEGRADE, never throw: composeGroupAgentsMd
* runs inside the provider contribution at every spawn, and a throw there rides
* wakeContainer's transient-retry contract — host-sweep respawns every 60s
* forever and the group goes silently dark (a permanent condition disguised as
* a transient one). Oversized docs drop their largest optional instruction
* sections, keep the core contract, and say so in the doc.
*/
import fs from 'fs';
import os from 'os';
import path from 'path';
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
vi.mock('../config.js', async (importOriginal) => ({
...(await importOriginal<typeof import('../config.js')>()),
DATA_DIR: '/tmp/nanoclaw-agents-md-test/data',
}));
import { composeGroupAgentsMd, CODEX_PROJECT_DOC_MAX_BYTES } from './codex-agents-md.js';
import { closeDb, createAgentGroup, initTestDb, runMigrations } from '../db/index.js';
import { ensureContainerConfig, updateContainerConfigJson } from '../db/container-configs.js';
import type { AgentGroup } from '../types.js';
const TEST_ROOT = '/tmp/nanoclaw-agents-md-test';
function group(folder: string): AgentGroup {
return {
id: `ag-${folder}`,
name: folder,
folder,
agent_provider: null,
created_at: new Date().toISOString(),
} as AgentGroup;
}
describe('composeGroupAgentsMd cap handling', () => {
beforeEach(() => {
if (fs.existsSync(TEST_ROOT)) fs.rmSync(TEST_ROOT, { recursive: true });
fs.mkdirSync(path.join(TEST_ROOT, 'data'), { recursive: true });
const db = initTestDb();
runMigrations(db);
});
afterEach(() => {
closeDb();
if (fs.existsSync(TEST_ROOT)) fs.rmSync(TEST_ROOT, { recursive: true });
});
it('writes the doc untouched when under the cap', () => {
const g = group('small');
createAgentGroup(g);
ensureContainerConfig(g.id);
const groupDir = fs.mkdtempSync(path.join(os.tmpdir(), 'agents-md-'));
try {
composeGroupAgentsMd(g, groupDir);
const doc = fs.readFileSync(path.join(groupDir, 'AGENTS.md'), 'utf-8');
expect(doc).not.toContain('Omitted for size');
// Agent-authored skills must be told their persistent home — without
// this, authored skills land on ephemeral container paths and vanish.
expect(doc).toContain('/workspace/agent/skills');
expect(Buffer.byteLength(doc, 'utf-8')).toBeLessThanOrEqual(CODEX_PROJECT_DOC_MAX_BYTES);
} finally {
fs.rmSync(groupDir, { recursive: true, force: true });
}
});
it('inlines the memory index so recall does not depend on a file read', () => {
const g = group('with-memory');
createAgentGroup(g);
ensureContainerConfig(g.id);
const groupDir = fs.mkdtempSync(path.join(os.tmpdir(), 'agents-md-'));
try {
fs.mkdirSync(path.join(groupDir, 'memory'), { recursive: true });
fs.writeFileSync(
path.join(groupDir, 'memory', 'index.md'),
'# Memory Index\n- [People](memories/people/) - notes about people and their preferences\n',
);
composeGroupAgentsMd(g, groupDir);
const doc = fs.readFileSync(path.join(groupDir, 'AGENTS.md'), 'utf-8');
expect(doc).toContain('Current memory index');
expect(doc).toContain('notes about people and their preferences');
} finally {
fs.rmSync(groupDir, { recursive: true, force: true });
}
});
it('degrades instead of throwing when MCP instructions push the doc over the cap', () => {
const g = group('oversized');
createAgentGroup(g);
ensureContainerConfig(g.id);
updateContainerConfigJson(g.id, 'mcp_servers', {
bloated: { command: 'x', instructions: 'y'.repeat(CODEX_PROJECT_DOC_MAX_BYTES + 1024) },
lean: { command: 'x', instructions: 'short and useful' },
});
const groupDir = fs.mkdtempSync(path.join(os.tmpdir(), 'agents-md-'));
try {
composeGroupAgentsMd(g, groupDir); // must not throw
const doc = fs.readFileSync(path.join(groupDir, 'AGENTS.md'), 'utf-8');
expect(Buffer.byteLength(doc, 'utf-8')).toBeLessThanOrEqual(CODEX_PROJECT_DOC_MAX_BYTES);
// Largest optional section dropped, named in the doc; the rest survive.
expect(doc).toContain('Omitted for size');
expect(doc).toContain('MCP Server: bloated');
expect(doc).toContain('short and useful');
expect(doc).toContain('Memory System');
} finally {
fs.rmSync(groupDir, { recursive: true, force: true });
}
});
});
+188
View File
@@ -0,0 +1,188 @@
/**
* AGENTS.md composition for codex agent groups — codex-owned payload code.
*
* AGENTS.md is Codex's project doc (its CLAUDE.md equivalent). Composed fresh
* on every spawn by the codex provider contribution (see ./codex.ts) from:
* - the shared base (`container/AGENTS.md`)
* - a pointer to the runner-scaffolded memory system (created container-side
* at boot via the `usesMemoryScaffold` capability — nothing is written here)
* - a pointer to codex-native skills under `.agents/skills`
* - each enabled NanoClaw module's `*.instructions.md` fragment
* - MCP server `instructions` from container.json
*
* Codex hard-caps project-doc loading (`project_doc_max_bytes`, mirrored in
* the container provider's config.toml writer) — compose fails loudly rather
* than letting Codex truncate silently.
*/
import fs from 'fs';
import path from 'path';
import type { McpServerConfig } from '../container-config.js';
import { getContainerConfig } from '../db/container-configs.js';
import { log } from '../log.js';
import type { AgentGroup } from '../types.js';
export const CODEX_PROJECT_DOC_MAX_BYTES = 32 * 1024;
export const CODEX_PROJECT_DOC_WARN_BYTES = 28 * 1024;
const HEADER = '<!-- Composed at spawn. Do not edit. Edit memory/system/definition.md for memory behavior. -->';
const MCP_TOOLS_HOST_SUBPATH = path.join('container', 'agent-runner', 'src', 'mcp-tools');
const MEMORY_POINTER = [
'Editable memory-system definition: `/workspace/agent/memory/system/definition.md`.',
'Top memory index: `/workspace/agent/memory/index.md`.',
'Read the definition and index, then use memories, data, and conversation archives when relevant.',
'Stored user preferences are binding: before your first reply in a session, check the index below and read any memory file relevant to the user or the request, and apply it without being asked.',
'Do not use `AGENTS.local.md` or `AGENTS.override.md` for memory.',
].join('\n\n');
/**
* Inline the group's current memory index into the composed doc. Recall must
* not depend on the model choosing to read a file before its first reply —
* with the map already in the system prompt, applying a stored preference is
* one hop (read the relevant memory file), not three. The index is small
* (hundreds of bytes); the 32KB fit logic above bounds the worst case.
*/
function memoryIndexInline(groupDir: string): string {
const indexPath = path.join(groupDir, 'memory', 'index.md');
if (!fs.existsSync(indexPath)) return '';
const content = fs.readFileSync(indexPath, 'utf-8').trim();
if (!content) return '';
return ['Current memory index (paths relative to `/workspace/agent/memory/`):', content].join('\n\n');
}
const NATIVE_RUNTIME_SKILLS_POINTER = [
'Selected NanoClaw runtime skills are available as Codex-native skills at `/workspace/agent/.agents/skills`.',
'Each skill directory contains a `SKILL.md` with its trigger description plus any supporting files, and points to the read-only shared skill source under `/app/skills`.',
'Use skill discovery to load these skills only when their descriptions match the task. Full skill instructions live in the skill directories, not in `AGENTS.md`.',
'Skills YOU author or install yourself go in `/workspace/agent/skills/<name>/SKILL.md` — persistent, provider-neutral (they load under any agent provider this group runs on), and yours to write and update over time. They are linked into `$CODEX_HOME/skills` automatically at boot. Never write skills anywhere else: paths outside your workspace and `$CODEX_HOME` are ephemeral.',
].join('\n\n');
interface AgentsMdSection {
name: string;
content: string;
}
export function composeGroupAgentsMd(group: AgentGroup, groupDir: string): void {
if (!fs.existsSync(groupDir)) fs.mkdirSync(groupDir, { recursive: true });
const configRow = getContainerConfig(group.id);
const mcpServers: Record<string, McpServerConfig> = configRow
? (JSON.parse(configRow.mcp_servers) as Record<string, McpServerConfig>)
: {};
const sections: AgentsMdSection[] = [{ name: 'header', content: HEADER }];
const pushSection = (name: string, ...content: string[]): void => {
const body = content
.map((part) => part.trim())
.filter(Boolean)
.join('\n\n');
if (body) sections.push({ name, content: `# ${name}\n\n${body}` });
};
const sharedBase = path.join(process.cwd(), 'container', 'AGENTS.md');
if (fs.existsSync(sharedBase)) {
pushSection('NanoClaw Runtime Contract', fs.readFileSync(sharedBase, 'utf-8'));
}
pushSection('Memory System', MEMORY_POINTER, memoryIndexInline(groupDir));
pushSection('Native Runtime Skills', NATIVE_RUNTIME_SKILLS_POINTER);
const cliDisabled = configRow?.cli_scope === 'disabled';
const mcpToolsHostDir = path.join(process.cwd(), MCP_TOOLS_HOST_SUBPATH);
if (fs.existsSync(mcpToolsHostDir)) {
for (const entry of fs.readdirSync(mcpToolsHostDir).sort()) {
const match = entry.match(/^(.+)\.instructions\.md$/);
if (!match) continue;
const moduleName = match[1];
if (moduleName === 'cli' && cliDisabled) continue;
pushSection(`NanoClaw Module: ${moduleName}`, fs.readFileSync(path.join(mcpToolsHostDir, entry), 'utf-8'));
}
}
for (const [name, mcp] of Object.entries(mcpServers)) {
if (mcp.instructions) {
pushSection(`MCP Server: ${name}`, mcp.instructions);
}
}
const content = fitAgentsMdToCap(group, sections);
writeAtomic(path.join(groupDir, 'AGENTS.md'), content);
}
function renderAgentsMd(sections: AgentsMdSection[]): string {
return (
sections
.map((section) => section.content.trim())
.filter(Boolean)
.join('\n\n') + '\n'
);
}
/**
* Fit the doc under Codex's 32KB project-doc cap by DEGRADING, never
* throwing: a per-spawn throw rides wakeContainer's transient-retry contract
* — host-sweep respawns every 60s forever and the group goes silently dark.
* Instead, drop the largest optional instruction sections (per-module and
* per-MCP-server) until the doc fits, log what was dropped at error level,
* and tell the agent in the doc itself. The core contract (header, runtime
* contract, memory, skills pointer) is never dropped.
*/
function fitAgentsMdToCap(group: AgentGroup, sections: AgentsMdSection[]): string {
const sectionBytes = (): { section: string; bytes: number }[] =>
sections.map((section) => ({ section: section.name, bytes: Buffer.byteLength(section.content, 'utf-8') }));
const isDroppable = (s: AgentsMdSection): boolean =>
s.name.startsWith('MCP Server: ') || s.name.startsWith('NanoClaw Module: ');
const dropped: string[] = [];
const render = (): string => {
const parts = [...sections];
if (dropped.length > 0) {
parts.push({
name: 'omitted',
content:
`# Omitted for size\n\nThese instruction sections were omitted to fit Codex's project-doc cap: ` +
`${dropped.join(', ')}. Their tools still work; consult each tool's own description.`,
});
}
return renderAgentsMd(parts);
};
let content = render();
while (Buffer.byteLength(content, 'utf-8') > CODEX_PROJECT_DOC_MAX_BYTES) {
const candidates = sections
.filter(isDroppable)
.sort((a, b) => Buffer.byteLength(b.content, 'utf-8') - Buffer.byteLength(a.content, 'utf-8'));
if (candidates.length === 0) break; // only core left — write oversized rather than brick the group
sections.splice(sections.indexOf(candidates[0]), 1);
dropped.push(candidates[0].name);
content = render();
}
const bytes = Buffer.byteLength(content, 'utf-8');
if (dropped.length > 0) {
log.error('AGENTS.md exceeded Codex project-doc cap — dropped largest instruction sections', {
group: group.name,
bytes,
maxBytes: CODEX_PROJECT_DOC_MAX_BYTES,
dropped,
sections: sectionBytes(),
});
} else if (bytes >= CODEX_PROJECT_DOC_WARN_BYTES) {
log.warn('AGENTS.md is near Codex project-doc cap', {
group: group.name,
bytes,
warnBytes: CODEX_PROJECT_DOC_WARN_BYTES,
maxBytes: CODEX_PROJECT_DOC_MAX_BYTES,
sections: sectionBytes(),
});
}
return content;
}
function writeAtomic(filePath: string, content: string): void {
const tmp = `${filePath}.tmp-${process.pid}`;
fs.writeFileSync(tmp, content);
fs.renameSync(tmp, filePath);
}
@@ -0,0 +1,98 @@
/**
* In-process seam test for the codex HOST contribution's runtime consumption
* of core (the "consumes core" leg the skill guidelines require): drive the
* REAL registered contribution — via the real barrel and registry, never by
* importing codex.ts's internals — against a real test DB and a temp
* GROUPS_DIR/DATA_DIR, then hand its result to the real buildMounts.
*
* This is what catches core drift that typecheck can't: the
* DATA_DIR/v2-sessions/<id>/.codex-shared session layout, the
* getAgentGroup/getContainerConfig reads, the mcp_servers JSON shape consumed
* by composeGroupAgentsMd, and the mount set buildMounts assembles for a
* surfaces-providing provider. (codex-registration.test.ts only guards that
* the name is registered; provider-surfaces.test.ts drives a FAKE provider to
* test the seam itself.)
*/
import fs from 'fs';
import path from 'path';
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
const TEST_ROOT = '/tmp/nanoclaw-codex-host-contribution-test';
const DATA_DIR = path.join(TEST_ROOT, 'data');
const GROUPS_DIR = path.join(TEST_ROOT, 'groups');
vi.mock('../config.js', async (importOriginal) => ({
...(await importOriginal<typeof import('../config.js')>()),
DATA_DIR: '/tmp/nanoclaw-codex-host-contribution-test/data',
GROUPS_DIR: '/tmp/nanoclaw-codex-host-contribution-test/groups',
}));
import { buildMounts } from '../container-runner.js';
import { closeDb, createAgentGroup, initTestDb, runMigrations } from '../db/index.js';
import { ensureContainerConfig, updateContainerConfigJson } from '../db/container-configs.js';
import { getProviderContainerConfig } from './provider-container-registry.js';
import './index.js'; // the real host provider barrel
import type { ContainerConfig } from '../container-config.js';
import type { AgentGroup, Session } from '../types.js';
function group(id: string, folder: string): AgentGroup {
return { id, name: folder, folder, agent_provider: null, created_at: new Date().toISOString() } as AgentGroup;
}
describe('codex host contribution against real core', () => {
beforeEach(() => {
fs.rmSync(TEST_ROOT, { recursive: true, force: true });
fs.mkdirSync(DATA_DIR, { recursive: true });
fs.mkdirSync(GROUPS_DIR, { recursive: true });
runMigrations(initTestDb());
});
afterEach(() => {
closeDb();
fs.rmSync(TEST_ROOT, { recursive: true, force: true });
});
it('creates the per-group state dir, composes AGENTS.md from the real config row, and mounts both', () => {
const ag = group('ag-codex', 'codex-group');
createAgentGroup(ag);
ensureContainerConfig(ag.id);
updateContainerConfigJson(ag.id, 'mcp_servers', {
tooling: { command: 'x', instructions: 'use the tooling server for builds' },
});
const groupDir = path.join(GROUPS_DIR, ag.folder);
const contributionFn = getProviderContainerConfig('codex');
expect(contributionFn).toBeDefined();
const contribution = contributionFn!({
sessionDir: path.join(DATA_DIR, 'v2-sessions', ag.id, 'session-1'),
agentGroupId: ag.id,
groupDir,
selectedSkills: [],
hostEnv: process.env,
});
// Per-group codex state dir exists and is mounted RW at ~/.codex.
const codexShared = path.join(DATA_DIR, 'v2-sessions', ag.id, '.codex-shared');
expect(fs.existsSync(codexShared)).toBe(true);
// OneCLI's auth-stub mountpoint is pre-created — on macOS Docker can't
// create a missing file mountpoint inside a virtiofs dir mount (exit 125
// on first spawn). Red here = the pre-create line was dropped.
expect(fs.existsSync(path.join(codexShared, 'auth.json'))).toBe(true);
const codexMount = contribution.mounts?.find((m) => m.containerPath === '/home/node/.codex');
expect(codexMount).toMatchObject({ hostPath: codexShared, readonly: false });
// AGENTS.md composed from the real DB row — MCP instructions included.
const agentsMd = fs.readFileSync(path.join(groupDir, 'AGENTS.md'), 'utf-8');
expect(agentsMd).toContain('MCP Server: tooling');
expect(agentsMd).toContain('use the tooling server for builds');
// The full mount set: codex surfaces in, default claude surfaces out.
const session = { id: 'session-1', agent_group_id: ag.id } as Session;
const config: ContainerConfig = { mcpServers: {}, packages: { apt: [], npm: [] }, additionalMounts: [], skills: [] };
const mounts = buildMounts(ag, session, config, 'codex', contribution);
const containerPaths = mounts.map((m) => m.containerPath);
expect(containerPaths).toContain('/home/node/.codex');
expect(containerPaths.some((p) => p.endsWith('AGENTS.md'))).toBe(true);
expect(containerPaths).not.toContain('/home/node/.claude');
});
});
+27
View File
@@ -0,0 +1,27 @@
/**
* Integration test for the codex provider's HOST-side reach-in: the self-registration
* import in the src/providers/index.ts barrel. Importing the barrel runs codex.ts's
* top-level registerProviderContainerConfig('codex', …); without that import line the
* host never wires the provider's per-session mounts / env passthrough.
*
* Behavior, not structural, and BARREL-ONLY: it imports the real barrel (./index.js),
* never ./codex.js directly, then asserts the registry actually contains the provider.
* Importing the provider module directly (as codex.factory.test.ts does) self-registers
* it and would stay GREEN even if the barrel line were deleted — that is a unit test,
* not a registration guard. This test goes red if the barrel import is deleted/drifts,
* or the barrel fails to evaluate.
*
* A provider is a MULTI-POINT integration: this guards the HOST barrel; the CONTAINER
* barrel is guarded by the sibling bun test; the SDK/CLI dependency + Dockerfile install
* are guarded by the build/container legs (see the skill's validate step).
*/
import { describe, it, expect } from 'vitest';
import { listProviderContainerConfigNames } from './provider-container-registry.js';
import './index.js'; // the real host provider barrel — triggers each provider's self-registration
describe('codex provider host registration', () => {
it('registers codex host container-config via the barrel', () => {
expect(listProviderContainerConfigNames()).toContain('codex');
});
});
+108
View File
@@ -0,0 +1,108 @@
/**
* Host-side container config for the `codex` provider.
*
* Registers with `providesAgentSurfaces` — codex owns its agent-facing
* surfaces, so core skips the default (Claude) compose/mounts and this
* contribution supplies them instead:
*
* - AGENTS.md — codex's project doc, composed fresh every spawn
* (see ./codex-agents-md.ts), mounted RO over the RW group dir.
* - .agents/skills — codex-native skill links synced to the group's
* container.json selection, mounted RO.
* - ~/.codex — a per-GROUP private state dir (`.codex-shared`), persistent
* across sessions so thread metadata and config.toml survive respawns.
*
* Credentials: NONE here — v2's invariant is that containers never receive
* raw API keys; OneCLI is the sole credential path. The OpenAI key (or
* ChatGPT token) lives in the OneCLI vault with an api.openai.com /
* chatgpt.com host pattern; codex's traffic already rides the gateway proxy
* (every spawn applies it — see container-runner.ts), which injects the real
* credential in flight. The container only ever sees the `onecli-managed`
* placeholder. Model/effort come from container_config (`ncl groups config
* update --model/--effort`), not env.
*
* Memory and exchange archiving are NOT handled here either — the
* container-side provider declares `usesMemoryScaffold` (the runner
* scaffolds the memory tree) and implements `onExchangeComplete` (the
* provider's own exchange-archive.ts persists each exchange).
*/
import fs from 'fs';
import path from 'path';
import { DATA_DIR } from '../config.js';
import { getAgentGroup } from '../db/agent-groups.js';
import { composeGroupAgentsMd } from './codex-agents-md.js';
import { registerProviderContainerConfig } from './provider-container-registry.js';
registerProviderContainerConfig(
'codex',
(ctx) => {
// Per-group codex state (config.toml, thread metadata).
const codexDir = path.join(DATA_DIR, 'v2-sessions', ctx.agentGroupId, '.codex-shared');
fs.mkdirSync(codexDir, { recursive: true });
// OneCLI bind-mounts its auth stub at ~/.codex/auth.json, nested inside
// this dir mount — Docker on macOS can't create a missing mountpoint file
// inside a virtiofs bind mount (runc: "mountpoint is outside of rootfs",
// exit 125), so it must exist before first spawn. Re-created here per
// spawn because a group reset that wipes .codex-shared re-triggers it.
// The 'a' flag creates the file if missing, never truncates an existing one.
fs.closeSync(fs.openSync(path.join(codexDir, 'auth.json'), 'a'));
// Compose this group's AGENTS.md and sync codex-native skill links.
const group = getAgentGroup(ctx.agentGroupId);
if (group) composeGroupAgentsMd(group, ctx.groupDir);
syncCodexSkillLinks(ctx.groupDir, ctx.selectedSkills);
// No credential env here — OneCLI's container-config drives auth end to
// end: the gateway serves a sentinel auth.json stub into ~/.codex for
// BOTH auth modes (ChatGPT subscription and API key) and swaps the real
// credential on the wire. Note the runner's CODEX_ENV_ALLOWLIST
// deliberately strips OPENAI_API_KEY from the codex process env — auth
// never rides env vars, only the stub. Duplicating any of it here would
// be a second source of truth.
const mounts = [{ hostPath: codexDir, containerPath: '/home/node/.codex', readonly: false }];
const composedAgentsMd = path.join(ctx.groupDir, 'AGENTS.md');
if (fs.existsSync(composedAgentsMd)) {
// RO over the RW group dir — regenerated every spawn, agent edits would
// be clobbered anyway. Memory behavior is edited via memory/system/.
mounts.push({ hostPath: composedAgentsMd, containerPath: '/workspace/agent/AGENTS.md', readonly: true });
}
const agentsDir = path.join(ctx.groupDir, '.agents');
if (fs.existsSync(agentsDir)) {
mounts.push({ hostPath: agentsDir, containerPath: '/workspace/agent/.agents', readonly: true });
}
return { mounts };
},
{ providesAgentSurfaces: true },
);
/**
* Sync `.agents/skills/<name>` symlinks to the selected skill set. Targets are
* container paths (`/app/skills/<name>`) — dangling on the host, valid inside.
*/
function syncCodexSkillLinks(groupDir: string, selectedSkills: string[]): void {
const skillsDir = path.join(groupDir, '.agents', 'skills');
fs.mkdirSync(skillsDir, { recursive: true });
const desired = new Set(selectedSkills);
for (const entry of fs.readdirSync(skillsDir)) {
const entryPath = path.join(skillsDir, entry);
let isSymlink = false;
try {
isSymlink = fs.lstatSync(entryPath).isSymbolicLink();
} catch {
continue;
}
if (isSymlink && !desired.has(entry)) fs.unlinkSync(entryPath);
}
for (const skill of selectedSkills) {
const linkPath = path.join(skillsDir, skill);
try {
fs.lstatSync(linkPath);
} catch {
fs.symlinkSync(`/app/skills/${skill}`, linkPath);
}
}
}
+3
View File
@@ -4,3 +4,6 @@
// needs (claude, mock) don't appear here.
//
// Skills add a new provider by appending one import line below.
import './codex.js';
import './opencode.js';
@@ -0,0 +1,27 @@
/**
* Integration test for the opencode provider's HOST-side reach-in: the self-registration
* import in the src/providers/index.ts barrel. Importing the barrel runs opencode.ts's
* top-level registerProviderContainerConfig('opencode', …); without that import line the
* host never wires the provider's per-session mounts / env passthrough.
*
* Behavior, not structural, and BARREL-ONLY: it imports the real barrel (./index.js),
* never ./opencode.js directly, then asserts the registry actually contains the provider.
* Importing the provider module directly (as opencode.factory.test.ts does) self-registers
* it and would stay GREEN even if the barrel line were deleted — that is a unit test,
* not a registration guard. This test goes red if the barrel import is deleted/drifts,
* or the barrel fails to evaluate.
*
* A provider is a MULTI-POINT integration: this guards the HOST barrel; the CONTAINER
* barrel is guarded by the sibling bun test; the SDK/CLI dependency + Dockerfile install
* are guarded by the build/container legs (see the skill's validate step).
*/
import { describe, it, expect } from 'vitest';
import { listProviderContainerConfigNames } from './provider-container-registry.js';
import './index.js'; // the real host provider barrel — triggers each provider's self-registration
describe('opencode provider host registration', () => {
it('registers opencode host container-config via the barrel', () => {
expect(listProviderContainerConfigNames()).toContain('opencode');
});
});
+49
View File
@@ -0,0 +1,49 @@
/**
* Host-side container config for the `opencode` provider.
*
* OpenCode's `opencode serve` process stores state under XDG_DATA_HOME, which
* we pin to a per-session host directory mounted at /opencode-xdg. The
* OPENCODE_* env vars tell the CLI which provider/model to use at runtime
* (read on the host, injected into the container). NO_PROXY / no_proxy are
* merged with host values so the in-container OpenCode client can talk to
* 127.0.0.1 even when HTTPS_PROXY is set by OneCLI.
*/
import fs from 'fs';
import path from 'path';
import { registerProviderContainerConfig } from './provider-container-registry.js';
function mergeNoProxy(current: string | undefined, additions: string): string {
if (!current?.trim()) return additions;
const parts = new Set(
current
.split(/[\s,]+/)
.map((s) => s.trim())
.filter(Boolean),
);
for (const addition of additions.split(',')) {
const trimmed = addition.trim();
if (trimmed) parts.add(trimmed);
}
return [...parts].join(',');
}
registerProviderContainerConfig('opencode', (ctx) => {
const opencodeDir = path.join(ctx.sessionDir, 'opencode-xdg');
fs.mkdirSync(opencodeDir, { recursive: true });
const env: Record<string, string> = {
XDG_DATA_HOME: '/opencode-xdg',
NO_PROXY: mergeNoProxy(ctx.hostEnv.NO_PROXY, '127.0.0.1,localhost'),
no_proxy: mergeNoProxy(ctx.hostEnv.no_proxy, '127.0.0.1,localhost'),
};
for (const key of ['OPENCODE_PROVIDER', 'OPENCODE_MODEL', 'OPENCODE_SMALL_MODEL'] as const) {
const value = ctx.hostEnv[key];
if (value) env[key] = value;
}
return {
mounts: [{ hostPath: opencodeDir, containerPath: '/opencode-xdg', readonly: false }],
env,
};
});
+2 -6
View File
@@ -27,7 +27,7 @@ import {
getMessagingGroupWithAgentCount,
} from './db/messaging-groups.js';
import { findSessionForAgent } from './db/sessions.js';
import { startTypingRefresh, stopTypingRefresh } from './modules/typing/index.js';
import { startTypingRefresh } from './modules/typing/index.js';
import { log } from './log.js';
import { resolveSession, writeSessionMessage, writeOutboundDirect } from './session-manager.js';
import { wakeContainer } from './container-runner.js';
@@ -457,11 +457,7 @@ async function deliverToAgent(
startTypingRefresh(session.id, session.agent_group_id, event.channelType, event.platformId, event.threadId);
const freshSession = getSession(session.id);
if (freshSession) {
const woke = await wakeContainer(freshSession);
// wakeContainer never throws — it returns false on transient spawn
// failure (host-sweep retries). Stop the typing indicator we just
// started so it doesn't leak; the inbound row stays pending.
if (!woke) stopTypingRefresh(freshSession.id);
await wakeContainer(freshSession);
}
}
}