Compare commits

..

53 Commits

Author SHA1 Message Date
Moshe Krupper 401c16fe95 feat(approvals): reject-with-reason — relay an optional decline reason to the agent
Add a third "Reject with reason…" button to module approval cards. Plain
Reject stays the instant fast path; the new option holds the row
(status='awaiting_reason'), DM-prompts the approver, and captures their
next DM (≤280 chars, truncated) as a one-line reason relayed to the
requesting agent as a single combined message. A ghosted hold is
finalized as a plain reject by the host sweep after ~5 min — restart-safe
via the durable DB row.

- Generalize the router message-interceptor to a list
  (registerMessageInterceptor) so approvals can capture replies alongside
  the permissions agent-naming flow.
- Share reject finalization across the instant, captured, and swept paths
  via finalizeReject.
- Scope: all module approvals (create_agent, install_packages,
  add_mcp_server); OneCLI credential cards are unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 09:25:22 +03:00
github-actions[bot] 3f39f57653 chore: bump version to 2.1.18 2026-06-18 06:15:49 +00:00
gavrielc 1b86950f10 Merge pull request #2803 from sturdy4days/refactor/remove-dead-resolvegroupipcpath
refactor: remove dead resolveGroupIpcPath
2026-06-18 09:15:36 +03:00
gavrielc 8b435eb02d Merge branch 'main' into refactor/remove-dead-resolvegroupipcpath 2026-06-18 09:15:23 +03:00
gavrielc 7e2004f945 Merge pull request #2806 from arkjun/docs/add-korean-readme
docs: add Korean README
2026-06-18 09:14:52 +03:00
gavrielc 63901d1bde Merge branch 'main' into docs/add-korean-readme 2026-06-18 09:14:35 +03:00
gavrielc e5d96e348f Merge pull request #2805 from amit-shafnir/fix/setup-token-pty-parsing
fix(setup): parse Claude OAuth token from wrapped PTY capture
2026-06-18 09:12:56 +03:00
Juntai Park 439c24f1b7 docs: link Korean README in language switchers 2026-06-18 11:21:51 +09:00
Juntai Park 2a144bb8d6 docs: add Korean README 2026-06-18 11:21:50 +09:00
Amit Shafnir 197faaaa14 fix(setup): parse Claude OAuth token from wrapped PTY capture
`claude setup-token` runs under script(1) so the browser OAuth flow keeps a
TTY while we capture the printed token. On terminals that wrap long lines
(e.g. sbx), the token lands split across lines with padding spaces, and the
old parser — which stripped only ANSI codes and newlines — matched just the
first fragment and failed the trailing `AA` check. Login succeeded; only our
parse of the human-oriented output failed (`No sk-ant-oat…AA token found`).

Add setup/lib/captured-token.ts: normalize the capture (strip ANSI/control
bytes and all whitespace, un-wrapping the token) then extract. The TS caller
(claude-assist.ts) and the bash registration script now share it, so the
normalization rules can't drift. Placeholder lines like
`export CLAUDE_CODE_OAUTH_TOKEN=<token>` are ignored.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 00:20:21 +03:00
sturdy4days 3ffd6dde00 refactor: remove dead resolveGroupIpcPath
resolveGroupIpcPath has no production callers (only its own test); IPC was
removed in the v2 architecture (host<->container communicate solely via the two
session DBs). Drop the function, the now-unused DATA_DIR import, and its tests.
2026-06-17 15:19:29 -04:00
github-actions[bot] ee7f891698 docs: update token count to 196k tokens · 98% of context window 2026-06-16 11:15:10 +00:00
github-actions[bot] 7fde348e2b chore: bump version to 2.1.17 2026-06-16 11:15:04 +00:00
Gabi Simons 122135e6dc Merge pull request #2759 from assapin/fix/budget-error-surfaced-to-user
fix(agent-runner): deliver budget/billing error turns instead of dropping them
2026-06-16 14:14:48 +03:00
Gabi Simons 8563fb0681 Merge remote-tracking branch 'origin/main' into fix/budget-error-surfaced-to-user
# Conflicts:
#	CHANGELOG.md
2026-06-16 11:35:45 +03:00
omri-maya 0155ab1943 Merge pull request #2775 from nanocoai/docs/onecli-gateway-upgrade-notice
docs(changelog): clarify the OneCLI gateway is a separate, operator-driven upgrade
2026-06-16 09:55:25 +03:00
Koshkoshinsk d1f94fcd24 docs(changelog): clarify the OneCLI gateway is a separate, operator-driven upgrade
The breaking notice said the onecli setup step enforces the pinned versions, which is only true for fresh installs — on an existing install, updating does not upgrade the running gateway. Clarify that the gateway is separate: /update-nanoclaw upgrades it when the pin moves, otherwise upgrade manually per docs/onecli-upgrades.md.
2026-06-15 20:25:42 +03:00
gavrielc dd60983f7f Merge pull request #2774 from nanocoai/feat/update-nanoclaw-onecli-pin
feat(update-nanoclaw): upgrade OneCLI gateway when its pinned version moves
2026-06-15 20:09:01 +03:00
Koshkoshinsk 096b8bf589 feat(update-nanoclaw): upgrade OneCLI gateway when its pinned version moves
When an update moves the onecli-gateway/onecli-cli pin in versions.json, the running gateway must be upgraded to match — otherwise the new code's @onecli-sh/sdk calls fail (404 on /v1/agents) and agents can't spawn. update-nanoclaw never detected this, so the upgrade was silently skipped. Add a conditional step that follows docs/onecli-upgrades.md before restart when the pin moves.
2026-06-15 19:37:23 +03:00
Gabi Simons 59c4d33adc Merge branch 'main' into fix/budget-error-surfaced-to-user 2026-06-15 17:42:01 +03:00
omri-maya 5f5c28d18d Merge pull request #2773 from nanocoai/docs/codex-fix-docs
docs(add-codex): drop redundant TTY warning in auth note
2026-06-15 16:04:28 +03:00
Koshkoshinsk b92d1f9343 docs(add-codex): drop redundant TTY warning in auth note
The 'don't run via `!` prefix or Bash tool' sentence was redundant with
the leading 'Run this in a separate, real terminal — it is interactive.'

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 15:32:04 +03:00
Gabi Simons e03c5c194a Merge branch 'main' into fix/budget-error-surfaced-to-user 2026-06-15 12:17:20 +03:00
Daniel M acbb1144b7 Merge pull request #2769 from nanocoai/docs/codex-interactive-host-restart
docs(add-codex): flag interactive auth step + add host-restart step
2026-06-15 02:24:06 +03:00
Koshkoshinsk 028897f38f docs(add-codex): flag interactive auth step + add host-restart step
- Authenticate: run in a separate real terminal, not Claude Code's `!`
  prefix or an agent Bash tool — the provider-auth picker + browser/device
  login need an interactive TTY, so those prompts stall otherwise (CDX-002).
- add a "Restart the host" step after the image rebuild so the host
  reloads Codex's /home/node/.codex mount + env; skipping it left the dir
  root-owned and the container hit EACCES writing config.toml (CDX-003).

Refs CDX-002, CDX-003.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 01:58:30 +03:00
gavrielc ac0a799cbf refactor(add-codex): install Codex CLI via cli-tools.json, not the Dockerfile
adfae67 moved the agent's global Node CLIs into container/cli-tools.json so a
skill adds one with a json-merge instead of editing the Dockerfile. The Codex
provider install was left behind — add-codex.sh still awk'd an ARG + RUN into
the Dockerfile and its test guarded that shape.

Migrate add-codex to the seam:
- add-codex.sh appends { name: "@openai/codex", version } to cli-tools.json
  (idempotent json-merge); install/idempotency gates read the manifest.
- SKILL.md / REMOVE.md document the manifest append/removal, not Dockerfile edits.
- codex-dockerfile.test.ts -> codex-cli-tools.test.ts, asserting the manifest
  entry (skips when the manifest is absent, e.g. the bare providers branch).

Pairs with the providers-branch commit that drops the codex Dockerfile lines,
renames the payload test, and points the setup install-check at the manifest.

Verified end-to-end: full add-codex install into a clean worktree leaves the
Dockerfile codex-free, the manifest correctly appended and idempotent; vitest
cli-tools.test.ts (6) and bun codex-cli-tools.test.ts (2) green; host tsc clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 21:40:44 +03:00
github-actions[bot] e3986eb58c chore: bump version to 2.1.16 2026-06-14 18:29:28 +00:00
github-actions[bot] 6d0d48d585 docs: update token count to 195k tokens · 98% of context window 2026-06-14 18:29:25 +00:00
gavrielc a142c496f7 Merge pull request #2756 from nanocoai/provider-selection
feat(providers): operator-driven provider selection, switching, and memory migration
2026-06-14 21:29:12 +03:00
gavrielc c5b4d11536 Apply suggestion from @gavrielc 2026-06-14 21:16:19 +03:00
Daniel M ed8b4149e7 Merge pull request #2764 from glifocat/docs/fix-claude-md-relocated-paths
docs(CLAUDE.md): fix two relocated Key Files paths
2026-06-14 18:13:31 +03:00
glifocat d5ce02d1b8 docs(CLAUDE.md): fix two relocated Key Files paths
The Key Files table and the Secrets/OneCLI section referenced
src/onecli-approvals.ts and src/user-dm.ts, but both files were moved
under src/modules/ (src/modules/approvals/onecli-approvals.ts and
src/modules/permissions/user-dm.ts). onecli-approvals.ts is already
cited at its correct new path elsewhere in the same doc, so this was a
partial-rename miss. Docs only — no code changes.

Closes #2763

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 17:01:40 +02:00
omri-maya c8af599944 Merge branch 'main' into provider-selection 2026-06-14 15:17:13 +03:00
github-actions[bot] 435233a062 chore: bump version to 2.1.15 2026-06-14 11:04:33 +00:00
gavrielc 785fce3754 Merge pull request #2758 from nanocoai/feat/cli-tools-manifest
feat(container): data-drive global CLI installs from cli-tools.json
2026-06-14 14:04:16 +03:00
assafpin 01433bae32 fix(agent-runner): deliver budget/billing error turns instead of dropping them
A turn that ends in a non-retryable provider error (e.g. an Anthropic
403 billing_error) comes back from the streaming SDK as a result with
is_error=true and no <message> envelope. dispatchResultText treated it
as scratchpad and dropped it, then the poll-loop pushed a re-wrap nudge
-> new turn -> same error, re-hammering the gateway until idle-kill. The
user saw silence.

- providers/claude.ts: surface is_error on the result event, and fall
  back to errors[] for the message text (error subtypes carry no result).
- poll-loop.ts: when a result has no <message> blocks and is_error, deliver
  the notice verbatim to the originating channel and skip the nudge.

Verified live (real agent image + SDK, 403 mock): the notice is delivered
to the channel and the retry loop is gone.

Refs #2751
2026-06-14 12:56:02 +03:00
Omri Maya 6d521a9d8d refactor(memory): scope imported-memory doctrine to /migrate-memory
The "read imported-agent-memory.md, treat it as binding" doctrine sat in the
memory definition that every group loads, but it only matters when an import
actually happened. Move it into the /migrate-memory skill — the step that
writes the imported file and its index pointer (which the agent inlines into
its prompt each turn) — and drop the always-on block from definition.md.

Addresses review feedback on #2756.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 12:12:41 +03:00
gavrielc adfae67611 feat(container): data-drive global CLI installs from cli-tools.json
The agent's global Node CLIs (claude-code, agent-browser, vercel) were each
a hardcoded ARG + RUN layer in the Dockerfile, so adding or bumping one meant
editing the Dockerfile — a code reach-in every tool-installing skill had to make.

Move the tool list into container/cli-tools.json. A skill now adds a CLI by
appending a {name, version} entry (a json-merge) — the safest change shape:
deterministic, idempotent, removable. install-cli-tools.sh parses the manifest
with node (no new jq dep), writes the per-tool only-built-dependencies opt-ins,
and runs one pinned `pnpm install -g`, so the pnpm supply-chain path is unchanged.

Behavior is byte-for-byte: same opt-ins, same pinned installs. agent-browser is
now pinned (0.27.1, what `latest` last resolved to) instead of floating.

container/cli-tools.test.ts guards the seam: red if a baseline tool is dropped,
a version unpins, or the Dockerfile wiring / pnpm path is removed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 12:07:14 +03:00
Omri Maya 13a37def89 feat(providers): operator-driven provider selection, switching, and memory migration
Make the agent provider a first-class, operator-chosen property instead of a
Claude-only assumption. Trunk gains the seams; the actual non-default payloads
(Codex first) install from the `providers` branch.

Setup
- A provider registry feeds a hard-wired setup picker (Claude | Codex). Picking
  a non-default provider installs its payload (setup/add-codex.sh, channel-style),
  runs a vault-only auth walkthrough (--step provider-auth), and records the pick
  on the first agent before its first spawn.
- Picking Claude changes nothing — default installs are byte-for-byte unaffected.

Provider as a DB property
- Provider lives on container_configs.provider (materialized to container.json,
  read by resolveProviderName). Creation stays provider-agnostic; the picked
  provider is applied via the picked-provider seam. The deprecated
  agent_groups.agent_provider path is not used.

Switching + memory
- Switch a live group with `ncl groups config update --provider` + restart.
- Memory never migrates at runtime — each provider keeps its own store. The
  /migrate-memory skill carries a group's memory across a switch in either
  direction (flat CLAUDE.local.md <-> memory/ scaffold). group-init seeds an
  imported-agent-memory note for non-default providers; the runner's memory
  definition reads it first turn. See docs/provider-migration.md.

No install-wide default, no runtime provider guard — switching is operator-by-
convention, consistent with the no-install-gating posture.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 07:49:39 +03:00
github-actions[bot] 03382e9dd7 chore: bump version to 2.1.14 2026-06-13 13:05:30 +00:00
github-actions[bot] 9763551656 docs: update token count to 194k tokens · 97% of context window 2026-06-13 13:05:27 +00:00
gavrielc a9c9cb300d Merge pull request #2754 from nanocoai/oss/exchange-hook
feat(runner): onExchangeComplete provider hook + slash-command interruption
2026-06-13 16:05:14 +03:00
gavrielc a619fc1aa2 Apply suggestion from @gavrielc 2026-06-13 16:03:02 +03:00
Omri Maya 3d2f3e58ca feat(runner): onExchangeComplete provider hook + slash-command interruption
Inverts conversation archiving into an optional onExchangeComplete provider
hook: the runner never archives on a provider's behalf, and the markdown
writer ships with the provider that needs it. Dormant for the default
provider.

Slash commands now interrupt an in-flight turn — a runner-handled command
(/clear, /compact, /cost, …) arriving mid-turn aborts the active stream and
runs immediately instead of waiting out the turn.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-13 15:56:43 +03:00
gavrielc 11afc64ba4 Merge pull request #2747 from nanocoai/oss/onecli-sdk-v2
feat(onecli): SDK 2.2.1 — credential-stub mounts + machine-checkable pins
2026-06-13 15:49:40 +03:00
github-actions[bot] 0ee75d393c chore: bump version to 2.1.13 2026-06-13 12:27:29 +00:00
github-actions[bot] 72b9cc7ed0 docs: update token count to 192k tokens · 96% of context window 2026-06-13 12:27:24 +00:00
gavrielc 5fcf234165 Merge pull request #2746 from nanocoai/oss/agent-surfaces
feat(providers): agent-surfaces capability seam
2026-06-13 15:27:12 +03:00
github-actions[bot] 9b1236505f chore: bump version to 2.1.12 2026-06-13 12:25:58 +00:00
github-actions[bot] 878cd68c1b docs: update token count to 191k tokens · 96% of context window 2026-06-13 12:25:52 +00:00
gavrielc fab1ebf2d6 Merge pull request #2745 from nanocoai/oss/memory-scaffold
feat(memory): opt-in persistent memory scaffold for providers
2026-06-13 15:25:39 +03:00
Omri Maya 3f9e89d345 feat(onecli): SDK 2.2.1 — credential-stub mounts + machine-checkable pins
Injects credentials as request-time stubs so no credential is ever written
into a container or to disk. Gateway and CLI versions move to versions.json
(machine-checkable pins); breaking upgrades are documented in
docs/onecli-upgrades.md as an agent-executable runbook (detect / why / fix /
verify / rollback), and the update flow follows linked docs and diffs the
pins.

BREAKING: requires a gateway upgrade; the doc carries the steps.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-12 11:30:11 +03:00
Omri Maya 2cfa86e570 feat(memory): opt-in persistent memory scaffold for providers
Adds a provider capability (usesMemoryScaffold) and a container-side boot
scaffold that materializes a persistent memory/ tree for providers that opt
in. Dormant for the default provider — the scaffold is only built when a
provider declares the capability, so existing installs are byte-identical
(asserted by a boot-gate wiring test).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-12 11:30:09 +03:00
82 changed files with 3008 additions and 429 deletions
+49 -54
View File
@@ -1,83 +1,78 @@
# Remove Codex provider
# Remove the Codex agent provider
Idempotent — safe to run even if some steps were never applied. Reverses both the host (`src/providers/`) and container (`container/agent-runner/src/providers/`) trees, plus the Dockerfile CLI install.
Reverses every change `/add-codex` makes and returns every group to the default provider. Safe to run when partially installed — skip any step whose target is already absent.
## 1. Delete the barrel import lines (both trees)
## 1. Switch codex groups back to the default
Delete (do not comment out) the `import './codex.js';` line from each barrel:
List groups still on codex and switch each one (each group's `memory/` tree stays on disk and readable; run `/migrate-memory` per group if its memory should carry back to Claude — see [docs/provider-migration.md](../../docs/provider-migration.md)):
```bash
ncl groups list
# for each group whose config shows provider=codex:
ncl groups config update --id <group-id> --provider claude
ncl groups restart --id <group-id>
```
## 2. Delete the barrel imports
Delete (do not comment out) the `import './codex.js';` line from each of:
- `src/providers/index.ts`
- `container/agent-runner/src/providers/index.ts`
- `setup/providers/index.ts`
This unregisters the provider from both `listProviderContainerConfigNames()` (host) and `listProviderNames()` (container).
## 2. Delete the copied files (both trees)
## 3. Delete every copied file
```bash
rm -f src/providers/codex.ts \
src/providers/codex-agents-md.ts \
src/providers/codex-registration.test.ts \
src/providers/codex-host-contribution.test.ts \
src/providers/codex-agents-md.test.ts \
container/agent-runner/src/providers/codex.ts \
container/agent-runner/src/providers/codex-app-server.ts \
container/agent-runner/src/providers/codex.factory.test.ts \
container/agent-runner/src/providers/exchange-archive.ts \
container/agent-runner/src/providers/exchange-archive.test.ts \
container/agent-runner/src/providers/codex-registration.test.ts \
container/agent-runner/src/providers/codex-dockerfile.test.ts
container/agent-runner/src/providers/codex.factory.test.ts \
container/agent-runner/src/providers/codex.turns.test.ts \
container/agent-runner/src/providers/codex-app-server.test.ts \
container/agent-runner/src/providers/codex-cli-tools.test.ts \
setup/providers/codex.ts \
setup/providers/codex.test.ts \
setup/providers/codex-registration.test.ts
```
## 3. Revert the Dockerfile CLI install
This skill itself (`.claude/skills/add-codex/`) stays — it ships with trunk so the provider can be re-added later.
In `container/Dockerfile`, remove both Codex edits (skip whichever is already gone):
`container/AGENTS.md` stays only if another installed provider uses agent surfaces; otherwise remove it too.
**(a)** Delete the version ARG from the "Pin CLI versions" block:
## 4. Remove the CLI manifest entry
```dockerfile
ARG CODEX_VERSION=0.124.0
```
**(b)** Delete the standalone Codex install layer:
```dockerfile
RUN --mount=type=cache,target=/root/.cache/pnpm \
pnpm install -g "@openai/codex@${CODEX_VERSION}"
```
Leave the other per-CLI install layers (claude-code, agent-browser, vercel) untouched.
## 4. Dependency
Codex is a CLI binary installed via the Dockerfile — there is no agent-runner package dependency to uninstall. Step 3 removes the only install surface; no `bun remove` / `pnpm uninstall` is needed.
## 5. Unset Codex env vars
Remove any Codex-specific lines you added to `.env` (`OPENAI_API_KEY`, `OPENAI_BASE_URL`, `CODEX_MODEL`) if no other integration uses them, then re-sync to the container:
Delete the `@openai/codex` entry from `container/cli-tools.json`:
```bash
mkdir -p data/env && cp .env data/env/env
node -e '
const fs = require("fs");
const file = "container/cli-tools.json";
const tools = JSON.parse(fs.readFileSync(file, "utf8")).filter((t) => t.name !== "@openai/codex");
const fmt = (t) => " { " + Object.entries(t).map(([k, v]) => JSON.stringify(k) + ": " + JSON.stringify(v)).join(", ") + " }";
fs.writeFileSync(file, "[\n" + tools.map(fmt).join(",\n") + "\n]\n");
'
```
Switch any group still on Codex back to the default provider — set `"provider": "claude"` in `groups/<folder>/container.json` and clear `agent_provider` on the group/session in the DB.
## 5. Vault secret (optional)
## 6. Rebuild and restart
The ChatGPT/OpenAI secret in the OneCLI vault grants nothing once the provider is gone. To remove it: `onecli secrets list`, then `onecli secrets delete --id <id>` for the `chatgpt.com` / `api.openai.com` entry.
Run from your NanoClaw project root:
## 6. Rebuild and verify
```bash
pnpm run build && ./container/build.sh
source setup/lib/install-slug.sh
# macOS
launchctl kickstart -k gui/$(id -u)/$(launchd_label)
# Linux
systemctl --user restart $(systemd_unit)
pnpm run build
pnpm exec tsc -p container/agent-runner/tsconfig.json --noEmit
./container/build.sh
pnpm test
cd container/agent-runner && bun test
```
## Verification
After removal, the registration guards no longer apply (their files are gone). Confirm the provider is fully unwired:
```bash
grep -R "codex.js" src/providers/index.ts container/agent-runner/src/providers/index.ts # no output
grep "@openai/codex" container/Dockerfile # no output
```
In a wired agent, requesting `agent_provider = 'codex'` should fall back to the default provider since `codex` is no longer in the registry.
All suites green and `ncl groups list` showing no codex groups means the removal is complete. Restart the service (`launchctl kickstart -k gui/$(id -u)/<label>` on macOS, `systemctl --user restart <unit>` on Linux).
+92 -134
View File
@@ -1,186 +1,144 @@
---
name: add-codex
description: Use Codex (CLI + AppServer) as the full agent provider — planning, tool orchestration, native compaction, MCP tools, session resume — in place of the Claude Agent SDK. ChatGPT subscription or OPENAI_API_KEY. Per-group via agent_provider. Distinct from using OpenAI as an MCP tool (where Claude remains the planner).
description: Use Codex (OpenAI's codex app-server) as a full agent provider — planning, tool orchestration, MCP tools, server-side history, session resume — alongside or instead of Claude. ChatGPT subscription or OpenAI API key, vault-only via OneCLI. Per-group via `ncl groups config update --provider codex`. Distinct from using OpenAI as an MCP tool (where Claude remains the planner).
---
# Codex agent provider
NanoClaw runs agents in a long-lived **poll loop** inside the container. The backend is selected with **`AGENT_PROVIDER`** (`claude` | `opencode` | `codex` | `mock`).
> Shortcut: `pnpm exec tsx setup/index.ts --step provider-auth codex` performs this whole install (manifest-driven from the providers branch: files, barrels, CLI manifest entry, image rebuild) plus auth in one command. The steps below are the same operations, for agent-driven or manual application.
Trunk ships with only the `claude` provider baked in. This skill copies the Codex provider files in from the `providers` branch, wires them into the host and container barrels, updates the Dockerfile to install the Codex CLI, and rebuilds the image.
NanoClaw selects each group's agent backend from `container_configs.provider` (default `claude`). This skill installs the Codex provider: copy the payload from the `providers` branch, append one import to each of the three provider barrels, add the pinned Codex CLI to the container manifest (`container/cli-tools.json`), rebuild, then run the vault auth walk-through.
The Codex provider runs `codex app-server` as a child process and speaks JSON-RPC over stdio. That gives it native session resume, streaming events, MCP tool access, and `thread/compact/start` compaction — same feature bar as the Claude Agent SDK, without the Anthropic-only lock-in.
The provider runs `codex app-server` as a child process speaking JSON-RPC over stdio: native streaming, MCP tools, server-side conversation history (the continuation is a thread id, no on-disk transcript). Credentials are **vault-only**: OneCLI serves a sentinel `auth.json` stub into the container and swaps the real ChatGPT token or API key on the wire — no key in `.env`, nothing readable in the container.
## Install
### Pre-flight
If all of the following are already present, skip to **Configuration**:
Check whether the payload is already wired (a prior apply, or a trunk that still carries it). All of these present means installed — skip to **Authenticate**:
- `src/providers/codex.ts`
- `src/providers/codex-registration.test.ts`
- `container/agent-runner/src/providers/codex.ts`
- `container/agent-runner/src/providers/codex-app-server.ts`
- `container/agent-runner/src/providers/codex.factory.test.ts`
- `container/agent-runner/src/providers/codex-registration.test.ts`
- `container/agent-runner/src/providers/codex-dockerfile.test.ts`
- `import './codex.js';` line in `src/providers/index.ts`
- `import './codex.js';` line in `container/agent-runner/src/providers/index.ts`
- `ARG CODEX_VERSION` and `"@openai/codex@${CODEX_VERSION}"` in the pnpm global-install block in `container/Dockerfile`
- `src/providers/codex.ts` and `src/providers/codex-agents-md.ts`
- `container/agent-runner/src/providers/codex.ts` and `codex-app-server.ts`
- `setup/providers/codex.ts`
- `import './codex.js';` in `src/providers/index.ts`, `container/agent-runner/src/providers/index.ts`, and `setup/providers/index.ts`
- an `@openai/codex` entry in `container/cli-tools.json`
Missing pieces — continue below. All steps are idempotent; re-running is safe.
### 1. Fetch the providers branch
### Fetch and copy
```bash
git fetch origin providers
```
### 2. Copy the Codex source files and tests
Copy each file with `git show origin/providers:<path> > <path>` (additive — never merge the branch):
Wholesale copies (owned entirely by this skill — user edits to these files won't survive a re-run, as designed):
Host (`src/providers/`):
- `codex.ts` — provider contribution: per-group `.codex-shared` state dir, AGENTS.md compose, skill links
- `codex-agents-md.ts` — AGENTS.md composition (32KB Codex cap: degrades by dropping the largest instruction sections, never blocks a spawn)
- `codex-registration.test.ts` — barrel-driven host registration guard
- `codex-host-contribution.test.ts` — drives the real contribution against a real test DB (the "consumes core" leg)
- `codex-agents-md.test.ts` — cap-degradation behavior
Container (`container/agent-runner/src/providers/`):
- `codex.ts` — the provider (turn loop, steering, memory scaffold + `onExchangeComplete` archiving)
- `codex-app-server.ts` — JSON-RPC child-process wrapper
- `exchange-archive.ts` — per-exchange markdown writer the `onExchangeComplete` hook uses (provider-owned, not runner code)
- `exchange-archive.test.ts` — writer behavior
- `codex-registration.test.ts` — barrel-driven container registration guard
- `codex.factory.test.ts`, `codex.turns.test.ts`, `codex-app-server.test.ts` — provider behavior
- `codex-cli-tools.test.ts` — structural guard for the Codex entry in `container/cli-tools.json`
Setup (`setup/providers/`):
- `codex.ts` — picker entry self-registration + the vault auth walk-through + install check
- `codex.test.ts` — install-check coverage
- `codex-registration.test.ts` — barrel-driven setup registration guard
Shared base (skip if present):
- `container/AGENTS.md` — the runtime-contract base the composed AGENTS.md embeds
### Wire the barrels
Append `import './codex.js';` to each of:
- `src/providers/index.ts`
- `container/agent-runner/src/providers/index.ts`
- `setup/providers/index.ts`
### CLI manifest
The agent's global Node CLIs install from `container/cli-tools.json` (a json-merge seam), not hand-edited Dockerfile layers. Add Codex by appending one entry — `@openai/codex` has no native postinstall, so no `onlyBuilt`:
```bash
git show origin/providers:src/providers/codex.ts > src/providers/codex.ts
git show origin/providers:src/providers/codex-registration.test.ts > src/providers/codex-registration.test.ts
git show origin/providers:container/agent-runner/src/providers/codex.ts > container/agent-runner/src/providers/codex.ts
git show origin/providers:container/agent-runner/src/providers/codex-app-server.ts > container/agent-runner/src/providers/codex-app-server.ts
git show origin/providers:container/agent-runner/src/providers/codex.factory.test.ts > container/agent-runner/src/providers/codex.factory.test.ts
git show origin/providers:container/agent-runner/src/providers/codex-registration.test.ts > container/agent-runner/src/providers/codex-registration.test.ts
node -e '
const fs = require("fs");
const file = "container/cli-tools.json";
const tools = JSON.parse(fs.readFileSync(file, "utf8"));
if (!tools.some((t) => t.name === "@openai/codex")) {
tools.push({ name: "@openai/codex", version: "0.138.0" });
const fmt = (t) => " { " + Object.entries(t).map(([k, v]) => JSON.stringify(k) + ": " + JSON.stringify(v)).join(", ") + " }";
fs.writeFileSync(file, "[\n" + tools.map(fmt).join(",\n") + "\n]\n");
}
'
```
The two `codex-registration.test.ts` files are the **registration guards**. Each imports only the real barrel — the host one calls `listProviderContainerConfigNames()` from `src/providers/index.ts`, the container one calls `listProviderNames()` from `container/agent-runner/src/providers/index.ts` — and asserts `codex` is present. They go red the instant a barrel import line is deleted or drifts. (`codex.factory.test.ts` imports `./codex.js` directly and self-registers, so it stays green even if the barrel line is gone — keep it as a unit test of provider behavior, but it is **not** the registration guard.)
The version (`0.138.0`) is the canonical pin — keep it in sync with `setup/add-codex.sh`. The Dockerfile already installs every manifest entry via pinned `pnpm install -g`; no Dockerfile edit is needed.
If `git show origin/providers:.../codex-registration.test.ts` errors with `path ... does not exist`, the registration tests have not landed on `origin/providers` yet. Run `git fetch origin providers` again; once the branch carries them, the copies above succeed. The rest of the install proceeds regardless — the Dockerfile and factory tests still run.
Copy the Dockerfile structural test that ships with this skill into the container provider tree:
### Build
```bash
cp .claude/skills/add-codex/codex-dockerfile.test.ts container/agent-runner/src/providers/codex-dockerfile.test.ts
pnpm run build
pnpm exec tsc -p container/agent-runner/tsconfig.json --noEmit
./container/build.sh
```
`codex-dockerfile.test.ts` reads the real `container/Dockerfile` and asserts the `ARG CODEX_VERSION=` line and the `pnpm install -g "@openai/codex@${CODEX_VERSION}"` line are both present. The Codex CLI is a binary, not an importable package, so the registration tests cannot see it — this structural test is what guards the Dockerfile edits in step 4.
### Restart the host
### 3. Append the self-registration imports
Each barrel gets one line — alphabetical placement keeps diffs small.
`src/providers/index.ts`:
```typescript
import './codex.js';
```
`container/agent-runner/src/providers/index.ts`:
```typescript
import './codex.js';
```
### 4. Add the Codex CLI to the container Dockerfile
Two edits to `container/Dockerfile`, both idempotent (skip if already present):
**(a)** In the "Pin CLI versions" ARG block (around line 18), add after `ARG CLAUDE_CODE_VERSION=...`:
```dockerfile
ARG CODEX_VERSION=0.124.0
```
**(b)** Add a new standalone `RUN` block for the Codex CLI, after the existing per-CLI install blocks (around line 106, right after the `@anthropic-ai/claude-code` block). The Dockerfile splits each global CLI into its own layer for cache granularity — keep that pattern; do not collapse them into a single combined `pnpm install -g` call:
```dockerfile
RUN --mount=type=cache,target=/root/.cache/pnpm \
pnpm install -g "@openai/codex@${CODEX_VERSION}"
```
Note: **no agent-runner package dependency** — Codex is a CLI binary, not a library. Unlike OpenCode, there's nothing to add to `container/agent-runner/package.json`.
### 5. Build and validate
The image rebuild does not reload the **host**. Codex's host contribution
(`src/providers/codex.ts`) registers the `/home/node/.codex` bind mount + env
passthrough, and the running host only picks it up on restart. Skip this and the
first Codex turn fails with `EACCES` writing `/home/node/.codex/config.toml`
with no mount, Docker auto-creates the dir root-owned and the non-root container
user can't write to it.
```bash
pnpm run build # host
pnpm exec vitest run src/providers/codex-registration.test.ts # host registration guard
pnpm exec tsc -p container/agent-runner/tsconfig.json --noEmit # container typecheck
cd container/agent-runner && bun test src/providers/codex-registration.test.ts && cd - # container registration guard
cd container/agent-runner && bun test src/providers/codex-dockerfile.test.ts && cd - # Dockerfile structural guard
./container/build.sh # agent image
# macOS (launchd)
launchctl kickstart -k gui/$(id -u)/com.nanoclaw
# Linux (systemd)
systemctl --user restart nanoclaw
```
All must be clean before proceeding.
- The **host** `codex-registration.test.ts` imports the real host barrel (`src/providers/index.ts`) and asserts `listProviderContainerConfigNames()` contains `codex`. It goes red if the `import './codex.js';` line is deleted or drifts, or if the barrel fails to evaluate.
- The **container** `codex-registration.test.ts` imports the real container barrel (`container/agent-runner/src/providers/index.ts`) and asserts `listProviderNames()` contains `codex`. Same failure surface for the container-side import line.
- The **Dockerfile** `codex-dockerfile.test.ts` reads `container/Dockerfile` and asserts the `ARG CODEX_VERSION=` and `@openai/codex@${CODEX_VERSION}` install lines are present — red if either edit is dropped.
The `@openai/codex` CLI binary is guarded by the Dockerfile structural test plus the container build (`./container/build.sh` fails if the install line is bad), **not** by the registration test — Codex is a CLI binary, not an importable package, so nothing imports it for the registration guard to trip on. To confirm the binary is actually present after the image rebuild, probe it inside a running container with `docker exec <container> codex --version`.
The host-side provider also consumes core APIs (per-session `~/.codex` mount, env passthrough); that typed core-API consumption is guarded by `pnpm run build`.
## Configuration
Codex supports two primary auth paths and one experimental BYO-endpoint path. Pick the one that matches your setup.
### Option A — ChatGPT subscription (recommended for individuals)
On the host (not inside the container), run Codex's OAuth login:
### Validate
```bash
codex login
pnpm vitest run src/providers/codex-registration.test.ts src/providers/codex-host-contribution.test.ts src/providers/codex-agents-md.test.ts setup/providers/
cd container/agent-runner && bun test src/providers/
```
This writes `~/.codex/auth.json` with a subscription token. The host-side Codex provider ([src/providers/codex.ts](../../../src/providers/codex.ts)) copies `auth.json` into a per-session `~/.codex` directory mounted into the container — your host's own Codex CLI is never touched.
The registration tests import only the real barrels — they go red if a barrel line is missing, a barrel fails to evaluate, or the payload is broken.
No `.env` variables required for this mode.
## Authenticate
### Option B — API key (recommended for CI or API billing)
> **Run this in a separate, real terminal — it is interactive.** It prompts for ChatGPT-subscription vs OpenAI-API-key and then drives a browser/device login, so it needs a TTY to answer prompts.
```env
OPENAI_API_KEY=sk-...
CODEX_MODEL=gpt-5.4-mini
```bash
pnpm exec tsx setup/index.ts --step provider-auth codex
```
The host forwards both variables into the container. If both subscription (`auth.json`) and `OPENAI_API_KEY` are present, Codex prefers the subscription.
The same walk-through fresh installs get from the setup picker: ChatGPT subscription (browser login or device pairing) or an OpenAI API key, landed in the OneCLI vault. Idempotent — it short-circuits when a matching secret already exists. It finishes with the install check.
### Option C — BYO OpenAI-compatible endpoint (experimental)
## Use it
Codex's built-in `openai` provider honors the `OPENAI_BASE_URL` env var directly. Point it at any OpenAI-compatible endpoint — Groq, Together, self-hosted vLLM, an OpenAI proxy, etc.
Per group:
```env
OPENAI_API_KEY=...
OPENAI_BASE_URL=https://api.groq.com/openai/v1
CODEX_MODEL=llama-3.3-70b-versatile
```bash
ncl groups config update --id <group-id> --provider codex
ncl groups restart --id <group-id>
```
Codex also ships first-class local-runner flags — `codex --oss --local-provider ollama` or `--local-provider lmstudio` — that auto-detect a local server. To use those inside NanoClaw, set `CODEX_MODEL` to a model your local runner serves and add the corresponding base URL; see the Codex CLI docs for the full `model_provider = oss` configuration.
Switching is an operator action — run it from the host. Memory does NOT carry over automatically — each provider keeps its own store; run `/migrate-memory` to carry it across. See [docs/provider-migration.md](../../docs/provider-migration.md) for the carry-over table and rollback.
**Experimental caveat:** tool-calling quality depends on the model and endpoint. Not every OpenAI-compat provider implements the full function-calling spec, and smaller models (< 30B) often struggle with multi-step tool orchestration. Test before committing.
There is no install-wide default provider. Setup's provider picker sets codex on the first agent it creates; creation itself is provider-agnostic (no `--provider` flag — provider is a DB property). Any group switches afterward via `ncl groups config update --provider` as above.
### Per group / per session
## Troubleshooting
Set `"provider": "codex"` in the group's **`container.json`** (`groups/<folder>/container.json`) — the in-container runner reads `provider` from there, not from the DB. The DB columns **`agent_groups.agent_provider`** and **`sessions.agent_provider`** (session overrides group) only drive host-side provider contribution — per-session `~/.codex` mount, `OPENAI_*` / `CODEX_MODEL` env passthrough — and do not propagate into `container.json` at spawn time. Set both, or just edit `container.json`; if they disagree, the runner uses `container.json` and the host-side resolver falls back through session → group → `container.json``'claude'`.
`CODEX_MODEL` applies process-wide via `.env`; if you need different models for different groups, set them via `container_config.env` on the group.
Extra MCP servers still come from **`NANOCLAW_MCP_SERVERS`** / `container_config.mcpServers` on the host. The runner merges them into the same `mcpServers` object passed to all providers.
## Operational notes
- **Spawn-per-query:** Codex's app-server is spawned fresh per query invocation, matching the OpenCode pattern. No long-lived daemon to keep healthy across sessions.
- **Per-session `~/.codex` isolation:** each group gets its own copy of the host's `auth.json`. The container can rewrite `config.toml` freely on every wake without touching the host's Codex config.
- **Native compaction:** kicks in automatically at 40K cumulative input tokens between turns, via `thread/compact/start`. If compaction fails, the provider logs and continues uncompacted — no fatal error.
- **Approvals:** auto-accepted inside the container (the container is the sandbox; same posture as Claude/OpenCode).
- **Mid-turn input:** Codex turns don't accept mid-turn messages. Follow-up `push()` calls queue and drain between turns, matching the OpenCode pattern. The poll-loop only pushes between turns anyway, so no messages are dropped.
- **Stale thread recovery:** `isSessionInvalid` matches on stale-thread-ID errors (`thread not found`, `unknown thread`, etc.) so a cold-started app-server can recover cleanly when it sees a stored continuation it no longer has.
## Next Steps
The registration and Dockerfile guards in **Build and validate** confirm the wiring. For a live end-to-end check, set `agent_provider = 'codex'` on a test group and send a message after the image rebuild. A successful round-trip looks like:
- `init` event with a stable thread ID as continuation
- One or more `activity` / `progress` events during the turn
- `result` event with the model's reply
If the agent hangs or errors, check `~/.codex/auth.json` exists on the host (Option A) or that `OPENAI_API_KEY` is forwarding correctly (Option B) — `docker exec` into a running container and `env | grep -i openai` to confirm. To confirm the CLI binary itself landed in the image, `docker exec <container> codex --version`.
To back this provider out, follow [REMOVE.md](REMOVE.md).
- **Container dies at boot, channel silent:** `grep 'Container exited non-zero' logs/nanoclaw.error.log` — the `stderrTail` carries the reason (e.g. `Unknown provider: codex. Registered: claude` means the barrels aren't wired in the running build).
- **In-channel `Error: spawn codex ENOENT` on every message:** the image predates the manifest entry — re-run `./container/build.sh`.
- **Auth errors mid-conversation:** the vault secret is missing or stale — re-run `pnpm exec tsx setup/index.ts --step provider-auth codex` (subscription re-login updates the vault copy).
@@ -0,0 +1,39 @@
// Structural guard for the Codex CLI install in container/cli-tools.json.
//
// @openai/codex is a CLI *binary* installed from the global-CLI manifest (a
// json-merge seam), not an importable package, so the barrel-driven
// registration tests cannot see it. This test reads the real cli-tools.json
// and asserts the @openai/codex entry is present and pinned to an exact
// version. It goes red if the manifest entry is dropped or unpins.
//
// Runs under bun (same suite as the container registration test):
// cd container/agent-runner && bun test src/providers/codex-cli-tools.test.ts
import { existsSync, readFileSync } from 'fs';
import path from 'path';
import { describe, it, expect } from 'bun:test';
// container/agent-runner/src/providers/ -> container/cli-tools.json
const MANIFEST = path.join(import.meta.dir, '..', '..', '..', 'cli-tools.json');
const manifestPresent = existsSync(MANIFEST);
// Read lazily — `describe.skipIf` still runs the body to register tests, so the
// read has to be guarded for the bare-branch (no manifest) case.
const tools: Array<{ name: string; version: string }> = manifestPresent
? JSON.parse(readFileSync(MANIFEST, 'utf8'))
: [];
const codex = tools.find((t) => t.name === '@openai/codex');
// cli-tools.json is a trunk file; on the bare providers branch it isn't present,
// so skip there. In an installed tree (trunk + this payload) it must carry the
// pinned @openai/codex entry.
describe.skipIf(!manifestPresent)('container/cli-tools.json codex CLI install', () => {
it('includes the @openai/codex entry', () => {
expect(codex).toBeDefined();
});
it('pins it to an exact semver (no latest, no ranges)', () => {
expect(codex?.version).toMatch(/^\d+\.\d+\.\d+(?:[-+][0-9A-Za-z.-]+)?$/);
});
});
@@ -1,30 +0,0 @@
// Structural guard for the Codex CLI install in container/Dockerfile.
//
// @openai/codex is a CLI *binary* installed via the Dockerfile, not an
// importable package, so the barrel-driven registration tests cannot see it.
// This test reads the real Dockerfile and asserts the version ARG and the
// `pnpm install -g` line for @openai/codex are both present. It goes red if
// either Dockerfile edit is dropped or drifts.
//
// Runs under bun (same suite as the container registration test):
// cd container/agent-runner && bun test src/providers/codex-dockerfile.test.ts
import { readFileSync } from 'fs';
import path from 'path';
import { describe, it, expect } from 'bun:test';
// container/agent-runner/src/providers/ -> container/Dockerfile
const DOCKERFILE = path.join(import.meta.dir, '..', '..', '..', 'Dockerfile');
describe('container/Dockerfile codex CLI install', () => {
const dockerfile = readFileSync(DOCKERFILE, 'utf8');
it('declares the CODEX_VERSION ARG', () => {
expect(dockerfile).toMatch(/ARG\s+CODEX_VERSION=/);
});
it('installs the @openai/codex CLI pinned to that ARG', () => {
expect(dockerfile).toMatch(/pnpm install -g\s+"@openai\/codex@\$\{CODEX_VERSION\}"/);
});
});
+3 -1
View File
@@ -71,6 +71,8 @@ Parse the `PAIR_TELEGRAM_ISSUED` status block for `CODE` and follow the `REMINDE
## 4. Run the init script
First, pick the agent provider. Read `src/providers/index.ts` and collect the installed providers from its `import './<name>.js';` lines — `claude` is always available as the built-in default. If a non-default provider is installed (e.g. codex), ask the user which one this agent should run on; if only claude is available, skip the question and omit the flag.
```bash
npx tsx scripts/init-first-agent.ts \
--channel "${CHANNEL}" \
@@ -80,7 +82,7 @@ npx tsx scripts/init-first-agent.ts \
--agent-name "${AGENT_NAME}"
```
Add `--welcome "System instruction: ..."` to override the default welcome prompt.
Add `--provider <name>` when the user picked a non-default provider (there is no install-wide default — the choice is explicit per group). Add `--welcome "System instruction: ..."` to override the default welcome prompt.
The script:
1. Upserts the `users` row and grants `owner` role if no owner exists.
+2
View File
@@ -67,6 +67,8 @@ pnpm exec tsx setup/index.ts --step register -- \
The `register` step creates the agent group (reusing it if the folder already exists), the messaging group, and the wiring row. `createMessagingGroupAgent` auto-creates the companion `agent_destinations` row so the agent can address the channel by name.
When creating a NEW agent group on a non-default provider, append `--provider <name>` (e.g. `--provider codex`) — there is no install-wide default; existing groups switch via `ncl groups config update --provider` instead.
For separate agents, also ask for a folder name and optionally a different assistant name.
## Add Channel Group
+50
View File
@@ -0,0 +1,50 @@
---
name: migrate-memory
description: Carry an agent group's memory across a provider switch, in either direction (e.g. Claude ↔ Codex, or any provider to/from another). Run after the operator switches a group's provider with `ncl groups config update --provider`. The coding agent reads the source provider's memory store, distills it into the target provider's store, and restarts the group. Triggers on "migrate memory", "carry memory over", "the agent forgot everything after the switch".
---
# Migrate memory across a provider switch
NanoClaw does not migrate memory at runtime — each provider keeps its own store, and carrying content across is the operator's move, executed by you (the coding agent). This skill is the whole mechanism: read the source store, **infer** what is durable, write it into the target store, restart.
You translate between **store shapes**, not provider names. There are two:
- **Flat file** — `CLAUDE.local.md` at the group workspace root (the Claude provider; may reference satellite files in the workspace).
- **Scaffold tree** — `memory/` (any provider with `usesMemoryScaffold`, e.g. Codex). `memory/index.md` is the index; durable notes live under `memory/memories/`; `memory/memories/imported-agent-memory.md` is the conventional landing file for imported memory.
A switch only needs migration when it **crosses shapes**. Two providers that both use the scaffold share the same `memory/` tree, so switching between them carries nothing — the memory is already there. The work is always one of: flat → scaffold, or scaffold → flat.
Principles: **copy, never move** (the source store stays intact — it IS the rollback), **idempotent** (re-running must not duplicate), **distill, don't dump** (you are the inference step: keep identity/seed instructions, user preferences, durable facts; drop conversational residue).
## Step 1: Identify the group, both providers, and the direction
- `ncl groups list`, then `ncl groups config get --id <group-id>` — note the current (target) `provider`. Ask the operator which group, and which provider it switched *from*, if either is ambiguous.
- Map each provider to its store shape (flat `CLAUDE.local.md` vs `memory/` scaffold), then inspect `groups/<folder>/`:
- **Same shape on both sides** (e.g. scaffold → scaffold) → the store is shared; nothing to migrate. Tell the operator and stop.
- **Flat → scaffold** (source has `CLAUDE.local.md` content, target uses the scaffold) → Step 2.
- **Scaffold → flat** (source has a `memory/` tree, target is Claude) → Step 3.
- Source missing or empty → nothing to migrate; tell the operator and stop.
## Step 2: flat → scaffold (`CLAUDE.local.md` → `memory/`)
1. Read `groups/<folder>/CLAUDE.local.md` and any workspace files it references.
2. If `memory/memories/imported-agent-memory.md` already exists, a previous import happened — show the operator what's there and ask before overwriting; integrate only what's new.
3. Distill the content into `groups/<folder>/memory/memories/imported-agent-memory.md` (create the directories if missing — the container scaffolds the rest of the tree at boot and never clobbers your files). Lead with anything that defines who the agent is or how it must behave; references to satellite files keep their workspace-root paths.
4. If `memory/index.md` exists, add the following: `- [Imported agent memory](memories/imported-agent-memory.md) — seed instructions and memory carried over from a previous provider. Read it first and treat it as binding; it may define who you are and how to behave. Integrate its facts into your memory as you work; never modify files that belong to another provider's memory system.`
5. Leave the source store exactly as it is.
## Step 3: scaffold → flat (`memory/` → `CLAUDE.local.md`)
1. Read `memory/index.md`, then the files it points to under `memory/memories/` (and `memory/data/` where durable).
2. Integrate the durable facts into `groups/<folder>/CLAUDE.local.md` under a clearly marked section (e.g. `## Imported from memory/ (<date>)`), deduplicating against what's already there. If the section already exists, update it instead of appending a second one.
3. Leave the source store exactly as it is.
## Step 4: Restart and verify
```bash
ncl groups restart --id <group-id>
```
Tell the operator to send the group a quick test message that depends on a migrated fact (a preference, a project name). If the agent doesn't know it, re-check that the target file landed in the right group folder.
Note: switching the provider is an operator action — `ncl groups config update --id <group-id> --provider <name>` from the host. See [docs/provider-migration.md](../../../docs/provider-migration.md) for what carries over automatically.
+6
View File
@@ -121,6 +121,7 @@ Bucket the upstream changed files:
- **Host source** (`src/`): may conflict if user modified the same files
- **Container** (`container/`): triggers container rebuild (+ typecheck if `agent-runner/src/` changed)
- **Build/config** (`package.json`, `pnpm-lock.yaml`, `tsconfig*.json`): lockfile changes trigger dep install
- **Version pins** (`versions.json`): a changed `onecli-gateway` / `onecli-cli` value requires upgrading the OneCLI gateway/CLI to match — see Step 5.5
- **Other**: docs, tests, setup scripts, misc
**Large drift check:** If the upstream commit count and age suggest the user has a lot of catching up to do, mention that `/migrate-nanoclaw` might be a better fit — it extracts customizations and reapplies them on clean upstream instead of merging. Offer it as an option but don't push.
@@ -215,6 +216,11 @@ If build fails:
- Do not refactor unrelated code.
- If unclear, ask the user before making changes.
# Step 5.5: OneCLI upgrade (if pins moved)
The OneCLI gateway and CLI are external components pinned in `versions.json`; when a pin moves, the running version must be upgraded to match or the new code may fail against it.
If `git diff <backup-tag-from-step-1>..HEAD -- versions.json` shows the `onecli-gateway` or `onecli-cli` value changed, follow `docs/onecli-upgrades.md` before the service restart (Step 8). Otherwise skip.
# Step 6: Breaking changes check
After validation succeeds, check if the update introduced any breaking changes.
+12
View File
@@ -2,6 +2,18 @@
All notable changes to NanoClaw will be documented in this file.
## [Unreleased]
- **Budget/billing-exhausted LLM turns now reach the user instead of being silently dropped.** When a turn ends in a non-retryable provider error (e.g. an Anthropic `403 billing_error`) with no `<message>` wrapping, the agent-runner delivers the provider's notice to the originating channel and stops re-nudging the failing gateway. `providers/claude.ts` now surfaces the SDK's `is_error` flag (and the error subtype's `errors[]` text); `poll-loop.ts` delivers that text and skips the re-wrap retry. Fixes the case where a spend-limit notice produced silence plus a turn-after-turn retry loop.
- [BREAKING] **`@onecli-sh/sdk` 0.5.0 -> 2.2.1 — requires a OneCLI server with the `/v1` API** (older servers 404 every SDK call). The sanctioned gateway and CLI versions are pinned in `versions.json`. **The gateway is a separate component — updating NanoClaw does not upgrade it for you:** `/update-nanoclaw` upgrades it when the pin moves, otherwise upgrade manually. **Migration:** [docs/onecli-upgrades.md](docs/onecli-upgrades.md).
- **New agent provider: Codex (OpenAI) — run `/add-codex`.** Full runtime via `codex app-server` (planning, MCP tools, server-side history, resume). Trunk ships the seams and the skill; the payload installs from the `providers` branch (the skill, the setup picker, or `--step provider-auth codex`). Auth is vault-only — no credential ever enters a container.
- **Setup can now select, install, and authenticate a non-default agent provider.** A provider registry feeds the setup picker, an installer pulls the provider's payload from its branch, a vault auth walkthrough runs (`--step provider-auth`), and the picked provider is set on the first agent (a DB property) before its first spawn. Default (Claude) installs are unaffected — picking Claude changes nothing.
- **Provider choice is explicit per group — no install-wide default.** Provider is a DB property set via `ncl groups config update --provider` + restart; creation is provider-agnostic.
- **Memory migrates via `/migrate-memory`, never at runtime.** Each provider keeps its own store; fresh groups on a surfaces-owning provider see no stale `CLAUDE.*` files. See [docs/provider-migration.md](docs/provider-migration.md).
- **Per-exchange archiving is provider-owned** — the `onExchangeComplete` hook; the markdown writer ships with the codex payload.
- **Container boot failures now say why** — the last stderr lines are logged at `warn` on a non-zero exit instead of a silent crash loop.
- **Slash commands now interrupt an in-flight turn.** A runner-handled command (`/clear`, `/compact`, `/cost`, …) arriving mid-turn aborts the active stream and runs immediately instead of waiting out the turn.
## [2.1.0] - 2026-06-07
- [BREAKING] **Startup now requires an upgrade marker.** The host refuses to boot unless `data/upgrade-state.json` records that this install reached the current version through a sanctioned path (`/setup`, `/update-nanoclaw`, `/migrate-nanoclaw`). After this update completes — and before restarting the service — stamp the marker by running `pnpm exec tsx scripts/upgrade-state.ts set`. If the host has already tripped on restart with "update did not go through the supported path", that same command clears it. See [docs/upgrade-recovery.md](docs/upgrade-recovery.md).
+5 -3
View File
@@ -69,8 +69,8 @@ For ad-hoc queries from skills or scripts, use the in-tree wrapper rather than t
| `src/modules/permissions/access.ts` | `canAccessAgentGroup` — owner / global admin / scoped admin / member resolution against `user_roles` + `agent_group_members` |
| `src/modules/approvals/primitive.ts` | `pickApprover`, `pickApprovalDelivery`, `requestApproval`, approval-handler registry |
| `src/command-gate.ts` | Router-side admin command gate — queries `user_roles` directly (no env var, no container-side check) |
| `src/onecli-approvals.ts` | OneCLI credentialed-action approval bridge |
| `src/user-dm.ts` | Cold-DM resolution + `user_dms` cache |
| `src/modules/approvals/onecli-approvals.ts` | OneCLI credentialed-action approval bridge |
| `src/modules/permissions/user-dm.ts` | Cold-DM resolution + `user_dms` cache |
| `src/group-init.ts` | Per-agent-group filesystem scaffold (CLAUDE.md, skills, agent-runner-src overlay) |
| `src/db/container-configs.ts` | CRUD for `container_configs` table (per-group container runtime config) |
| `src/backfill-container-configs.ts` | Migrates legacy `container.json` files into the DB on startup |
@@ -152,7 +152,7 @@ Key files: `src/container-restart.ts`, `src/container-runner.ts` (`killContainer
## Secrets / Credentials / OneCLI
API keys, OAuth tokens, and auth credentials are managed by the OneCLI gateway. Secrets are injected into per-agent containers at request time — none are passed in env vars or through chat context. The container agent sees this via the `onecli-gateway` container skill (`container/skills/onecli-gateway/SKILL.md`), which teaches it how the proxy works, how to handle auth errors, and to never ask for raw credentials. Host-side wiring: `src/onecli-approvals.ts`, `ensureAgent()` in `container-runner.ts`. Run `onecli --help`.
API keys, OAuth tokens, and auth credentials are managed by the OneCLI gateway. Secrets are injected into per-agent containers at request time — none are passed in env vars or through chat context. The container agent sees this via the `onecli-gateway` container skill (`container/skills/onecli-gateway/SKILL.md`), which teaches it how the proxy works, how to handle auth errors, and to never ask for raw credentials. Host-side wiring: `src/modules/approvals/onecli-approvals.ts`, `ensureAgent()` in `container-runner.ts`. Run `onecli --help`.
### Secret modes
@@ -193,6 +193,7 @@ Four types of skills. See [CONTRIBUTING.md](CONTRIBUTING.md) for the full taxono
| `/debug` | Container issues, logs, troubleshooting |
| `/update-nanoclaw` | Bring upstream updates into a customized install |
| `/init-onecli` | Install OneCLI Agent Vault and migrate `.env` credentials |
| `/migrate-memory` | Carry a group's agent memory across a provider switch (operator-run, both directions) |
## Contributing
@@ -275,6 +276,7 @@ This project uses pnpm with `minimumReleaseAge: 4320` (3 days) in `pnpm-workspac
| [docs/build-and-runtime.md](docs/build-and-runtime.md) | Runtime split (Node host + Bun container), lockfiles, image build surface, CI, key invariants |
| [docs/v1-to-v2-changes.md](docs/v1-to-v2-changes.md) | v1→v2 architecture diff — vocabulary for where v1 things moved |
| [docs/migration-dev.md](docs/migration-dev.md) | Migration development guide — testing, debugging, dev loop |
| [docs/provider-migration.md](docs/provider-migration.md) | Switching a live agent group between providers (e.g. Claude → Codex) — what carries over, rollback |
| [docs/customizing.md](docs/customizing.md) | Short intro to customizing via skills |
| [docs/skills-model.md](docs/skills-model.md) | The skills model in full: recipes, tests, upgrades, migrations |
| [docs/skill-guidelines.md](docs/skill-guidelines.md) | Authoritative checklist for writing a skill |
+7
View File
@@ -19,6 +19,13 @@
**Not accepted:** Features, capabilities, compatibility, enhancements. These should be skills.
## Breaking Changes
Breaking changes are allowed; **silent** ones are not. NanoClaw does not migrate user installs at runtime — the user's coding agent is the migrator, so every breaking change must ship a migration path that agent can execute without a human reverse-engineering the diff:
1. **Every `[BREAKING]` CHANGELOG entry must reference its migration path** — either a skill to run (`Run /<skill-name> to <action>`) or a `docs/` page covering **detect / why / fix / verify / rollback** (see [docs/onecli-upgrades.md](docs/onecli-upgrades.md) for the shape). `/update-nanoclaw` surfaces these entries after every update and walks the user through them.
2. **If the change moves an external component's sanctioned version** (gateway, pinned CLI binary, …), update its pin in [`versions.json`](versions.json). The changelog stays human-narrative; `versions.json` is the machine-checkable signal — `/update-nanoclaw` diffs it across the update and routes the user to the linked doc for any pin that moved.
## Skills
NanoClaw uses [Claude Code skills](https://code.claude.com/docs/en/skills) — markdown files with optional supporting files that teach Claude how to do something. There are four types of skills in NanoClaw, each serving a different purpose.
+1
View File
@@ -11,6 +11,7 @@
<a href="https://docs.nanoclaw.dev">docs</a>&nbsp; • &nbsp;
<a href="README_zh.md">中文</a>&nbsp; • &nbsp;
<a href="README_ja.md">日本語</a>&nbsp; • &nbsp;
<a href="README_ko.md">한국어</a>&nbsp; • &nbsp;
<a href="https://discord.gg/VDdww8qS42"><img src="https://img.shields.io/discord/1470188214710046894?label=Discord&logo=discord&v=2" alt="Discord" valign="middle"></a>&nbsp; • &nbsp;
<a href="repo-tokens"><img src="repo-tokens/badge.svg" alt="repo tokens" valign="middle"></a>
</p>
+1
View File
@@ -11,6 +11,7 @@
<a href="https://docs.nanoclaw.dev">ドキュメント</a>&nbsp; • &nbsp;
<a href="README.md">English</a>&nbsp; • &nbsp;
<a href="README_zh.md">中文</a>&nbsp; • &nbsp;
<a href="README_ko.md">한국어</a>&nbsp; • &nbsp;
<a href="https://discord.gg/VDdww8qS42"><img src="https://img.shields.io/discord/1470188214710046894?label=Discord&logo=discord&v=2" alt="Discord" valign="middle"></a>&nbsp; • &nbsp;
<a href="repo-tokens"><img src="repo-tokens/badge.svg" alt="repo tokens" valign="middle"></a>
</p>
+228
View File
@@ -0,0 +1,228 @@
<p align="center">
<img src="assets/nanoclaw-logo.png" alt="NanoClaw" width="400">
</p>
<p align="center">
에이전트를 각자의 컨테이너에서 안전하게 실행하는 AI 어시스턴트입니다. 가볍고, 쉽게 이해할 수 있으며, 여러분의 필요에 맞게 완전히 커스터마이즈할 수 있도록 만들어졌습니다.
</p>
<p align="center">
<a href="https://nanoclaw.dev">nanoclaw.dev</a>&nbsp; • &nbsp;
<a href="https://docs.nanoclaw.dev">문서</a>&nbsp; • &nbsp;
<a href="README.md">English</a>&nbsp; • &nbsp;
<a href="README_zh.md">中文</a>&nbsp; • &nbsp;
<a href="README_ja.md">日本語</a>&nbsp; • &nbsp;
<a href="https://discord.gg/VDdww8qS42"><img src="https://img.shields.io/discord/1470188214710046894?label=Discord&logo=discord&v=2" alt="Discord" valign="middle"></a>&nbsp; • &nbsp;
<a href="repo-tokens"><img src="repo-tokens/badge.svg" alt="repo tokens" valign="middle"></a>
</p>
---
## NanoClaw를 만든 이유
[OpenClaw](https://github.com/openclaw/openclaw)는 인상적인 프로젝트지만, 제가 이해하지 못하는 복잡한 소프트웨어에 제 삶 전체에 대한 접근 권한을 줬다면 저는 잠을 이루지 못했을 것입니다. OpenClaw는 거의 50만 줄에 달하는 코드, 53개의 설정 파일, 70개 이상의 의존성을 가지고 있습니다. 보안은 진정한 OS 수준의 격리가 아니라 애플리케이션 수준(허용 목록, 페어링 코드)에 의존합니다. 모든 것이 메모리를 공유하는 하나의 Node 프로세스에서 실행됩니다.
NanoClaw는 그와 동일한 핵심 기능을 제공하지만, 이해할 수 있을 만큼 작은 코드베이스로 구현합니다. 하나의 프로세스와 몇 개의 파일뿐입니다. Claude 에이전트는 단순한 권한 검사 뒤가 아니라, 파일시스템이 격리된 각자의 Linux 컨테이너에서 실행됩니다.
## 빠른 시작
```bash
git clone https://github.com/nanocoai/nanoclaw.git nanoclaw-v2
cd nanoclaw-v2
bash nanoclaw.sh
```
`nanoclaw.sh`는 갓 준비한 머신에서 시작해 메시지를 보낼 수 있는 이름 붙은 에이전트까지 안내합니다. 누락된 경우 Node, pnpm, Docker를 설치하고, Anthropic 자격 증명을 OneCLI에 등록하며, 에이전트 컨테이너를 빌드하고, 첫 채널(Telegram, Discord, WhatsApp 또는 로컬 CLI)을 페어링합니다. 어떤 단계가 실패하면 Claude Code가 자동으로 호출되어 원인을 진단하고 중단된 지점부터 재개합니다.
<details>
<summary><strong>NanoClaw v1에서 마이그레이션하시나요?</strong></summary>
기존 v1 설치 옆에 새로운 v2 체크아웃을 만들어 실행하세요:
```bash
git clone https://github.com/nanocoai/nanoclaw.git nanoclaw-v2
cd nanoclaw-v2
bash migrate-v2.sh
```
`migrate-v2.sh`는 v1 설치(형제 디렉터리, 또는 `NANOCLAW_V1_PATH=/path/to/nanoclaw`)를 찾아 상태를 v2 체크아웃으로 마이그레이션한 다음, 판단이 필요한 부분(소유자 시딩, CLAUDE.local.md 정리, 포크 커스터마이징 재적용)을 마무리하기 위해 Claude Code로 `exec`합니다.
이 스크립트는 Claude 세션 내부가 아니라 직접 실행하세요. 결정론적인 부분에서 Node/pnpm 부트스트랩, Docker, OneCLI, 컨테이너 빌드를 위해 대화형 프롬프트와 실제 셸 I/O가 필요합니다.
**무엇을 하는가:** `.env`를 병합하고, `registered_groups`로부터 v2 DB를 시딩하며, 그룹 폴더 + 세션 데이터 + 예약 작업을 복사하고, 선택한 채널 어댑터를 설치하며, 채널 인증 상태(WhatsApp의 Baileys 키스토어 + LID 매핑 포함)를 복사하고, 에이전트 컨테이너를 빌드합니다.
**무엇을 하지 않는가:** 시스템 서비스를 전환하지 않습니다. 프롬프트에서 *"switch to v2"*를 선택하거나, 테스트 후 수동으로 전환하세요. 기존 v1 설치는 그대로 유지됩니다.
무엇이 달라졌는지는 [docs/v1-to-v2-changes.md](docs/v1-to-v2-changes.md)를, 개발 노트는 [docs/migration-dev.md](docs/migration-dev.md)를 참고하세요.
</details>
## 철학
**이해할 수 있을 만큼 작게.** 하나의 프로세스, 몇 개의 소스 파일, 마이크로서비스 없음. NanoClaw 코드베이스 전체를 이해하고 싶다면 Claude Code에게 안내해 달라고 요청하기만 하면 됩니다.
**격리를 통한 보안.** 에이전트는 Linux 컨테이너에서 실행되며 명시적으로 마운트된 것만 볼 수 있습니다. 명령이 호스트가 아니라 컨테이너 안에서 실행되기 때문에 Bash 접근도 안전합니다.
**개별 사용자를 위해 설계.** NanoClaw는 거대한 단일 프레임워크가 아니라, 각 사용자의 정확한 필요에 맞는 소프트웨어입니다. 비대한 소프트웨어가 되는 대신, NanoClaw는 맞춤형이 되도록 설계되었습니다. 직접 포크를 만들고 Claude Code가 여러분의 필요에 맞게 수정하도록 합니다.
**커스터마이징 = 코드 변경.** 설정의 난립이 없습니다. 다른 동작을 원하시나요? 코드를 수정하세요. 코드베이스가 충분히 작아서 안전하게 변경할 수 있습니다.
**AI 네이티브, 설계상 하이브리드.** 설치와 온보딩 흐름은 최적화된 스크립트 경로로, 빠르고 결정론적입니다. 어떤 단계에 판단이 필요할 때 — 설치 실패, 안내가 필요한 결정, 커스터마이징 등 — 제어권이 Claude Code로 매끄럽게 넘어갑니다. 설정 이후에도 모니터링 대시보드나 디버깅 UI가 없습니다. 채팅으로 문제를 설명하면 Claude Code가 처리합니다.
**기능보다 스킬.** 트렁크는 특정 채널 어댑터나 대체 에이전트 프로바이더가 아니라 레지스트리와 인프라를 제공합니다. 채널(Discord, Slack, Telegram, WhatsApp, …)은 오래 유지되는 `channels` 브랜치에, 대체 프로바이더(OpenCode, Ollama)는 `providers` 브랜치에 있습니다. `/add-telegram`, `/add-opencode` 등을 실행하면 스킬이 여러분이 필요로 하는 모듈만 정확히 포크로 복사합니다. 요청하지 않은 기능은 없습니다.
**최고의 하니스, 최고의 모델.** NanoClaw는 Anthropic의 공식 Claude Agent SDK를 통해 Claude Code를 네이티브로 사용하므로, 최신 Claude 모델과 Claude Code의 전체 도구 세트를 누릴 수 있습니다. 여기에는 자신의 NanoClaw 포크를 직접 수정하고 확장하는 능력도 포함됩니다. 다른 프로바이더는 드롭인 옵션입니다. OpenAI의 Codex는 `/add-codex`(ChatGPT 구독 또는 API 키), OpenRouter·Google·DeepSeek 등은 OpenCode를 통한 `/add-opencode`, 로컬 오픈 웨이트 모델은 `/add-ollama-provider`로 추가합니다. 프로바이더는 에이전트 그룹별로 설정할 수 있습니다.
## 지원 기능
- **멀티 채널 메시징** — WhatsApp, Telegram, Discord, Slack, Microsoft Teams, iMessage, Matrix, Google Chat, Webex, Linear, GitHub, WeChat, 그리고 Resend를 통한 이메일. `/add-<channel>` 스킬로 필요할 때 설치합니다. 하나 또는 여러 개를 동시에 실행할 수 있습니다.
- **유연한 격리** — 완전한 프라이버시를 위해 각 채널을 자체 에이전트에 연결하거나, 대화는 분리하되 메모리는 통합하기 위해 하나의 에이전트를 여러 채널에서 공유하거나, 여러 채널을 하나의 공유 세션으로 묶어 하나의 대화가 여러 채널에 걸쳐 이어지도록 할 수 있습니다. `/manage-channels`로 채널별로 선택하세요. [docs/isolation-model.md](docs/isolation-model.md)를 참고하세요.
- **에이전트별 작업 공간** — 각 에이전트 그룹은 자체 `CLAUDE.md`, 자체 메모리, 자체 컨테이너, 그리고 여러분이 허용한 마운트만 갖습니다. 직접 연결하지 않는 한 경계를 넘는 것은 아무것도 없습니다.
- **예약 작업** — Claude를 실행하고 여러분에게 다시 메시지를 보낼 수 있는 반복 작업
- **웹 접근** — 웹에서 검색하고 콘텐츠를 가져오기
- **컨테이너 격리** — 에이전트는 Docker(macOS/Linux/WSL2)에서 샌드박스화되며, 선택적으로 [Docker Sandboxes](docs/docker-sandboxes.md) 마이크로 VM 격리나 macOS 네이티브 런타임인 Apple Container를 사용할 수 있습니다
- **자격 증명 보안** — 에이전트는 원시 API 키를 절대 보유하지 않습니다. 아웃바운드 요청은 [OneCLI의 Agent Vault](https://github.com/onecli/onecli)를 통해 라우팅되며, 요청 시점에 자격 증명을 주입하고 에이전트별 정책과 속도 제한을 적용합니다.
## 사용법
트리거 단어(기본값: `@Andy`)로 어시스턴트에게 말을 거세요:
```
@Andy 매주 평일 오전 9시에 영업 파이프라인 개요를 보내줘 (내 Obsidian 보관함 폴더에 접근 가능)
@Andy 매주 금요일에 지난 한 주간의 git 히스토리를 검토하고, 내용이 어긋나면 README를 업데이트해줘
@Andy 매주 월요일 오전 8시에 Hacker News와 TechCrunch에서 AI 관련 소식을 모아 브리핑을 보내줘
```
여러분이 소유하거나 관리하는 채널에서는 그룹과 작업을 관리할 수 있습니다:
```
@Andy 모든 그룹에 걸친 예약 작업을 전부 나열해줘
@Andy 월요일 브리핑 작업을 일시 정지해줘
@Andy Family Chat 그룹에 참여해줘
```
## 커스터마이징
NanoClaw는 설정 파일을 사용하지 않습니다. 변경하려면 Claude Code에게 원하는 것을 말하기만 하면 됩니다:
- "트리거 단어를 @Bob으로 바꿔줘"
- "앞으로는 응답을 더 짧고 직접적으로 하도록 기억해줘"
- "내가 좋은 아침이라고 인사하면 맞춤 인사를 추가해줘"
- "매주 대화 요약을 저장해줘"
또는 안내형 변경을 위해 `/customize`를 실행하세요.
코드베이스가 충분히 작아서 Claude가 안전하게 수정할 수 있습니다.
## 기여하기
**기능을 추가하지 마세요. 스킬을 추가하세요.**
새로운 채널이나 에이전트 프로바이더를 추가하고 싶다면 트렁크에 추가하지 마세요. 새 채널 어댑터는 `channels` 브랜치에, 새 에이전트 프로바이더는 `providers` 브랜치에 들어갑니다. 사용자는 `/add-<name>` 스킬로 자신의 포크에 설치하며, 이 스킬은 관련 모듈을 표준 경로로 복사하고, 등록을 연결하며, 의존성을 고정합니다.
이를 통해 트렁크는 순수한 레지스트리이자 인프라로 유지되고, 모든 포크는 가벼운 상태를 유지합니다. 사용자는 요청한 채널과 프로바이더만 얻고 그 외에는 아무것도 얻지 않습니다.
### RFS (Request for Skills)
저희가 보고 싶은 스킬:
**커뮤니케이션 채널**
- `/add-signal` — Signal을 채널로 추가
## 요구 사항
- macOS 또는 Linux (Windows는 WSL2 경유)
- Node.js 20+ 및 pnpm 10+ (설치 프로그램이 누락 시 둘 다 설치합니다)
- [Docker Desktop](https://docker.com/products/docker-desktop) (macOS/Windows) 또는 Docker Engine (Linux)
- `/customize`, `/debug`, 설정 중 오류 복구, 그리고 모든 `/add-<channel>` 스킬을 위한 [Claude Code](https://claude.ai/download)
## 아키텍처
```
메시징 앱 → 호스트 프로세스(라우터) → inbound.db → 컨테이너(Bun, Claude Agent SDK) → outbound.db → 호스트 프로세스(전송) → 메시징 앱
```
하나의 Node 호스트가 세션별 에이전트 컨테이너를 오케스트레이션합니다. 메시지가 도착하면 호스트는 엔티티 모델(사용자 → 메시징 그룹 → 에이전트 그룹 → 세션)을 통해 라우팅하고, 세션의 `inbound.db`에 기록한 뒤 컨테이너를 깨웁니다. 컨테이너 내부의 에이전트 러너는 `inbound.db`를 폴링하고, Claude를 실행하며, 응답을 `outbound.db`에 기록합니다. 호스트는 `outbound.db`를 폴링하여 채널 어댑터를 통해 다시 전송합니다.
세션당 두 개의 SQLite 파일이 있으며 각각 정확히 하나의 작성자만 갖습니다. 교차 마운트 경합이 없고, IPC가 없으며, stdin 파이핑이 없습니다. 채널과 대체 프로바이더는 시작 시 자체 등록됩니다. 트렁크는 레지스트리와 Chat SDK 브리지를 제공하고, 어댑터 자체는 포크별로 스킬을 통해 설치됩니다.
전체 아키텍처 설명은 [docs/architecture.md](docs/architecture.md)를, 3단계 격리 모델은 [docs/isolation-model.md](docs/isolation-model.md)를 참고하세요.
핵심 파일:
- `src/index.ts` — 진입점: DB 초기화, 채널 어댑터, 전송 폴링, 스윕
- `src/router.ts` — 인바운드 라우팅: 메시징 그룹 → 에이전트 그룹 → 세션 → `inbound.db`
- `src/delivery.ts``outbound.db` 폴링, 어댑터를 통한 전송, 시스템 액션 처리
- `src/host-sweep.ts` — 60초 스윕: 정체 감지, 예정 메시지 깨우기, 반복 처리
- `src/session-manager.ts` — 세션 확인, `inbound.db` / `outbound.db` 열기
- `src/container-runner.ts` — 에이전트 그룹별 컨테이너 생성, OneCLI 자격 증명 주입
- `src/db/` — 중앙 DB (사용자, 역할, 에이전트 그룹, 메시징 그룹, 연결, 마이그레이션)
- `src/channels/` — 채널 어댑터 인프라 (어댑터는 `/add-<channel>` 스킬로 설치)
- `src/providers/` — 호스트 측 프로바이더 설정 (`claude`는 기본 내장, 그 외는 스킬 경유)
- `container/agent-runner/` — Bun 에이전트 러너: 폴 루프, MCP 도구, 프로바이더 추상화
- `groups/<folder>/` — 에이전트 그룹별 파일시스템 (`CLAUDE.md`, 스킬, 컨테이너 설정)
## FAQ
**왜 Docker인가요?**
Docker는 크로스 플랫폼 지원(macOS, Linux, 그리고 WSL2 경유 Windows)과 성숙한 생태계를 제공합니다. macOS에서는 더 가벼운 네이티브 런타임인 Apple Container도 지원됩니다. 추가 격리를 위해 [Docker Sandboxes](docs/docker-sandboxes.md)는 각 컨테이너를 마이크로 VM 안에서 실행합니다.
**Linux나 Windows에서 실행할 수 있나요?**
네. Docker가 기본 런타임이며 macOS, Linux, Windows(WSL2 경유)에서 작동합니다. `bash nanoclaw.sh`를 실행하기만 하면 됩니다.
**이것은 안전한가요?**
에이전트는 애플리케이션 수준의 권한 검사 뒤가 아니라 컨테이너에서 실행됩니다. 명시적으로 마운트된 디렉터리만 접근할 수 있습니다. 자격 증명은 컨테이너에 들어가지 않습니다. 아웃바운드 API 요청은 [OneCLI의 Agent Vault](https://github.com/onecli/onecli)를 통해 라우팅되며, 프록시 수준에서 인증을 주입하고 속도 제한과 접근 정책을 지원합니다. 여전히 실행하는 것을 검토해야 하지만, 코드베이스가 충분히 작아서 실제로 검토할 수 있습니다. 전체 보안 모델은 [보안 문서](https://docs.nanoclaw.dev/concepts/security)를 참고하세요.
**왜 설정 파일이 없나요?**
설정의 난립을 원하지 않습니다. 모든 사용자는 일반적인 시스템을 설정하는 대신, 코드가 정확히 원하는 대로 동작하도록 NanoClaw를 커스터마이즈해야 합니다. 설정 파일을 선호한다면 Claude에게 추가해 달라고 할 수 있습니다.
**서드파티나 오픈소스 모델을 사용할 수 있나요?**
네. 지원되는 경로는 `/add-opencode`(OpenCode 설정을 통한 OpenRouter, OpenAI, Google, DeepSeek 등) 또는 `/add-ollama-provider`(Ollama를 통한 로컬 오픈 웨이트 모델)입니다. 둘 다 에이전트 그룹별로 설정할 수 있으므로, 같은 설치 내에서 서로 다른 에이전트가 서로 다른 백엔드에서 실행될 수 있습니다.
일회성 실험의 경우, Claude API 호환 엔드포인트라면 `.env`를 통해서도 작동합니다:
```bash
ANTHROPIC_BASE_URL=https://your-api-endpoint.com
ANTHROPIC_AUTH_TOKEN=your-token-here
```
**문제를 어떻게 디버깅하나요?**
Claude Code에게 물어보세요. "스케줄러가 왜 실행되지 않지?" "최근 로그에 뭐가 있지?" "이 메시지는 왜 응답을 받지 못했지?" 그것이 NanoClaw의 바탕에 깔린 AI 네이티브 접근 방식입니다.
**설정이 왜 작동하지 않나요?**
어떤 단계가 실패하면 `nanoclaw.sh`는 진단하고 재개하기 위해 Claude Code로 넘깁니다. 그래도 해결되지 않으면 `claude`를 실행한 뒤 `/debug`를 실행하세요. Claude가 다른 사용자에게도 영향을 줄 만한 문제를 발견하면, 관련 설정 단계나 스킬에 대한 PR을 열어주세요.
**NanoClaw를 어떻게 제거하나요?**
```bash
bash nanoclaw.sh --uninstall
```
모든 설치는 체크아웃별 ID로 태깅되므로, 제거 프로그램은 해당 사본에 속한 것만 제거합니다: 백그라운드 서비스, 컨테이너와 이미지, 앱 데이터와 로그, 에이전트 파일, 그리고 이 사본의 OneCLI 볼트 에이전트입니다. 공유되는 것 — OneCLI 앱과 여러분의 자격 증명, 머신의 다른 NanoClaw 사본 — 은 그대로 둡니다. 무엇을 발견했는지 정확히 보여주고 그룹별로 확인을 요청합니다. 여러분이 동의하기 전까지는 아무것도 삭제되지 않습니다. 변경 없이 미리 보려면 `--dry-run`을, 프롬프트를 건너뛰려면 `--yes`를 사용하세요. `.env`는 제거 전에 백업됩니다. 마무리하려면 체크아웃 폴더 자체를 삭제하세요.
**어떤 변경이 코드베이스에 받아들여지나요?**
기본 구성에는 보안 수정, 버그 수정, 명확한 개선만 받아들여집니다. 그게 전부입니다.
그 외의 모든 것(새로운 기능, OS 호환성, 하드웨어 지원, 향상)은 스킬로 기여해야 합니다. 채널과 프로바이더 코드는 `channels`/`providers` 레지스트리 브랜치에, 그 외에는 자체 완결형 스킬로 기여합니다. [docs/customizing.md](docs/customizing.md)와 [CONTRIBUTING.md](CONTRIBUTING.md)를 참고하세요.
이를 통해 기본 시스템을 최소한으로 유지하고, 모든 사용자가 원하지 않는 기능을 떠안지 않으면서 자신의 설치를 커스터마이즈할 수 있습니다.
## 커뮤니티
질문이 있나요? 아이디어가 있나요? [Discord에 참여하세요](https://discord.gg/VDdww8qS42).
## 변경 이력
호환성을 깨는 변경 사항은 [CHANGELOG.md](CHANGELOG.md)를, 또는 문서 사이트의 [전체 릴리스 히스토리](https://docs.nanoclaw.dev/changelog)를 참고하세요.
## 라이선스
MIT
<img referrerpolicy="no-referrer-when-downgrade" src="https://static.scarf.sh/a.png?x-pxid=47894bd5-353b-42fe-bb97-74144e6df0bf" />
+1
View File
@@ -11,6 +11,7 @@
<a href="https://docs.nanoclaw.dev">文档</a>&nbsp; • &nbsp;
<a href="README.md">English</a>&nbsp; • &nbsp;
<a href="README_ja.md">日本語</a>&nbsp; • &nbsp;
<a href="README_ko.md">한국어</a>&nbsp; • &nbsp;
<a href="https://discord.gg/VDdww8qS42"><img src="https://img.shields.io/discord/1470188214710046894?label=Discord&logo=discord&v=2" alt="Discord" valign="middle"></a>&nbsp; • &nbsp;
<a href="repo-tokens"><img src="repo-tokens/badge.svg" alt="repo tokens" valign="middle"></a>
</p>
+11 -15
View File
@@ -16,12 +16,11 @@ FROM node:22-slim
# CJK fonts add ~200MB. Opt in only if you render Chinese/Japanese/Korean text.
ARG INSTALL_CJK_FONTS=false
# Pin CLI versions for reproducibility. Bump deliberately — unpinned installs
# mean every rebuild silently picks up the latest and can break in lockstep
# across all users.
ARG CLAUDE_CODE_VERSION=2.1.170
ARG AGENT_BROWSER_VERSION=latest
ARG VERCEL_VERSION=52.2.1
# Pin versions for reproducibility. Bump deliberately — unpinned installs mean
# every rebuild silently picks up the latest and can break in lockstep across
# all users. The global Node CLIs (claude-code, agent-browser, vercel) are
# pinned in cli-tools.json so a skill can add one with a json-merge; Bun (the
# runtime) is pinned here because it installs from a different source.
ARG BUN_VERSION=1.3.12
# ---- System dependencies -----------------------------------------------------
@@ -99,16 +98,13 @@ ENV PATH="$PNPM_HOME:$PATH"
ARG PNPM_VERSION=10.33.0
RUN corepack enable && corepack prepare pnpm@${PNPM_VERSION} --activate
# Global Node CLIs the agent invokes at runtime live in cli-tools.json so a
# skill can add one with a json-merge instead of editing this Dockerfile.
# install-cli-tools.sh installs each via pnpm (pinned), writing the per-tool
# only-built-dependencies opt-ins it reads from the manifest.
COPY cli-tools.json install-cli-tools.sh /tmp/
RUN --mount=type=cache,target=/root/.cache/pnpm \
echo "only-built-dependencies[]=agent-browser" > /root/.npmrc && \
echo "only-built-dependencies[]=@anthropic-ai/claude-code" >> /root/.npmrc && \
pnpm install -g "vercel@${VERCEL_VERSION}"
RUN --mount=type=cache,target=/root/.cache/pnpm \
pnpm install -g "agent-browser@${AGENT_BROWSER_VERSION}"
RUN --mount=type=cache,target=/root/.cache/pnpm \
pnpm install -g "@anthropic-ai/claude-code@${CLAUDE_CODE_VERSION}"
sh /tmp/install-cli-tools.sh /tmp/cli-tools.json
# ---- ncl CLI wrapper ----------------------------------------------------------
# Actual script lives in the mounted source at /app/src/cli/ncl.ts.
+7
View File
@@ -27,6 +27,7 @@ import { fileURLToPath } from 'url';
import { loadConfig } from './config.js';
import { buildSystemPromptAddendum } from './destinations.js';
import { ensureMemoryScaffold } from './memory-scaffold.js';
// Providers barrel — each enabled provider self-registers on import.
// Provider skills append imports to providers/index.ts.
import './providers/index.js';
@@ -95,6 +96,12 @@ async function main(): Promise<void> {
effort: config.effort,
});
// Providers that lack native memory opt in via `usesMemoryScaffold`; for them
// the runner creates a persistent memory/ tree in its host-backed workspace at
// boot (idempotent). Default off — the trunk default (Claude) omits the flag
// and keeps its native memory untouched.
if (provider.usesMemoryScaffold) ensureMemoryScaffold();
await runPollLoop({
provider,
providerName,
@@ -5,6 +5,7 @@ import { getUndeliveredMessages } from './db/messages-out.js';
import { getPendingMessages } from './db/messages-in.js';
import { getContinuation, setContinuation } from './db/session-state.js';
import { MockProvider } from './providers/mock.js';
import type { ProviderExchange } from './providers/types.js';
import { runPollLoop } from './poll-loop.js';
beforeEach(() => {
@@ -304,6 +305,7 @@ async function runPollLoopWithTimeout(provider: MockProvider, signal: AbortSigna
provider,
providerName: 'mock',
cwd: '/tmp',
signal,
}),
new Promise<void>((_, reject) => {
signal.addEventListener('abort', () => reject(new Error('aborted')));
@@ -324,6 +326,86 @@ function sleep(ms: number): Promise<void> {
return new Promise((resolve) => setTimeout(resolve, ms));
}
describe('poll loop — exchange hook (onExchangeComplete)', () => {
// A provider that declares the per-exchange hook. The hook call is the
// wiring under test — these tests go red if the poll-loop seam is severed.
// What the provider DOES with an exchange (e.g. write markdown into
// conversations/) ships with the provider, not the runner.
class HookedMockProvider extends MockProvider {
readonly exchanges: ProviderExchange[] = [];
onExchangeComplete(exchange: ProviderExchange): void {
this.exchanges.push(exchange);
}
}
it('reports each exchange to a provider that declares the hook', async () => {
insertMessage('m1', { sender: 'Alice', text: 'please archive this' }, { platformId: 'chan-1', channelType: 'discord' });
const provider = new HookedMockProvider({}, () => '<message to="discord-test">archived answer</message>');
const controller = new AbortController();
const loopPromise = runPollLoopWithTimeout(provider, controller.signal, 2000);
await waitFor(() => provider.exchanges.length > 0, 2000);
controller.abort();
expect(provider.exchanges.length).toBe(1);
const exchange = provider.exchanges[0];
expect(exchange.prompt).toContain('please archive this');
expect(exchange.result).toContain('archived answer');
expect(exchange.continuation).toStartWith('mock-session-');
expect(exchange.status).toBe('completed');
await loopPromise.catch(() => {});
});
it('does not report the internal wrapping-retry nudge as a user prompt', async () => {
insertMessage('m1', { sender: 'Alice', text: 'wrap this later' }, { platformId: 'chan-1', channelType: 'discord' });
let calls = 0;
const provider = new HookedMockProvider({}, () => {
calls += 1;
// First result is unwrapped (triggers the retry nudge), second is wrapped.
return calls === 1 ? 'unwrapped text' : '<message to="discord-test">wrapped now</message>';
});
const controller = new AbortController();
const loopPromise = runPollLoopWithTimeout(provider, controller.signal, 3000);
await waitFor(() => provider.exchanges.length >= 2, 3000);
controller.abort();
// Both exchanges attribute themselves to the real user prompt, never the nudge.
for (const exchange of provider.exchanges) {
expect(exchange.prompt).not.toContain('Your response was not delivered');
expect(exchange.prompt).toContain('wrap this later');
}
expect(provider.exchanges.map((e) => e.status)).toEqual(['undelivered', 'completed']);
await loopPromise.catch(() => {});
});
it('a throwing hook never breaks delivery', async () => {
insertMessage('m1', { sender: 'Alice', text: 'still deliver this' }, { platformId: 'chan-1', channelType: 'discord' });
class ThrowingHookProvider extends MockProvider {
onExchangeComplete(): void {
throw new Error('hook exploded');
}
}
const provider = new ThrowingHookProvider({}, () => '<message to="discord-test">delivered anyway</message>');
const controller = new AbortController();
const loopPromise = runPollLoopWithTimeout(provider, controller.signal, 2000);
await waitFor(() => getUndeliveredMessages().length > 0, 2000);
controller.abort();
const out = getUndeliveredMessages();
expect(out.length).toBe(1);
expect(out[0].content).toContain('delivered anyway');
await loopPromise.catch(() => {});
});
});
describe('poll loop — provider error recovery', () => {
it('writes error to outbound and continues loop on provider throw', async () => {
insertMessage('m1', { sender: 'Alice', text: 'trigger error' }, { platformId: 'chan-1', channelType: 'discord' });
@@ -462,3 +544,76 @@ class InvalidSessionProvider {
};
}
}
describe('poll loop — slash command during active query', () => {
it('aborts the active query when /clear arrives as a follow-up', async () => {
insertMessage('m-active', { sender: 'Alice', text: 'long running request' }, { platformId: 'chan-1', channelType: 'discord' });
const provider = new BlockingProvider();
const controller = new AbortController();
const loopPromise = runPollLoopWithTimeout(provider as unknown as MockProvider, controller.signal, 3000);
await waitFor(() => provider.queries === 1, 2000);
insertMessage('m-clear-active', { sender: 'Alice', text: '/clear' }, { platformId: 'chan-1', channelType: 'discord' });
await waitFor(() => provider.aborts === 1, 2000);
await waitFor(
() => getUndeliveredMessages().some((msg) => JSON.parse(msg.content).text === 'Session cleared.'),
2000,
);
controller.abort();
expect(provider.ends).toBe(0);
expect(getContinuation('mock')).toBeUndefined();
expect(getPendingMessages()).toHaveLength(0);
await loopPromise.catch(() => {});
});
});
/**
* Provider whose query never completes until ended/aborted for testing how
* the loop interrupts an active stream.
*/
class BlockingProvider {
readonly supportsNativeSlashCommands = false;
queries = 0;
aborts = 0;
ends = 0;
isSessionInvalid(): boolean {
return false;
}
query() {
const owner = this;
this.queries += 1;
let wake: (() => void) | null = null;
let ended = false;
let aborted = false;
return {
push() {},
end: () => {
owner.ends += 1;
ended = true;
wake?.();
},
abort: () => {
owner.aborts += 1;
aborted = true;
wake?.();
},
events: (async function* () {
yield { type: 'activity' as const };
yield { type: 'init' as const, continuation: 'blocking-session' };
while (!ended && !aborted) {
await new Promise<void>((resolve) => {
wake = resolve;
});
wake = null;
}
})(),
};
}
}
@@ -0,0 +1,53 @@
import { describe, expect, it } from 'bun:test';
import fs from 'fs';
import os from 'os';
import path from 'path';
import { ensureMemoryScaffold } from './memory-scaffold.js';
describe('ensureMemoryScaffold', () => {
it('deterministically creates the memory tree', () => {
const base = fs.mkdtempSync(path.join(os.tmpdir(), 'nanoclaw-mem-'));
try {
ensureMemoryScaffold(base);
expect(fs.existsSync(path.join(base, 'memory', 'index.md'))).toBe(true);
expect(fs.existsSync(path.join(base, 'memory', 'system', 'definition.md'))).toBe(true);
expect(fs.existsSync(path.join(base, 'memory', 'memories'))).toBe(true);
expect(fs.existsSync(path.join(base, 'memory', 'data'))).toBe(true);
} finally {
fs.rmSync(base, { recursive: true, force: true });
}
});
it('never touches workspace memory it did not create — CLAUDE.local.md stays untouched', () => {
const base = fs.mkdtempSync(path.join(os.tmpdir(), 'nanoclaw-mem-'));
try {
fs.writeFileSync(path.join(base, 'CLAUDE.local.md'), '# group memory\nuser prefers terse replies\n');
ensureMemoryScaffold(base);
// Migration between memory stores is the operator's move (/migrate-memory),
// never a boot side effect.
expect(fs.existsSync(path.join(base, 'memory', 'memories', 'imported-agent-memory.md'))).toBe(false);
expect(fs.readFileSync(path.join(base, 'CLAUDE.local.md'), 'utf-8')).toContain('terse replies');
} finally {
fs.rmSync(base, { recursive: true, force: true });
}
});
it('is idempotent and never clobbers the agent edits', () => {
const base = fs.mkdtempSync(path.join(os.tmpdir(), 'nanoclaw-mem-'));
try {
ensureMemoryScaffold(base);
const indexFile = path.join(base, 'memory', 'index.md');
fs.writeFileSync(indexFile, '# my own index\n');
ensureMemoryScaffold(base);
expect(fs.readFileSync(indexFile, 'utf-8')).toBe('# my own index\n');
} finally {
fs.rmSync(base, { recursive: true, force: true });
}
});
});
@@ -0,0 +1,39 @@
import fs from 'fs';
import path from 'path';
import { fileURLToPath } from 'url';
/**
* Create the agent's persistent memory scaffold, container-side, at boot.
*
* The runner owns its own workspace: it writes the memory tree straight into
* `/workspace/agent` (the host-backed, RW group dir, so it persists across the
* ephemeral container). No host-side step, nothing mounted in.
*
* The default `definition.md` / `index.md` live as real markdown templates next
* to this module (under `memory-templates/`) not as strings in code so the
* doctrine is editable as markdown and the agent receives an unescaped copy.
* They ship in the mounted `/app/src` tree, so no image change is needed.
*
* Idempotent only writes what's missing, so the agent's own edits and
* accumulated memory are never clobbered on a later wake. Provider-agnostic:
* the runner makes no assumption about which harness is running a provider
* opts in via `usesMemoryScaffold`.
*/
const TEMPLATES_DIR = path.join(path.dirname(fileURLToPath(import.meta.url)), 'memory-templates');
export function ensureMemoryScaffold(baseDir = '/workspace/agent'): void {
const memoryDir = path.join(baseDir, 'memory');
const systemDir = path.join(memoryDir, 'system');
for (const dir of [systemDir, path.join(memoryDir, 'memories'), path.join(memoryDir, 'data')]) {
fs.mkdirSync(dir, { recursive: true });
}
copyTemplateIfMissing('definition.md', path.join(systemDir, 'definition.md'));
copyTemplateIfMissing('index.md', path.join(memoryDir, 'index.md'));
}
function copyTemplateIfMissing(template: string, dest: string): void {
if (fs.existsSync(dest)) return;
fs.copyFileSync(path.join(TEMPLATES_DIR, template), dest);
}
@@ -0,0 +1,22 @@
import { describe, expect, it } from 'bun:test';
import fs from 'fs';
import path from 'path';
// Wiring guard for the memory-scaffold seam: the boot gate in index.ts
// (`if (provider.usesMemoryScaffold) ensureMemoryScaffold()`) is the seam's
// single functional reach-in. The unit tests in memory-scaffold.test.ts drive
// ensureMemoryScaffold directly and stay green if the gate is deleted — this
// test goes red. main() can't be driven in-process (it reads
// /workspace/agent/container.json and enters the poll loop), so the guard is
// structural: gate + import must both be present in the real entry point.
describe('memory scaffold boot wiring', () => {
const indexSrc = fs.readFileSync(path.join(import.meta.dir, 'index.ts'), 'utf-8');
it('gates the scaffold on the provider capability in main()', () => {
expect(indexSrc).toContain('if (provider.usesMemoryScaffold) ensureMemoryScaffold()');
});
it('imports ensureMemoryScaffold from the seam module', () => {
expect(indexSrc).toContain("import { ensureMemoryScaffold } from './memory-scaffold.js'");
});
});
@@ -0,0 +1,23 @@
# Agent Memory System
This editable file defines how your persistent memory works. It is a starting
point, not a contract — reorganize it as the work demands. If the user or another
memory system replaces this definition, follow the replacement.
Start every memory task at `memory/index.md`, then follow the narrowest relevant index.
Treat indexes as core data: keep them accurate and concise.
Every folder of durable memory has its own `index.md` describing its contents.
When an index grows past roughly 20 entries, group related items into subfolders,
and give each new subfolder its own `index.md` linked from the parent.
Use `memory/memories/` for durable facts, project context, people, decisions, and entity notes.
Use `memory/data/` for structured reference data, datasets, tables, and reusable records.
Use entity folders for things that matter: projects, people, places, organizations, decisions.
When the user shares something that should survive future turns, store it in the
smallest useful file; prefer updating an existing file over creating duplicates.
Write concise, source-aware notes; include dates when timing matters.
If a fact is corrected, update the memory and keep only useful history.
When you add, move, or remove memory, update the nearest index.
Before answering from memory, read the relevant index or file instead of guessing;
if memory is missing or uncertain, say so and verify when it matters.
@@ -0,0 +1,5 @@
# Memory Index
- [Memory system definition](system/definition.md)
- [Memories](memories/) - durable facts, people, projects, decisions
- [Data](data/) - structured reference data
+60 -1
View File
@@ -4,8 +4,9 @@ import { initTestSessionDb, closeSessionDb, getInboundDb, getOutboundDb } from '
import { getPendingMessages, markCompleted } from './db/messages-in.js';
import { getUndeliveredMessages } from './db/messages-out.js';
import { formatMessages, extractRouting } from './formatter.js';
import { isCorruptionError } from './poll-loop.js';
import { isCorruptionError, processQuery } from './poll-loop.js';
import { MockProvider } from './providers/mock.js';
import type { AgentQuery, ProviderEvent } from './providers/types.js';
beforeEach(() => {
initTestSessionDb();
@@ -379,6 +380,64 @@ describe('end-to-end with mock provider', () => {
});
});
/**
* Build a one-shot stub query that yields init + a single result event, then
* ends. `pushes` records any follow-ups the loop tried to inject (e.g. the
* re-wrap nudge), so a test can assert the loop did NOT re-hammer.
*/
function makeResultQuery(result: ProviderEvent): { query: AgentQuery; pushes: string[] } {
const pushes: string[] = [];
async function* events(): AsyncGenerator<ProviderEvent> {
yield { type: 'init', continuation: 'sess-1' };
yield result;
}
return {
pushes,
query: {
push: (m: string) => {
pushes.push(m);
},
end: () => {},
events: events(),
abort: () => {},
},
};
}
const ERR_ROUTING = {
platformId: 'chan-1',
channelType: 'discord',
threadId: null,
inReplyTo: 'm1',
};
describe('error result with no <message> envelope', () => {
it('delivers a budget/billing error to the triggering channel and does not nudge', async () => {
const budgetText = 'Spending limit reached. Add your own key at https://example.com/keys';
const { query, pushes } = makeResultQuery({ type: 'result', text: budgetText, isError: true });
await processQuery(query, ERR_ROUTING, ['m1'], 'claude', undefined, 'prompt', undefined);
const out = getUndeliveredMessages();
expect(out).toHaveLength(1);
expect(JSON.parse(out[0].content).text).toBe(budgetText);
expect(out[0].platform_id).toBe('chan-1');
expect(out[0].channel_type).toBe('discord');
// No re-wrap nudge — an error result must not re-hammer the gateway.
expect(pushes).toHaveLength(0);
});
it('still nudges (and does not deliver) a normal unwrapped result', async () => {
const { query, pushes } = makeResultQuery({ type: 'result', text: 'bare text, no envelope' });
await processQuery(query, ERR_ROUTING, ['m1'], 'claude', undefined, 'prompt', undefined);
expect(getUndeliveredMessages()).toHaveLength(0);
expect(pushes).toHaveLength(1);
expect(pushes[0]).toContain('was not delivered');
});
});
describe('isCorruptionError', () => {
it('matches the Docker Desktop macOS torn-read symptom', () => {
expect(isCorruptionError('database disk image is malformed')).toBe(true);
+114 -19
View File
@@ -14,7 +14,7 @@ import {
type RoutingContext,
} from './formatter.js';
import { isUploadTraceCommand, uploadTrace } from './upload-trace.js';
import type { AgentProvider, AgentQuery, ProviderEvent } from './providers/types.js';
import type { AgentProvider, AgentQuery, ProviderEvent, ProviderExchange } from './providers/types.js';
const POLL_INTERVAL_MS = 1000;
const ACTIVE_POLL_INTERVAL_MS = 500;
@@ -63,6 +63,12 @@ export interface PollLoopConfig {
systemContext?: {
instructions?: string;
};
/**
* Optional stop signal. In production the loop runs until the container
* dies; tests pass a signal so an abandoned loop actually exits instead of
* polling forever and stealing messages from the next test's DB.
*/
signal?: AbortSignal;
}
/**
@@ -107,6 +113,7 @@ export async function runPollLoop(config: PollLoopConfig): Promise<void> {
let pollCount = 0;
let isFirstPoll = true;
while (true) {
if (config.signal?.aborted) return;
// Skip system messages — they're responses for MCP tools (e.g., ask_user_question)
const messages = getPendingMessages(isFirstPoll).filter((m) => m.kind !== 'system');
isFirstPoll = false;
@@ -232,7 +239,15 @@ export async function runPollLoop(config: PollLoopConfig): Promise<void> {
// can stamp it on outbound rows — needed for a2a return-path routing.
setCurrentInReplyTo(routing.inReplyTo);
try {
const result = await processQuery(query, routing, processingIds, config.providerName);
const result = await processQuery(
query,
routing,
processingIds,
config.providerName,
config.provider.onExchangeComplete?.bind(config.provider),
prompt,
continuation,
);
if (result.continuation && result.continuation !== continuation) {
continuation = result.continuation;
setContinuation(config.providerName, continuation);
@@ -308,15 +323,23 @@ interface QueryResult {
continuation?: string;
}
async function processQuery(
export async function processQuery(
query: AgentQuery,
routing: RoutingContext,
initialBatchIds: string[],
providerName: string,
onExchangeComplete: ((exchange: ProviderExchange) => void) | undefined,
initialPrompt: string,
initialContinuation: string | undefined,
): Promise<QueryResult> {
let queryContinuation: string | undefined;
let done = false;
let unwrappedNudged = false;
// Prompt queue for the exchange hook — each result event consumes the
// oldest unanswered prompt, except a wrapping-retry result, which answers
// the same prompt again. Unused (and unmaintained) when the provider
// doesn't implement `onExchangeComplete`.
const archivePrompts: string[] = [initialPrompt];
// Concurrent polling: push follow-ups into the active query as they arrive.
// We do NOT force-end the stream on silence — keeping the query open avoids
@@ -342,13 +365,16 @@ async function processQuery(
// resume id (fixed at sdkQuery() time); admin/passthrough commands
// (/compact, /cost, …) only dispatch when they're the first input
// of a query — pushed mid-stream they arrive as plain text and
// the SDK never runs them. End the stream and leave the rows
// pending; the outer loop handles them on next iteration via the
// canonical command path + formatMessagesWithCommands.
// the SDK never runs them. Abort the active stream and leave the
// rows pending; the outer loop handles them on next iteration via
// the canonical command path + formatMessagesWithCommands. Abort,
// not end: end() lets an in-flight turn run to completion, which
// can block the command (e.g. /clear during a long task) for as
// long as the turn takes.
if (pending.some((m) => isRunnerCommand(m))) {
log('Pending slash command — ending stream so outer loop can process');
log('Pending slash command — aborting active stream so outer loop can process');
endedForCommand = true;
query.end();
query.abort();
return;
}
@@ -393,6 +419,7 @@ async function processQuery(
log(`Pushing ${keep.length} follow-up message(s) into active query`);
unwrappedNudged = false;
query.push(prompt);
archivePrompts.push(prompt);
markCompleted(keptIds);
} catch (err) {
// Without this catch the rejection escapes the void IIFE and Node
@@ -455,21 +482,57 @@ async function processQuery(
// at all — either way the turn is finished.
markCompleted(initialBatchIds);
if (event.text) {
const { hasUnwrapped } = dispatchResultText(event.text, routing);
if (hasUnwrapped && !unwrappedNudged) {
unwrappedNudged = true;
const destinations = getAllDestinations();
const names = destinations.map((d) => d.name).join(', ');
query.push(
`<system>Your response was not delivered — it was not wrapped in <message to="name">...</message> blocks. ` +
`All output must be wrapped: use <message to="name"> for content to send, or <internal> for scratchpad. ` +
`Your destinations: ${names}. ` +
`Please re-send your response with the correct wrapping.</system>`,
);
const { sent, hasUnwrapped } = dispatchResultText(event.text, routing);
if (sent === 0 && event.isError === true) {
// Non-retryable error turn (e.g. a 403 billing_error) with no
// <message> envelope: deliver the notice instead of dropping it as
// scratchpad, and skip the re-wrap nudge — it would just re-hammer
// the failing gateway turn after turn.
deliverErrorResult(event.text, routing);
notifyExchangeComplete(onExchangeComplete, {
prompt: archivePrompts[0] ?? initialPrompt,
result: event.text,
continuation: queryContinuation ?? initialContinuation,
status: 'error',
});
archivePrompts.shift();
} else {
const willRetryWrapping = hasUnwrapped && !unwrappedNudged;
notifyExchangeComplete(onExchangeComplete, {
prompt: archivePrompts[0] ?? initialPrompt,
result: event.text,
continuation: queryContinuation ?? initialContinuation,
status: hasUnwrapped ? 'undelivered' : 'completed',
});
if (willRetryWrapping) {
unwrappedNudged = true;
const destinations = getAllDestinations();
const names = destinations.map((d) => d.name).join(', ');
query.push(
`<system>Your response was not delivered — it was not wrapped in <message to="name">...</message> blocks. ` +
`All output must be wrapped: use <message to="name"> for content to send, or <internal> for scratchpad. ` +
`Your destinations: ${names}. ` +
`Please re-send your response with the correct wrapping.</system>`,
);
}
// The wrapping-retry result answers the SAME user prompt — keep it
// queued so the retry archives against it, not the nudge text.
if (!willRetryWrapping) archivePrompts.shift();
}
} else {
archivePrompts.shift();
}
}
}
} catch (err) {
const errMsg = err instanceof Error ? err.message : String(err);
notifyExchangeComplete(onExchangeComplete, {
prompt: archivePrompts[0] ?? initialPrompt,
result: `Error: ${errMsg}`,
continuation: queryContinuation ?? initialContinuation,
status: 'error',
});
throw err;
} finally {
done = true;
clearInterval(pollHandle);
@@ -478,6 +541,18 @@ async function processQuery(
return { continuation: queryContinuation };
}
function notifyExchangeComplete(
hook: ((exchange: ProviderExchange) => void) | undefined,
exchange: ProviderExchange,
): void {
if (!hook) return;
try {
hook(exchange);
} catch (err) {
log(`onExchangeComplete failed: ${err instanceof Error ? err.message : String(err)}`);
}
}
function handleEvent(event: ProviderEvent, _routing: RoutingContext): void {
switch (event.type) {
case 'init':
@@ -497,6 +572,26 @@ function handleEvent(event: ProviderEvent, _routing: RoutingContext): void {
}
}
/**
* Deliver a turn's text straight to the channel the batch arrived on. Used when
* a turn ends in a provider error (e.g. a non-retryable 403 billing_error) with
* no <message> envelope: the notice would otherwise be dropped as scratchpad.
* This is the same user-facing write the outer catch block does, minus the
* `Error:` prefix the provider's text is already a user-facing message.
*/
function deliverErrorResult(text: string, routing: RoutingContext): void {
log('Error result with no <message> envelope — delivering to channel');
writeMessageOut({
id: generateId(),
in_reply_to: routing.inReplyTo,
kind: 'chat',
platform_id: routing.platformId,
channel_type: routing.channelType,
thread_id: routing.threadId,
content: JSON.stringify({ text }),
});
}
/**
* Parse the agent's final text for <message to="name">...</message> blocks
* and dispatch each one to its resolved destination. Text outside of blocks
@@ -440,8 +440,13 @@ export class ClaudeProvider implements AgentProvider {
if (message.type === 'system' && message.subtype === 'init') {
yield { type: 'init', continuation: message.session_id };
} else if (message.type === 'result') {
const text = 'result' in message ? (message as { result?: string }).result ?? null : null;
yield { type: 'result', text };
// `result` text exists only on subtype:"success"; error subtypes
// (e.g. a non-retryable 403 billing_error) carry their message in
// `errors[]` instead. Surface either so the poll-loop can deliver a
// billing/quota notice to the user rather than dropping the turn.
const m = message as { result?: string; is_error?: boolean; errors?: string[] };
const text = m.result ?? (m.errors && m.errors.length > 0 ? m.errors.join('\n') : null);
yield { type: 'result', text, isError: m.is_error === true };
} else if (message.type === 'system' && (message as { subtype?: string }).subtype === 'api_retry') {
yield { type: 'error', message: 'API retry', retryable: true };
} else if (message.type === 'system' && (message as { subtype?: string }).subtype === 'rate_limit_event') {
+36 -1
View File
@@ -6,6 +6,25 @@ export interface AgentProvider {
*/
readonly supportsNativeSlashCommands: boolean;
/**
* Optional. When true, the runner scaffolds a persistent `memory/` tree in the
* agent's workspace at boot. Providers with their own native memory (e.g.
* Claude's `CLAUDE.local.md`) omit this and get nothing memory is opt-in per
* provider, never gated on a provider name.
*/
readonly usesMemoryScaffold?: boolean;
/**
* Optional. Called by the poll-loop after each completed exchange (a
* result, a wrapping retry, or an error). Providers whose harness keeps no
* on-disk transcript implement this to persist exchanges themselves (e.g.
* markdown into the agent's `conversations/` dir); providers that persist
* and archive their own transcript (e.g. the Claude Agent SDK's `.jsonl`)
* omit it. Best-effort: the loop catches and logs anything it throws. The
* implementation lives with the provider, never in the runner.
*/
onExchangeComplete?(exchange: ProviderExchange): void;
/** Start a new query. Returns a handle for streaming input and output. */
query(input: QueryInput): AgentQuery;
@@ -31,6 +50,16 @@ export interface AgentProvider {
maybeRotateContinuation?(continuation: string, cwd: string): string | null;
}
/** One prompt/result round-trip, as reported to `onExchangeComplete`. */
export interface ProviderExchange {
/** The user prompt this exchange answers (never an internal retry nudge). */
prompt: string;
result: string | null;
/** Continuation/thread id in effect for the exchange, if any. */
continuation?: string;
status: 'completed' | 'undelivered' | 'error';
}
/**
* Options passed to provider constructors. Fields are common to most
* providers; individual providers may ignore any they don't need.
@@ -96,7 +125,13 @@ export interface AgentQuery {
export type ProviderEvent =
| { type: 'init'; continuation: string }
| { type: 'result'; text: string | null }
/**
* A completed turn. `isError` is set when the underlying SDK flagged the
* turn as an error (e.g. a non-retryable Anthropic 403 billing_error). The
* poll-loop uses it to surface the result text to the user instead of
* dropping it as un-wrapped scratchpad, and to skip the re-wrap nudge.
*/
| { type: 'result'; text: string | null; isError?: boolean }
| { type: 'error'; message: string; retryable: boolean; classification?: string }
| { type: 'progress'; message: string }
/**
+5
View File
@@ -0,0 +1,5 @@
[
{ "name": "vercel", "version": "52.2.1" },
{ "name": "agent-browser", "version": "0.27.1", "onlyBuilt": true },
{ "name": "@anthropic-ai/claude-code", "version": "2.1.170", "onlyBuilt": true }
]
+61
View File
@@ -0,0 +1,61 @@
import { describe, it, expect } from 'vitest';
import { readFileSync } from 'node:fs';
import { fileURLToPath } from 'node:url';
import { dirname, join } from 'node:path';
// Guards the cli-tools.json seam: the global CLIs the agent invokes at runtime
// are installed from the manifest (a skill adds one with a json-merge), not
// hand-edited into the Dockerfile. These go red on a bad merge that drops a
// baseline tool, or on dewiring the Dockerfile / switching the installer off
// the pnpm supply-chain path.
const here = dirname(fileURLToPath(import.meta.url));
const manifest = JSON.parse(readFileSync(join(here, 'cli-tools.json'), 'utf8')) as Array<{
name: string;
version: string;
onlyBuilt?: boolean;
}>;
const dockerfile = readFileSync(join(here, 'Dockerfile'), 'utf8');
const installer = readFileSync(join(here, 'install-cli-tools.sh'), 'utf8');
describe('cli-tools manifest', () => {
it('is a non-empty array of { name, version }', () => {
expect(Array.isArray(manifest)).toBe(true);
expect(manifest.length).toBeGreaterThan(0);
for (const tool of manifest) {
expect(typeof tool.name).toBe('string');
expect(tool.name.length).toBeGreaterThan(0);
expect(typeof tool.version).toBe('string');
expect(tool.version.length).toBeGreaterThan(0);
}
});
it('has unique tool names (json-merge is keyed on name)', () => {
const names = manifest.map((t) => t.name);
expect(new Set(names).size).toBe(names.length);
});
it('pins every version to an exact semver (no latest, no ranges — supply-chain policy)', () => {
for (const tool of manifest) {
expect(tool.version, `${tool.name} must be an exact semver, not "${tool.version}"`).toMatch(
/^\d+\.\d+\.\d+(?:[-+][0-9A-Za-z.-]+)?$/,
);
}
});
it('keeps the baseline CLIs the agent depends on', () => {
const names = manifest.map((t) => t.name);
for (const required of ['vercel', 'agent-browser', '@anthropic-ai/claude-code']) {
expect(names).toContain(required);
}
});
it('is wired into the Dockerfile build (COPY manifest + run installer)', () => {
expect(dockerfile).toMatch(/COPY cli-tools\.json install-cli-tools\.sh/);
expect(dockerfile).toMatch(/install-cli-tools\.sh \/tmp\/cli-tools\.json/);
});
it('installs via pnpm and writes only-built opt-ins (preserves the supply-chain path)', () => {
expect(installer).toMatch(/pnpm install -g/);
expect(installer).toMatch(/only-built-dependencies\[\]=/);
});
});
+29
View File
@@ -0,0 +1,29 @@
#!/bin/sh
# Install the global Node CLIs the agent invokes at runtime, from cli-tools.json.
#
# A skill adds a tool by appending a { "name", "version" } entry to that
# manifest (a json-merge) instead of editing the Dockerfile — the reach-in
# becomes the safest change shape, deterministic and removable.
#
# Every tool is installed via `pnpm install -g`, pinned to an exact version, so
# the pnpm supply-chain policy still applies. Tools with a native postinstall
# set "onlyBuilt": true to opt in to running build scripts (pnpm skips them by
# default). Run as root before `USER node`, so /root/.npmrc is the right home.
set -eu
MANIFEST="${1:-/tmp/cli-tools.json}"
# Write the per-tool only-built-dependencies opt-ins pnpm reads at install time.
node -e '
const tools = require(process.argv[1]);
const optIns = tools.filter((t) => t.onlyBuilt).map((t) => "only-built-dependencies[]=" + t.name);
require("fs").writeFileSync("/root/.npmrc", optIns.join("\n") + (optIns.length ? "\n" : ""));
' "$MANIFEST"
# Install every tool, pinned. name@version specs never contain spaces, so the
# unquoted expansion word-splits cleanly into positional args.
# shellcheck disable=SC2046
set -- $(node -e 'require(process.argv[1]).forEach((t) => console.log(t.name + "@" + t.version))' "$MANIFEST")
if [ "$#" -gt 0 ]; then
pnpm install -g "$@"
fi
+83
View File
@@ -0,0 +1,83 @@
# Upgrading the OneCLI gateway
NanoClaw talks to the OneCLI gateway (credential vault + egress proxy) through `@onecli-sh/sdk`. The gateway is an external component with its own release line, so NanoClaw pins the **sanctioned gateway version** in [`versions.json`](../versions.json) under `onecli-gateway`. When an update moves that pin, the gateway must be upgraded — this doc is the migration path. It is written to be handed to a coding agent verbatim: detect → upgrade → verify → rollback.
There is deliberately **no runtime version check, and setup does not migrate the gateway for you**: the gateway is a separate out-of-band component, and the migrator is your coding agent running `/update-nanoclaw` — it diffs `versions.json` across the update and routes you here when the `onecli-gateway` pin moved. (Setup detects a pre-`/v1` gateway and points at this doc, but never upgrades it.) Run the steps below verbatim.
## 1. Detect
Find out what is running and what is required:
```bash
cat versions.json # the sanctioned pin
curl -s http://127.0.0.1:10254/api/health # reports the running gateway version
curl -s -o /dev/null -w '%{http_code}' http://127.0.0.1:10254/v1/health
```
If the last command prints `404`, the server predates the `/v1` API that `@onecli-sh/sdk` 2.x requires — every SDK call will fail with 404s that look transient but are permanent. If your gateway is remote, substitute its host for `127.0.0.1` (it's in `.env` as `ONECLI_URL` / `NANOCLAW_ONECLI_API_HOST`).
Why gateways fall behind: the OneCLI installer's docker-compose tracks the `latest` image tag, but Docker never re-pulls a tag — the server freezes at whatever `latest` meant on install day.
## 2. Upgrade
The gateway runs as a Docker service in `~/.onecli`. Upgrade just that container to the pinned `onecli-gateway` version — vault data lives in named Docker volumes and survives. This upgrades only the gateway; the CLI binary is pinned separately (see below).
**Local gateway (the common case):**
```bash
cd ~/.onecli && ONECLI_VERSION=<onecli-gateway pin from versions.json> docker compose pull onecli && docker compose up -d
```
**Remote gateway** — run the same command on the gateway's host (NanoClaw can't reach it over SSH).
## 3. Verify
Host-side health is necessary but **not sufficient**:
```bash
curl -s http://127.0.0.1:10254/v1/health # must return {"status":"ok",...}
```
**Verify the bind interface (container reachability).** Agent containers reach the gateway over the docker bridge (`host.docker.internal` → e.g. `172.17.0.1`), so a server bound only to `127.0.0.1` boots clean host-side while every credentialed call from containers dies at the proxy:
```bash
docker run --rm --add-host=host.docker.internal:host-gateway \
curlimages/curl -s -o /dev/null -w '%{http_code}' http://host.docker.internal:10254/v1/health
```
This must print `200`. If it can't connect while the host-side check passed, set the bind address in `~/.onecli/.env` to the docker-bridge IP (or `0.0.0.0` on a host with a closed firewall) and `cd ~/.onecli && docker compose up -d`. Symptom if skipped: host log clean, agents fail all API calls.
Finally, restart the NanoClaw service (per-install names — derive with `setup/lib/install-slug.sh`):
```bash
# macOS
source setup/lib/install-slug.sh && launchctl kickstart -k gui/$(id -u)/$(launchd_label)
# Linux
source setup/lib/install-slug.sh && systemctl --user restart $(systemd_unit)
```
## 4. Rollback
```bash
cd ~/.onecli && ONECLI_VERSION=<old-version> docker compose up -d
```
If the NanoClaw update itself is being rolled back, also pin `@onecli-sh/sdk` back to its previous version in `package.json` and run `pnpm install`. Vault data is unaffected in both directions.
## The CLI binary (`onecli-cli` pin)
The `onecli` host CLI is pinned the same way, under `onecli-cli` in `versions.json`. Setup installs exactly that version by direct release download — it never resolves "latest". When an update moves this pin, replace the binary with the pinned release:
```bash
onecli --version # detect: what is installed
V=<onecli-cli pin from versions.json>
OS=$(uname -s | tr '[:upper:]' '[:lower:]') # darwin | linux
ARCH=$(uname -m | sed 's/x86_64/amd64/;s/aarch64/arm64/') # amd64 | arm64
curl -fsSL -o /tmp/onecli.tgz \
"https://github.com/onecli/onecli-cli/releases/download/v${V}/onecli_${V}_${OS}_${ARCH}.tar.gz"
tar -xzf /tmp/onecli.tgz -C /tmp
install -m 0755 /tmp/onecli "$(command -v onecli || echo ~/.local/bin/onecli)"
onecli --version # verify: must match versions.json
```
To roll back, run the same block after reverting `versions.json` (or checking out the previous NanoClaw version). The CLI is stateless — vault data lives in the gateway, so swapping the binary in either direction loses nothing.
+44
View File
@@ -0,0 +1,44 @@
# Switching an agent group between providers
How an **operator** moves a live agent group from one agent provider to another (e.g. Claude → Codex) and back. Switching is an operator action: it runs from the host via `ncl groups config update --provider` + restart.
NanoClaw's runtime does not migrate anything when you switch. Provider-neutral state simply stays where it is; provider-specific state (memory, in-flight context) stays with its provider, and carrying memory across is a separate, explicit operator step (`/migrate-memory`, executed by your coding agent).
## Preconditions
1. **The target provider is installed** — run its `/add-<provider>` skill and rebuild the container image (`./container/build.sh`). If the provider isn't installed (or the name is a typo), the container fails at boot and the host surfaces its last words in the logs: look for `Container exited non-zero` with a `stderrTail` like `Unknown provider: codexx. Registered: claude, codex`.
2. **Auth is configured** — each provider documents its own auth in its install skill (for Codex: a ChatGPT-subscription or API-key secret in the OneCLI vault).
## Switching
```bash
ncl groups config update --id <group-id> --provider codex
ncl groups restart --id <group-id>
```
Sessions resolve their provider at container spawn (`sessions.agent_provider` is only set when you've explicitly pinned a session), so existing sessions pick up the new provider on their next wake.
## What carries over automatically
| State | How |
|-------|-----|
| Group identity, wiring, members, roles, destinations | Provider-neutral, in the central DB — untouched |
| Container config (model aside), skills, MCP servers, packages, mounts, cli_scope | Provider-neutral — untouched |
| Workspace files (`groups/<folder>/` — notes, data files the agent created) | Same workspace, mounted for every provider |
| Conversation archives (`conversations/`) | Provider-neutral markdown — readable by the new provider |
| Agent surfaces (system instructions / project docs) | Composed fresh at every spawn from the same sources — nothing to migrate |
## What does NOT carry over
- **Agent memory.** Each provider keeps its own store: Claude's per-group memory is `CLAUDE.local.md` in the workspace; scaffold providers (e.g. Codex) keep a `memory/` tree. Neither is touched by a switch — the old store sits intact, the new provider starts with its own. To carry memory across, run **`/migrate-memory`**: your coding agent reads the source store, distills it into the target store (copy, never move), and restarts the group. Both directions work.
- **In-flight conversation context.** Continuations are provider-specific (a Claude SDK session, a Codex thread) and stored in separate per-provider slots — the new provider starts a fresh thread. The old slot is kept, not deleted. Recent context is recoverable from `conversations/` archives.
- **Provider state dirs** (`.claude-shared/`, `.codex-shared/`). Each provider keeps its own; they sit idle while unused and are reused if you switch back.
## Rolling back
```bash
ncl groups config update --id <group-id> --provider claude
ncl groups restart --id <group-id>
```
Rollback is lossless by construction: the per-provider continuation slot means Claude resumes its previous session (subject to normal transcript-rotation age limits), and `CLAUDE.local.md` was never modified by the switch. Memory written **while on the other provider** lives in that provider's store — run `/migrate-memory` again if you want it carried back.
+2 -2
View File
@@ -1,6 +1,6 @@
{
"name": "nanoclaw",
"version": "2.1.11",
"version": "2.1.18",
"description": "Personal Claude assistant. Lightweight, secure, customizable.",
"type": "module",
"packageManager": "pnpm@10.33.0",
@@ -30,7 +30,7 @@
"dependencies": {
"@clack/core": "^1.2.0",
"@clack/prompts": "^1.2.0",
"@onecli-sh/sdk": "^0.5.0",
"@onecli-sh/sdk": "2.2.1",
"better-sqlite3": "11.10.0",
"chat": "^4.24.0",
"cron-parser": "5.5.0",
+5 -5
View File
@@ -15,8 +15,8 @@ importers:
specifier: ^1.2.0
version: 1.2.0
'@onecli-sh/sdk':
specifier: ^0.5.0
version: 0.5.0
specifier: 2.2.1
version: 2.2.1
better-sqlite3:
specifier: 11.10.0
version: 11.10.0
@@ -303,8 +303,8 @@ packages:
'@emnapi/core': ^1.7.1
'@emnapi/runtime': ^1.7.1
'@onecli-sh/sdk@0.5.0':
resolution: {integrity: sha512-oe5Yx9o98v6N1PgzcCR7nULHHqcqKWNJIDOHGOSNX+l20mLlZpFUqfKPeFmsojBNRQMoqbvZQKUlFMp6gVuYBA==}
'@onecli-sh/sdk@2.2.1':
resolution: {integrity: sha512-q2mCW4ZsARlLEoTxz/P0NQ4MiCh7Z2n28pxkSc7srS+tozyw40PdTnWYW7NI8hfSYplZTx5856Adq1iPi4KN3Q==}
engines: {node: '>=20'}
'@oxc-project/types@0.124.0':
@@ -1665,7 +1665,7 @@ snapshots:
'@tybys/wasm-util': 0.10.1
optional: true
'@onecli-sh/sdk@0.5.0': {}
'@onecli-sh/sdk@2.2.1': {}
'@oxc-project/types@0.124.0': {}
+4 -4
View File
@@ -1,5 +1,5 @@
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="90" height="20" role="img" aria-label="190k tokens, 95% of context window">
<title>190k tokens, 95% of context window</title>
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="90" height="20" role="img" aria-label="196k tokens, 98% of context window">
<title>196k tokens, 98% of context window</title>
<linearGradient id="s" x2="0" y2="100%">
<stop offset="0" stop-color="#bbb" stop-opacity=".1"/>
<stop offset="1" stop-opacity=".1"/>
@@ -15,8 +15,8 @@
<g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" font-size="11">
<text aria-hidden="true" x="26" y="15" fill="#010101" fill-opacity=".3">tokens</text>
<text x="26" y="14">tokens</text>
<text aria-hidden="true" x="71" y="15" fill="#010101" fill-opacity=".3">190k</text>
<text x="71" y="14">190k</text>
<text aria-hidden="true" x="71" y="15" fill="#010101" fill-opacity=".3">196k</text>
<text x="71" y="14">196k</text>
</g>
</g>
</a>

Before

Width:  |  Height:  |  Size: 1.1 KiB

After

Width:  |  Height:  |  Size: 1.1 KiB

+6
View File
@@ -21,6 +21,7 @@ import path from 'path';
import { DATA_DIR } from '../src/config.js';
import { createAgentGroup, getAgentGroupByFolder } from '../src/db/agent-groups.js';
import { updateContainerConfigScalars } from '../src/db/container-configs.js';
import { initDb } from '../src/db/connection.js';
import {
createMessagingGroup,
@@ -102,6 +103,7 @@ async function main(): Promise<void> {
// 2. Agent group + filesystem.
const folder = args.folder || `cli-with-${normalizeName(args.displayName)}`;
const pickedProvider = process.env.NANOCLAW_PICKED_PROVIDER?.trim().toLowerCase();
let ag: AgentGroup | undefined = getAgentGroupByFolder(folder);
if (!ag) {
const agId = generateId('ag');
@@ -123,6 +125,10 @@ async function main(): Promise<void> {
`You are ${args.agentName}, a personal NanoClaw agent for ${args.displayName}. ` +
'When the user first reaches out, introduce yourself briefly and invite them to chat. Keep replies concise.',
});
// Runtime provider lives on the config row, not the deprecated agent_provider.
if (pickedProvider && pickedProvider !== 'claude') {
updateContainerConfigScalars(ag.id, { provider: pickedProvider });
}
// 3. CLI messaging group + wiring.
let cliMg: MessagingGroup | undefined = getMessagingGroupByPlatform(CLI_CHANNEL, CLI_PLATFORM_ID);
+20 -8
View File
@@ -30,10 +30,11 @@
* For direct-addressable channels (telegram, whatsapp, etc.), --platform-id
* is typically the same as the handle in --user-id, with the channel prefix.
*/
import fs from 'fs';
import net from 'net';
import path from 'path';
import { DATA_DIR } from '../src/config.js';
import { DATA_DIR, GROUPS_DIR } from '../src/config.js';
import { createAgentGroup, getAgentGroupByFolder } from '../src/db/agent-groups.js';
import { initDb } from '../src/db/connection.js';
import {
@@ -47,8 +48,7 @@ import { normalizeName } from '../src/modules/agent-to-agent/db/agent-destinatio
import { addMember } from '../src/modules/permissions/db/agent-group-members.js';
import { getUserRoles, grantRole } from '../src/modules/permissions/db/user-roles.js';
import { upsertUser } from '../src/modules/permissions/db/users.js';
import { updateContainerConfigScalars } from '../src/db/container-configs.js';
import { initGroupFilesystem } from '../src/group-init.js';
import { ensureContainerConfig, updateContainerConfigScalars } from '../src/db/container-configs.js';
import { namespacedPlatformId } from '../src/platform-id.js';
import type { AgentGroup, MessagingGroup } from '../src/types.js';
@@ -189,6 +189,7 @@ async function main(): Promise<void> {
// 2. Agent group + filesystem.
const folder = `dm-with-${normalizeName(args.displayName)}`;
const pickedProvider = process.env.NANOCLAW_PICKED_PROVIDER?.trim().toLowerCase();
let ag: AgentGroup | undefined = getAgentGroupByFolder(folder);
if (!ag) {
const agId = generateId('ag');
@@ -204,12 +205,23 @@ async function main(): Promise<void> {
} else {
console.log(`Reusing agent group: ${ag.id} (${folder})`);
}
initGroupFilesystem(ag, {
instructions:
`# ${args.agentName}\n\n` +
// Ensure the config row exists; defer workspace scaffolding to the first
// spawn (group-init), where the DB-resolved provider decides the surface
// (Claude: CLAUDE.local.md; a surfaces-owning provider: the memory scaffold)
// — so a non-Claude group never gets stale CLAUDE.* files written here.
ensureContainerConfig(ag.id);
// Runtime provider lives on the config row, not the deprecated agent_provider.
if (pickedProvider && pickedProvider !== 'claude') {
updateContainerConfigScalars(ag.id, { provider: pickedProvider });
}
const groupDir = path.resolve(GROUPS_DIR, folder);
fs.mkdirSync(groupDir, { recursive: true });
fs.writeFileSync(
path.join(groupDir, '.seed.md'),
`# ${args.agentName}\n\n` +
`You are ${args.agentName}, a personal NanoClaw agent for ${args.displayName}. ` +
'When the user first reaches out (or you receive a system welcome prompt), introduce yourself briefly and invite them to chat. Keep replies concise.',
});
'When the user first reaches out (or you receive a system welcome prompt), introduce yourself briefly and invite them to chat. Keep replies concise.\n',
);
// 2b. Assign the user a role for this agent group. The caller picks via
// --role; the channel drivers default to 'owner' for the self-host case.
+121
View File
@@ -0,0 +1,121 @@
#!/usr/bin/env bash
#
# Install the Codex agent provider non-interactively: copy the payload from the
# `providers` branch, wire the three provider barrels, and add the Codex CLI to
# the container manifest (container/cli-tools.json). The image rebuild is the
# caller's job (the setup container step / `./container/build.sh`).
#
# Emits exactly one status block on stdout (ADD_CODEX); all chatty progress
# goes to stderr. Keep in sync with .claude/skills/add-codex/SKILL.md.
set -euo pipefail
PROJECT_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
cd "$PROJECT_ROOT"
# Keep in sync with add-codex SKILL.md. This is the canonical Codex CLI pin —
# it lands in container/cli-tools.json (the global-CLI manifest), not the Dockerfile.
CODEX_VERSION="0.138.0"
# Resolve the remote carrying the providers branch (same nanoclaw remote that
# carries channels — handles forks where it isn't `origin`).
# shellcheck source=setup/lib/channels-remote.sh
source "$PROJECT_ROOT/setup/lib/channels-remote.sh"
REMOTE=$(resolve_channels_remote)
BRANCH="${REMOTE}/providers"
# The codex payload — host provider, container runtime, setup module, doctrine.
# Barrels are appended to, not copied.
PAYLOAD_FILES=(
src/providers/codex.ts
src/providers/codex-agents-md.ts
src/providers/codex-registration.test.ts
src/providers/codex-host-contribution.test.ts
src/providers/codex-agents-md.test.ts
container/agent-runner/src/providers/codex.ts
container/agent-runner/src/providers/codex-app-server.ts
container/agent-runner/src/providers/exchange-archive.ts
container/agent-runner/src/providers/exchange-archive.test.ts
container/agent-runner/src/providers/codex-registration.test.ts
container/agent-runner/src/providers/codex.factory.test.ts
container/agent-runner/src/providers/codex.turns.test.ts
container/agent-runner/src/providers/codex-app-server.test.ts
container/agent-runner/src/providers/codex-cli-tools.test.ts
setup/providers/codex.ts
setup/providers/codex.test.ts
setup/providers/codex-registration.test.ts
container/AGENTS.md
)
BARRELS=(
src/providers/index.ts
container/agent-runner/src/providers/index.ts
setup/providers/index.ts
)
ALREADY_INSTALLED=true
emit_status() {
local status=$1 error=${2:-}
echo "=== NANOCLAW SETUP: ADD_CODEX ==="
echo "STATUS: ${status}"
echo "CODEX_VERSION: ${CODEX_VERSION}"
echo "ALREADY_INSTALLED: ${ALREADY_INSTALLED}"
[ -n "$error" ] && echo "ERROR: ${error}"
echo "=== END ==="
}
log() { echo "[add-codex] $*" >&2; }
# Idempotent: a complete install has the host provider file, the host barrel
# import, and the Codex CLI in the container manifest. Any missing → (re)install.
need_install() {
[ ! -f src/providers/codex.ts ] && return 0
! grep -q "^import './codex.js';" src/providers/index.ts 2>/dev/null && return 0
! grep -q '@openai/codex' container/cli-tools.json 2>/dev/null && return 0
return 1
}
if need_install; then
ALREADY_INSTALLED=false
log "Fetching providers branch from ${REMOTE}"
git fetch "$REMOTE" providers >&2 2>/dev/null || {
emit_status failed "git fetch ${REMOTE} providers failed"
exit 1
}
log "Copying Codex payload from ${BRANCH}"
for f in "${PAYLOAD_FILES[@]}"; do
mkdir -p "$(dirname "$f")"
git show "${BRANCH}:$f" > "$f" 2>/dev/null || {
emit_status failed "providers branch is missing ${f}"
exit 1
}
done
log "Wiring provider barrels…"
for b in "${BARRELS[@]}"; do
grep -q "^import './codex.js';" "$b" || printf "import './codex.js';\n" >> "$b"
done
log "Adding the Codex CLI to the container manifest (cli-tools.json)…"
# A json-merge: append { name, version } if absent. The Dockerfile installs
# every manifest entry via pinned `pnpm install -g` — no Dockerfile edit, no
# awk surgery. @openai/codex has no native postinstall, so no "onlyBuilt".
MANIFEST=container/cli-tools.json
node -e '
const fs = require("fs");
const [file, name, version] = process.argv.slice(1);
const tools = JSON.parse(fs.readFileSync(file, "utf8"));
if (!tools.some((t) => t.name === name)) {
tools.push({ name, version });
const fmt = (t) =>
" { " +
Object.entries(t).map(([k, v]) => JSON.stringify(k) + ": " + JSON.stringify(v)).join(", ") +
" }";
fs.writeFileSync(file, "[\n" + tools.map(fmt).join(",\n") + "\n]\n");
}
' "$MANIFEST" "@openai/codex" "${CODEX_VERSION}" || {
emit_status failed "failed to add @openai/codex to ${MANIFEST}"
exit 1
}
fi
emit_status ok
+85 -2
View File
@@ -38,8 +38,12 @@ import { runTeamsChannel } from './channels/teams.js';
import { runTelegramChannel } from './channels/telegram.js';
import { runWhatsAppChannel } from './channels/whatsapp.js';
import { pingCliAgent, type PingResult } from './lib/agent-ping.js';
import { getSetupProvider, listSetupProviders } from './providers/registry.js';
// Provider payloads self-register their picker entry + auth on import.
import './providers/index.js';
import { brightSelect } from './lib/bright-select.js';
import { offerClaudeOnFailure } from './lib/claude-handoff.js';
import { setPickedProvider } from './lib/picked-provider.js';
import {
applyToEnv,
parseFlags,
@@ -321,8 +325,54 @@ async function main(): Promise<void> {
}
}
let agentProvider: string | undefined;
if (!skip.has('auth')) {
await runAuthStep();
// Agent runtime pick. Claude is the default and a no-op — choosing it
// runs the existing Claude auth flow unchanged. A branch provider walks
// its own auth (e.g. Codex: ChatGPT subscription or API key, vault-only)
// and verifies its payload is wired. The pick installs and authenticates
// the runtime; it is NOT an install-wide default — and it is NOT a
// creation flag. Provider is a DB property of a group: the creation flows
// create provider-agnostic groups, and setup sets the picked provider on
// each via `ncl groups config update --provider` right after creating it
// (the creation scripts inherit it and apply at create — see picked-provider). Existing groups switch the
// same way (docs/provider-migration.md).
agentProvider = await askAgentProviderChoice();
setPickedProvider(agentProvider);
let providerEntry = getSetupProvider(agentProvider);
if (agentProvider !== 'claude' && !providerEntry) {
// A non-claude provider picked from the hard-wired list isn't wired in
// this install yet — install it via its self-contained script (channel
// style, idempotent: self-skips if already installed), rebuild the image
// (the container step already ran, the Dockerfile just changed), then
// load the payload's setup module so it self-registers.
const install = await runQuietChild(
`add-${agentProvider}`,
'bash',
[`setup/add-${agentProvider}.sh`],
{
running: `Installing ${agentProvider}`,
done: `${agentProvider} installed.`,
},
);
if (!install.ok) {
await fail(
`add-${agentProvider}`,
`Couldn't install ${agentProvider}.`,
'See logs/setup-steps/ for details, then retry setup.',
);
}
p.log.info(brandBody('Rebuilding the container image with the new provider…'));
spawnSync('./container/build.sh', [], { stdio: 'inherit' });
await import(`./providers/${agentProvider}.js`);
providerEntry = getSetupProvider(agentProvider);
}
if (providerEntry?.runAuth) {
await providerEntry.runAuth();
await providerEntry.runInstallCheck?.();
} else {
await runAuthStep();
}
}
if (!skip.has('mounts')) {
@@ -748,6 +798,39 @@ function sendChatMessage(message: string): Promise<void> {
// ─── auth step (select → branch) ────────────────────────────────────────
// Providers offered for install are hard-wired in trunk — an audited control
// surface (no branch enumeration that anyone with write access could extend).
// Codex is the only one offered here; opencode/ollama install via their own
// /add-* skills. Each is installed by its self-contained setup/add-<name>.sh.
const INSTALLABLE_PROVIDERS = [
{ value: 'codex', label: 'Codex', hint: 'OpenAI — ChatGPT subscription or API key' },
] as const;
async function askAgentProviderChoice(): Promise<string> {
const installed = listSetupProviders();
const installedNames = new Set(installed.map((entry) => entry.value));
// Offer the hard-wired installable providers this install hasn't wired yet —
// selecting one installs it via setup/add-<name>.sh.
const available = INSTALLABLE_PROVIDERS.filter((prov) => !installedNames.has(prov.value));
const options = [
...installed.map(({ value, label, hint }) => ({ value, label, hint })),
...available.map((prov) => ({ value: prov.value, label: prov.label, hint: `${prov.hint} — installs now` })),
];
// The pick installs and authenticates a runtime — it is not an
// install-wide default, so re-runs safely Enter-through on claude (its
// auth flow short-circuits when the secret already exists).
const choice = ensureAnswer(
await brightSelect<string>({
message: 'Which agent runtime should power your assistant?',
options,
initialValue: 'claude',
}),
) as string;
setupLog.userInput('agent_provider', choice);
phEmit('agent_provider_chosen', { provider: choice });
return choice;
}
async function runAuthStep(): Promise<void> {
if (anthropicSecretExists()) {
p.log.success(brandBody('Your Claude account is already connected.'));
@@ -1261,7 +1344,7 @@ function detectExistingOnecli(): { version: string; apiHost: string } | null {
} catch {
// not JSON — try to extract a URL directly
}
const m = raw.match(/https?:\/\/[\w.\-]+(?::\d+)?/);
const m = raw.match(/https?:\/\/[\w.-]+(?::\d+)?/);
return m ? { version, apiHost: m[0] } : null;
} catch {
return null;
+8 -1
View File
@@ -68,8 +68,12 @@ export async function run(args: string[]): Promise<void> {
log.info('Invoking init-cli-agent', { displayName, agentName });
// Provider-agnostic: init-cli-agent creates a default group and emits its id.
// Surface that id so the orchestrator can set the picked provider on it (via
// ncl) before the ping — provider is a DB property, never a creation flag.
let stdout = '';
try {
execFileSync('pnpm', scriptArgs, {
stdout = execFileSync('pnpm', scriptArgs, {
cwd: projectRoot,
stdio: ['ignore', 'pipe', 'pipe'],
encoding: 'utf-8',
@@ -90,10 +94,13 @@ export async function run(args: string[]): Promise<void> {
process.exit(1);
}
const agentGroupId = stdout.match(/^AGENT_GROUP_ID:\s*(\S+)/m)?.[1];
emitStatus('CLI_AGENT', {
DISPLAY_NAME: displayName,
AGENT_NAME: agentName || displayName,
CHANNEL: 'cli/local',
...(agentGroupId ? { AGENT_GROUP_ID: agentGroupId } : {}),
STATUS: 'success',
LOG: 'logs/setup.log',
});
+23
View File
@@ -35,6 +35,29 @@ export function readEnvKey(key: string, projectRoot?: string): string | null {
return null;
}
/**
* Set (or replace) a single `KEY=value` line in `.env`, creating the file if
* needed. Non-secret config only secrets belong in the OneCLI vault.
*/
export function upsertEnvKey(key: string, value: string, projectRoot?: string): void {
const envPath = path.join(projectRoot ?? process.cwd(), '.env');
let content = '';
try {
content = fs.readFileSync(envPath, 'utf-8');
} catch {
/* no .env yet */
}
const line = `${key}=${value}`;
const lines = content.split('\n');
const idx = lines.findIndex((l) => l.trim().startsWith(`${key}=`));
if (idx >= 0) lines[idx] = line;
else {
while (lines.length > 0 && lines[lines.length - 1].trim() === '') lines.pop();
lines.push(line);
}
fs.writeFileSync(envPath, lines.join('\n') + '\n');
}
export function detectExistingDisplayName(projectRoot: string): string | null {
const dbPath = path.join(projectRoot, 'data', 'v2.db');
if (!fs.existsSync(dbPath)) return null;
+1
View File
@@ -23,6 +23,7 @@ const STEPS: Record<
verify: () => import('./verify.js'),
onecli: () => import('./onecli.js'),
auth: () => import('./auth.js'),
'provider-auth': () => import('./provider-auth.js'),
'cli-agent': () => import('./cli-agent.js'),
};
+27 -1
View File
@@ -66,17 +66,43 @@ export interface BrightSelectOptions<T> {
initialValue?: T;
}
/**
* Discard any stdin buffered while no prompt was reading keypresses made
* during spinners and installs otherwise get consumed by the next select the
* instant it opens, submitting it before it ever renders for the user (a
* stray ``+`Enter` silently picks option 2). Raw-mode reads only see kernel
* tty data via the event loop, so the drain needs a real (short) window.
*/
export function flushStdin(windowMs = 50): Promise<void> {
return new Promise((resolve) => {
const stdin = process.stdin;
if (!stdin.isTTY) return resolve();
const wasRaw = stdin.isRaw === true;
stdin.setRawMode?.(true);
const discard = (): void => {};
stdin.on('data', discard);
stdin.resume();
setTimeout(() => {
stdin.off('data', discard);
stdin.pause();
if (!wasRaw) stdin.setRawMode?.(false);
resolve();
}, windowMs);
});
}
/**
* Matches the return shape of `p.select` resolves to the selected value
* on submit, or to clack's cancel symbol on Ctrl-C / Esc. Callers pass
* the result through `ensureAnswer(...)` the same way they do for
* `p.select`.
*/
export function brightSelect<T>(
export async function brightSelect<T>(
opts: BrightSelectOptions<T>,
): Promise<T | symbol> {
const { message, options, initialValue } = opts;
await flushStdin();
return new SelectPrompt({
options: options as Array<{ value: T; label?: string; hint?: string }>,
initialValue,
+44
View File
@@ -0,0 +1,44 @@
import { describe, expect, it } from 'vitest';
import { extractClaudeOAuthToken } from './captured-token.js';
// A syntactically valid token: sk-ant-oat + 93 token chars + AA.
const TOKEN = `sk-ant-oat01-${'a'.repeat(90)}AA`;
describe('extractClaudeOAuthToken', () => {
it('extracts the token from clean single-line output (normal terminal)', () => {
const raw = `Login successful.\nYour token:\n${TOKEN}\n`;
expect(extractClaudeOAuthToken(raw)).toBe(TOKEN);
});
// The actual sbx failure shape: the real token wrapped across two lines AND
// the `export CLAUDE_CODE_OAUTH_TOKEN=<token>` placeholder in the same
// capture. The old parser returned null (matched only the first fragment);
// the normalizer must un-wrap the real token and never mistake the
// placeholder for it.
it('extracts the real wrapped token from sbx capture and ignores the placeholder export', () => {
const head = TOKEN.slice(0, 72);
const tail = TOKEN.slice(72);
const raw = `
\x1b[?2026h Long-lived authentication token created successfully!
Your OAuth token (valid for 1 year):
${head}
${tail}
Store this token securely. You won't be able to see it again.
Use this token by setting: export CLAUDE_CODE_OAUTH_TOKEN=<token>
`;
expect(extractClaudeOAuthToken(raw)).toBe(TOKEN);
});
it('returns null for the placeholder env-var line, not a real token', () => {
expect(extractClaudeOAuthToken('export CLAUDE_CODE_OAUTH_TOKEN=<token>\n')).toBeNull();
});
it('returns null when no token is present', () => {
expect(extractClaudeOAuthToken('claude: authentication cancelled\n')).toBeNull();
});
});
+73
View File
@@ -0,0 +1,73 @@
/**
* Parse a provider auth token out of interactive CLI output captured through
* a PTY (`script(1)`).
*
* Secret this module hides: the menagerie of PTY-capture artifacts that
* corrupt an otherwise whitespace-free secret. A real terminal wraps long
* lines, pads with spaces, and interleaves ANSI/control sequences, so a token
* the CLI printed as one string lands in the capture split across lines with
* escape codes embedded. Provider login itself succeeds only our parse of
* the human-oriented output fails.
*
* A normalize step strips the capture artifacts; the extractor matches the
* token shape against the clean string. A future provider adds its own
* extractor here rather than regexing raw `script(1)` output.
*
* Runnable as a CLI for the bash callers that can't import TS:
* tsx setup/lib/captured-token.ts claude <capture-file>
* Prints the token and exits 0, or exits 1 with nothing on stdout.
*/
import fs from 'fs';
import { pathToFileURL } from 'url';
/* eslint-disable no-control-regex -- these patterns exist precisely to match
the ESC/control bytes a PTY capture is full of. */
// CSI sequences (colors, cursor moves): ESC [ , optional private '?' /
// parameter bytes, optional intermediate bytes, one final byte. Stripped
// explicitly because a colour reset mid-token (sk…\x1b[0m…AA) would otherwise
// leave a `[` that breaks the token's character run.
const CSI = /\x1b\[[0-9;?]*[ -/]*[@-~]/g;
// Everything <= space (control bytes incl. any stray ESC, CR/LF, tabs, and the
// wrap-padding spaces inserted mid-token) plus DEL. Tokens contain none of these.
const CONTROL_AND_SPACE = /[\x00-\x20\x7f]/g;
/* eslint-enable no-control-regex */
/**
* Collapse PTY-capture artifacts so a whitespace-free secret printed across
* wrapped lines becomes a single contiguous string. Drops ALL whitespace by
* design these captures exist only to recover a token, never prose.
*/
function normalizeCapturedTerminalOutput(raw: string): string {
return raw.replace(CSI, '').replace(CONTROL_AND_SPACE, '');
}
// Claude subscription OAuth tokens: sk-ant-oat<base64url>AA. Bounded length
// keeps a greedy match from running off the end of the token.
const CLAUDE_OAUTH_TOKEN = /sk-ant-oat[A-Za-z0-9_-]{80,500}AA/g;
/**
* Extract the Claude OAuth token from a PTY capture of `claude setup-token`,
* or `null` if none is present. Returns the LAST match setup-token can echo
* partial/intermediate output before the final token. Placeholder strings like
* `<token>` never match (they lack the `sk-ant-oat` prefix).
*/
export function extractClaudeOAuthToken(raw: string): string | null {
const matches = normalizeCapturedTerminalOutput(raw).match(CLAUDE_OAUTH_TOKEN);
return matches ? matches[matches.length - 1] : null;
}
function runCli(argv: string[]): number {
const [provider, file] = argv;
if (provider !== 'claude' || !file) {
process.stderr.write('usage: captured-token.ts claude <capture-file>\n');
return 2;
}
const token = extractClaudeOAuthToken(fs.readFileSync(file, 'utf-8'));
if (!token) return 1;
process.stdout.write(token);
return 0;
}
if (import.meta.url === pathToFileURL(process.argv[1] ?? '').href) {
process.exit(runCli(process.argv.slice(2)));
}
+4 -8
View File
@@ -27,6 +27,7 @@ import path from 'path';
import * as p from '@clack/prompts';
import k from 'kleur';
import { extractClaudeOAuthToken } from './captured-token.js';
import { ensureAnswer } from './runner.js';
import { brandBody, fitToWidth, fmtDuration, note } from './theme.js';
@@ -207,16 +208,11 @@ export async function ensureClaudeReady(projectRoot: string): Promise<boolean> {
});
if (!isClaudeAuthenticated() && fs.existsSync(tmpfile)) {
const raw = fs.readFileSync(tmpfile, 'utf-8');
const stripped = raw
.replace(/\x1b\[[0-9;]*[a-zA-Z]/g, '')
.replace(/[\n\r]/g, '');
const matches = stripped.match(/(sk-ant-oat[A-Za-z0-9_-]{80,500}AA)/g);
if (matches) {
process.env.CLAUDE_CODE_OAUTH_TOKEN = matches[matches.length - 1];
}
const token = extractClaudeOAuthToken(fs.readFileSync(tmpfile, 'utf-8'));
if (token) process.env.CLAUDE_CODE_OAUTH_TOKEN = token;
}
} finally {
// eslint-disable-next-line no-empty -- best-effort temp cleanup
try { fs.unlinkSync(tmpfile); } catch {}
}
+28
View File
@@ -0,0 +1,28 @@
/**
* The agent runtime the operator picked in THIS setup run.
*
* There is no install-wide default provider and no `--provider` in the
* creation contract provider is a DB property of a group. Setup is the one
* orchestrator that knows the operator's pick, so it stashes it here (set once
* at the auth step). The group-creation scripts (`init-first-agent`,
* `init-cli-agent`) run as **child processes**, so the pick is carried over the
* process boundary via an environment variable they inherit; they apply it to
* the group at creation, before the welcome wakes the container. This is the
* only place the value lives a setup-run-scoped global, NOT a persisted
* install default. `undefined` / `'claude'` means the built-in default and no
* provider write at all.
*/
const ENV_KEY = 'NANOCLAW_PICKED_PROVIDER';
export function setPickedProvider(provider: string | undefined): void {
const normalized = provider?.trim().toLowerCase() || undefined;
if (normalized && normalized !== 'claude') {
process.env[ENV_KEY] = normalized;
} else {
delete process.env[ENV_KEY];
}
}
export function getPickedProvider(): string | undefined {
return process.env[ENV_KEY]?.trim().toLowerCase() || undefined;
}
+48
View File
@@ -0,0 +1,48 @@
/**
* versions.json is the machine-checkable source for sanctioned component
* versions: setup steps read it, /update-nanoclaw diffs it across updates.
* These tests go red if the file, the pin, or the onecli-step wiring is
* deleted the pin moving back to a hardcoded constant is the regression
* this guards against.
*/
import fs from 'fs';
import path from 'path';
import { fileURLToPath } from 'url';
import { describe, expect, it } from 'vitest';
import { readVersionPin } from './version-pins.js';
const here = path.dirname(fileURLToPath(import.meta.url));
describe('readVersionPin', () => {
it('resolves the onecli-gateway pin from the real versions.json', () => {
expect(readVersionPin('onecli-gateway')).toMatch(/^\d+\.\d+\.\d+$/);
});
it('resolves the onecli-cli pin from the real versions.json', () => {
expect(readVersionPin('onecli-cli')).toMatch(/^\d+\.\d+\.\d+$/);
});
it('throws for a component with no pin', () => {
expect(() => readVersionPin('no-such-component')).toThrow(/no pin/);
});
});
describe('onecli step wiring', () => {
it('reads its gateway pin from versions.json, not a hardcoded constant', () => {
const source = fs.readFileSync(path.join(here, '..', 'onecli.ts'), 'utf-8');
expect(source).toContain("readVersionPin('onecli-gateway')");
expect(source).not.toMatch(/ONECLI_GATEWAY_VERSION = '\d/);
});
it('reads its CLI pin from versions.json and never resolves "latest"', () => {
const source = fs.readFileSync(path.join(here, '..', 'onecli.ts'), 'utf-8');
expect(source).toContain("readVersionPin('onecli-cli')");
expect(source).not.toMatch(/ONECLI_CLI(?:_FALLBACK)?_VERSION = '\d/);
// The upstream installer and the /releases/latest redirect probe both
// chase "latest" — reintroducing either bypasses the sanctioned pin.
expect(source).not.toContain('onecli.sh/cli/install');
expect(source).not.toContain('/releases/latest');
});
});
+31
View File
@@ -0,0 +1,31 @@
/**
* Sanctioned version pins for external components (`versions.json` at the
* repo root) the single machine-checkable source. Setup steps read their
* pin here; `/update-nanoclaw` diffs the file across an update and routes
* the user to the migration doc for any pin that moved (see CONTRIBUTING.md,
* "Breaking changes").
*/
import fs from 'fs';
import path from 'path';
import { fileURLToPath } from 'url';
const VERSIONS_FILE = path.resolve(
path.dirname(fileURLToPath(import.meta.url)),
'..',
'..',
'versions.json',
);
/**
* Returns the pinned version for a component, e.g.
* `readVersionPin('onecli-gateway')`. Throws when the file or the pin is
* missing a missing pin is an install-tree defect, not a runtime condition.
*/
export function readVersionPin(component: string): string {
const pins: unknown = JSON.parse(fs.readFileSync(VERSIONS_FILE, 'utf-8'));
const value = (pins as Record<string, unknown>)[component];
if (typeof value !== 'string' || value.length === 0) {
throw new Error(`versions.json has no pin for "${component}"`);
}
return value;
}
+29
View File
@@ -0,0 +1,29 @@
/**
* The step DETECTS gateway /v1 compatibility and warns (pointing at
* docs/onecli-upgrades.md) it does not migrate the gateway; that's the
* agent's job via /update-nanoclaw. The verify helper must distinguish
* incompatible (pre-/v1 server: warn) from unreachable (transient: nothing to
* say) so the warning only fires on a real pre-/v1 server.
*/
import { describe, expect, it } from 'vitest';
import { verifyGatewayV1 } from './onecli.js';
function fakeFetch(behavior: 'ok' | '404' | 'down'): typeof fetch {
return (async () => {
if (behavior === 'down') throw new Error('ECONNREFUSED');
return { ok: behavior === 'ok' } as Response;
}) as unknown as typeof fetch;
}
describe('verifyGatewayV1', () => {
it('ok when /v1/health answers', async () => {
expect(await verifyGatewayV1('http://x', fakeFetch('ok'))).toBe('ok');
});
it('incompatible when the server answers HTTP without /v1', async () => {
expect(await verifyGatewayV1('http://x', fakeFetch('404'))).toBe('incompatible');
});
it('unreachable on connection failure', async () => {
expect(await verifyGatewayV1('http://x', fakeFetch('down'))).toBe('unreachable');
});
});
+61 -54
View File
@@ -17,6 +17,7 @@ import os from 'os';
import path from 'path';
import { log } from '../src/log.js';
import { readVersionPin } from './lib/version-pins.js';
import { emitStatus } from './status.js';
const LOCAL_BIN = path.join(os.homedir(), '.local', 'bin');
@@ -102,20 +103,18 @@ function writeEnvOnecliUrl(url: string): void {
writeEnvVar('ONECLI_URL', url);
}
// Last-known-good CLI release. Used only if BOTH the upstream installer
// and the redirect-based version probe fail. Bump deliberately when a
// new CLI release ships.
const ONECLI_GATEWAY_VERSION = '1.23.0';
const ONECLI_CLI_FALLBACK_VERSION = '1.3.0';
// The SANCTIONED gateway version: fresh installs pin to it. Upgrading an
// existing gateway is NOT done here — the gateway is a separate out-of-band
// component, and the migrator is the user's coding agent following
// docs/onecli-upgrades.md during /update-nanoclaw. The pin lives in
// versions.json ("onecli-gateway") so that flow can diff it across updates and
// route the agent to the doc; bump it there deliberately on a new release.
const ONECLI_GATEWAY_VERSION = readVersionPin('onecli-gateway');
// The CLI binary follows the same convention: installed at its pin
// ("onecli-cli" in versions.json), never at whatever "latest" means today.
const ONECLI_CLI_VERSION = readVersionPin('onecli-cli');
const ONECLI_CLI_REPO = 'onecli/onecli-cli';
function installOnecliCliOnly(): { stdout: string; ok: boolean } {
const upstream = runInstall('curl -fsSL onecli.sh/cli/install | sh');
if (upstream.ok) return { stdout: upstream.stdout, ok: true };
const fallback = installOnecliCliDirect();
return { stdout: upstream.stdout + (upstream.stderr ?? '') + '\n' + fallback.stdout, ok: fallback.ok };
}
// Remove containers in the "onecli" compose project whose service name isn't
// in the v2 set. Pre-v2 OneCLI used service "app" (container onecli-app-1);
// v2 uses "onecli". Compose flags the old container as an orphan but won't
@@ -161,24 +160,10 @@ function installOnecli(): { stdout: string; ok: boolean } {
return { stdout: stdout + (gw.stderr ?? ''), ok: false };
}
// CLI install. The upstream script calls the GitHub releases API
// (api.github.com) to resolve the latest tag — which 403s anonymous
// callers after 60 requests/hour per IP. Try upstream first; on failure
// resolve the version ourselves (via HTTP redirect, which isn't
// API-throttled) and download the release archive directly.
const upstream = runInstall('curl -fsSL onecli.sh/cli/install | sh');
stdout += upstream.stdout;
if (upstream.ok) return { stdout, ok: true };
log.warn('Upstream CLI installer failed — falling back to direct download', {
stderr: upstream.stderr,
});
stdout += (upstream.stderr ?? '') + '\n';
const fallback = installOnecliCliDirect();
stdout += fallback.stdout;
if (!fallback.ok) {
log.error('OneCLI CLI install failed (both upstream and direct fallback)');
const cli = installOnecliCliDirect();
stdout += cli.stdout;
if (!cli.ok) {
log.error('OneCLI CLI install failed');
return { stdout, ok: false };
}
return { stdout, ok: true };
@@ -198,11 +183,11 @@ function runInstall(cmd: string): { stdout: string; stderr?: string; ok: boolean
}
/**
* Reinstate the OneCLI CLI install without hitting GitHub's rate-limited
* releases API. Resolves the version via the HTTP redirect from
* /releases/latest /releases/tag/vX.Y.Z, then downloads the archive
* directly. Falls back to ONECLI_CLI_FALLBACK_VERSION if the redirect
* probe also fails.
* Install the OneCLI CLI at the sanctioned pin by downloading the release
* archive straight from GitHub. Deliberately no "latest" resolution the
* upstream installer script always chases the newest release, which would
* drift from the pin. PATH setup is not lost by skipping it:
* ensureShellProfilePath() in run() covers it.
*/
function installOnecliCliDirect(): { stdout: string; ok: boolean } {
const lines: string[] = [];
@@ -221,24 +206,7 @@ function installOnecliCliDirect(): { stdout: string; ok: boolean } {
return { stdout: lines.join('\n'), ok: false };
}
let version: string | null = null;
try {
const redirect = execSync(
`curl -fsSL -o /dev/null -w '%{url_effective}' https://github.com/${ONECLI_CLI_REPO}/releases/latest`,
{ encoding: 'utf-8', stdio: ['ignore', 'pipe', 'pipe'] },
).trim();
const m = redirect.match(/\/tag\/v?([^/]+)$/);
if (m) version = m[1];
} catch {
// redirect probe failed — we'll pin the fallback
}
if (!version) {
version = ONECLI_CLI_FALLBACK_VERSION;
append(`Version probe failed; installing pinned fallback ${version}.`);
} else {
append(`Resolved onecli CLI ${version} via release redirect.`);
}
const version = ONECLI_CLI_VERSION;
const archive = `onecli_${version}_${osName}_${arch}.tar.gz`;
const url = `https://github.com/${ONECLI_CLI_REPO}/releases/download/v${version}/${archive}`;
const tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'onecli-'));
@@ -275,6 +243,39 @@ function installOnecliCliDirect(): { stdout: string; ok: boolean } {
}
}
/**
* /v1 API compatibility check. @onecli-sh/sdk 2.x requires the server's /v1
* API; servers older than the cutover answer 404 on every SDK call (permanent,
* but presents as transient per-spawn failures). This is detect-only setup
* does not migrate the gateway. The upgrade is an out-of-band action on a
* separate component that the agent runs via docs/onecli-upgrades.md during
* /update-nanoclaw, so this step only surfaces the condition and points there.
*/
export async function verifyGatewayV1(
url: string,
fetchImpl: typeof fetch = fetch,
): Promise<'ok' | 'incompatible' | 'unreachable'> {
try {
const res = await fetchImpl(`${url}/v1/health`, { signal: AbortSignal.timeout(5000) });
return res.ok ? 'ok' : 'incompatible';
} catch {
return 'unreachable';
}
}
/**
* Detect-and-warn helper: returns a status HINT (and logs) when the gateway is
* pre-/v1, else null. Never fails the step or auto-upgrades the agent owns
* the upgrade via docs/onecli-upgrades.md.
*/
function gatewayV1Hint(result: 'ok' | 'incompatible' | 'unreachable'): string | null {
if (result !== 'incompatible') return null;
log.warn('OneCLI gateway lacks the /v1 API @onecli-sh/sdk 2.x requires', {
pin: ONECLI_GATEWAY_VERSION,
});
return 'OneCLI gateway lacks the /v1 API @onecli-sh/sdk 2.x requires — upgrade it: docs/onecli-upgrades.md';
}
export async function pollHealth(url: string, timeoutMs: number): Promise<boolean> {
// `/api/health` matches the path probe.sh uses — keep them aligned.
const deadline = Date.now() + timeoutMs;
@@ -300,7 +301,7 @@ export async function run(args: string[]): Promise<void> {
// Remote-mode: install only the CLI, point it at the remote gateway, and
// record the URL in .env. No local gateway is started.
log.info('Installing OneCLI CLI for remote gateway', { remoteUrl });
const res = installOnecliCliOnly();
const res = installOnecliCliDirect();
if (!res.ok || !onecliVersion()) {
emitStatus('ONECLI', {
INSTALLED: false,
@@ -339,12 +340,14 @@ export async function run(args: string[]): Promise<void> {
log.info('Wrote ONECLI_API_KEY to .env');
}
const healthy = await pollHealth(remoteUrl, 5000);
const v1Hint = healthy ? gatewayV1Hint(await verifyGatewayV1(remoteUrl)) : null;
emitStatus('ONECLI', {
INSTALLED: true,
REMOTE: true,
ONECLI_URL: remoteUrl,
HEALTHY: healthy,
STATUS: 'success',
...(v1Hint ? { GATEWAY_HINT: v1Hint } : {}),
LOG: 'logs/setup.log',
});
return;
@@ -378,12 +381,14 @@ export async function run(args: string[]): Promise<void> {
writeEnvOnecliUrl(url);
log.info('Reusing existing OneCLI', { url });
const healthy = await pollHealth(url, 5000);
const v1Hint = healthy ? gatewayV1Hint(await verifyGatewayV1(url)) : null;
emitStatus('ONECLI', {
INSTALLED: true,
REUSED: true,
ONECLI_URL: url,
HEALTHY: healthy,
STATUS: 'success',
...(v1Hint ? { GATEWAY_HINT: v1Hint } : {}),
LOG: 'logs/setup.log',
});
return;
@@ -436,6 +441,7 @@ export async function run(args: string[]): Promise<void> {
log.info('Wrote ONECLI_URL to .env', { url });
const healthy = await pollHealth(url, 15000);
const v1Hint = healthy ? gatewayV1Hint(await verifyGatewayV1(url)) : null;
emitStatus('ONECLI', {
INSTALLED: true,
@@ -446,6 +452,7 @@ export async function run(args: string[]): Promise<void> {
// The next step (auth) will surface a genuinely broken gateway via
// `onecli secrets list`, so don't trigger rescue attempts from here.
STATUS: 'success',
...(v1Hint ? { GATEWAY_HINT: v1Hint } : {}),
...(healthy
? {}
: {
+80
View File
@@ -0,0 +1,80 @@
/**
* Standalone provider auth the late-adopter entry point.
*
* Fresh installs reach a provider's auth walk-through via the setup picker;
* an existing install adding a provider later runs THIS instead:
*
* pnpm exec tsx setup/index.ts --step provider-auth codex
*
* Same walk-through, same vault-only invariant, idempotent (each provider's
* runAuth short-circuits when its secret already exists) and unlike
* re-running full setup, it touches nothing else: no install-wide default
* provider rewrite, no service changes. Provider install skills call this as
* their auth step so there is exactly one auth implementation per provider.
*/
import { execSync } from 'child_process';
import fs from 'fs';
import path from 'path';
import { getSetupProvider, listSetupProviders } from './providers/registry.js';
// Provider payloads self-register on import.
import './providers/index.js';
// Hard-wired install scripts — the audited control surface (no branch
// enumeration). Each setup/add-<name>.sh is idempotent and self-skips when the
// payload is already wired. Codex is the only manifest-style provider today.
const INSTALL_SCRIPTS: Record<string, string> = {
codex: 'setup/add-codex.sh',
};
export async function run(args: string[]): Promise<void> {
const name = args[0]?.trim().toLowerCase();
const withAuth = listSetupProviders().filter((entry) => entry.runAuth);
if (!name) {
console.error(
`Usage: pnpm exec tsx setup/index.ts --step provider-auth <provider>\n` +
`Providers with an auth step: ${withAuth.map((entry) => entry.value).join(', ') || '(none installed)'}`,
);
process.exit(1);
}
let entry = getSetupProvider(name);
const script = INSTALL_SCRIPTS[name];
if (script) {
// Install OR refresh: the script is idempotent and is also the upgrade
// path — payload files resync and a bumped Dockerfile pin replaces the
// local one. Rebuild the image only when the Dockerfile actually changed
// (payload code is mounted, not baked).
const dfPath = path.join(process.cwd(), 'container', 'Dockerfile');
const dfBefore = fs.readFileSync(dfPath, 'utf-8');
console.log(`${entry ? 'Refreshing' : 'Installing'} ${name}`);
execSync(`bash ${script}`, { stdio: 'inherit' });
if (fs.readFileSync(dfPath, 'utf-8') !== dfBefore) {
console.log('Dockerfile pin changed — rebuilding the container image…');
execSync('./container/build.sh', { stdio: 'inherit' });
}
if (!entry) {
await import(`./providers/${name}.js`);
entry = getSetupProvider(name);
}
if (!entry) {
console.error(`Install completed but ${name} did not register — check setup/providers/${name}.ts`);
process.exit(1);
}
} else if (!entry) {
console.error(
`Unknown provider: ${name}. Installed: ${listSetupProviders()
.map((e) => e.value)
.join(', ')}.`,
);
process.exit(1);
}
if (!entry.runAuth) {
console.error(`Provider "${name}" uses the standard auth flow — run the full setup, or /add-${name}'s steps.`);
process.exit(1);
}
await entry.runAuth();
await entry.runInstallCheck?.();
}
+83
View File
@@ -0,0 +1,83 @@
import { describe, it, expect } from 'vitest';
import fs from 'fs';
import path from 'path';
import { fileURLToPath } from 'url';
/**
* Provider is a DB property of a group, set only via
* `ncl groups config update --provider`. The group-creation contract that a
* fork's coding agent and its skills depend on must carry zero provider
* vocabulary no `--provider` flag passed to, parsed by, or threaded through
* any creation path. These guards go red if that flag creeps back in.
*
* (Prose references to the ncl surface in comments are fine we assert the
* absence of the `'--provider'` arg *literal*, not the substring.)
*/
const repoRoot = path.resolve(path.dirname(fileURLToPath(import.meta.url)), '..');
function read(rel: string): string {
return fs.readFileSync(path.join(repoRoot, rel), 'utf-8');
}
const CREATION_FILES = [
'scripts/init-first-agent.ts',
'scripts/init-cli-agent.ts',
'setup/register.ts',
'setup/cli-agent.ts',
'setup/channels/telegram.ts',
'setup/channels/discord.ts',
'setup/channels/slack.ts',
'setup/channels/whatsapp.ts',
'setup/channels/signal.ts',
'setup/channels/imessage.ts',
'setup/channels/teams.ts',
];
describe('creation is provider-agnostic', () => {
for (const file of CREATION_FILES) {
it(`${file} passes/parses no --provider flag`, () => {
const src = read(file);
expect(src).not.toContain("'--provider'");
expect(src).not.toMatch(/case '--provider'/);
});
}
});
describe('setup carries the picked provider to creation via a setup-run env var', () => {
it('picked-provider stashes/reads the pick in the NANOCLAW_PICKED_PROVIDER env var', () => {
const src = read('setup/lib/picked-provider.ts');
expect(src).toContain('NANOCLAW_PICKED_PROVIDER');
// The pick is set into process.env so child creation scripts inherit it —
// an in-process module global can't cross the process boundary.
expect(src).toMatch(/process\.env\[/);
});
// The creation scripts run as child processes, inherit the env var, and apply
// it to the group's runtime config — container_configs.provider, the source of
// truth materialized into container.json (agent_provider is deprecated) — before
// the welcome wakes the container. No `--provider` flag in the contract (above).
for (const file of ['scripts/init-first-agent.ts', 'scripts/init-cli-agent.ts']) {
it(`${file} applies the env-carried provider to container_configs.provider`, () => {
const src = read(file);
expect(src).toContain('NANOCLAW_PICKED_PROVIDER');
expect(src).toMatch(/updateContainerConfigScalars\([^)]*provider:\s*pickedProvider/);
});
}
});
describe('codex installs from a hard-wired self-contained script', () => {
// The provider picker no longer enumerates a remote manifest branch (an
// unaudited control surface). Codex is offered in trunk and installed by its
// own setup/add-<name>.sh, exactly like a channel adapter.
it('setup/add-codex.sh exists', () => {
expect(fs.existsSync(path.join(repoRoot, 'setup/add-codex.sh'))).toBe(true);
});
it('setup/auto.ts installs the picked provider by running setup/add-<name>.sh', () => {
const src = read('setup/auto.ts');
expect(src).toContain('setup/add-${agentProvider}.sh');
// The removed branch-enumeration machinery must not creep back in.
expect(src).not.toContain('listBranchProviderManifests');
expect(src).not.toContain('installProviderFromBranch');
});
});
+3
View File
@@ -0,0 +1,3 @@
// Setup-side provider barrel. Provider payloads with their own setup surface
// (picker entry, auth walk-through, install check) self-register on import.
// Skills add a provider by appending one import line below.
+43
View File
@@ -0,0 +1,43 @@
/**
* Setup-side provider registration guards.
*
* Behavior (barrel-driven): imports the real setup/providers barrel and
* asserts the built-in default red if the barrel fails to evaluate.
* Per-provider registration guards ship WITH each provider payload (the
* skill copies them in), same archetype as the host/container registration
* tests.
*
* Structural: the picker and the standalone provider-auth step are wiring
* inside non-invocable entry flows (setup main, STEPS map) assert their
* consumption of the registry in source, so deleting either reach-in goes red.
*/
import fs from 'fs';
import path from 'path';
import { describe, expect, it } from 'vitest';
import { getSetupProvider, listSetupProviders } from './registry.js';
import './index.js'; // the real setup provider barrel — triggers self-registration
describe('setup provider registry', () => {
it('always carries claude as the built-in default with the standard auth flow', () => {
const claude = getSetupProvider('claude');
expect(claude).toBeDefined();
expect(claude!.runAuth).toBeUndefined();
expect(listSetupProviders()[0]!.value).toBe('claude');
});
});
describe('setup flow consumes the registry (structural)', () => {
it('the picker renders options from listSetupProviders', () => {
const src = fs.readFileSync(path.join(process.cwd(), 'setup', 'auto.ts'), 'utf-8');
expect(src).toContain('listSetupProviders()');
expect(src).toContain("import './providers/index.js'");
// The capability-keyed branch — a provider's own auth runs iff it declares one.
expect(src).toMatch(/providerEntry\?\.runAuth/);
});
it('the standalone provider-auth step is reachable from the STEPS map', () => {
const src = fs.readFileSync(path.join(process.cwd(), 'setup', 'index.ts'), 'utf-8');
expect(src).toContain("'provider-auth'");
});
});
+59
View File
@@ -0,0 +1,59 @@
/**
* Setup-side provider registry the picker and the standalone `provider-auth`
* step render from this map instead of hardcoding provider names in the setup
* flow (same capability-not-name rule as the host provider-container registry).
*
* `claude` is the built-in default: it has no `runAuth` of its own, which the
* setup flow reads as "run the standard auth step". A provider payload adds
* itself by shipping a `setup/providers/<name>.ts` with a top-level
* `registerSetupProvider(...)` call and appending one import line to the
* `setup/providers/index.ts` barrel the same shape as the host and container
* provider registries, guarded the same way (a barrel-driven registration test).
*/
import type { AssistContext } from '../lib/claude-assist.js'; // type-only — registry stays runtime-dependency-free
/**
* Outcome of a provider-owned failure-assist hook:
* - 'launched' the provider's debugger ran (user may have fixed things).
* - 'declined' the user said no; do NOT offer another debugger.
* - 'unavailable' the provider's CLI can't be used here; the dispatcher
* falls back to the guarded Claude offer (never install/sign-in).
*/
export type FailureAssistResult = 'launched' | 'declined' | 'unavailable';
export interface SetupProviderEntry {
value: string;
label: string;
hint: string;
/** Provider-owned auth walk-through (vault-only). Absent → standard auth step. */
runAuth?: () => Promise<void>;
/** Verifies the provider's payload is wired (files, barrels, Dockerfile pin). */
runInstallCheck?: () => Promise<void>;
/** Provider-owned interactive failure debugger. 'unavailable' dispatcher
* falls back to the guarded Claude offer (never install/sign-in). */
offerFailureAssist?: (ctx: AssistContext, projectRoot: string) => Promise<FailureAssistResult>;
}
const registry = new Map<string, SetupProviderEntry>();
registry.set('claude', {
value: 'claude',
label: 'Claude',
hint: 'default — Anthropic subscription or API key',
});
export function registerSetupProvider(entry: SetupProviderEntry): void {
if (registry.has(entry.value)) {
throw new Error(`Setup provider already registered: ${entry.value}`);
}
registry.set(entry.value, entry);
}
export function getSetupProvider(name: string): SetupProviderEntry | undefined {
return registry.get(name.toLowerCase());
}
/** Claude (the default) first, then the rest in registration order. */
export function listSetupProviders(): SetupProviderEntry[] {
return [...registry.values()];
}
+7 -7
View File
@@ -9,7 +9,8 @@ set -euo pipefail
# Flow:
# 1. Run `claude setup-token` under a PTY (via script(1)) so the browser
# OAuth dance works and its token is captured into a tempfile.
# 2. Regex the sk-ant-oat…AA token out of the ANSI-stripped capture.
# 2. Parse the sk-ant-oat…AA token out of the capture via the shared
# PTY-capture parser (setup/lib/captured-token.ts).
# 3. Register it with OneCLI.
#
# Env overrides:
@@ -99,12 +100,11 @@ else
script -q "$tmpfile" $cmd
fi
# Strip ANSI codes + newlines (TTY wraps the token mid-string), then match
# the sk-ant-oat…AA token. perl because BSD grep caps {n,m} at 255.
token=$(sed $'s/\x1b\\[[0-9;]*[a-zA-Z]//g' "$tmpfile" \
| tr -d '\n\r' \
| perl -ne 'print "$1\n" while /(sk-ant-oat[A-Za-z0-9_-]{80,500}AA)/g' \
| tail -1 || true)
# Extract the token via the shared PTY-capture parser (setup/lib/captured-token.ts),
# so this script and setup/lib/claude-assist.ts stay in lockstep on the
# normalization rules (ANSI/control stripping, un-wrapping the token).
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
token=$(pnpm exec tsx "$SCRIPT_DIR/lib/captured-token.ts" claude "$tmpfile" || true)
if [ -z "$token" ]; then
keep=$(mktemp -t claude-setup-token-log.XXXXXX)
+8 -4
View File
@@ -11,6 +11,7 @@ import { DATA_DIR } from '../src/config.js';
import { initDb } from '../src/db/connection.js';
import { runMigrations } from '../src/db/migrations/index.js';
import { createAgentGroup, getAgentGroupByFolder } from '../src/db/agent-groups.js';
import { ensureContainerConfig } from '../src/db/container-configs.js';
import {
createMessagingGroup,
createMessagingGroupAgent,
@@ -18,7 +19,6 @@ import {
getMessagingGroupAgentByPair,
} from '../src/db/messaging-groups.js';
import { isValidGroupFolder } from '../src/group-folder.js';
import { initGroupFilesystem } from '../src/group-init.js';
import { log } from '../src/log.js';
import { namespacedPlatformId } from '../src/platform-id.js';
import { resolveSession, writeSessionMessage } from '../src/session-manager.js';
@@ -118,7 +118,7 @@ export async function run(args: string[]): Promise<void> {
// Chat SDK adapters prefix, native adapters (WhatsApp/iMessage/Signal) don't.
parsed.platformId = namespacedPlatformId(parsed.channel, parsed.platformId);
log.info('Registering channel', parsed);
log.info('Registering channel', { ...parsed });
// Init v2 central DB
fs.mkdirSync(path.join(projectRoot, 'data'), { recursive: true });
@@ -126,7 +126,11 @@ export async function run(args: string[]): Promise<void> {
const db = initDb(dbPath);
runMigrations(db);
// 1. Create or find agent group
// 1. Create or find agent group. Provider-agnostic: provider is a DB
// property set via `ncl groups config update --provider`, not a creation
// flag. The workspace is scaffolded at the first spawn (group-init), where
// the DB-resolved provider is known; here we only ensure the config row
// exists so that update has a row to write.
let agentGroup = getAgentGroupByFolder(parsed.folder);
if (!agentGroup) {
const agId = generateId('ag');
@@ -140,7 +144,7 @@ export async function run(args: string[]): Promise<void> {
agentGroup = getAgentGroupByFolder(parsed.folder)!;
log.info('Created agent group', { id: agId, folder: parsed.folder });
}
initGroupFilesystem(agentGroup);
ensureContainerConfig(agentGroup.id);
// 2. Create or find messaging group
let messagingGroup = getMessagingGroupByPlatform(parsed.channel, parsed.platformId);
+23
View File
@@ -26,6 +26,12 @@ vi.mock('./db/sessions.js', () => ({
const mockWriteSessionMessage = vi.fn();
vi.mock('./session-manager.js', () => ({
writeSessionMessage: (...args: unknown[]) => mockWriteSessionMessage(...args),
openInboundDb: () => ({}),
}));
const mockCountDueMessages = vi.fn((..._args: unknown[]) => 0);
vi.mock('./db/session-db.js', () => ({
countDueMessages: (...args: unknown[]) => mockCountDueMessages(...args),
}));
import { restartAgentGroupContainers } from './container-restart.js';
@@ -148,4 +154,21 @@ describe('restartAgentGroupContainers', () => {
expect(mockWriteSessionMessage.mock.calls[0][1]).toBe('s1');
expect(mockWriteSessionMessage.mock.calls[1][1]).toBe('s2');
});
it('wakes even without a wake message when in-flight messages are pending', () => {
// A provider switch mid-conversation kills a container holding claimed
// messages — without an immediate respawn those messages stay dark until
// the next inbound or a slow sweep backoff.
mockGetSessionsByAgentGroup.mockReturnValue([makeSession('s1', 'ag1')]);
mockIsContainerRunning.mockReturnValue(true);
mockCountDueMessages.mockReturnValue(2);
restartAgentGroupContainers('ag1', 'provider switch');
const onExit = mockKillContainer.mock.calls[0][2] as () => void;
expect(typeof onExit).toBe('function');
mockGetSession.mockReturnValue(makeSession('s1', 'ag1'));
onExit();
expect(mockWakeContainer).toHaveBeenCalled();
});
});
+8 -2
View File
@@ -5,9 +5,10 @@
* wakes a fresh container via the onExit callback race-free.
*/
import { isContainerRunning, killContainer, wakeContainer } from './container-runner.js';
import { countDueMessages } from './db/session-db.js';
import { getSession, getSessionsByAgentGroup } from './db/sessions.js';
import { log } from './log.js';
import { writeSessionMessage } from './session-manager.js';
import { openInboundDb, writeSessionMessage } from './session-manager.js';
/**
* Kill all running containers for an agent group and respawn them.
@@ -40,10 +41,15 @@ export function restartAgentGroupContainers(agentGroupId: string, reason: string
onWake: 1,
});
}
// Always respawn after the kill when there is anything to process: an
// explicit wake message, or in-flight messages the dying container had
// claimed. Without this, a provider switch mid-conversation leaves the
// claimed messages dark until the next inbound or a slow sweep backoff.
const hasPending = countDueMessages(openInboundDb(session.agent_group_id, session.id)) > 0;
killContainer(
session.id,
reason,
wakeMessage
wakeMessage || hasPending
? () => {
const s = getSession(session.id);
if (s) wakeContainer(s);
+35
View File
@@ -1,3 +1,5 @@
import fs from 'fs';
import path from 'path';
import { describe, expect, it } from 'vitest';
import { resolveProviderName } from './container-runner.js';
@@ -25,3 +27,36 @@ describe('resolveProviderName', () => {
expect(resolveProviderName(null, '')).toBe('claude');
});
});
describe('buildContainerArgs ordering invariant (structural)', () => {
// The OneCLI gateway apply (SDK applyContainerConfig) appends credential-stub
// mounts — e.g. the codex auth.json sentinel nested INSIDE our RW
// /home/node/.codex mount. Docker applies binds in argument order, so the
// stub must land AFTER its parent mount or the parent shadows it and the
// agent silently degrades to loginless auth. Driving the real
// buildContainerArgs needs a live gateway + container runtime, so this
// guards the invariant structurally: the gateway apply must appear after
// the volume-mounts loop in the source.
it('applies the OneCLI gateway after the volume mounts', () => {
const src = fs.readFileSync(path.join(process.cwd(), 'src', 'container-runner.ts'), 'utf-8');
const mountsLoop = src.indexOf('for (const mount of mounts)');
const gatewayApply = src.indexOf('onecli.applyContainerConfig');
expect(mountsLoop).toBeGreaterThan(-1);
expect(gatewayApply).toBeGreaterThan(-1);
expect(gatewayApply).toBeGreaterThan(mountsLoop);
});
});
describe('container boot-failure tripwire (structural)', () => {
// A container that dies at boot (unknown provider, missing CLI binary, bad
// config) explains itself only on stderr — which logs at debug, below the
// default level. The spawn handler must keep a stderr tail and surface it
// at warn on a non-zero exit, or the operator sees only "exited code 1" on
// repeat. Driving a real failing spawn needs a container runtime, so this
// guards the wiring structurally, matching the invariant test above.
it('surfaces the stderr tail when the container exits non-zero', () => {
const src = fs.readFileSync(path.join(process.cwd(), 'src', 'container-runner.ts'), 'utf-8');
expect(src).toContain('stderrTail.push(line)');
expect(src).toMatch(/Container exited non-zero.*stderrTail/s);
});
});
+34 -19
View File
@@ -21,7 +21,7 @@ import {
} from './config.js';
import { materializeContainerJson } from './container-config.js';
import { getContainerConfig } from './db/container-configs.js';
import { updateContainerConfigScalars, updateContainerConfigJson } from './db/container-configs.js';
import { updateContainerConfigScalars } from './db/container-configs.js';
import { CONTAINER_RUNTIME_BIN, hostGatewayArgs, readonlyMountArgs, stopContainer } from './container-runtime.js';
import { EGRESS_NETWORK, egressNetworkArgs, ensureEgressNetwork } from './egress-lockdown.js';
import { composeGroupClaudeMd } from './claude-md-compose.js';
@@ -168,10 +168,16 @@ async function spawnContainer(session: Session): Promise<void> {
activeContainers.set(session.id, { process: container, containerName });
markContainerRunning(session.id);
// Log stderr
// Log stderr. A container that dies at boot (unknown provider, missing
// binary, bad config) explains itself only here — and debug is below the
// default log level — so keep a tail to surface on a non-zero exit.
const stderrTail: string[] = [];
container.stderr?.on('data', (data) => {
for (const line of data.toString().trim().split('\n')) {
if (line) log.debug(line, { container: agentGroup.folder });
if (!line) continue;
log.debug(line, { container: agentGroup.folder });
stderrTail.push(line);
if (stderrTail.length > 10) stderrTail.shift();
}
});
@@ -187,7 +193,12 @@ async function spawnContainer(session: Session): Promise<void> {
activeContainers.delete(session.id);
markContainerStopped(session.id);
stopTypingRefresh(session.id);
log.info('Container exited', { sessionId: session.id, code, containerName });
// code null = killed by signal (normal shutdown path), not a boot failure.
if (code !== 0 && code !== null && stderrTail.length > 0) {
log.warn('Container exited non-zero', { sessionId: session.id, code, containerName, stderrTail });
} else {
log.info('Container exited', { sessionId: session.id, code, containerName });
}
});
container.on('error', (err) => {
@@ -417,7 +428,7 @@ async function buildContainerArgs(
containerName: string,
agentGroup: AgentGroup,
containerConfig: import('./container-config.js').ContainerConfig,
provider: string,
_provider: string,
providerContribution: ProviderContainerContribution,
agentIdentifier?: string,
): Promise<string[]> {
@@ -434,20 +445,6 @@ async function buildContainerArgs(
}
}
// OneCLI gateway — injects HTTPS_PROXY + certs so container API calls
// are routed through the agent vault for credential injection. Treated as
// a transient hard failure: if we can't wire the gateway, we don't spawn.
// The caller (router or host-sweep) catches the throw, leaves the inbound
// message pending, and the next sweep tick retries.
if (agentIdentifier) {
await onecli.ensureAgent({ name: agentGroup.name, identifier: agentIdentifier });
}
const onecliApplied = await onecli.applyContainerConfig(args, { addHostMapping: false, agent: agentIdentifier });
if (!onecliApplied) {
throw new Error('OneCLI gateway not applied — refusing to spawn container without credentials');
}
log.info('OneCLI gateway applied', { containerName });
// Egress lockdown when enabled — throws if it can't be established, aborting
// the spawn rather than running with open egress. Otherwise the host gateway.
if (ensureEgressNetwork()) {
@@ -474,6 +471,24 @@ async function buildContainerArgs(
}
}
// OneCLI gateway — injects HTTPS_PROXY + certs so container API calls
// are routed through the agent vault for credential injection, and mounts
// any credential stubs the gateway serves (e.g. a sentinel auth file).
// Runs AFTER the volume mounts so a stub nested inside one of our mounts
// (a parent dir mounted RW above it) lands later in the args and isn't
// shadowed by it. Treated as a transient hard failure: if we can't wire
// the gateway, we don't spawn. The caller (router or host-sweep) catches
// the throw, leaves the inbound message pending, and the next sweep tick
// retries.
if (agentIdentifier) {
await onecli.ensureAgent({ name: agentGroup.name, identifier: agentIdentifier });
}
const onecliApplied = await onecli.applyContainerConfig(args, { addHostMapping: false, agent: agentIdentifier });
if (!onecliApplied) {
throw new Error('OneCLI gateway not applied — refusing to spawn container without credentials');
}
log.info('OneCLI gateway applied', { containerName });
// Override entrypoint: run v2 entry point directly via Bun (no tsc, no stdin).
args.push('--entrypoint', 'bash');
+22
View File
@@ -184,6 +184,28 @@ export function updatePendingApprovalStatus(approvalId: string, status: PendingA
getDb().prepare('UPDATE pending_approvals SET status = ? WHERE approval_id = ?').run(status, approvalId);
}
/**
* Park an approval in the "rejected, awaiting reason" hold: the admin clicked
* "Reject with reason…" and we're waiting for their one-line reply. `expiresAt`
* is the deadline after which the host sweep finalizes a plain reject (so a
* ghosted hold never strands the requesting agent). Reuses the otherwise-unused
* `expires_at` column on module-initiated rows.
*/
export function markApprovalAwaitingReason(approvalId: string, expiresAt: string): void {
getDb()
.prepare("UPDATE pending_approvals SET status = 'awaiting_reason', expires_at = ? WHERE approval_id = ?")
.run(expiresAt, approvalId);
}
/** Awaiting-reason approvals whose reply window has elapsed — the sweep's ghost set. */
export function getExpiredAwaitingReasonApprovals(nowIso: string): PendingApproval[] {
return getDb()
.prepare(
"SELECT * FROM pending_approvals WHERE status = 'awaiting_reason' AND expires_at IS NOT NULL AND expires_at <= ?",
)
.all(nowIso) as PendingApproval[];
}
export function deletePendingApproval(approvalId: string): void {
getDb().prepare('DELETE FROM pending_approvals WHERE approval_id = ?').run(approvalId);
}
+1 -7
View File
@@ -2,7 +2,7 @@ import path from 'path';
import { describe, expect, it } from 'vitest';
import { isValidGroupFolder, resolveGroupFolderPath, resolveGroupIpcPath } from './group-folder.js';
import { isValidGroupFolder, resolveGroupFolderPath } from './group-folder.js';
describe('group folder validation', () => {
it('accepts normal group folder names', () => {
@@ -23,13 +23,7 @@ describe('group folder validation', () => {
expect(resolved.endsWith(`${path.sep}groups${path.sep}family-chat`)).toBe(true);
});
it('resolves safe paths under data ipc directory', () => {
const resolved = resolveGroupIpcPath('family-chat');
expect(resolved.endsWith(`${path.sep}data${path.sep}ipc${path.sep}family-chat`)).toBe(true);
});
it('throws for unsafe folder names', () => {
expect(() => resolveGroupFolderPath('../../etc')).toThrow();
expect(() => resolveGroupIpcPath('/tmp')).toThrow();
});
});
+1 -9
View File
@@ -1,6 +1,6 @@
import path from 'path';
import { DATA_DIR, GROUPS_DIR } from './config.js';
import { GROUPS_DIR } from './config.js';
const GROUP_FOLDER_PATTERN = /^[A-Za-z0-9][A-Za-z0-9_-]{0,63}$/;
const RESERVED_FOLDERS = new Set(['global']);
@@ -34,11 +34,3 @@ export function resolveGroupFolderPath(folder: string): string {
ensureWithinBase(GROUPS_DIR, groupPath);
return groupPath;
}
export function resolveGroupIpcPath(folder: string): string {
assertValidGroupFolder(folder);
const ipcBaseDir = path.resolve(DATA_DIR, 'ipc');
const ipcPath = path.resolve(ipcBaseDir, folder);
ensureWithinBase(ipcBaseDir, ipcPath);
return ipcPath;
}
+37 -7
View File
@@ -66,13 +66,43 @@ export function initGroupFilesystem(
initialized.push('groupDir');
}
// groups/<folder>/CLAUDE.local.md — per-group agent memory, auto-loaded by
// Claude Code. Seeded with caller-provided instructions on first creation.
const claudeLocalFile = path.join(groupDir, 'CLAUDE.local.md');
if (defaultSurfaces && !fs.existsSync(claudeLocalFile)) {
const body = opts?.instructions ? opts.instructions + '\n' : '';
fs.writeFileSync(claudeLocalFile, body);
initialized.push('CLAUDE.local.md');
// Seed instructions land in the provider's OWN memory surface. Default
// (Claude) surfaces auto-load CLAUDE.local.md natively. A surfaces-owning
// provider must never see stale CLAUDE.* files in its workspace — its seed
// goes into the memory scaffold's conventional landing file instead
// (memory/memories/imported-agent-memory.md): the container-side scaffold
// preserves pre-existing files, and the doctrine tells the agent to read
// that file on its first turn.
//
// Creation stays provider-agnostic: a DM-agent creator drops the seed in a
// neutral `.seed.md`, and placement is deferred to here (the first spawn,
// where the DB-resolved provider is known). Once placed it's consumed.
// `opts.instructions` still wins for any caller that passes it inline.
const neutralSeedFile = path.join(groupDir, '.seed.md');
const seed =
opts?.instructions ??
(fs.existsSync(neutralSeedFile) ? fs.readFileSync(neutralSeedFile, 'utf-8').trimEnd() : undefined);
if (defaultSurfaces) {
const claudeLocalFile = path.join(groupDir, 'CLAUDE.local.md');
if (!fs.existsSync(claudeLocalFile)) {
fs.writeFileSync(claudeLocalFile, seed ? seed + '\n' : '');
initialized.push('CLAUDE.local.md');
}
} else if (seed) {
const seedFile = path.join(groupDir, 'memory', 'memories', 'imported-agent-memory.md');
if (!fs.existsSync(seedFile)) {
fs.mkdirSync(path.dirname(seedFile), { recursive: true });
fs.writeFileSync(seedFile, seed + '\n');
initialized.push('memory/memories/imported-agent-memory.md');
}
}
// The neutral seed is single-use — drop it once the surface it belonged in
// has been resolved, so it can't re-seed after the operator edits theirs.
if (fs.existsSync(neutralSeedFile)) {
fs.rmSync(neutralSeedFile);
initialized.push('.seed.md consumed');
}
// Ensure container_configs row exists in the DB. Idempotent — no-op if
+12
View File
@@ -152,6 +152,18 @@ async function sweep(): Promise<void> {
log.error('Host sweep error', { err });
}
// Finalize any "Reject with reason…" holds whose reply window elapsed (admin
// ghosted, or the host restarted mid-capture). Central-DB scan, once per tick
// — not per session.
// MODULE-HOOK:approvals-reason-sweep:start
try {
const { sweepAwaitingReasonRejects } = await import('./modules/approvals/index.js');
await sweepAwaitingReasonRejects();
} catch (err) {
log.error('Reject-with-reason sweep failed', { err });
}
// MODULE-HOOK:approvals-reason-sweep:end
setTimeout(sweep, SWEEP_INTERVAL_MS);
}
@@ -16,6 +16,7 @@ const mockRequestApproval = vi.fn().mockResolvedValue(undefined);
const mockGetContainerConfig = vi.fn();
const mockCreateAgentGroup = vi.fn();
const mockInitGroupFilesystem = vi.fn();
const mockUpdateScalars = vi.fn();
const mockWriteDestinations = vi.fn();
const mockNotifyWrite = vi.fn();
@@ -24,6 +25,8 @@ vi.mock('../approvals/index.js', () => ({
}));
vi.mock('../../db/container-configs.js', () => ({
getContainerConfig: (...a: unknown[]) => mockGetContainerConfig(...a),
ensureContainerConfig: () => {},
updateContainerConfigScalars: (...a: unknown[]) => mockUpdateScalars(...a),
}));
vi.mock('../../db/agent-groups.js', () => ({
getAgentGroup: (id: string) => ({ id, name: id.toUpperCase(), folder: id, agent_provider: null, created_at: '' }),
@@ -75,6 +78,29 @@ describe('handleCreateAgent — scope-based authorization', () => {
expect(mockInitGroupFilesystem).toHaveBeenCalledTimes(1);
});
it('child inherits the creator provider (codex parent → codex child)', async () => {
// A subagent must run on the same authenticated runtime as its creator —
// on a codex-only install a claude default would 401. Red-on-delete:
// dropping the inheritance leaves the child provider-less (→ claude).
mockGetContainerConfig.mockReturnValue({ cli_scope: 'global', provider: 'codex' });
await handleCreateAgent({ name: 'Scout', instructions: 'help' }, SESSION);
expect(mockInitGroupFilesystem).toHaveBeenCalledWith(
expect.anything(),
expect.objectContaining({ provider: 'codex' }),
);
expect(mockUpdateScalars).toHaveBeenCalledWith(expect.any(String), { provider: 'codex' });
});
it('claude creator leaves the child provider unset (built-in default)', async () => {
mockGetContainerConfig.mockReturnValue({ cli_scope: 'global' }); // no provider
await handleCreateAgent({ name: 'Scout', instructions: 'help' }, SESSION);
expect(mockUpdateScalars).not.toHaveBeenCalled();
});
it('group scope (default): requires approval, does NOT create directly', async () => {
mockGetContainerConfig.mockReturnValue({ cli_scope: 'group' });
+12 -2
View File
@@ -16,7 +16,7 @@ import path from 'path';
import { GROUPS_DIR } from '../../config.js';
import { createAgentGroup, getAgentGroup, getAgentGroupByFolder } from '../../db/agent-groups.js';
import { getContainerConfig } from '../../db/container-configs.js';
import { getContainerConfig, updateContainerConfigScalars } from '../../db/container-configs.js';
import { getSession } from '../../db/sessions.js';
import { wakeContainer } from '../../container-runner.js';
import { initGroupFilesystem } from '../../group-init.js';
@@ -163,7 +163,17 @@ async function performCreateAgent(
created_at: now,
};
createAgentGroup(newGroup);
initGroupFilesystem(newGroup, { instructions: instructions ?? undefined });
// A subagent inherits its creator's provider. Provider is a DB property; the
// child is created provider-agnostic, then stamped with the parent's runtime
// so a single-provider install (e.g. codex-only, where claude isn't
// authenticated) doesn't spawn a child on a runtime it can't reach. The
// operator can still flip a child later with `ncl groups config update
// --provider`. claude (the built-in default) leaves the column unset.
const parentProvider = getContainerConfig(sourceGroup.id)?.provider ?? undefined;
initGroupFilesystem(newGroup, { instructions: instructions ?? undefined, provider: parentProvider });
if (parentProvider) {
updateContainerConfigScalars(newGroup.id, { provider: parentProvider });
}
// Insert bidirectional destination rows (= ACL grants).
// Creator refers to child by the name it chose; child refers to creator as "parent".
+59
View File
@@ -0,0 +1,59 @@
/**
* Shared "finalize a rejected approval" path.
*
* Three entry points land here so they relay one message and clean up
* identically:
* 1. The instant Reject button (response-handler.ts)
* 2. A captured Reject-with-reason reply (reason-capture.ts)
* 3. The host-sweep ghost finalizer (reason-capture.ts, via host-sweep)
*
* Kept in its own leaf file so both response-handler.ts and reason-capture.ts
* can import it without an import cycle (finalize primitive only).
*/
import { wakeContainer } from '../../container-runner.js';
import { deletePendingApproval } from '../../db/sessions.js';
import { log } from '../../log.js';
import { writeSessionMessage } from '../../session-manager.js';
import type { PendingApproval, Session } from '../../types.js';
import { notifyApprovalResolved } from './primitive.js';
/**
* Notify the requesting agent that its action was rejected, drop the pending
* row, fire approval-resolved callbacks, and wake the container.
*
* When `reason` is provided it's appended to the agent-facing note with generic
* attribution the why, not the who (the rejecting admin may belong to a
* different owner than the requesting agent). Callers are responsible for
* clamping the reason length before passing it in.
*/
export async function finalizeReject(
approval: PendingApproval,
session: Session,
userId: string,
reason?: string,
): Promise<void> {
const text = reason
? `Your ${approval.action} request was rejected by admin: "${reason}"`
: `Your ${approval.action} request was rejected by admin.`;
writeSessionMessage(session.agent_group_id, session.id, {
id: `appr-note-${Date.now()}-${Math.random().toString(36).slice(2, 8)}`,
kind: 'chat',
timestamp: new Date().toISOString(),
platformId: session.agent_group_id,
channelType: 'agent',
threadId: null,
content: JSON.stringify({ text, sender: 'system', senderId: 'system' }),
});
log.info('Approval rejected', {
approvalId: approval.approval_id,
action: approval.action,
userId,
withReason: reason !== undefined,
});
deletePendingApproval(approval.approval_id);
await notifyApprovalResolved({ approval, session, outcome: 'reject', userId });
await wakeContainer(session);
}
+9
View File
@@ -8,10 +8,16 @@
* - A response handler that claims pending_approvals rows and dispatches
* to whatever module registered for the row's `action` string. Also
* resolves in-memory OneCLI credential approvals.
* - A message-interceptor (via ./reason-capture.js) that captures an admin's
* one-line reply after they click "Reject with reason…".
* - An adapter-ready callback that starts the OneCLI manual-approval handler
* once the delivery adapter is set.
* - A shutdown callback that stops the OneCLI handler cleanly.
*
* Exposes `sweepAwaitingReasonRejects` for the host sweep to finalize ghosted
* reject-with-reason holds (re-exported here, which also loads reason-capture
* so its interceptor registers).
*
* Self-mod flows (install_packages, add_mcp_server) moved out to
* `src/modules/self-mod/` in PR #7 they now register delivery actions
* + approval handlers via this module's public API.
@@ -24,6 +30,9 @@ import { startOneCLIApprovalHandler, stopOneCLIApprovalHandler } from './onecli-
// Public API re-exports so consumers import from the module root.
export { requestApproval, registerApprovalHandler, notifyAgent } from './primitive.js';
export type { ApprovalHandler, ApprovalHandlerContext, RequestApprovalOptions } from './primitive.js';
// Host-sweep hook for ghosted "Reject with reason…" holds. The re-export also
// loads reason-capture.js, registering its message-interceptor on import.
export { sweepAwaitingReasonRejects } from './reason-capture.js';
registerResponseHandler(handleApprovalsResponse);
+14 -1
View File
@@ -32,10 +32,23 @@ import type { MessagingGroup, PendingApproval, Session } from '../../types.js';
import { getAdminsOfAgentGroup, getGlobalAdmins, getOwners } from '../permissions/db/user-roles.js';
import { ensureUserDm } from '../permissions/user-dm.js';
/** Two-button approval UI — the only options the primitive supports today. */
/**
* Card value for the "Reject with reason…" button. Selecting it doesn't
* finalize the reject it holds the row and captures the approver's next DM
* as a one-line reason relayed to the requesting agent. See reason-capture.ts.
*/
export const REJECT_WITH_REASON_VALUE = 'reject_with_reason';
/**
* Three-button approval UI. Plain Reject is the instant fast path; "Reject with
* reason" opts into the reason-capture flow. Shared by every module approval
* (create_agent, install_packages, add_mcp_server); OneCLI credential cards
* keep their own two-button set in onecli-approvals.ts.
*/
const APPROVAL_OPTIONS: RawOption[] = [
{ label: 'Approve', selectedLabel: '✅ Approved', value: 'approve' },
{ label: 'Reject', selectedLabel: '❌ Rejected', value: 'reject' },
{ label: 'Reject with reason…', selectedLabel: '📝 Rejected (awaiting reason)', value: REJECT_WITH_REASON_VALUE },
];
// ── Approval handler registry ──
@@ -0,0 +1,279 @@
/**
* "Reject with reason…" capture flow.
*
* Covers the three entry points end to end against the real central DB:
* - arming (handleApprovalsResponse with the third option) holds the row and
* prompts the admin instead of finalizing;
* - the captured reply relays one combined message, clamped to 280 chars;
* - the host sweep finalizes a ghosted hold as a plain reject.
*
* writeSessionMessage is mocked so the relayed agent-facing text can be read
* back directly; the delivery adapter is a fake that records prompt sends.
*/
import * as fs from 'fs';
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
import type { InboundEvent } from '../../channels/adapter.js';
import { initTestDb, closeDb, runMigrations } from '../../db/index.js';
import { createAgentGroup } from '../../db/agent-groups.js';
import { createMessagingGroup } from '../../db/messaging-groups.js';
import {
createSession,
createPendingApproval,
deletePendingApproval,
getPendingApproval,
markApprovalAwaitingReason,
} from '../../db/sessions.js';
import { setDeliveryAdapter, type ChannelDeliveryAdapter } from '../../delivery.js';
import { writeSessionMessage } from '../../session-manager.js';
import { upsertUser } from '../permissions/db/users.js';
import { upsertUserDm } from '../permissions/db/user-dms.js';
import { grantRole } from '../permissions/db/user-roles.js';
import { REJECT_WITH_REASON_VALUE } from './primitive.js';
vi.mock('../../container-runner.js', () => ({
wakeContainer: vi.fn().mockResolvedValue(undefined),
}));
vi.mock('../../config.js', async () => {
const actual = await vi.importActual('../../config.js');
return { ...actual, DATA_DIR: '/tmp/nanoclaw-test-reject-reason' };
});
vi.mock('../../session-manager.js', async () => {
const actual = await vi.importActual<typeof import('../../session-manager.js')>('../../session-manager.js');
return { ...actual, writeSessionMessage: vi.fn() };
});
const TEST_DIR = '/tmp/nanoclaw-test-reject-reason';
const DM_CHANNEL = 'slack';
const DM_PLATFORM = 'D-admin-1';
function now(): string {
return new Date().toISOString();
}
let delivered: Array<{ channelType: string; platformId: string; content: string }>;
const fakeAdapter: ChannelDeliveryAdapter = {
async deliver(channelType, platformId, _threadId, _kind, content) {
delivered.push({ channelType, platformId, content });
return 'pm-1';
},
};
function seedApproval(approvalId: string, action = 'create_agent'): void {
createPendingApproval({
approval_id: approvalId,
session_id: 'sess-1',
request_id: approvalId,
action,
payload: JSON.stringify({ name: 'child' }),
created_at: now(),
title: 'Approval',
options_json: JSON.stringify([]),
});
}
function dmReply(text?: string): InboundEvent {
const content: Record<string, unknown> = { sender: 'admin-1', senderId: 'admin-1' };
if (text !== undefined) content.text = text;
return {
channelType: DM_CHANNEL,
platformId: DM_PLATFORM,
threadId: null,
message: { id: 'm-1', kind: 'chat', content: JSON.stringify(content), timestamp: now() },
};
}
/** Click the "Reject with reason…" button as the seeded admin. */
async function clickRejectWithReason(approvalId: string): Promise<void> {
const { handleApprovalsResponse } = await import('./response-handler.js');
await handleApprovalsResponse({
questionId: approvalId,
value: REJECT_WITH_REASON_VALUE,
userId: 'admin-1',
channelType: DM_CHANNEL,
platformId: '', // not surfaced by the click payload — resolved via ensureUserDm
threadId: null,
});
}
/** The text of the most recent agent-facing note written via writeSessionMessage. */
function lastRelayedText(): string | undefined {
const call = vi.mocked(writeSessionMessage).mock.calls.at(-1);
if (!call) return undefined;
return (JSON.parse(call[2].content) as { text: string }).text;
}
beforeEach(() => {
vi.clearAllMocks();
if (fs.existsSync(TEST_DIR)) fs.rmSync(TEST_DIR, { recursive: true, force: true });
fs.mkdirSync(TEST_DIR, { recursive: true });
const db = initTestDb();
runMigrations(db);
delivered = [];
createAgentGroup({ id: 'ag-1', name: 'Agent', folder: 'agent', agent_provider: null, created_at: now() });
createSession({
id: 'sess-1',
agent_group_id: 'ag-1',
messaging_group_id: null,
thread_id: null,
agent_provider: null,
status: 'active',
container_status: 'stopped',
last_active: now(),
created_at: now(),
});
// Authorized approver + a cached DM so ensureUserDm resolves without a
// platform openDM call.
upsertUser({ id: 'slack:admin-1', kind: 'slack', display_name: 'Admin', created_at: now() });
grantRole({ user_id: 'slack:admin-1', role: 'owner', agent_group_id: null, granted_by: null, granted_at: now() });
createMessagingGroup({
id: 'mg-dm-1',
channel_type: DM_CHANNEL,
platform_id: DM_PLATFORM,
name: 'Admin DM',
is_group: 0,
unknown_sender_policy: 'strict',
created_at: now(),
});
upsertUserDm({
user_id: 'slack:admin-1',
channel_type: DM_CHANNEL,
messaging_group_id: 'mg-dm-1',
resolved_at: now(),
});
setDeliveryAdapter(fakeAdapter);
});
afterEach(() => {
closeDb();
if (fs.existsSync(TEST_DIR)) fs.rmSync(TEST_DIR, { recursive: true, force: true });
});
describe('reject with reason', () => {
it('holds the row and prompts the admin instead of finalizing', async () => {
seedApproval('appr-1');
await clickRejectWithReason('appr-1');
const row = getPendingApproval('appr-1');
expect(row?.status).toBe('awaiting_reason');
expect(row?.expires_at).toBeTruthy();
// Prompt went to the admin's resolved DM, not the (empty) click platformId.
expect(delivered).toHaveLength(1);
expect(delivered[0].channelType).toBe(DM_CHANNEL);
expect(delivered[0].platformId).toBe(DM_PLATFORM);
expect((JSON.parse(delivered[0].content) as { text: string }).text).toMatch(/reason/i);
// Agent is not notified yet — the hold is still open.
expect(vi.mocked(writeSessionMessage)).not.toHaveBeenCalled();
});
it('relays the captured reason as one combined message and clears the row', async () => {
const { captureReasonReply } = await import('./reason-capture.js');
seedApproval('appr-2', 'install_packages');
await clickRejectWithReason('appr-2');
const consumed = await captureReasonReply(dmReply('too risky for prod'));
expect(consumed).toBe(true);
expect(getPendingApproval('appr-2')).toBeUndefined();
expect(lastRelayedText()).toBe('Your install_packages request was rejected by admin: "too risky for prod"');
});
it('truncates an over-long reason to 280 chars with an ellipsis', async () => {
const { captureReasonReply } = await import('./reason-capture.js');
seedApproval('appr-3');
await clickRejectWithReason('appr-3');
await captureReasonReply(dmReply('x'.repeat(400)));
const reason = lastRelayedText()!.match(/: "(.*)"$/)![1];
expect(reason).toHaveLength(280);
expect(reason.endsWith('…')).toBe(true);
});
it('finalizes a plain reject when the captured reply carries no text', async () => {
const { captureReasonReply } = await import('./reason-capture.js');
seedApproval('appr-4');
await clickRejectWithReason('appr-4');
const consumed = await captureReasonReply(dmReply(undefined));
expect(consumed).toBe(true);
expect(getPendingApproval('appr-4')).toBeUndefined();
expect(lastRelayedText()).toBe('Your create_agent request was rejected by admin.');
});
it('does not swallow a later DM once the hold was already finalized', async () => {
const { captureReasonReply } = await import('./reason-capture.js');
seedApproval('appr-5');
await clickRejectWithReason('appr-5');
// Simulate the sweep (or any other path) finalizing first.
deletePendingApproval('appr-5');
const consumed = await captureReasonReply(dmReply('late reason'));
expect(consumed).toBe(false);
});
it('ignores DMs on channels with no armed reason capture', async () => {
const { captureReasonReply } = await import('./reason-capture.js');
const consumed = await captureReasonReply({
channelType: DM_CHANNEL,
platformId: 'D-someone-else',
threadId: null,
message: { id: 'm', kind: 'chat', content: JSON.stringify({ text: 'hi' }), timestamp: now() },
});
expect(consumed).toBe(false);
});
});
describe('reject-with-reason host sweep', () => {
it('finalizes a hold whose window elapsed as a plain reject', async () => {
const { sweepAwaitingReasonRejects } = await import('./reason-capture.js');
seedApproval('appr-ghost', 'add_mcp_server');
markApprovalAwaitingReason('appr-ghost', new Date(Date.now() - 1000).toISOString());
await sweepAwaitingReasonRejects();
expect(getPendingApproval('appr-ghost')).toBeUndefined();
expect(lastRelayedText()).toBe('Your add_mcp_server request was rejected by admin.');
});
it('leaves a still-open hold untouched', async () => {
const { sweepAwaitingReasonRejects } = await import('./reason-capture.js');
seedApproval('appr-open');
markApprovalAwaitingReason('appr-open', new Date(Date.now() + 60_000).toISOString());
await sweepAwaitingReasonRejects();
expect(getPendingApproval('appr-open')?.status).toBe('awaiting_reason');
expect(vi.mocked(writeSessionMessage)).not.toHaveBeenCalled();
});
});
describe('plain reject (regression)', () => {
it('finalizes immediately with no reason and no DM prompt', async () => {
const { handleApprovalsResponse } = await import('./response-handler.js');
seedApproval('appr-plain', 'install_packages');
await handleApprovalsResponse({
questionId: 'appr-plain',
value: 'reject',
userId: 'admin-1',
channelType: DM_CHANNEL,
platformId: '',
threadId: null,
});
expect(getPendingApproval('appr-plain')).toBeUndefined();
expect(delivered).toHaveLength(0);
expect(lastRelayedText()).toBe('Your install_packages request was rejected by admin.');
});
});
+174
View File
@@ -0,0 +1,174 @@
/**
* "Reject with reason…" capture flow.
*
* When an admin clicks the third approval button, the reject is held instead of
* finalized: the row is parked at status='awaiting_reason' and the admin is
* prompted in their DM for a one-line reason. Their next DM ( 280 chars) is
* captured by a router message-interceptor and relayed to the requesting agent
* as one combined message `Your <action> request was rejected by admin:
* "<reason>"`. A plain Reject never arms this, so an unrelated DM is never
* swallowed.
*
* Restart-safety: arming lives in an in-memory map (lost on restart, like the
* agent-naming capture it mirrors), but the hold is a durable DB row. If the
* admin never replies or the host restarts mid-capture the host sweep
* (sweepAwaitingReasonRejects, run each tick) finalizes a plain reject once the
* row's window elapses, so the requesting agent is never stranded.
*
* Reuses, not reinvents: the agent-naming prompt-then-capture pattern
* (in-memory map + next-DM interceptor) and the shared finalizeReject path.
*/
import type { InboundEvent } from '../../channels/adapter.js';
import { getDeliveryAdapter } from '../../delivery.js';
import {
deletePendingApproval,
getExpiredAwaitingReasonApprovals,
getPendingApproval,
getSession,
markApprovalAwaitingReason,
} from '../../db/sessions.js';
import { log } from '../../log.js';
import { registerMessageInterceptor } from '../../router.js';
import type { PendingApproval, Session } from '../../types.js';
import { ensureUserDm } from '../permissions/user-dm.js';
import { finalizeReject } from './finalize.js';
/** How long an awaiting-reason hold waits for the admin's reply before the sweep finalizes a plain reject. */
const REASON_CAPTURE_WINDOW_MS = 5 * 60 * 1000;
/** Cap on the relayed reason — one cheap guardrail against a wall of text landing in another team's agent context. */
const MAX_REASON_LEN = 280;
const PROMPT_TEXT =
"Reply with a one-line reason for the rejection — I'll relay it to the agent. " +
'No reply within ~5 min declines it without a reason.';
interface ReasonArming {
approvalId: string;
/** Namespaced id of the admin who clicked, for resolution attribution. */
userId: string;
}
/**
* Approvers waiting to type a rejection reason, keyed by their DM channel
* (`<channelType>:<dmPlatformId>`). A DM's platform id is unique per user, so
* the inbound reply matches by channel alone no sender re-parsing needed, and
* a group message can never collide with an armed DM. Cleared on receipt,
* staleness, or restart.
*/
const awaitingReason = new Map<string, ReasonArming>();
function dmKey(channelType: string, platformId: string): string {
return `${channelType}:${platformId}`;
}
function clampReason(raw: string): string {
const trimmed = raw.trim();
if (trimmed.length <= MAX_REASON_LEN) return trimmed;
return trimmed.slice(0, MAX_REASON_LEN - 1) + '…';
}
function extractText(event: InboundEvent): string {
try {
const parsed = JSON.parse(event.message.content) as Record<string, unknown>;
return typeof parsed.text === 'string' ? parsed.text : '';
} catch {
return '';
}
}
/**
* Begin the reject-with-reason hold for an approval the admin chose not to
* finalize outright. Prompts the admin's DM, then parks the row and arms
* capture. If we can't reach the admin (no DM, no adapter, delivery throws) we
* finalize a plain reject immediately rather than strand the requesting agent.
*/
export async function armReasonCapture(approval: PendingApproval, session: Session, userId: string): Promise<void> {
const dm = userId ? await ensureUserDm(userId) : null;
const adapter = getDeliveryAdapter();
if (!dm || !adapter) {
log.warn('reject-with-reason: cannot reach approver, finalizing plain reject', {
approvalId: approval.approval_id,
userId,
hasDm: Boolean(dm),
hasAdapter: Boolean(adapter),
});
await finalizeReject(approval, session, userId);
return;
}
try {
await adapter.deliver(dm.channel_type, dm.platform_id, null, 'chat-sdk', JSON.stringify({ text: PROMPT_TEXT }));
} catch (err) {
log.error('reject-with-reason: reason prompt delivery failed, finalizing plain reject', {
approvalId: approval.approval_id,
err,
});
await finalizeReject(approval, session, userId);
return;
}
// Prompt is out — now hold the row and arm capture. Order matters: a reply
// can't arrive before the prompt is read, so there's no lost-message window.
const expiresAt = new Date(Date.now() + REASON_CAPTURE_WINDOW_MS).toISOString();
markApprovalAwaitingReason(approval.approval_id, expiresAt);
awaitingReason.set(dmKey(dm.channel_type, dm.platform_id), { approvalId: approval.approval_id, userId });
log.info('reject-with-reason: awaiting reason reply', { approvalId: approval.approval_id, userId });
}
/**
* Router message-interceptor: capture the next DM from an admin who armed a
* reason. Returns true (consume the message) when this DM is an armed reason
* channel and still holds a live row; false otherwise so normal routing runs.
*
* Exported for tests; registered as the interceptor below.
*/
export async function captureReasonReply(event: InboundEvent): Promise<boolean> {
const arming = awaitingReason.get(dmKey(event.channelType, event.platformId));
if (!arming) return false;
// This DM is an armed reason channel — disarm regardless of outcome.
awaitingReason.delete(dmKey(event.channelType, event.platformId));
const approval = getPendingApproval(arming.approvalId);
if (!approval || approval.status !== 'awaiting_reason') {
// Already finalized (e.g. ghosted by the sweep). The reply is no longer a
// reason — let it route normally instead of swallowing it.
return false;
}
const session = approval.session_id ? getSession(approval.session_id) : null;
if (!session) {
deletePendingApproval(approval.approval_id);
return true;
}
const reason = clampReason(extractText(event));
await finalizeReject(approval, session, arming.userId, reason || undefined);
log.info('reject-with-reason: reason captured and relayed', {
approvalId: approval.approval_id,
hasReason: reason.length > 0,
});
return true;
}
registerMessageInterceptor(captureReasonReply);
/**
* Host-sweep finalizer: any reject-with-reason hold whose window elapsed (admin
* ghosted, or the host restarted mid-capture and lost the in-memory arming) is
* finalized as a plain reject. Restart-safe the hold is a durable row, so the
* requesting agent always gets its decision. Called once per sweep tick.
*/
export async function sweepAwaitingReasonRejects(): Promise<void> {
const rows = getExpiredAwaitingReasonApprovals(new Date().toISOString());
for (const approval of rows) {
const session = approval.session_id ? getSession(approval.session_id) : null;
if (!session) {
deletePendingApproval(approval.approval_id);
continue;
}
// Plain reject, unknown resolver — the admin opted in but never typed.
await finalizeReject(approval, session, '');
log.info('reject-with-reason: window elapsed, finalized as plain reject', { approvalId: approval.approval_id });
}
}
+22 -12
View File
@@ -5,7 +5,10 @@
* 1. Module-initiated actions the module called `requestApproval()` with
* some free-form `action` string and registered a handler via
* `registerApprovalHandler(action, handler)`. On approve, we look up the
* handler and call it; on reject, we notify the agent and move on.
* handler and call it; on plain reject we relay a decline to the agent; on
* "Reject with reason…" we hold the row and capture the admin's next DM as
* a one-line reason (see reason-capture.ts). Reject finalization is shared
* via finalizeReject.
* 2. OneCLI credential approvals (`action = 'onecli_credential'`). Resolved
* via an in-memory Promise see onecli-approvals.ts.
*
@@ -19,8 +22,10 @@ import { log } from '../../log.js';
import { writeSessionMessage } from '../../session-manager.js';
import type { PendingApproval } from '../../types.js';
import { hasAdminPrivilege, isGlobalAdmin, isOwner } from '../permissions/db/user-roles.js';
import { finalizeReject } from './finalize.js';
import { ONECLI_ACTION, resolveOneCLIApproval } from './onecli-approvals.js';
import { getApprovalHandler, notifyApprovalResolved } from './primitive.js';
import { getApprovalHandler, notifyApprovalResolved, REJECT_WITH_REASON_VALUE } from './primitive.js';
import { armReasonCapture } from './reason-capture.js';
export async function handleApprovalsResponse(payload: ResponsePayload): Promise<boolean> {
const approval = getPendingApproval(payload.questionId);
@@ -65,6 +70,21 @@ async function handleRegisteredApproval(
return;
}
// "Reject with reason…" — hold the row and capture the admin's next DM
// instead of finalizing now. The agent is notified exactly once: after the
// reason arrives, or after the sweep's timeout if the admin ghosts.
if (selectedOption === REJECT_WITH_REASON_VALUE) {
await armReasonCapture(approval, session, userId);
return;
}
// Plain Reject (or any other non-approve value) — instant fast path.
if (selectedOption !== 'approve') {
await finalizeReject(approval, session, userId);
return;
}
// Approved — dispatch to the module that registered for this action.
const notify = (text: string): void => {
writeSessionMessage(session.agent_group_id, session.id, {
id: `appr-note-${Date.now()}-${Math.random().toString(36).slice(2, 8)}`,
@@ -77,16 +97,6 @@ async function handleRegisteredApproval(
});
};
if (selectedOption !== 'approve') {
notify(`Your ${approval.action} request was rejected by admin.`);
log.info('Approval rejected', { approvalId: approval.approval_id, action: approval.action, userId });
deletePendingApproval(approval.approval_id);
await notifyApprovalResolved({ approval, session, outcome: 'reject', userId });
await wakeContainer(session);
return;
}
// Approved — dispatch to the module that registered for this action.
const handler = getApprovalHandler(approval.action);
if (!handler) {
log.warn('No approval handler registered — row dropped', {
@@ -292,6 +292,8 @@ export function createNewAgentGroup(name: string): AgentGroup {
});
const ag = getAgentGroup(agId)!;
// Channel-approved groups get the built-in default provider (claude); the
// operator flips a group with `ncl groups config update --provider`.
initGroupFilesystem(ag);
return ag;
}
+2 -2
View File
@@ -22,7 +22,7 @@ import {
routeInbound,
setAccessGate,
setChannelRequestGate,
setMessageInterceptor,
registerMessageInterceptor,
setSenderResolver,
setSenderScopeGate,
type AccessGateResult,
@@ -521,7 +521,7 @@ registerResponseHandler(handleChannelApprovalResponse);
// Captures the next DM from an approver who clicked "Create new agent",
// creates the agent immediately, wires the channel, and replays.
setMessageInterceptor(async (event: InboundEvent): Promise<boolean> => {
registerMessageInterceptor(async (event: InboundEvent): Promise<boolean> => {
const userId = extractAndUpsertUser(event);
if (!userId) return false;
+53 -1
View File
@@ -71,7 +71,7 @@ describe('initGroupFilesystem agent surfaces', () => {
expect(fs.existsSync(path.join(claudeDir, 'skills'))).toBe(true);
});
it('skips the default surfaces for a provider that provides its own', () => {
it('writes the seed into the memory scaffold — never CLAUDE.* — for a provider with its own surfaces', () => {
const ag = group('ag-surfy', 'surfy-group');
createAgentGroup(ag);
@@ -80,10 +80,27 @@ describe('initGroupFilesystem agent surfaces', () => {
const groupDir = path.join(GROUPS_DIR, ag.folder);
const sessionRoot = path.join(DATA_DIR, 'v2-sessions', ag.id);
expect(fs.existsSync(groupDir)).toBe(true);
// A fresh group on a surfaces-owning provider must not contain stale
// Claude surfaces; its seed lands in the scaffold's conventional file,
// which the container-side scaffold preserves at boot.
expect(fs.existsSync(path.join(groupDir, 'CLAUDE.local.md'))).toBe(false);
expect(fs.readFileSync(path.join(groupDir, 'memory', 'memories', 'imported-agent-memory.md'), 'utf-8')).toBe(
'hello\n',
);
expect(fs.existsSync(path.join(sessionRoot, '.claude-shared'))).toBe(false);
});
it('writes nothing at all for a surfaces-owning provider without instructions', () => {
const ag = group('ag-surfy-bare', 'surfy-bare-group');
createAgentGroup(ag);
initGroupFilesystem(ag, { provider: 'surfaces-test-provider' });
const groupDir = path.join(GROUPS_DIR, ag.folder);
expect(fs.existsSync(path.join(groupDir, 'CLAUDE.local.md'))).toBe(false);
expect(fs.existsSync(path.join(groupDir, 'memory'))).toBe(false);
});
it('treats an unregistered provider name as default surfaces', () => {
const ag = group('ag-unknown', 'unknown-group');
createAgentGroup(ag);
@@ -94,6 +111,41 @@ describe('initGroupFilesystem agent surfaces', () => {
});
});
describe('initGroupFilesystem deferred seed (.seed.md)', () => {
// Creation is provider-agnostic: the DM-agent creators drop a neutral
// `.seed.md` and defer placement to the first spawn, where the DB-resolved
// provider is known. group-init places it into the right surface and
// consumes it. Red-on-delete: if that placement is removed, these fail.
it('places .seed.md into CLAUDE.local.md for the default provider, then consumes it', () => {
const ag = group('ag-seed-default', 'seed-default');
createAgentGroup(ag);
const groupDir = path.join(GROUPS_DIR, ag.folder);
fs.mkdirSync(groupDir, { recursive: true });
fs.writeFileSync(path.join(groupDir, '.seed.md'), 'seeded identity\n');
initGroupFilesystem(ag, {}); // no inline instructions — must read .seed.md
expect(fs.readFileSync(path.join(groupDir, 'CLAUDE.local.md'), 'utf-8')).toBe('seeded identity\n');
expect(fs.existsSync(path.join(groupDir, '.seed.md'))).toBe(false);
});
it('places .seed.md into the memory scaffold (never CLAUDE.*) for a surfaces-owning provider, then consumes it', () => {
const ag = group('ag-seed-surfy', 'seed-surfy');
createAgentGroup(ag);
const groupDir = path.join(GROUPS_DIR, ag.folder);
fs.mkdirSync(groupDir, { recursive: true });
fs.writeFileSync(path.join(groupDir, '.seed.md'), 'seeded identity\n');
initGroupFilesystem(ag, { provider: 'surfaces-test-provider' });
expect(fs.existsSync(path.join(groupDir, 'CLAUDE.local.md'))).toBe(false);
expect(fs.readFileSync(path.join(groupDir, 'memory', 'memories', 'imported-agent-memory.md'), 'utf-8')).toBe(
'seeded identity\n',
);
expect(fs.existsSync(path.join(groupDir, '.seed.md'))).toBe(false);
});
});
describe('buildMounts agent surfaces', () => {
it('mounts the default surfaces for an unregistered provider (todays behavior)', () => {
const ag = group('ag-mounts-default', 'mounts-default');
+17 -9
View File
@@ -110,16 +110,20 @@ export function setSenderScopeGate(fn: SenderScopeGateFn): void {
/**
* Message-interceptor hook. Runs at the very top of routeInbound, before
* messaging-group resolution. When the interceptor returns true the message
* is consumed and routing stops. Used by the permissions module to capture
* free-text replies during multi-step approval flows (e.g. agent naming).
* messaging-group resolution. When an interceptor returns true the message is
* consumed and routing stops. Multiple interceptors may register; they run in
* registration order and the first to claim the message (return true) wins.
*
* Used by modules to capture free-text DM replies during multi-step approval
* flows the permissions module (agent naming during channel registration)
* and the approvals module (reject-with-reason capture).
*/
export type MessageInterceptorFn = (event: InboundEvent) => Promise<boolean>;
let messageInterceptor: MessageInterceptorFn | null = null;
const messageInterceptors: MessageInterceptorFn[] = [];
export function setMessageInterceptor(fn: MessageInterceptorFn): void {
messageInterceptor = fn;
export function registerMessageInterceptor(fn: MessageInterceptorFn): void {
messageInterceptors.push(fn);
}
/**
@@ -156,9 +160,13 @@ function safeParseContent(raw: string): { text?: string; sender?: string; sender
* Creates messaging group + session if they don't exist yet.
*/
export async function routeInbound(event: InboundEvent): Promise<void> {
// Pre-route interceptor — lets modules consume messages before any routing
// (e.g. free-text replies during multi-step approval flows).
if (messageInterceptor && (await messageInterceptor(event))) return;
// Pre-route interceptors — let modules consume messages before any routing
// (e.g. free-text DM replies during multi-step approval flows). They run in
// registration order; the first to claim the message stops routing. The
// sequential await is intentional — first-to-claim is order-dependent.
for (const intercept of messageInterceptors) {
if (await intercept(event)) return;
}
// 0. Apply the adapter's thread policy. Non-threaded adapters (Telegram,
// WhatsApp, iMessage, email) collapse threads to the channel. Resolved
+6 -1
View File
@@ -200,8 +200,13 @@ export interface PendingApproval {
channel_type: string | null;
platform_id: string | null;
platform_message_id: string | null;
/**
* For OneCLI credential rows, the gateway's request TTL. For a module
* approval held by "Reject with reason…", the deadline after which the
* host sweep finalizes a plain reject (set by markApprovalAwaitingReason).
*/
expires_at: string | null;
status: 'pending' | 'approved' | 'rejected' | 'expired';
status: 'pending' | 'approved' | 'rejected' | 'expired' | 'awaiting_reason';
title: string;
options_json: string;
}
+4
View File
@@ -0,0 +1,4 @@
{
"onecli-gateway": "1.36.0",
"onecli-cli": "2.2.5"
}
+3 -1
View File
@@ -4,6 +4,8 @@ export default defineConfig({
test: {
// container/agent-runner tests run under Bun (they depend on bun:sqlite).
// See container/agent-runner/package.json "test" script.
include: ['src/**/*.test.ts', 'setup/**/*.test.ts', 'scripts/**/*.test.ts'],
// container/*.test.ts: top-level only — container/agent-runner tests run
// under Bun (they depend on bun:sqlite) and must not be picked up here.
include: ['src/**/*.test.ts', 'setup/**/*.test.ts', 'scripts/**/*.test.ts', 'container/*.test.ts'],
},
});