fix(codex): deliver harness file events + add file to ProviderEvent

Codex's built-in image generation yields { type: 'file', path } that the ProviderEvent union didn't declare (breaks tsc once codex.ts lands on trunk) and the poll-loop never consumed (the image was dropped and never reached chat). Adding the type alone clears the build but leaves delivery broken — this fixes both. - add { type: 'file'; path: string } to ProviderEvent - extract enqueueFileOut() owning the outbox-staging + messages_out {files:[]} contract so send_file and the poll-loop can't drift apart - poll-loop delivers file events to the batch's reply destination, best-effort (missing dest / unreadable file logs, never fails the turn) - tests for enqueueFileOut Refs CDX-001. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
refactor(add-codex): install Codex CLI via cli-tools.json, not the Dockerfile
2026-06-15 18:21:47 +08:00 · 2026-06-15 01:50:32 +03:00 · 2026-06-14 21:40:44 +03:00 · 2026-06-14 18:29:28 +00:00 · 2026-06-14 18:29:25 +00:00 · 2026-06-14 21:29:12 +03:00
139 changed files with 8281 additions and 1348 deletions
@@ -1,83 +1,78 @@
-# Remove Codex provider
+# Remove the Codex agent provider

-Idempotent — safe to run even if some steps were never applied. Reverses both the host (`src/providers/`) and container (`container/agent-runner/src/providers/`) trees, plus the Dockerfile CLI install.
+Reverses every change `/add-codex` makes and returns every group to the default provider. Safe to run when partially installed — skip any step whose target is already absent.

-## 1. Delete the barrel import lines (both trees)
+## 1. Switch codex groups back to the default

-Delete (do not comment out) the `import './codex.js';` line from each barrel:
+List groups still on codex and switch each one (each group's `memory/` tree stays on disk and readable; run `/migrate-memory` per group if its memory should carry back to Claude — see [docs/provider-migration.md](../../docs/provider-migration.md)):
+
+```bash
+ncl groups list
+# for each group whose config shows provider=codex:
+ncl groups config update --id <group-id> --provider claude
+ncl groups restart --id <group-id>
+```
+
+## 2. Delete the barrel imports
+
+Delete (do not comment out) the `import './codex.js';` line from each of:

 - `src/providers/index.ts`
 - `container/agent-runner/src/providers/index.ts`
+- `setup/providers/index.ts`

-This unregisters the provider from both `listProviderContainerConfigNames()` (host) and `listProviderNames()` (container).
-
-## 2. Delete the copied files (both trees)
+## 3. Delete every copied file

 ```bash
 rm -f src/providers/codex.ts \
+      src/providers/codex-agents-md.ts \
      src/providers/codex-registration.test.ts \
+      src/providers/codex-host-contribution.test.ts \
+      src/providers/codex-agents-md.test.ts \
      container/agent-runner/src/providers/codex.ts \
      container/agent-runner/src/providers/codex-app-server.ts \
-      container/agent-runner/src/providers/codex.factory.test.ts \
+      container/agent-runner/src/providers/exchange-archive.ts \
+      container/agent-runner/src/providers/exchange-archive.test.ts \
      container/agent-runner/src/providers/codex-registration.test.ts \
-      container/agent-runner/src/providers/codex-dockerfile.test.ts
+      container/agent-runner/src/providers/codex.factory.test.ts \
+      container/agent-runner/src/providers/codex.turns.test.ts \
+      container/agent-runner/src/providers/codex-app-server.test.ts \
+      container/agent-runner/src/providers/codex-cli-tools.test.ts \
+      setup/providers/codex.ts \
+      setup/providers/codex.test.ts \
+      setup/providers/codex-registration.test.ts
 ```

-## 3. Revert the Dockerfile CLI install
+This skill itself (`.claude/skills/add-codex/`) stays — it ships with trunk so the provider can be re-added later.

-In `container/Dockerfile`, remove both Codex edits (skip whichever is already gone):
+`container/AGENTS.md` stays only if another installed provider uses agent surfaces; otherwise remove it too.

-**(a)** Delete the version ARG from the "Pin CLI versions" block:
+## 4. Remove the CLI manifest entry

-```dockerfile
-ARG CODEX_VERSION=0.124.0
-```
-
-**(b)** Delete the standalone Codex install layer:
-
-```dockerfile
-RUN --mount=type=cache,target=/root/.cache/pnpm \
-    pnpm install -g "@openai/codex@${CODEX_VERSION}"
-```
-
-Leave the other per-CLI install layers (claude-code, agent-browser, vercel) untouched.
-
-## 4. Dependency
-
-Codex is a CLI binary installed via the Dockerfile — there is no agent-runner package dependency to uninstall. Step 3 removes the only install surface; no `bun remove` / `pnpm uninstall` is needed.
-
-## 5. Unset Codex env vars
-
-Remove any Codex-specific lines you added to `.env` (`OPENAI_API_KEY`, `OPENAI_BASE_URL`, `CODEX_MODEL`) if no other integration uses them, then re-sync to the container:
+Delete the `@openai/codex` entry from `container/cli-tools.json`:

 ```bash
-mkdir -p data/env && cp .env data/env/env
+node -e '
+  const fs = require("fs");
+  const file = "container/cli-tools.json";
+  const tools = JSON.parse(fs.readFileSync(file, "utf8")).filter((t) => t.name !== "@openai/codex");
+  const fmt = (t) => "  { " + Object.entries(t).map(([k, v]) => JSON.stringify(k) + ": " + JSON.stringify(v)).join(", ") + " }";
+  fs.writeFileSync(file, "[\n" + tools.map(fmt).join(",\n") + "\n]\n");
+'
 ```

-Switch any group still on Codex back to the default provider — set `"provider": "claude"` in `groups/<folder>/container.json` and clear `agent_provider` on the group/session in the DB.
+## 5. Vault secret (optional)

-## 6. Rebuild and restart
+The ChatGPT/OpenAI secret in the OneCLI vault grants nothing once the provider is gone. To remove it: `onecli secrets list`, then `onecli secrets delete --id <id>` for the `chatgpt.com` / `api.openai.com` entry.

-Run from your NanoClaw project root:
+## 6. Rebuild and verify

 ```bash
-pnpm run build && ./container/build.sh
-source setup/lib/install-slug.sh
-
-# macOS
-launchctl kickstart -k gui/$(id -u)/$(launchd_label)
-
-# Linux
-systemctl --user restart $(systemd_unit)
+pnpm run build
+pnpm exec tsc -p container/agent-runner/tsconfig.json --noEmit
+./container/build.sh
+pnpm test
+cd container/agent-runner && bun test
 ```

-## Verification
-
-After removal, the registration guards no longer apply (their files are gone). Confirm the provider is fully unwired:
-
-```bash
-grep -R "codex.js" src/providers/index.ts container/agent-runner/src/providers/index.ts   # no output
-grep "@openai/codex" container/Dockerfile                                                  # no output
-```
-
-In a wired agent, requesting `agent_provider = 'codex'` should fall back to the default provider since `codex` is no longer in the registry.
+All suites green and `ncl groups list` showing no codex groups means the removal is complete. Restart the service (`launchctl kickstart -k gui/$(id -u)/<label>` on macOS, `systemctl --user restart <unit>` on Linux).
@@ -1,186 +1,126 @@
 ---
 name: add-codex
-description: Use Codex (CLI + AppServer) as the full agent provider — planning, tool orchestration, native compaction, MCP tools, session resume — in place of the Claude Agent SDK. ChatGPT subscription or OPENAI_API_KEY. Per-group via agent_provider. Distinct from using OpenAI as an MCP tool (where Claude remains the planner).
+description: Use Codex (OpenAI's codex app-server) as a full agent provider — planning, tool orchestration, MCP tools, server-side history, session resume — alongside or instead of Claude. ChatGPT subscription or OpenAI API key, vault-only via OneCLI. Per-group via `ncl groups config update --provider codex`. Distinct from using OpenAI as an MCP tool (where Claude remains the planner).
 ---

 # Codex agent provider

-NanoClaw runs agents in a long-lived **poll loop** inside the container. The backend is selected with **`AGENT_PROVIDER`** (`claude` | `opencode` | `codex` | `mock`).
+> Shortcut: `pnpm exec tsx setup/index.ts --step provider-auth codex` performs this whole install (manifest-driven from the providers branch: files, barrels, CLI manifest entry, image rebuild) plus auth in one command. The steps below are the same operations, for agent-driven or manual application.

-Trunk ships with only the `claude` provider baked in. This skill copies the Codex provider files in from the `providers` branch, wires them into the host and container barrels, updates the Dockerfile to install the Codex CLI, and rebuilds the image.
+NanoClaw selects each group's agent backend from `container_configs.provider` (default `claude`). This skill installs the Codex provider: copy the payload from the `providers` branch, append one import to each of the three provider barrels, add the pinned Codex CLI to the container manifest (`container/cli-tools.json`), rebuild, then run the vault auth walk-through.

-The Codex provider runs `codex app-server` as a child process and speaks JSON-RPC over stdio. That gives it native session resume, streaming events, MCP tool access, and `thread/compact/start` compaction — same feature bar as the Claude Agent SDK, without the Anthropic-only lock-in.
+The provider runs `codex app-server` as a child process speaking JSON-RPC over stdio: native streaming, MCP tools, server-side conversation history (the continuation is a thread id, no on-disk transcript). Credentials are **vault-only**: OneCLI serves a sentinel `auth.json` stub into the container and swaps the real ChatGPT token or API key on the wire — no key in `.env`, nothing readable in the container.

 ## Install

 ### Pre-flight

-If all of the following are already present, skip to **Configuration**:
+Check whether the payload is already wired (a prior apply, or a trunk that still carries it). All of these present means installed — skip to **Authenticate**:

- `src/providers/codex.ts`
- `src/providers/codex-registration.test.ts`
- `container/agent-runner/src/providers/codex.ts`
- `container/agent-runner/src/providers/codex-app-server.ts`
- `container/agent-runner/src/providers/codex.factory.test.ts`
- `container/agent-runner/src/providers/codex-registration.test.ts`
- `container/agent-runner/src/providers/codex-dockerfile.test.ts`
- `import './codex.js';` line in `src/providers/index.ts`
- `import './codex.js';` line in `container/agent-runner/src/providers/index.ts`
- `ARG CODEX_VERSION` and `"@openai/codex@${CODEX_VERSION}"` in the pnpm global-install block in `container/Dockerfile`
+- `src/providers/codex.ts` and `src/providers/codex-agents-md.ts`
+- `container/agent-runner/src/providers/codex.ts` and `codex-app-server.ts`
+- `setup/providers/codex.ts`
+- `import './codex.js';` in `src/providers/index.ts`, `container/agent-runner/src/providers/index.ts`, and `setup/providers/index.ts`
+- an `@openai/codex` entry in `container/cli-tools.json`

-Missing pieces — continue below. All steps are idempotent; re-running is safe.
-
-### 1. Fetch the providers branch
+### Fetch and copy

 ```bash
 git fetch origin providers
 ```

-### 2. Copy the Codex source files and tests
+Copy each file with `git show origin/providers:<path> > <path>` (additive — never merge the branch):

-Wholesale copies (owned entirely by this skill — user edits to these files won't survive a re-run, as designed):
+Host (`src/providers/`):
+- `codex.ts` — provider contribution: per-group `.codex-shared` state dir, AGENTS.md compose, skill links
+- `codex-agents-md.ts` — AGENTS.md composition (32KB Codex cap: degrades by dropping the largest instruction sections, never blocks a spawn)
+- `codex-registration.test.ts` — barrel-driven host registration guard
+- `codex-host-contribution.test.ts` — drives the real contribution against a real test DB (the "consumes core" leg)
+- `codex-agents-md.test.ts` — cap-degradation behavior
+
+Container (`container/agent-runner/src/providers/`):
+- `codex.ts` — the provider (turn loop, steering, memory scaffold + `onExchangeComplete` archiving)
+- `codex-app-server.ts` — JSON-RPC child-process wrapper
+- `exchange-archive.ts` — per-exchange markdown writer the `onExchangeComplete` hook uses (provider-owned, not runner code)
+- `exchange-archive.test.ts` — writer behavior
+- `codex-registration.test.ts` — barrel-driven container registration guard
+- `codex.factory.test.ts`, `codex.turns.test.ts`, `codex-app-server.test.ts` — provider behavior
+- `codex-cli-tools.test.ts` — structural guard for the Codex entry in `container/cli-tools.json`
+
+Setup (`setup/providers/`):
+- `codex.ts` — picker entry self-registration + the vault auth walk-through + install check
+- `codex.test.ts` — install-check coverage
+- `codex-registration.test.ts` — barrel-driven setup registration guard
+
+Shared base (skip if present):
+- `container/AGENTS.md` — the runtime-contract base the composed AGENTS.md embeds
+
+### Wire the barrels
+
+Append `import './codex.js';` to each of:
+- `src/providers/index.ts`
+- `container/agent-runner/src/providers/index.ts`
+- `setup/providers/index.ts`
+
+### CLI manifest
+
+The agent's global Node CLIs install from `container/cli-tools.json` (a json-merge seam), not hand-edited Dockerfile layers. Add Codex by appending one entry — `@openai/codex` has no native postinstall, so no `onlyBuilt`:

 ```bash
-git show origin/providers:src/providers/codex.ts                                         > src/providers/codex.ts
-git show origin/providers:src/providers/codex-registration.test.ts                       > src/providers/codex-registration.test.ts
-git show origin/providers:container/agent-runner/src/providers/codex.ts                  > container/agent-runner/src/providers/codex.ts
-git show origin/providers:container/agent-runner/src/providers/codex-app-server.ts       > container/agent-runner/src/providers/codex-app-server.ts
-git show origin/providers:container/agent-runner/src/providers/codex.factory.test.ts     > container/agent-runner/src/providers/codex.factory.test.ts
-git show origin/providers:container/agent-runner/src/providers/codex-registration.test.ts > container/agent-runner/src/providers/codex-registration.test.ts
+node -e '
+  const fs = require("fs");
+  const file = "container/cli-tools.json";
+  const tools = JSON.parse(fs.readFileSync(file, "utf8"));
+  if (!tools.some((t) => t.name === "@openai/codex")) {
+    tools.push({ name: "@openai/codex", version: "0.138.0" });
+    const fmt = (t) => "  { " + Object.entries(t).map(([k, v]) => JSON.stringify(k) + ": " + JSON.stringify(v)).join(", ") + " }";
+    fs.writeFileSync(file, "[\n" + tools.map(fmt).join(",\n") + "\n]\n");
+  }
+'
 ```

-The two `codex-registration.test.ts` files are the **registration guards**. Each imports only the real barrel — the host one calls `listProviderContainerConfigNames()` from `src/providers/index.ts`, the container one calls `listProviderNames()` from `container/agent-runner/src/providers/index.ts` — and asserts `codex` is present. They go red the instant a barrel import line is deleted or drifts. (`codex.factory.test.ts` imports `./codex.js` directly and self-registers, so it stays green even if the barrel line is gone — keep it as a unit test of provider behavior, but it is **not** the registration guard.)
+The version (`0.138.0`) is the canonical pin — keep it in sync with `setup/add-codex.sh`. The Dockerfile already installs every manifest entry via pinned `pnpm install -g`; no Dockerfile edit is needed.

-If `git show origin/providers:.../codex-registration.test.ts` errors with `path ... does not exist`, the registration tests have not landed on `origin/providers` yet. Run `git fetch origin providers` again; once the branch carries them, the copies above succeed. The rest of the install proceeds regardless — the Dockerfile and factory tests still run.
-
-Copy the Dockerfile structural test that ships with this skill into the container provider tree:
+### Build

 ```bash
-cp .claude/skills/add-codex/codex-dockerfile.test.ts container/agent-runner/src/providers/codex-dockerfile.test.ts
+pnpm run build
+pnpm exec tsc -p container/agent-runner/tsconfig.json --noEmit
+./container/build.sh
 ```

-`codex-dockerfile.test.ts` reads the real `container/Dockerfile` and asserts the `ARG CODEX_VERSION=` line and the `pnpm install -g "@openai/codex@${CODEX_VERSION}"` line are both present. The Codex CLI is a binary, not an importable package, so the registration tests cannot see it — this structural test is what guards the Dockerfile edits in step 4.
-
-### 3. Append the self-registration imports
-
-Each barrel gets one line — alphabetical placement keeps diffs small.
-
-`src/providers/index.ts`:
-
-```typescript
-import './codex.js';
-```
-
-`container/agent-runner/src/providers/index.ts`:
-
-```typescript
-import './codex.js';
-```
-
-### 4. Add the Codex CLI to the container Dockerfile
-
-Two edits to `container/Dockerfile`, both idempotent (skip if already present):
-
-**(a)** In the "Pin CLI versions" ARG block (around line 18), add after `ARG CLAUDE_CODE_VERSION=...`:
-
-```dockerfile
-ARG CODEX_VERSION=0.124.0
-```
-
-**(b)** Add a new standalone `RUN` block for the Codex CLI, after the existing per-CLI install blocks (around line 106, right after the `@anthropic-ai/claude-code` block). The Dockerfile splits each global CLI into its own layer for cache granularity — keep that pattern; do not collapse them into a single combined `pnpm install -g` call:
-
-```dockerfile
-RUN --mount=type=cache,target=/root/.cache/pnpm \
-    pnpm install -g "@openai/codex@${CODEX_VERSION}"
-```
-
-Note: **no agent-runner package dependency** — Codex is a CLI binary, not a library. Unlike OpenCode, there's nothing to add to `container/agent-runner/package.json`.
-
-### 5. Build and validate
+### Validate

 ```bash
-pnpm run build                                                          # host
-pnpm exec vitest run src/providers/codex-registration.test.ts          # host registration guard
-pnpm exec tsc -p container/agent-runner/tsconfig.json --noEmit         # container typecheck
-cd container/agent-runner && bun test src/providers/codex-registration.test.ts && cd -   # container registration guard
-cd container/agent-runner && bun test src/providers/codex-dockerfile.test.ts && cd -      # Dockerfile structural guard
-./container/build.sh                                                    # agent image
+pnpm vitest run src/providers/codex-registration.test.ts src/providers/codex-host-contribution.test.ts src/providers/codex-agents-md.test.ts setup/providers/
+cd container/agent-runner && bun test src/providers/
 ```

-All must be clean before proceeding.
+The registration tests import only the real barrels — they go red if a barrel line is missing, a barrel fails to evaluate, or the payload is broken.

- The **host** `codex-registration.test.ts` imports the real host barrel (`src/providers/index.ts`) and asserts `listProviderContainerConfigNames()` contains `codex`. It goes red if the `import './codex.js';` line is deleted or drifts, or if the barrel fails to evaluate.
- The **container** `codex-registration.test.ts` imports the real container barrel (`container/agent-runner/src/providers/index.ts`) and asserts `listProviderNames()` contains `codex`. Same failure surface for the container-side import line.
- The **Dockerfile** `codex-dockerfile.test.ts` reads `container/Dockerfile` and asserts the `ARG CODEX_VERSION=` and `@openai/codex@${CODEX_VERSION}` install lines are present — red if either edit is dropped.
-
-The `@openai/codex` CLI binary is guarded by the Dockerfile structural test plus the container build (`./container/build.sh` fails if the install line is bad), **not** by the registration test — Codex is a CLI binary, not an importable package, so nothing imports it for the registration guard to trip on. To confirm the binary is actually present after the image rebuild, probe it inside a running container with `docker exec <container> codex --version`.
-
-The host-side provider also consumes core APIs (per-session `~/.codex` mount, env passthrough); that typed core-API consumption is guarded by `pnpm run build`.
-
-## Configuration
-
-Codex supports two primary auth paths and one experimental BYO-endpoint path. Pick the one that matches your setup.
-
-### Option A — ChatGPT subscription (recommended for individuals)
-
-On the host (not inside the container), run Codex's OAuth login:
+## Authenticate

 ```bash
-codex login
+pnpm exec tsx setup/index.ts --step provider-auth codex
 ```

-This writes `~/.codex/auth.json` with a subscription token. The host-side Codex provider ([src/providers/codex.ts](../../../src/providers/codex.ts)) copies `auth.json` into a per-session `~/.codex` directory mounted into the container — your host's own Codex CLI is never touched.
+The same walk-through fresh installs get from the setup picker: ChatGPT subscription (browser login or device pairing) or an OpenAI API key, landed in the OneCLI vault. Idempotent — it short-circuits when a matching secret already exists. It finishes with the install check.

-No `.env` variables required for this mode.
+## Use it

-### Option B — API key (recommended for CI or API billing)
+Per group:

-```env
-OPENAI_API_KEY=sk-...
-CODEX_MODEL=gpt-5.4-mini
+```bash
+ncl groups config update --id <group-id> --provider codex
+ncl groups restart --id <group-id>
 ```

-The host forwards both variables into the container. If both subscription (`auth.json`) and `OPENAI_API_KEY` are present, Codex prefers the subscription.
+Switching is an operator action — run it from the host. Memory does NOT carry over automatically — each provider keeps its own store; run `/migrate-memory` to carry it across. See [docs/provider-migration.md](../../docs/provider-migration.md) for the carry-over table and rollback.

-### Option C — BYO OpenAI-compatible endpoint (experimental)
+There is no install-wide default provider. Setup's provider picker sets codex on the first agent it creates; creation itself is provider-agnostic (no `--provider` flag — provider is a DB property). Any group switches afterward via `ncl groups config update --provider` as above.

-Codex's built-in `openai` provider honors the `OPENAI_BASE_URL` env var directly. Point it at any OpenAI-compatible endpoint — Groq, Together, self-hosted vLLM, an OpenAI proxy, etc.
+## Troubleshooting

-```env
-OPENAI_API_KEY=...
-OPENAI_BASE_URL=https://api.groq.com/openai/v1
-CODEX_MODEL=llama-3.3-70b-versatile
-```
-
-Codex also ships first-class local-runner flags — `codex --oss --local-provider ollama` or `--local-provider lmstudio` — that auto-detect a local server. To use those inside NanoClaw, set `CODEX_MODEL` to a model your local runner serves and add the corresponding base URL; see the Codex CLI docs for the full `model_provider = oss` configuration.
-
-**Experimental caveat:** tool-calling quality depends on the model and endpoint. Not every OpenAI-compat provider implements the full function-calling spec, and smaller models (< 30B) often struggle with multi-step tool orchestration. Test before committing.
-
-### Per group / per session
-
-Set `"provider": "codex"` in the group's **`container.json`** (`groups/<folder>/container.json`) — the in-container runner reads `provider` from there, not from the DB. The DB columns **`agent_groups.agent_provider`** and **`sessions.agent_provider`** (session overrides group) only drive host-side provider contribution — per-session `~/.codex` mount, `OPENAI_*` / `CODEX_MODEL` env passthrough — and do not propagate into `container.json` at spawn time. Set both, or just edit `container.json`; if they disagree, the runner uses `container.json` and the host-side resolver falls back through session → group → `container.json` → `'claude'`.
-
-`CODEX_MODEL` applies process-wide via `.env`; if you need different models for different groups, set them via `container_config.env` on the group.
-
-Extra MCP servers still come from **`NANOCLAW_MCP_SERVERS`** / `container_config.mcpServers` on the host. The runner merges them into the same `mcpServers` object passed to all providers.
-
-## Operational notes
-
- **Spawn-per-query:** Codex's app-server is spawned fresh per query invocation, matching the OpenCode pattern. No long-lived daemon to keep healthy across sessions.
- **Per-session `~/.codex` isolation:** each group gets its own copy of the host's `auth.json`. The container can rewrite `config.toml` freely on every wake without touching the host's Codex config.
- **Native compaction:** kicks in automatically at 40K cumulative input tokens between turns, via `thread/compact/start`. If compaction fails, the provider logs and continues uncompacted — no fatal error.
- **Approvals:** auto-accepted inside the container (the container is the sandbox; same posture as Claude/OpenCode).
- **Mid-turn input:** Codex turns don't accept mid-turn messages. Follow-up `push()` calls queue and drain between turns, matching the OpenCode pattern. The poll-loop only pushes between turns anyway, so no messages are dropped.
- **Stale thread recovery:** `isSessionInvalid` matches on stale-thread-ID errors (`thread not found`, `unknown thread`, etc.) so a cold-started app-server can recover cleanly when it sees a stored continuation it no longer has.
-
-## Next Steps
-
-The registration and Dockerfile guards in **Build and validate** confirm the wiring. For a live end-to-end check, set `agent_provider = 'codex'` on a test group and send a message after the image rebuild. A successful round-trip looks like:
-
- `init` event with a stable thread ID as continuation
- One or more `activity` / `progress` events during the turn
- `result` event with the model's reply
-
-If the agent hangs or errors, check `~/.codex/auth.json` exists on the host (Option A) or that `OPENAI_API_KEY` is forwarding correctly (Option B) — `docker exec` into a running container and `env | grep -i openai` to confirm. To confirm the CLI binary itself landed in the image, `docker exec <container> codex --version`.
-
-To back this provider out, follow [REMOVE.md](REMOVE.md).
+- **Container dies at boot, channel silent:** `grep 'Container exited non-zero' logs/nanoclaw.error.log` — the `stderrTail` carries the reason (e.g. `Unknown provider: codex. Registered: claude` means the barrels aren't wired in the running build).
+- **In-channel `Error: spawn codex ENOENT` on every message:** the image predates the manifest entry — re-run `./container/build.sh`.
+- **Auth errors mid-conversation:** the vault secret is missing or stale — re-run `pnpm exec tsx setup/index.ts --step provider-auth codex` (subscription re-login updates the vault copy).
@@ -0,0 +1,39 @@
+// Structural guard for the Codex CLI install in container/cli-tools.json.
+//
+// @openai/codex is a CLI *binary* installed from the global-CLI manifest (a
+// json-merge seam), not an importable package, so the barrel-driven
+// registration tests cannot see it. This test reads the real cli-tools.json
+// and asserts the @openai/codex entry is present and pinned to an exact
+// version. It goes red if the manifest entry is dropped or unpins.
+//
+// Runs under bun (same suite as the container registration test):
+//   cd container/agent-runner && bun test src/providers/codex-cli-tools.test.ts
+
+import { existsSync, readFileSync } from 'fs';
+import path from 'path';
+
+import { describe, it, expect } from 'bun:test';
+
+// container/agent-runner/src/providers/ -> container/cli-tools.json
+const MANIFEST = path.join(import.meta.dir, '..', '..', '..', 'cli-tools.json');
+const manifestPresent = existsSync(MANIFEST);
+
+// Read lazily — `describe.skipIf` still runs the body to register tests, so the
+// read has to be guarded for the bare-branch (no manifest) case.
+const tools: Array<{ name: string; version: string }> = manifestPresent
+  ? JSON.parse(readFileSync(MANIFEST, 'utf8'))
+  : [];
+const codex = tools.find((t) => t.name === '@openai/codex');
+
+// cli-tools.json is a trunk file; on the bare providers branch it isn't present,
+// so skip there. In an installed tree (trunk + this payload) it must carry the
+// pinned @openai/codex entry.
+describe.skipIf(!manifestPresent)('container/cli-tools.json codex CLI install', () => {
+  it('includes the @openai/codex entry', () => {
+    expect(codex).toBeDefined();
+  });
+
+  it('pins it to an exact semver (no latest, no ranges)', () => {
+    expect(codex?.version).toMatch(/^\d+\.\d+\.\d+(?:[-+][0-9A-Za-z.-]+)?$/);
+  });
+});
@@ -1,30 +0,0 @@
-// Structural guard for the Codex CLI install in container/Dockerfile.
-//
-// @openai/codex is a CLI *binary* installed via the Dockerfile, not an
-// importable package, so the barrel-driven registration tests cannot see it.
-// This test reads the real Dockerfile and asserts the version ARG and the
-// `pnpm install -g` line for @openai/codex are both present. It goes red if
-// either Dockerfile edit is dropped or drifts.
-//
-// Runs under bun (same suite as the container registration test):
-//   cd container/agent-runner && bun test src/providers/codex-dockerfile.test.ts
-
-import { readFileSync } from 'fs';
-import path from 'path';
-
-import { describe, it, expect } from 'bun:test';
-
-// container/agent-runner/src/providers/ -> container/Dockerfile
-const DOCKERFILE = path.join(import.meta.dir, '..', '..', '..', 'Dockerfile');
-
-describe('container/Dockerfile codex CLI install', () => {
-  const dockerfile = readFileSync(DOCKERFILE, 'utf8');
-
-  it('declares the CODEX_VERSION ARG', () => {
-    expect(dockerfile).toMatch(/ARG\s+CODEX_VERSION=/);
-  });
-
-  it('installs the @openai/codex CLI pinned to that ARG', () => {
-    expect(dockerfile).toMatch(/pnpm install -g\s+"@openai\/codex@\$\{CODEX_VERSION\}"/);
-  });
-});
@@ -111,8 +111,8 @@ Run `/manage-channels` to wire the GitHub channel to an agent group, or insert m

 ```sql
 -- Create messaging group (one per repo)
-INSERT INTO messaging_groups (id, channel_type, platform_id, name, is_group, unknown_sender_policy, created_at)
-VALUES ('mg-github-myrepo', 'github', 'github:owner/repo', 'owner/repo', 1, '<policy>', datetime('now'));
+INSERT INTO messaging_groups (id, channel_type, platform_id, instance, name, is_group, unknown_sender_policy, created_at)
+VALUES ('mg-github-myrepo', 'github', 'github:owner/repo', 'github', 'owner/repo', 1, '<policy>', datetime('now'));

 -- Wire to agent group
 INSERT INTO messaging_group_agents (id, messaging_group_id, agent_group_id, trigger_rules, response_scope, session_mode, priority, created_at)
@@ -119,8 +119,8 @@ Run `/manage-channels` to wire the Linear channel to an agent group, or insert m

 ```sql
 -- Create messaging group (one per team)
-INSERT INTO messaging_groups (id, channel_type, platform_id, name, is_group, unknown_sender_policy, created_at)
-VALUES ('mg-linear-eng', 'linear', 'linear:ENG', 'Engineering', 1, 'public', datetime('now'));
+INSERT INTO messaging_groups (id, channel_type, platform_id, instance, name, is_group, unknown_sender_policy, created_at)
+VALUES ('mg-linear-eng', 'linear', 'linear:ENG', 'linear', 'Engineering', 1, 'public', datetime('now'));

 -- Wire to agent group
 INSERT INTO messaging_group_agents (id, messaging_group_id, agent_group_id, trigger_rules, response_scope, session_mode, priority, created_at)
@@ -71,6 +71,8 @@ Parse the `PAIR_TELEGRAM_ISSUED` status block for `CODE` and follow the `REMINDE

 ## 4. Run the init script

+First, pick the agent provider. Read `src/providers/index.ts` and collect the installed providers from its `import './<name>.js';` lines — `claude` is always available as the built-in default. If a non-default provider is installed (e.g. codex), ask the user which one this agent should run on; if only claude is available, skip the question and omit the flag.
+
 ```bash
 npx tsx scripts/init-first-agent.ts \
  --channel "${CHANNEL}" \
@@ -80,7 +82,7 @@ npx tsx scripts/init-first-agent.ts \
  --agent-name "${AGENT_NAME}"
 ```

-Add `--welcome "System instruction: ..."` to override the default welcome prompt.
+Add `--provider <name>` when the user picked a non-default provider (there is no install-wide default — the choice is explicit per group). Add `--welcome "System instruction: ..."` to override the default welcome prompt.

 The script:
 1. Upserts the `users` row and grants `owner` role if no owner exists.
@@ -67,6 +67,8 @@ pnpm exec tsx setup/index.ts --step register -- \

 The `register` step creates the agent group (reusing it if the folder already exists), the messaging group, and the wiring row. `createMessagingGroupAgent` auto-creates the companion `agent_destinations` row so the agent can address the channel by name.

+When creating a NEW agent group on a non-default provider, append `--provider <name>` (e.g. `--provider codex`) — there is no install-wide default; existing groups switch via `ncl groups config update --provider` instead.
+
 For separate agents, also ask for a folder name and optionally a different assistant name.

 ## Add Channel Group
@@ -0,0 +1,50 @@
+---
+name: migrate-memory
+description: Carry an agent group's memory across a provider switch, in either direction (e.g. Claude ↔ Codex, or any provider to/from another). Run after the operator switches a group's provider with `ncl groups config update --provider`. The coding agent reads the source provider's memory store, distills it into the target provider's store, and restarts the group. Triggers on "migrate memory", "carry memory over", "the agent forgot everything after the switch".
+---
+
+# Migrate memory across a provider switch
+
+NanoClaw does not migrate memory at runtime — each provider keeps its own store, and carrying content across is the operator's move, executed by you (the coding agent). This skill is the whole mechanism: read the source store, **infer** what is durable, write it into the target store, restart.
+
+You translate between **store shapes**, not provider names. There are two:
+
+- **Flat file** — `CLAUDE.local.md` at the group workspace root (the Claude provider; may reference satellite files in the workspace).
+- **Scaffold tree** — `memory/` (any provider with `usesMemoryScaffold`, e.g. Codex). `memory/index.md` is the index; durable notes live under `memory/memories/`; `memory/memories/imported-agent-memory.md` is the conventional landing file for imported memory.
+
+A switch only needs migration when it **crosses shapes**. Two providers that both use the scaffold share the same `memory/` tree, so switching between them carries nothing — the memory is already there. The work is always one of: flat → scaffold, or scaffold → flat.
+
+Principles: **copy, never move** (the source store stays intact — it IS the rollback), **idempotent** (re-running must not duplicate), **distill, don't dump** (you are the inference step: keep identity/seed instructions, user preferences, durable facts; drop conversational residue).
+
+## Step 1: Identify the group, both providers, and the direction
+
+- `ncl groups list`, then `ncl groups config get --id <group-id>` — note the current (target) `provider`. Ask the operator which group, and which provider it switched *from*, if either is ambiguous.
+- Map each provider to its store shape (flat `CLAUDE.local.md` vs `memory/` scaffold), then inspect `groups/<folder>/`:
+  - **Same shape on both sides** (e.g. scaffold → scaffold) → the store is shared; nothing to migrate. Tell the operator and stop.
+  - **Flat → scaffold** (source has `CLAUDE.local.md` content, target uses the scaffold) → Step 2.
+  - **Scaffold → flat** (source has a `memory/` tree, target is Claude) → Step 3.
+  - Source missing or empty → nothing to migrate; tell the operator and stop.
+
+## Step 2: flat → scaffold (`CLAUDE.local.md` → `memory/`)
+
+1. Read `groups/<folder>/CLAUDE.local.md` and any workspace files it references.
+2. If `memory/memories/imported-agent-memory.md` already exists, a previous import happened — show the operator what's there and ask before overwriting; integrate only what's new.
+3. Distill the content into `groups/<folder>/memory/memories/imported-agent-memory.md` (create the directories if missing — the container scaffolds the rest of the tree at boot and never clobbers your files). Lead with anything that defines who the agent is or how it must behave; references to satellite files keep their workspace-root paths.
+4. If `memory/index.md` exists, add the following: `- [Imported agent memory](memories/imported-agent-memory.md) — seed instructions and memory carried over from a previous provider. Read it first and treat it as binding; it may define who you are and how to behave. Integrate its facts into your memory as you work; never modify files that belong to another provider's memory system.`
+5. Leave the source store exactly as it is.
+
+## Step 3: scaffold → flat (`memory/` → `CLAUDE.local.md`)
+
+1. Read `memory/index.md`, then the files it points to under `memory/memories/` (and `memory/data/` where durable).
+2. Integrate the durable facts into `groups/<folder>/CLAUDE.local.md` under a clearly marked section (e.g. `## Imported from memory/ (<date>)`), deduplicating against what's already there. If the section already exists, update it instead of appending a second one.
+3. Leave the source store exactly as it is.
+
+## Step 4: Restart and verify
+
+```bash
+ncl groups restart --id <group-id>
+```
+
+Tell the operator to send the group a quick test message that depends on a migrated fact (a preference, a project name). If the agent doesn't know it, re-check that the target file landed in the right group folder.
+
+Note: switching the provider is an operator action — `ncl groups config update --id <group-id> --provider <name>` from the host. See [docs/provider-migration.md](../../../docs/provider-migration.md) for what carries over automatically.
@@ -28,6 +28,15 @@ Two phases: **Extract** (build the migration guide) and **Upgrade** (use it). If

 ---

+# Phase 0: Refresh this skill first
+
+The migration process itself evolves, so run its newest version before doing anything else:
+- Ensure the `upstream` remote exists (default `https://github.com/nanocoai/nanoclaw.git`) and fetch: `git fetch upstream --prune`. Detect the upstream branch (`main` or `master`).
+- Refresh this skill from upstream: `git checkout upstream/<branch> -- .claude/skills/migrate-nanoclaw/`
+- Re-read `.claude/skills/migrate-nanoclaw/SKILL.md`. If it changed, **follow the updated version from the top** instead of this one.
+
+This is the only working-tree change expected before the preflight check below; changes limited to `.claude/skills/migrate-nanoclaw/` are this self-refresh — ignore them in the 1.0 clean-tree check and proceed.
+
 # Phase 1: Extract

 ## 1.0 Preflight
@@ -464,6 +473,11 @@ Point the branch at the upgraded state with `git reset --hard <upgrade-commit>`

 Run `pnpm install && pnpm run build` in the main tree to confirm.

+Stamp the upgrade marker (required — without it the startup tripwire stops the host on next start). Only do this after the build above succeeds:
+```bash
+pnpm exec tsx scripts/upgrade-state.ts set "" migrate-nanoclaw
+```
+
 Restart the service. Service labels are per-install — derive them from `setup/lib/install-slug.sh`:
 ```bash
 source setup/lib/install-slug.sh
@@ -60,11 +60,20 @@ Help a user with a customized NanoClaw install safely incorporate upstream chang
 - Default to MERGE (one-pass conflict resolution). Offer REBASE as an explicit option.
 - Keep token usage low: rely on `git status`, `git log`, `git diff`, and open only conflicted files.

+# Step 0a: Refresh this skill first
+The update process itself evolves, so run its newest version before doing anything else:
+- Ensure the `upstream` remote exists (default `https://github.com/nanocoai/nanoclaw.git`) and fetch: `git fetch upstream --prune`. Detect the upstream branch (`main` or `master`).
+- Refresh this skill from upstream: `git checkout upstream/<branch> -- .claude/skills/update-nanoclaw/`
+- Re-read `.claude/skills/update-nanoclaw/SKILL.md`. If it changed, **follow the updated version from the top** instead of this one.
+
+This is the only working-tree change expected before the preflight check; the full update commits it along with everything else.
+
 # Step 0: Preflight (stop early if unsafe)
 Run:
 - `git status --porcelain`
 If output is non-empty:
 - Tell the user to commit or stash first, then stop.
+- Exception: changes limited to `.claude/skills/update-nanoclaw/` are the Step 0a self-refresh — ignore those and proceed.

 Confirm remotes:
 - `git remote -v`
@@ -256,6 +265,16 @@ If any channels/providers are installed AND `upstream/channels` or `upstream/pro

 If no channels/providers are installed, skip silently.

+Proceed to Step 7.9.
+
+# Step 7.9: Stamp the upgrade marker (required)
+After validation has **succeeded**, record that this install reached the new version through the supported path. Without this, the startup tripwire stops the host on its next start.
+
+- `pnpm exec tsx scripts/upgrade-state.ts set "" update-nanoclaw`
+  - The empty version argument stamps the current `package.json` version.
+
+If validation did NOT succeed, do not stamp — leave the tripwire to catch the broken state.
+
 Proceed to Step 8.

 # Step 8: Summary + rollback instructions
@@ -18,12 +18,20 @@ jobs:

      - uses: actions/checkout@v4
        with:
+          fetch-depth: 0
          token: ${{ steps.app-token.outputs.token }}

      - uses: pnpm/action-setup@v4

      - name: Bump patch version
        run: |
+          # Skip the auto-bump when the pushed commits already changed the
+          # version themselves (e.g. a release PR that set a minor/major).
+          # Otherwise the bot would patch a deliberate 2.1.0 up to 2.1.1.
+          if git diff --name-only "${{ github.event.before }}" "${{ github.sha }}" | grep -qx 'package.json'; then
+            echo "package.json already changed in this push; skipping auto-bump."
+            exit 0
+          fi
          pnpm version patch --no-git-tag-version
          git add package.json
          git diff --cached --quiet && exit 0
@@ -39,3 +39,10 @@ groups/*
 .nanoclaw/

 agents-sdk-docs
+.agents
+AGENTS.md
+
+# Internal working docs, never committed
+docs/maintainer-guide.md
+docs/drafts/
+forks.md
@@ -2,6 +2,21 @@

 All notable changes to NanoClaw will be documented in this file.

+## [Unreleased]
+
+- [BREAKING] **`@onecli-sh/sdk` 0.5.0 -> 2.2.1 — requires a OneCLI server with the `/v1` API** (older servers 404 every SDK call). The sanctioned gateway and CLI versions are pinned in `versions.json`; the `onecli` setup step enforces them. **Migration:** [docs/onecli-upgrades.md](docs/onecli-upgrades.md).
+- **New agent provider: Codex (OpenAI) — run `/add-codex`.** Full runtime via `codex app-server` (planning, MCP tools, server-side history, resume). Trunk ships the seams and the skill; the payload installs from the `providers` branch (the skill, the setup picker, or `--step provider-auth codex`). Auth is vault-only — no credential ever enters a container.
+- **Setup can now select, install, and authenticate a non-default agent provider.** A provider registry feeds the setup picker, an installer pulls the provider's payload from its branch, a vault auth walkthrough runs (`--step provider-auth`), and the picked provider is set on the first agent (a DB property) before its first spawn. Default (Claude) installs are unaffected — picking Claude changes nothing.
+- **Provider choice is explicit per group — no install-wide default.** Provider is a DB property set via `ncl groups config update --provider` + restart; creation is provider-agnostic.
+- **Memory migrates via `/migrate-memory`, never at runtime.** Each provider keeps its own store; fresh groups on a surfaces-owning provider see no stale `CLAUDE.*` files. See [docs/provider-migration.md](docs/provider-migration.md).
+- **Per-exchange archiving is provider-owned** — the `onExchangeComplete` hook; the markdown writer ships with the codex payload.
+- **Container boot failures now say why** — the last stderr lines are logged at `warn` on a non-zero exit instead of a silent crash loop.
+- **Slash commands now interrupt an in-flight turn.** A runner-handled command (`/clear`, `/compact`, `/cost`, …) arriving mid-turn aborts the active stream and runs immediately instead of waiting out the turn.
+
+## [2.1.0] - 2026-06-07
+
+- [BREAKING] **Startup now requires an upgrade marker.** The host refuses to boot unless `data/upgrade-state.json` records that this install reached the current version through a sanctioned path (`/setup`, `/update-nanoclaw`, `/migrate-nanoclaw`). After this update completes — and before restarting the service — stamp the marker by running `pnpm exec tsx scripts/upgrade-state.ts set`. If the host has already tripped on restart with "update did not go through the supported path", that same command clears it. See [docs/upgrade-recovery.md](docs/upgrade-recovery.md).
+
 ## [2.0.64] - 2026-05-18

 - **`ncl destinations add` and `remove` through the approval flow now reach the receiver immediately.** Approved destinations weren't being projected into the receiving agent's local session state, so a freshly-added destination silently failed at `send_message` with `unknown destination`, and a removed destination stayed resolvable until the next container restart. Both now take effect the moment the approval executes. Direct (non-approval) calls were unaffected.
@@ -33,7 +33,7 @@ user_dms (user_id, channel_type, messaging_group_id) — cold-DM cache

 agent_groups (workspace, memory, CLAUDE.md, personality, container config)
    ↕ many-to-many via messaging_group_agents (session_mode, trigger_rules, priority)
-messaging_groups (one chat/channel on one platform; unknown_sender_policy)
+messaging_groups (one chat/channel on one platform; instance = adapter-instance name, defaults to channel_type; unknown_sender_policy)

 sessions (agent_group_id + messaging_group_id + thread_id → per-session container)
 ```
@@ -69,8 +69,8 @@ For ad-hoc queries from skills or scripts, use the in-tree wrapper rather than t
 | `src/modules/permissions/access.ts` | `canAccessAgentGroup` — owner / global admin / scoped admin / member resolution against `user_roles` + `agent_group_members` |
 | `src/modules/approvals/primitive.ts` | `pickApprover`, `pickApprovalDelivery`, `requestApproval`, approval-handler registry |
 | `src/command-gate.ts` | Router-side admin command gate — queries `user_roles` directly (no env var, no container-side check) |
-| `src/onecli-approvals.ts` | OneCLI credentialed-action approval bridge |
-| `src/user-dm.ts` | Cold-DM resolution + `user_dms` cache |
+| `src/modules/approvals/onecli-approvals.ts` | OneCLI credentialed-action approval bridge |
+| `src/modules/permissions/user-dm.ts` | Cold-DM resolution + `user_dms` cache |
 | `src/group-init.ts` | Per-agent-group filesystem scaffold (CLAUDE.md, skills, agent-runner-src overlay) |
 | `src/db/container-configs.ts` | CRUD for `container_configs` table (per-group container runtime config) |
 | `src/backfill-container-configs.ts` | Migrates legacy `container.json` files into the DB on startup |
@@ -83,6 +83,7 @@ For ad-hoc queries from skills or scripts, use the in-tree wrapper rather than t
 | `groups/<folder>/` | Per-agent-group filesystem (CLAUDE.md, skills, per-group `agent-runner-src/` overlay) |
 | `scripts/init-first-agent.ts` | Bootstrap the first DM-wired agent (used by `/init-first-agent` skill) |
 | `migrate-v2.sh` + `setup/migrate-v2/` | v1→v2 migration. Standalone script: `bash migrate-v2.sh`. Seeds DB, copies groups/sessions, installs channels, builds container, offers service switchover, then hands off to `/migrate-from-v1` skill for owner setup and CLAUDE.md cleanup. See [docs/migration-dev.md](docs/migration-dev.md). |
+| `nanoclaw.sh --uninstall` + `setup/uninstall/` | Uninstall this copy only (slug-scoped): service, containers + image, `data/`, `logs/`, `groups/`, this copy's OneCLI agents. Confirms per group; `--dry-run` previews, `--yes` skips prompts. Other copies and the shared OneCLI app are untouched. Bypasses bootstrap entirely; `uninstall.sh` is a pointer that execs it. |

 ## Admin CLI (`ncl`)

@@ -151,7 +152,7 @@ Key files: `src/container-restart.ts`, `src/container-runner.ts` (`killContainer

 ## Secrets / Credentials / OneCLI

-API keys, OAuth tokens, and auth credentials are managed by the OneCLI gateway. Secrets are injected into per-agent containers at request time — none are passed in env vars or through chat context. The container agent sees this via the `onecli-gateway` container skill (`container/skills/onecli-gateway/SKILL.md`), which teaches it how the proxy works, how to handle auth errors, and to never ask for raw credentials. Host-side wiring: `src/onecli-approvals.ts`, `ensureAgent()` in `container-runner.ts`. Run `onecli --help`.
+API keys, OAuth tokens, and auth credentials are managed by the OneCLI gateway. Secrets are injected into per-agent containers at request time — none are passed in env vars or through chat context. The container agent sees this via the `onecli-gateway` container skill (`container/skills/onecli-gateway/SKILL.md`), which teaches it how the proxy works, how to handle auth errors, and to never ask for raw credentials. Host-side wiring: `src/modules/approvals/onecli-approvals.ts`, `ensureAgent()` in `container-runner.ts`. Run `onecli --help`.

 ### Secret modes

@@ -192,6 +193,7 @@ Four types of skills. See [CONTRIBUTING.md](CONTRIBUTING.md) for the full taxono
 | `/debug` | Container issues, logs, troubleshooting |
 | `/update-nanoclaw` | Bring upstream updates into a customized install |
 | `/init-onecli` | Install OneCLI Agent Vault and migrate `.env` credentials |
+| `/migrate-memory` | Carry a group's agent memory across a provider switch (operator-run, both directions) |

 ## Contributing

@@ -274,6 +276,10 @@ This project uses pnpm with `minimumReleaseAge: 4320` (3 days) in `pnpm-workspac
 | [docs/build-and-runtime.md](docs/build-and-runtime.md) | Runtime split (Node host + Bun container), lockfiles, image build surface, CI, key invariants |
 | [docs/v1-to-v2-changes.md](docs/v1-to-v2-changes.md) | v1→v2 architecture diff — vocabulary for where v1 things moved |
 | [docs/migration-dev.md](docs/migration-dev.md) | Migration development guide — testing, debugging, dev loop |
+| [docs/provider-migration.md](docs/provider-migration.md) | Switching a live agent group between providers (e.g. Claude → Codex) — what carries over, rollback |
+| [docs/customizing.md](docs/customizing.md) | Short intro to customizing via skills |
+| [docs/skills-model.md](docs/skills-model.md) | The skills model in full: recipes, tests, upgrades, migrations |
+| [docs/skill-guidelines.md](docs/skill-guidelines.md) | Authoritative checklist for writing a skill |

 ## Container Build Cache

@@ -19,6 +19,13 @@

 **Not accepted:** Features, capabilities, compatibility, enhancements. These should be skills.

+## Breaking Changes
+
+Breaking changes are allowed; **silent** ones are not. NanoClaw does not migrate user installs at runtime — the user's coding agent is the migrator, so every breaking change must ship a migration path that agent can execute without a human reverse-engineering the diff:
+
+1. **Every `[BREAKING]` CHANGELOG entry must reference its migration path** — either a skill to run (`Run /<skill-name> to <action>`) or a `docs/` page covering **detect / why / fix / verify / rollback** (see [docs/onecli-upgrades.md](docs/onecli-upgrades.md) for the shape). `/update-nanoclaw` surfaces these entries after every update and walks the user through them.
+2. **If the change moves an external component's sanctioned version** (gateway, pinned CLI binary, …), update its pin in [`versions.json`](versions.json). The changelog stays human-narrative; `versions.json` is the machine-checkable signal — `/update-nanoclaw` diffs it across the update and routes the user to the linked doc for any pin that moved.
+
 ## Skills

 NanoClaw uses [Claude Code skills](https://code.claude.com/docs/en/skills) — markdown files with optional supporting files that teach Claude how to do something. There are four types of skills in NanoClaw, each serving a different purpose.
@@ -29,26 +36,27 @@ Every user should have clean and minimal code that does exactly what they need.

 ### Skill types

-#### 1. Feature skills (branch-based)
+#### 1. Channel and provider skills (registry branches)

-Add capabilities to NanoClaw by merging a git branch. The SKILL.md contains setup instructions; the actual code lives on a `skill/*` branch.
+Add a messaging channel or an agent provider. The SKILL.md contains the install steps; the actual code lives on a long-lived registry branch (`channels` or `providers`) that we keep in sync with `main`.

-**Location:** `.claude/skills/` on `main` (instructions only), code on `skill/*` branch
+**Location:** `.claude/skills/` on `main` (instructions only), code on the `channels` or `providers` branch

-**Examples:** `/add-telegram`, `/add-slack`, `/add-discord`, `/add-gmail`
+**Examples:** `/add-telegram`, `/add-slack`, `/add-discord`, `/add-opencode`

 **How they work:**
 1. User runs `/add-telegram`
-2. Claude follows the SKILL.md: fetches and merges the `skill/telegram` branch
-3. Claude walks through interactive setup (env vars, bot creation, etc.)
+2. Claude follows the SKILL.md: `git fetch origin channels`, then copies each file in with `git show origin/channels:<path> > <path>`. Install is an additive fetch, never a `git merge`.
+3. The adapter's registration test is fetched the same way and run as verification
+4. Claude walks through interactive setup (tokens, bot creation, etc.)

-**Contributing a feature skill:**
+**Contributing a channel or provider skill:**
 1. Fork `nanocoai/nanoclaw` and branch from `main`
-2. Make the code changes (new files, modified source, updated `package.json`, etc.)
-3. Add a SKILL.md in `.claude/skills/<name>/` with setup instructions — step 1 should be merging the branch
-4. Open a PR. We'll create the `skill/<name>` branch from your work
+2. Build the adapter following [docs/skill-guidelines.md](docs/skill-guidelines.md): a self-registering module, one appended barrel import, and a registration test that imports the real barrel
+3. Add a SKILL.md in `.claude/skills/<name>/` with the fetch-and-copy steps, and a REMOVE.md that reverses every change
+4. Open a PR. We'll land the code on the registry branch from your work

-See `/add-telegram` for a good example. See [docs/skills-as-branches.md](docs/skills-as-branches.md) for the full system design.
+See `/add-slack` for a good example. See [docs/skills-model.md](docs/skills-model.md) for why install is a fetch, never a merge.

 #### 2. Utility skills (with code files)

@@ -58,7 +66,7 @@ Standalone tools that ship code files alongside the SKILL.md. The SKILL.md tells

 **Examples:** a self-contained CLI or helper shipped in a `scripts/` subfolder of the skill.

-**Key difference from feature skills:** No branch merge needed. The code is self-contained in the skill directory and gets copied into place during installation.
+**Key difference from channel/provider skills:** the code is self-contained in the skill directory and gets copied into place during installation; nothing is fetched from a registry branch.

 **Guidelines:**
 - Put code in separate files, not inline in the SKILL.md
@@ -93,6 +101,10 @@ Skills that run inside the agent container, not on the host. These teach the con
 - Use `allowed-tools` frontmatter to scope tool permissions
 - Keep them focused — the agent's context window is shared across all container skills

+### Writing a good skill
+
+The authoring bar is [docs/skill-guidelines.md](docs/skill-guidelines.md): mostly adds, minimal reach-ins into existing code, a test for every functional integration point, and a REMOVE.md whenever apply leaves anything behind. [docs/skills-model.md](docs/skills-model.md) explains the model behind it.
+
 ### SKILL.md format

 All skills use the [Claude Code skills standard](https://code.claude.com/docs/en/skills):
@@ -196,11 +196,19 @@ Ask Claude Code. "Why isn't the scheduler running?" "What's in the recent logs?"

 If a step fails, `nanoclaw.sh` hands off to Claude Code to diagnose and resume. If that doesn't resolve it, run `claude`, then `/debug`. If Claude identifies an issue likely to affect other users, open a PR against the relevant setup step or skill.

+**How do I uninstall NanoClaw?**
+
+```bash
+bash nanoclaw.sh --uninstall
+```
+
+Every install is tagged with a per-checkout id, so the uninstaller removes only what belongs to that copy: the background service, containers and image, app data and logs, your agents' files, and this copy's OneCLI vault agents. Shared things — the OneCLI app and your credentials, other NanoClaw copies on the machine — are left alone. It shows exactly what it found and asks for confirmation per group; nothing is deleted until you say yes. Use `--dry-run` to preview without changing anything, or `--yes` to skip the prompts. Your `.env` is backed up before removal. To finish, delete the checkout folder itself.
+
 **What changes will be accepted into the codebase?**

 Only security fixes, bug fixes, and clear improvements will be accepted to the base configuration. That's all.

-Everything else (new capabilities, OS compatibility, hardware support, enhancements) should be contributed as skills on the `channels` or `providers` branch.
+Everything else (new capabilities, OS compatibility, hardware support, enhancements) should be contributed as skills: channel and provider code on the `channels`/`providers` registry branches, everything else as a self-contained skill. See [docs/customizing.md](docs/customizing.md) and [CONTRIBUTING.md](CONTRIBUTING.md).

 This keeps the base system minimal and lets every user customize their installation without inheriting features they don't want.

@@ -16,12 +16,11 @@ FROM node:22-slim
 # CJK fonts add ~200MB. Opt in only if you render Chinese/Japanese/Korean text.
 ARG INSTALL_CJK_FONTS=false

-# Pin CLI versions for reproducibility. Bump deliberately — unpinned installs
-# mean every rebuild silently picks up the latest and can break in lockstep
-# across all users.
-ARG CLAUDE_CODE_VERSION=2.1.154
-ARG AGENT_BROWSER_VERSION=latest
-ARG VERCEL_VERSION=52.2.1
+# Pin versions for reproducibility. Bump deliberately — unpinned installs mean
+# every rebuild silently picks up the latest and can break in lockstep across
+# all users. The global Node CLIs (claude-code, agent-browser, vercel) are
+# pinned in cli-tools.json so a skill can add one with a json-merge; Bun (the
+# runtime) is pinned here because it installs from a different source.
 ARG BUN_VERSION=1.3.12

 # ---- System dependencies -----------------------------------------------------
@@ -99,16 +98,13 @@ ENV PATH="$PNPM_HOME:$PATH"
 ARG PNPM_VERSION=10.33.0
 RUN corepack enable && corepack prepare pnpm@${PNPM_VERSION} --activate

+# Global Node CLIs the agent invokes at runtime live in cli-tools.json so a
+# skill can add one with a json-merge instead of editing this Dockerfile.
+# install-cli-tools.sh installs each via pnpm (pinned), writing the per-tool
+# only-built-dependencies opt-ins it reads from the manifest.
+COPY cli-tools.json install-cli-tools.sh /tmp/
 RUN --mount=type=cache,target=/root/.cache/pnpm \
-    echo "only-built-dependencies[]=agent-browser" > /root/.npmrc && \
-    echo "only-built-dependencies[]=@anthropic-ai/claude-code" >> /root/.npmrc && \
-    pnpm install -g "vercel@${VERCEL_VERSION}"
-
-RUN --mount=type=cache,target=/root/.cache/pnpm \
-    pnpm install -g "agent-browser@${AGENT_BROWSER_VERSION}"
-
-RUN --mount=type=cache,target=/root/.cache/pnpm \
-    pnpm install -g "@anthropic-ai/claude-code@${CLAUDE_CODE_VERSION}"
+    sh /tmp/install-cli-tools.sh /tmp/cli-tools.json

 # ---- ncl CLI wrapper ----------------------------------------------------------
 # Actual script lives in the mounted source at /app/src/cli/ncl.ts.
@@ -5,7 +5,7 @@
    "": {
      "name": "nanoclaw-agent-runner",
      "dependencies": {
-        "@anthropic-ai/claude-agent-sdk": "^0.3.154",
+        "@anthropic-ai/claude-agent-sdk": "^0.3.170",
        "@anthropic-ai/sdk": "^0.100.0",
        "@modelcontextprotocol/sdk": "^1.29.0",
        "cron-parser": "^5.0.0",
@@ -19,23 +19,23 @@
    },
  },
  "packages": {
-    "@anthropic-ai/claude-agent-sdk": ["@anthropic-ai/claude-agent-sdk@0.3.154", "", { "optionalDependencies": { "@anthropic-ai/claude-agent-sdk-darwin-arm64": "0.3.154", "@anthropic-ai/claude-agent-sdk-darwin-x64": "0.3.154", "@anthropic-ai/claude-agent-sdk-linux-arm64": "0.3.154", "@anthropic-ai/claude-agent-sdk-linux-arm64-musl": "0.3.154", "@anthropic-ai/claude-agent-sdk-linux-x64": "0.3.154", "@anthropic-ai/claude-agent-sdk-linux-x64-musl": "0.3.154", "@anthropic-ai/claude-agent-sdk-win32-arm64": "0.3.154", "@anthropic-ai/claude-agent-sdk-win32-x64": "0.3.154" }, "peerDependencies": { "@anthropic-ai/sdk": ">=0.93.0", "@modelcontextprotocol/sdk": "^1.29.0", "zod": "^4.0.0" } }, "sha512-iEn25urI2QrMPFIhId3h7v/7EG5gsmF7ooe+6EvsAosePeLmpVVerp5nXtHnlmBkMinLecurcPA+OddKw76jYw=="],
+    "@anthropic-ai/claude-agent-sdk": ["@anthropic-ai/claude-agent-sdk@0.3.170", "", { "optionalDependencies": { "@anthropic-ai/claude-agent-sdk-darwin-arm64": "0.3.170", "@anthropic-ai/claude-agent-sdk-darwin-x64": "0.3.170", "@anthropic-ai/claude-agent-sdk-linux-arm64": "0.3.170", "@anthropic-ai/claude-agent-sdk-linux-arm64-musl": "0.3.170", "@anthropic-ai/claude-agent-sdk-linux-x64": "0.3.170", "@anthropic-ai/claude-agent-sdk-linux-x64-musl": "0.3.170", "@anthropic-ai/claude-agent-sdk-win32-arm64": "0.3.170", "@anthropic-ai/claude-agent-sdk-win32-x64": "0.3.170" }, "peerDependencies": { "@anthropic-ai/sdk": ">=0.93.0", "@modelcontextprotocol/sdk": "^1.29.0", "zod": "^4.0.0" } }, "sha512-pAvhfk+iTodXZ6RF18Kz7BEUWFjL7EcR3tKuhUNdPpE1NAYCR3mSHGbafi72JsrNwKEDIs7FU31z3fqhwy8QzA=="],

-    "@anthropic-ai/claude-agent-sdk-darwin-arm64": ["@anthropic-ai/claude-agent-sdk-darwin-arm64@0.3.154", "", { "os": "darwin", "cpu": "arm64" }, "sha512-oFW3LD5lYrKAU+AKu27Z8hrzqkrh362qQrwi/i3DxGcud9BXUycsXYjShpDj3D3JZu169UzZuSPhx1Wajmbiwg=="],
+    "@anthropic-ai/claude-agent-sdk-darwin-arm64": ["@anthropic-ai/claude-agent-sdk-darwin-arm64@0.3.170", "", { "os": "darwin", "cpu": "arm64" }, "sha512-rwfgArIa5WI0QPNqFsRBgvtSI0mrtpynUm0oK6+l6/KX4hcgnYGEzciZR1bOeD9/7sSZlTdIgt+T9alKeZmXcg=="],

-    "@anthropic-ai/claude-agent-sdk-darwin-x64": ["@anthropic-ai/claude-agent-sdk-darwin-x64@0.3.154", "", { "os": "darwin", "cpu": "x64" }, "sha512-5BgWEueP+cqoctWjZYhCbyltuaV/N2DmKDXD3/69cKaVmJp8XL9OCzlq/HEirA/+Ssjskx6hDUBaOcpuZ3iwQA=="],
+    "@anthropic-ai/claude-agent-sdk-darwin-x64": ["@anthropic-ai/claude-agent-sdk-darwin-x64@0.3.170", "", { "os": "darwin", "cpu": "x64" }, "sha512-0e58h8UQMtsQxLGIv9r4foxfBFWKZ7NeDtoplLhuD7EwQonehomw1sBXCch77t/IfUS+q5vQ5zv+fOGmap5nLQ=="],

-    "@anthropic-ai/claude-agent-sdk-linux-arm64": ["@anthropic-ai/claude-agent-sdk-linux-arm64@0.3.154", "", { "os": "linux", "cpu": "arm64" }, "sha512-rRkW4SBL3W7zQvKscCIfIGlmoeuTbMV6dXFbPdmpRGvmYZIs79RpzO6xrGBnnhmm+B7znQ9oHAnffi/2FBgJbA=="],
+    "@anthropic-ai/claude-agent-sdk-linux-arm64": ["@anthropic-ai/claude-agent-sdk-linux-arm64@0.3.170", "", { "os": "linux", "cpu": "arm64" }, "sha512-gLbaFqcGppFJQd4DLNV4IXoeahejT/p2/M8bSSvRDbla9GOsBr1AxV5XLRyBn1e7xFGozZIAIQr3+1chp7NJgQ=="],

-    "@anthropic-ai/claude-agent-sdk-linux-arm64-musl": ["@anthropic-ai/claude-agent-sdk-linux-arm64-musl@0.3.154", "", { "os": "linux", "cpu": "arm64" }, "sha512-o2bCQN4Xn3UqCLErC5m4T7u0yYArJYmgFCUFnA6K96DdW2RERvx+gTKXxWuHEBkDO+eMoHLHLxk0u2jGES00Ng=="],
+    "@anthropic-ai/claude-agent-sdk-linux-arm64-musl": ["@anthropic-ai/claude-agent-sdk-linux-arm64-musl@0.3.170", "", { "os": "linux", "cpu": "arm64" }, "sha512-SRYfQcsXlOq+CD/FqkQBTSHbaD++w73GnnO+NUV9adLYrca3kfetRwWT1iguY1cNS0l34dCR3rlzCPq78vg1Jg=="],

-    "@anthropic-ai/claude-agent-sdk-linux-x64": ["@anthropic-ai/claude-agent-sdk-linux-x64@0.3.154", "", { "os": "linux", "cpu": "x64" }, "sha512-GpiFF8Ez6PbM3m0gqtCo/FKM346qyRdP7VhbmJzdnbNKTiiUZ66vDQyEUPZPCG24ZkrG4m96KpRIUwY08rHiNg=="],
+    "@anthropic-ai/claude-agent-sdk-linux-x64": ["@anthropic-ai/claude-agent-sdk-linux-x64@0.3.170", "", { "os": "linux", "cpu": "x64" }, "sha512-Xl/m7TaSC3T5IDBdHrZQ9fCQYyDmPELN34CL+MoyPIf7uSmuZnjE9fUOqDh2Rv26JxWssi1M6X+BBvVuKd6Cpg=="],

-    "@anthropic-ai/claude-agent-sdk-linux-x64-musl": ["@anthropic-ai/claude-agent-sdk-linux-x64-musl@0.3.154", "", { "os": "linux", "cpu": "x64" }, "sha512-zA7S8Lm6O4QBsUpbhiOht8BgiXHOBBFUIo8ZLK6r5wAatK3Q44syWVxICeyCnR6wqfnkf3cugCw27ycS6vVgaA=="],
+    "@anthropic-ai/claude-agent-sdk-linux-x64-musl": ["@anthropic-ai/claude-agent-sdk-linux-x64-musl@0.3.170", "", { "os": "linux", "cpu": "x64" }, "sha512-m4+I0qBEk7cxRKS+pL+eoWXbXTFOAo83fQ0tQvap4z/mDMm06IWJtEPoYTaMBwsp32GJWLkHWKbZSBCHZnp2DQ=="],

-    "@anthropic-ai/claude-agent-sdk-win32-arm64": ["@anthropic-ai/claude-agent-sdk-win32-arm64@0.3.154", "", { "os": "win32", "cpu": "arm64" }, "sha512-cDW1YFbU/PJFlrGXhlAGcbkXt80sEO6WtnH8nN8YHXLn5NWduy2q7o/qC6i8XozgvRGf6t/eMoH7IasGIEDhDw=="],
+    "@anthropic-ai/claude-agent-sdk-win32-arm64": ["@anthropic-ai/claude-agent-sdk-win32-arm64@0.3.170", "", { "os": "win32", "cpu": "arm64" }, "sha512-IG+8isJNNJKbnnhO7m+PGhfVCg+XoQ/MDxGde5eigFI0WsEfitjuWSWwx82bT9ghxI1aa6qNvI+UPgPcZuo5Fg=="],

-    "@anthropic-ai/claude-agent-sdk-win32-x64": ["@anthropic-ai/claude-agent-sdk-win32-x64@0.3.154", "", { "os": "win32", "cpu": "x64" }, "sha512-tSKaIIpL72OPg3WfzZTCIl8OJgcbq4qieu8/fDWjsdeQuari9gQMIuEflFphk9HqNsxpSmDqKi8Sm5mW2V566Q=="],
+    "@anthropic-ai/claude-agent-sdk-win32-x64": ["@anthropic-ai/claude-agent-sdk-win32-x64@0.3.170", "", { "os": "win32", "cpu": "x64" }, "sha512-7cuqSKbHVItPGVwRbd3A0BEJwcNtc7Fhoh6qHN4C6yrmjSrvdYYx3MLvq/VI768/RoG7mAMDxb+j7WfEfoP9BA=="],

    "@anthropic-ai/sdk": ["@anthropic-ai/sdk@0.100.0", "", { "dependencies": { "json-schema-to-ts": "^3.1.1", "standardwebhooks": "^1.0.0" }, "peerDependencies": { "zod": "^3.25.0 || ^4.0.0" }, "optionalPeers": ["zod"], "bin": { "anthropic-ai-sdk": "bin/cli" } }, "sha512-cAm3aXm6qAiHIvHxyIIGd6tVmsD2gDqlc2h0R20ijNUzGgVnIN822bit4mKbF6CkuV7qIrLQIPoAepHEpanrQQ=="],

@@ -9,7 +9,7 @@
    "test": "bun test"
  },
  "dependencies": {
-    "@anthropic-ai/claude-agent-sdk": "^0.3.154",
+    "@anthropic-ai/claude-agent-sdk": "^0.3.170",
    "@anthropic-ai/sdk": "^0.100.0",
    "@modelcontextprotocol/sdk": "^1.29.0",
    "cron-parser": "^5.0.0",
@@ -27,6 +27,7 @@ import { fileURLToPath } from 'url';

 import { loadConfig } from './config.js';
 import { buildSystemPromptAddendum } from './destinations.js';
+import { ensureMemoryScaffold } from './memory-scaffold.js';
 // Providers barrel — each enabled provider self-registers on import.
 // Provider skills append imports to providers/index.ts.
 import './providers/index.js';
@@ -95,6 +96,12 @@ async function main(): Promise<void> {
    effort: config.effort,
  });

+  // Providers that lack native memory opt in via `usesMemoryScaffold`; for them
+  // the runner creates a persistent memory/ tree in its host-backed workspace at
+  // boot (idempotent). Default off — the trunk default (Claude) omits the flag
+  // and keeps its native memory untouched.
+  if (provider.usesMemoryScaffold) ensureMemoryScaffold();
+
  await runPollLoop({
    provider,
    providerName,
@@ -5,6 +5,7 @@ import { getUndeliveredMessages } from './db/messages-out.js';
 import { getPendingMessages } from './db/messages-in.js';
 import { getContinuation, setContinuation } from './db/session-state.js';
 import { MockProvider } from './providers/mock.js';
+import type { ProviderExchange } from './providers/types.js';
 import { runPollLoop } from './poll-loop.js';

 beforeEach(() => {
@@ -304,6 +305,7 @@ async function runPollLoopWithTimeout(provider: MockProvider, signal: AbortSigna
      provider,
      providerName: 'mock',
      cwd: '/tmp',
+      signal,
    }),
    new Promise<void>((_, reject) => {
      signal.addEventListener('abort', () => reject(new Error('aborted')));
@@ -324,6 +326,86 @@ function sleep(ms: number): Promise<void> {
  return new Promise((resolve) => setTimeout(resolve, ms));
 }

+describe('poll loop — exchange hook (onExchangeComplete)', () => {
+  // A provider that declares the per-exchange hook. The hook call is the
+  // wiring under test — these tests go red if the poll-loop seam is severed.
+  // What the provider DOES with an exchange (e.g. write markdown into
+  // conversations/) ships with the provider, not the runner.
+  class HookedMockProvider extends MockProvider {
+    readonly exchanges: ProviderExchange[] = [];
+    onExchangeComplete(exchange: ProviderExchange): void {
+      this.exchanges.push(exchange);
+    }
+  }
+
+  it('reports each exchange to a provider that declares the hook', async () => {
+    insertMessage('m1', { sender: 'Alice', text: 'please archive this' }, { platformId: 'chan-1', channelType: 'discord' });
+
+    const provider = new HookedMockProvider({}, () => '<message to="discord-test">archived answer</message>');
+    const controller = new AbortController();
+    const loopPromise = runPollLoopWithTimeout(provider, controller.signal, 2000);
+
+    await waitFor(() => provider.exchanges.length > 0, 2000);
+    controller.abort();
+
+    expect(provider.exchanges.length).toBe(1);
+    const exchange = provider.exchanges[0];
+    expect(exchange.prompt).toContain('please archive this');
+    expect(exchange.result).toContain('archived answer');
+    expect(exchange.continuation).toStartWith('mock-session-');
+    expect(exchange.status).toBe('completed');
+
+    await loopPromise.catch(() => {});
+  });
+
+  it('does not report the internal wrapping-retry nudge as a user prompt', async () => {
+    insertMessage('m1', { sender: 'Alice', text: 'wrap this later' }, { platformId: 'chan-1', channelType: 'discord' });
+
+    let calls = 0;
+    const provider = new HookedMockProvider({}, () => {
+      calls += 1;
+      // First result is unwrapped (triggers the retry nudge), second is wrapped.
+      return calls === 1 ? 'unwrapped text' : '<message to="discord-test">wrapped now</message>';
+    });
+    const controller = new AbortController();
+    const loopPromise = runPollLoopWithTimeout(provider, controller.signal, 3000);
+
+    await waitFor(() => provider.exchanges.length >= 2, 3000);
+    controller.abort();
+
+    // Both exchanges attribute themselves to the real user prompt, never the nudge.
+    for (const exchange of provider.exchanges) {
+      expect(exchange.prompt).not.toContain('Your response was not delivered');
+      expect(exchange.prompt).toContain('wrap this later');
+    }
+    expect(provider.exchanges.map((e) => e.status)).toEqual(['undelivered', 'completed']);
+
+    await loopPromise.catch(() => {});
+  });
+
+  it('a throwing hook never breaks delivery', async () => {
+    insertMessage('m1', { sender: 'Alice', text: 'still deliver this' }, { platformId: 'chan-1', channelType: 'discord' });
+
+    class ThrowingHookProvider extends MockProvider {
+      onExchangeComplete(): void {
+        throw new Error('hook exploded');
+      }
+    }
+    const provider = new ThrowingHookProvider({}, () => '<message to="discord-test">delivered anyway</message>');
+    const controller = new AbortController();
+    const loopPromise = runPollLoopWithTimeout(provider, controller.signal, 2000);
+
+    await waitFor(() => getUndeliveredMessages().length > 0, 2000);
+    controller.abort();
+
+    const out = getUndeliveredMessages();
+    expect(out.length).toBe(1);
+    expect(out[0].content).toContain('delivered anyway');
+
+    await loopPromise.catch(() => {});
+  });
+});
+
 describe('poll loop — provider error recovery', () => {
  it('writes error to outbound and continues loop on provider throw', async () => {
    insertMessage('m1', { sender: 'Alice', text: 'trigger error' }, { platformId: 'chan-1', channelType: 'discord' });
@@ -462,3 +544,76 @@ class InvalidSessionProvider {
    };
  }
 }
+
+describe('poll loop — slash command during active query', () => {
+  it('aborts the active query when /clear arrives as a follow-up', async () => {
+    insertMessage('m-active', { sender: 'Alice', text: 'long running request' }, { platformId: 'chan-1', channelType: 'discord' });
+
+    const provider = new BlockingProvider();
+    const controller = new AbortController();
+    const loopPromise = runPollLoopWithTimeout(provider as unknown as MockProvider, controller.signal, 3000);
+
+    await waitFor(() => provider.queries === 1, 2000);
+    insertMessage('m-clear-active', { sender: 'Alice', text: '/clear' }, { platformId: 'chan-1', channelType: 'discord' });
+
+    await waitFor(() => provider.aborts === 1, 2000);
+    await waitFor(
+      () => getUndeliveredMessages().some((msg) => JSON.parse(msg.content).text === 'Session cleared.'),
+      2000,
+    );
+    controller.abort();
+
+    expect(provider.ends).toBe(0);
+    expect(getContinuation('mock')).toBeUndefined();
+    expect(getPendingMessages()).toHaveLength(0);
+
+    await loopPromise.catch(() => {});
+  });
+});
+
+/**
+ * Provider whose query never completes until ended/aborted — for testing how
+ * the loop interrupts an active stream.
+ */
+class BlockingProvider {
+  readonly supportsNativeSlashCommands = false;
+  queries = 0;
+  aborts = 0;
+  ends = 0;
+
+  isSessionInvalid(): boolean {
+    return false;
+  }
+
+  query() {
+    const owner = this;
+    this.queries += 1;
+    let wake: (() => void) | null = null;
+    let ended = false;
+    let aborted = false;
+
+    return {
+      push() {},
+      end: () => {
+        owner.ends += 1;
+        ended = true;
+        wake?.();
+      },
+      abort: () => {
+        owner.aborts += 1;
+        aborted = true;
+        wake?.();
+      },
+      events: (async function* () {
+        yield { type: 'activity' as const };
+        yield { type: 'init' as const, continuation: 'blocking-session' };
+        while (!ended && !aborted) {
+          await new Promise<void>((resolve) => {
+            wake = resolve;
+          });
+          wake = null;
+        }
+      })(),
+    };
+  }
+}
@@ -5,8 +5,11 @@
 * send_message(to="agent-name") since agents and channels share the
 * unified destinations namespace.
 *
- * create_agent is admin-only. Non-admin containers never see this tool
- * (see mcp-tools/index.ts). The host re-checks permission on receive.
+ * create_agent writes central-DB state. The host authorizes it by CLI scope:
+ * trusted owner agent groups (scope 'global') create directly; confined groups
+ * require admin approval (see src/modules/agent-to-agent/create-agent.ts). This
+ * tool just writes the outbound request; authorization is enforced host-side,
+ * not here — the container is untrusted and cannot be relied on to gate itself.
 */
 import { writeMessageOut } from '../db/messages-out.js';
 import { registerTools } from './server.js';
@@ -32,7 +35,7 @@ export const createAgent: McpToolDefinition = {
  tool: {
    name: 'create_agent',
    description:
-      'Create a long-lived companion sub-agent (research assistant, task manager, specialist) — the name becomes your destination for it. Admin-only. Fire-and-forget.',
+      'Create a long-lived companion sub-agent (research assistant, task manager, specialist) — the name becomes your destination for it. May require admin approval before the agent is created. Fire-and-forget.',
    inputSchema: {
      type: 'object' as const,
      properties: {
@@ -13,6 +13,7 @@ import { getCurrentInReplyTo } from '../current-batch.js';
 import { findByName, getAllDestinations } from '../destinations.js';
 import { getMessageIdBySeq, getRoutingBySeq, writeMessageOut } from '../db/messages-out.js';
 import { getSessionRouting } from '../db/session-routing.js';
+import { enqueueFileOut } from '../outbox.js';
 import { registerTools } from './server.js';
 import type { McpToolDefinition } from './types.js';

@@ -156,21 +157,16 @@ export const sendFile: McpToolDefinition = {
    const resolvedPath = path.isAbsolute(filePath) ? filePath : path.resolve('/workspace/agent', filePath);
    if (!fs.existsSync(resolvedPath)) return err(`File not found: ${filePath}`);

-    const id = generateId();
-    const filename = (args.filename as string) || path.basename(resolvedPath);
-
-    const outboxDir = path.join('/workspace/outbox', id);
-    fs.mkdirSync(outboxDir, { recursive: true });
-    fs.copyFileSync(resolvedPath, path.join(outboxDir, filename));
-
-    writeMessageOut({
-      id,
-      in_reply_to: getCurrentInReplyTo(),
-      kind: 'chat',
-      platform_id: routing.platform_id,
-      channel_type: routing.channel_type,
-      thread_id: routing.thread_id,
-      content: JSON.stringify({ text: (args.text as string) || '', files: [filename] }),
+    const { id, filename } = enqueueFileOut({
+      srcPath: resolvedPath,
+      routing: {
+        platform_id: routing.platform_id,
+        channel_type: routing.channel_type,
+        thread_id: routing.thread_id,
+        in_reply_to: getCurrentInReplyTo(),
+      },
+      text: (args.text as string) || '',
+      filename: (args.filename as string) || undefined,
    });

    log(`send_file: ${id} → ${routing.resolvedName} (${filename})`);
@@ -0,0 +1,53 @@
+import { describe, expect, it } from 'bun:test';
+import fs from 'fs';
+import os from 'os';
+import path from 'path';
+
+import { ensureMemoryScaffold } from './memory-scaffold.js';
+
+describe('ensureMemoryScaffold', () => {
+  it('deterministically creates the memory tree', () => {
+    const base = fs.mkdtempSync(path.join(os.tmpdir(), 'nanoclaw-mem-'));
+    try {
+      ensureMemoryScaffold(base);
+
+      expect(fs.existsSync(path.join(base, 'memory', 'index.md'))).toBe(true);
+      expect(fs.existsSync(path.join(base, 'memory', 'system', 'definition.md'))).toBe(true);
+      expect(fs.existsSync(path.join(base, 'memory', 'memories'))).toBe(true);
+      expect(fs.existsSync(path.join(base, 'memory', 'data'))).toBe(true);
+    } finally {
+      fs.rmSync(base, { recursive: true, force: true });
+    }
+  });
+
+  it('never touches workspace memory it did not create — CLAUDE.local.md stays untouched', () => {
+    const base = fs.mkdtempSync(path.join(os.tmpdir(), 'nanoclaw-mem-'));
+    try {
+      fs.writeFileSync(path.join(base, 'CLAUDE.local.md'), '# group memory\nuser prefers terse replies\n');
+
+      ensureMemoryScaffold(base);
+
+      // Migration between memory stores is the operator's move (/migrate-memory),
+      // never a boot side effect.
+      expect(fs.existsSync(path.join(base, 'memory', 'memories', 'imported-agent-memory.md'))).toBe(false);
+      expect(fs.readFileSync(path.join(base, 'CLAUDE.local.md'), 'utf-8')).toContain('terse replies');
+    } finally {
+      fs.rmSync(base, { recursive: true, force: true });
+    }
+  });
+
+  it('is idempotent and never clobbers the agent edits', () => {
+    const base = fs.mkdtempSync(path.join(os.tmpdir(), 'nanoclaw-mem-'));
+    try {
+      ensureMemoryScaffold(base);
+      const indexFile = path.join(base, 'memory', 'index.md');
+      fs.writeFileSync(indexFile, '# my own index\n');
+
+      ensureMemoryScaffold(base);
+
+      expect(fs.readFileSync(indexFile, 'utf-8')).toBe('# my own index\n');
+    } finally {
+      fs.rmSync(base, { recursive: true, force: true });
+    }
+  });
+});
@@ -0,0 +1,39 @@
+import fs from 'fs';
+import path from 'path';
+import { fileURLToPath } from 'url';
+
+/**
+ * Create the agent's persistent memory scaffold, container-side, at boot.
+ *
+ * The runner owns its own workspace: it writes the memory tree straight into
+ * `/workspace/agent` (the host-backed, RW group dir, so it persists across the
+ * ephemeral container). No host-side step, nothing mounted in.
+ *
+ * The default `definition.md` / `index.md` live as real markdown templates next
+ * to this module (under `memory-templates/`) — not as strings in code — so the
+ * doctrine is editable as markdown and the agent receives an unescaped copy.
+ * They ship in the mounted `/app/src` tree, so no image change is needed.
+ *
+ * Idempotent — only writes what's missing, so the agent's own edits and
+ * accumulated memory are never clobbered on a later wake. Provider-agnostic:
+ * the runner makes no assumption about which harness is running — a provider
+ * opts in via `usesMemoryScaffold`.
+ */
+const TEMPLATES_DIR = path.join(path.dirname(fileURLToPath(import.meta.url)), 'memory-templates');
+
+export function ensureMemoryScaffold(baseDir = '/workspace/agent'): void {
+  const memoryDir = path.join(baseDir, 'memory');
+  const systemDir = path.join(memoryDir, 'system');
+
+  for (const dir of [systemDir, path.join(memoryDir, 'memories'), path.join(memoryDir, 'data')]) {
+    fs.mkdirSync(dir, { recursive: true });
+  }
+
+  copyTemplateIfMissing('definition.md', path.join(systemDir, 'definition.md'));
+  copyTemplateIfMissing('index.md', path.join(memoryDir, 'index.md'));
+}
+
+function copyTemplateIfMissing(template: string, dest: string): void {
+  if (fs.existsSync(dest)) return;
+  fs.copyFileSync(path.join(TEMPLATES_DIR, template), dest);
+}
@@ -0,0 +1,22 @@
+import { describe, expect, it } from 'bun:test';
+import fs from 'fs';
+import path from 'path';
+
+// Wiring guard for the memory-scaffold seam: the boot gate in index.ts
+// (`if (provider.usesMemoryScaffold) ensureMemoryScaffold()`) is the seam's
+// single functional reach-in. The unit tests in memory-scaffold.test.ts drive
+// ensureMemoryScaffold directly and stay green if the gate is deleted — this
+// test goes red. main() can't be driven in-process (it reads
+// /workspace/agent/container.json and enters the poll loop), so the guard is
+// structural: gate + import must both be present in the real entry point.
+describe('memory scaffold boot wiring', () => {
+  const indexSrc = fs.readFileSync(path.join(import.meta.dir, 'index.ts'), 'utf-8');
+
+  it('gates the scaffold on the provider capability in main()', () => {
+    expect(indexSrc).toContain('if (provider.usesMemoryScaffold) ensureMemoryScaffold()');
+  });
+
+  it('imports ensureMemoryScaffold from the seam module', () => {
+    expect(indexSrc).toContain("import { ensureMemoryScaffold } from './memory-scaffold.js'");
+  });
+});
@@ -0,0 +1,23 @@
+# Agent Memory System
+
+This editable file defines how your persistent memory works. It is a starting
+point, not a contract — reorganize it as the work demands. If the user or another
+memory system replaces this definition, follow the replacement.
+
+Start every memory task at `memory/index.md`, then follow the narrowest relevant index.
+Treat indexes as core data: keep them accurate and concise.
+Every folder of durable memory has its own `index.md` describing its contents.
+When an index grows past roughly 20 entries, group related items into subfolders,
+and give each new subfolder its own `index.md` linked from the parent.
+
+Use `memory/memories/` for durable facts, project context, people, decisions, and entity notes.
+Use `memory/data/` for structured reference data, datasets, tables, and reusable records.
+Use entity folders for things that matter: projects, people, places, organizations, decisions.
+
+When the user shares something that should survive future turns, store it in the
+smallest useful file; prefer updating an existing file over creating duplicates.
+Write concise, source-aware notes; include dates when timing matters.
+If a fact is corrected, update the memory and keep only useful history.
+When you add, move, or remove memory, update the nearest index.
+Before answering from memory, read the relevant index or file instead of guessing;
+if memory is missing or uncertain, say so and verify when it matters.
@@ -0,0 +1,5 @@
+# Memory Index
+
+- [Memory system definition](system/definition.md)
+- [Memories](memories/) - durable facts, people, projects, decisions
+- [Data](data/) - structured reference data
@@ -0,0 +1,87 @@
+import { describe, it, expect, beforeEach, afterEach } from 'bun:test';
+import fs from 'fs';
+import os from 'os';
+import path from 'path';
+
+import { initTestSessionDb, closeSessionDb } from './db/connection.js';
+import { getUndeliveredMessages } from './db/messages-out.js';
+import { enqueueFileOut } from './outbox.js';
+
+let outboxDir: string;
+let srcDir: string;
+
+beforeEach(() => {
+  initTestSessionDb();
+  outboxDir = fs.mkdtempSync(path.join(os.tmpdir(), 'nanoclaw-outbox-'));
+  srcDir = fs.mkdtempSync(path.join(os.tmpdir(), 'nanoclaw-src-'));
+  process.env.NANOCLAW_OUTBOX_DIR = outboxDir;
+});
+
+afterEach(() => {
+  closeSessionDb();
+  delete process.env.NANOCLAW_OUTBOX_DIR;
+  fs.rmSync(outboxDir, { recursive: true, force: true });
+  fs.rmSync(srcDir, { recursive: true, force: true });
+});
+
+function writeSrc(name: string, bytes: string): string {
+  const p = path.join(srcDir, name);
+  fs.writeFileSync(p, bytes);
+  return p;
+}
+
+describe('enqueueFileOut', () => {
+  it('stages the file under the outbox and enqueues a messages_out row with files[]', () => {
+    const src = writeSrc('ig_abc.png', 'PNGDATA');
+
+    const { id, filename } = enqueueFileOut({
+      srcPath: src,
+      routing: { platform_id: 'chan-1', channel_type: 'discord', thread_id: 'thr-9', in_reply_to: 'm1' },
+      text: 'here you go',
+    });
+
+    // Bytes staged at <outbox>/<id>/<filename> for the host to read.
+    const staged = path.join(outboxDir, id, filename);
+    expect(fs.existsSync(staged)).toBe(true);
+    expect(fs.readFileSync(staged, 'utf8')).toBe('PNGDATA');
+
+    // Exactly one outbound row, carrying the file reference + routing.
+    const out = getUndeliveredMessages();
+    expect(out).toHaveLength(1);
+    const row = out[0];
+    expect(row.platform_id).toBe('chan-1');
+    expect(row.channel_type).toBe('discord');
+    expect(row.thread_id).toBe('thr-9');
+    expect(row.in_reply_to).toBe('m1');
+    const content = JSON.parse(row.content);
+    expect(content.files).toEqual(['ig_abc.png']);
+    expect(content.text).toBe('here you go');
+  });
+
+  it('defaults filename to the basename and text to empty', () => {
+    const src = writeSrc('chart.png', 'X');
+
+    const { filename } = enqueueFileOut({
+      srcPath: src,
+      routing: { platform_id: 'C-1', channel_type: 'slack', thread_id: null },
+    });
+
+    expect(filename).toBe('chart.png');
+    const row = getUndeliveredMessages()[0];
+    expect(row.in_reply_to).toBeNull();
+    const content = JSON.parse(row.content);
+    expect(content.text).toBe('');
+    expect(content.files).toEqual(['chart.png']);
+  });
+
+  it('throws when the source file is missing — callers decide how to surface it', () => {
+    expect(() =>
+      enqueueFileOut({
+        srcPath: path.join(srcDir, 'does-not-exist.png'),
+        routing: { platform_id: 'C-1', channel_type: 'slack', thread_id: null },
+      }),
+    ).toThrow();
+    // Nothing enqueued on failure.
+    expect(getUndeliveredMessages()).toHaveLength(0);
+  });
+});
@@ -0,0 +1,68 @@
+/**
+ * File delivery via the outbox.
+ *
+ * A file is delivered in two parts that must stay in lockstep: the bytes are
+ * staged under `/workspace/outbox/<id>/<filename>` (the host reads them from
+ * there after polling), and a `messages_out` row carries `{ files: [name] }`
+ * so the host knows to attach them. This helper owns that contract so the two
+ * callers — the `send_file` MCP tool (model-driven) and the poll-loop's `file`
+ * event consumer (harness-generated images) — can't drift apart.
+ */
+import fs from 'fs';
+import path from 'path';
+
+import { writeMessageOut } from './db/messages-out.js';
+
+/** Where staged files live. Overridable for tests; production is always the mount. */
+function outboxBase(): string {
+  return process.env.NANOCLAW_OUTBOX_DIR ?? '/workspace/outbox';
+}
+
+function generateId(): string {
+  return `msg-${Date.now()}-${Math.random().toString(36).slice(2, 8)}`;
+}
+
+export interface FileOutRouting {
+  platform_id: string;
+  channel_type: string;
+  thread_id: string | null;
+  in_reply_to?: string | null;
+}
+
+export interface EnqueueFileOut {
+  /** Absolute or already-resolved path to the file to deliver. Must exist. */
+  srcPath: string;
+  routing: FileOutRouting;
+  /** Optional accompanying message text. */
+  text?: string;
+  /** Display name; defaults to the basename of `srcPath`. */
+  filename?: string;
+}
+
+/**
+ * Stage a file into the outbox and enqueue its `messages_out` row.
+ *
+ * Throws if `srcPath` cannot be read/copied — callers decide whether that
+ * should surface to the user (the MCP tool validates existence first; the
+ * poll-loop consumer logs and moves on so one bad image can't fail the turn).
+ */
+export function enqueueFileOut(opts: EnqueueFileOut): { id: string; filename: string; seq: number } {
+  const id = generateId();
+  const filename = opts.filename ?? path.basename(opts.srcPath);
+
+  const outboxDir = path.join(outboxBase(), id);
+  fs.mkdirSync(outboxDir, { recursive: true });
+  fs.copyFileSync(opts.srcPath, path.join(outboxDir, filename));
+
+  const seq = writeMessageOut({
+    id,
+    in_reply_to: opts.routing.in_reply_to ?? null,
+    kind: 'chat',
+    platform_id: opts.routing.platform_id,
+    channel_type: opts.routing.channel_type,
+    thread_id: opts.routing.thread_id,
+    content: JSON.stringify({ text: opts.text ?? '', files: [filename] }),
+  });
+
+  return { id, filename, seq };
+}
@@ -14,7 +14,8 @@ import {
  type RoutingContext,
 } from './formatter.js';
 import { isUploadTraceCommand, uploadTrace } from './upload-trace.js';
-import type { AgentProvider, AgentQuery, ProviderEvent } from './providers/types.js';
+import { enqueueFileOut } from './outbox.js';
+import type { AgentProvider, AgentQuery, ProviderEvent, ProviderExchange } from './providers/types.js';

 const POLL_INTERVAL_MS = 1000;
 const ACTIVE_POLL_INTERVAL_MS = 500;
@@ -63,6 +64,12 @@ export interface PollLoopConfig {
  systemContext?: {
    instructions?: string;
  };
+  /**
+   * Optional stop signal. In production the loop runs until the container
+   * dies; tests pass a signal so an abandoned loop actually exits instead of
+   * polling forever and stealing messages from the next test's DB.
+   */
+  signal?: AbortSignal;
 }

 /**
@@ -107,6 +114,7 @@ export async function runPollLoop(config: PollLoopConfig): Promise<void> {
  let pollCount = 0;
  let isFirstPoll = true;
  while (true) {
+    if (config.signal?.aborted) return;
    // Skip system messages — they're responses for MCP tools (e.g., ask_user_question)
    const messages = getPendingMessages(isFirstPoll).filter((m) => m.kind !== 'system');
    isFirstPoll = false;
@@ -232,7 +240,15 @@ export async function runPollLoop(config: PollLoopConfig): Promise<void> {
    // can stamp it on outbound rows — needed for a2a return-path routing.
    setCurrentInReplyTo(routing.inReplyTo);
    try {
-      const result = await processQuery(query, routing, processingIds, config.providerName);
+      const result = await processQuery(
+        query,
+        routing,
+        processingIds,
+        config.providerName,
+        config.provider.onExchangeComplete?.bind(config.provider),
+        prompt,
+        continuation,
+      );
      if (result.continuation && result.continuation !== continuation) {
        continuation = result.continuation;
        setContinuation(config.providerName, continuation);
@@ -313,10 +329,18 @@ async function processQuery(
  routing: RoutingContext,
  initialBatchIds: string[],
  providerName: string,
+  onExchangeComplete: ((exchange: ProviderExchange) => void) | undefined,
+  initialPrompt: string,
+  initialContinuation: string | undefined,
 ): Promise<QueryResult> {
  let queryContinuation: string | undefined;
  let done = false;
  let unwrappedNudged = false;
+  // Prompt queue for the exchange hook — each result event consumes the
+  // oldest unanswered prompt, except a wrapping-retry result, which answers
+  // the same prompt again. Unused (and unmaintained) when the provider
+  // doesn't implement `onExchangeComplete`.
+  const archivePrompts: string[] = [initialPrompt];

  // Concurrent polling: push follow-ups into the active query as they arrive.
  // We do NOT force-end the stream on silence — keeping the query open avoids
@@ -342,13 +366,16 @@ async function processQuery(
        // resume id (fixed at sdkQuery() time); admin/passthrough commands
        // (/compact, /cost, …) only dispatch when they're the first input
        // of a query — pushed mid-stream they arrive as plain text and
-        // the SDK never runs them. End the stream and leave the rows
-        // pending; the outer loop handles them on next iteration via the
-        // canonical command path + formatMessagesWithCommands.
+        // the SDK never runs them. Abort the active stream and leave the
+        // rows pending; the outer loop handles them on next iteration via
+        // the canonical command path + formatMessagesWithCommands. Abort,
+        // not end: end() lets an in-flight turn run to completion, which
+        // can block the command (e.g. /clear during a long task) for as
+        // long as the turn takes.
        if (pending.some((m) => isRunnerCommand(m))) {
-          log('Pending slash command — ending stream so outer loop can process');
+          log('Pending slash command — aborting active stream so outer loop can process');
          endedForCommand = true;
-          query.end();
+          query.abort();
          return;
        }

@@ -393,6 +420,7 @@ async function processQuery(
        log(`Pushing ${keep.length} follow-up message(s) into active query`);
        unwrappedNudged = false;
        query.push(prompt);
+        archivePrompts.push(prompt);
        markCompleted(keptIds);
      } catch (err) {
        // Without this catch the rejection escapes the void IIFE and Node
@@ -456,7 +484,14 @@ async function processQuery(
        markCompleted(initialBatchIds);
        if (event.text) {
          const { hasUnwrapped } = dispatchResultText(event.text, routing);
-          if (hasUnwrapped && !unwrappedNudged) {
+          const willRetryWrapping = hasUnwrapped && !unwrappedNudged;
+          notifyExchangeComplete(onExchangeComplete, {
+            prompt: archivePrompts[0] ?? initialPrompt,
+            result: event.text,
+            continuation: queryContinuation ?? initialContinuation,
+            status: hasUnwrapped ? 'undelivered' : 'completed',
+          });
+          if (willRetryWrapping) {
            unwrappedNudged = true;
            const destinations = getAllDestinations();
            const names = destinations.map((d) => d.name).join(', ');
@@ -467,9 +502,25 @@ async function processQuery(
                `Please re-send your response with the correct wrapping.</system>`,
            );
          }
+          // The wrapping-retry result answers the SAME user prompt — keep it
+          // queued so the retry archives against it, not the nudge text.
+          if (!willRetryWrapping) archivePrompts.shift();
+        } else {
+          archivePrompts.shift();
        }
+      } else if (event.type === 'file') {
+        deliverHarnessFile(event.path, routing);
      }
    }
+  } catch (err) {
+    const errMsg = err instanceof Error ? err.message : String(err);
+    notifyExchangeComplete(onExchangeComplete, {
+      prompt: archivePrompts[0] ?? initialPrompt,
+      result: `Error: ${errMsg}`,
+      continuation: queryContinuation ?? initialContinuation,
+      status: 'error',
+    });
+    throw err;
  } finally {
    done = true;
    clearInterval(pollHandle);
@@ -478,6 +529,18 @@ async function processQuery(
  return { continuation: queryContinuation };
 }

+function notifyExchangeComplete(
+  hook: ((exchange: ProviderExchange) => void) | undefined,
+  exchange: ProviderExchange,
+): void {
+  if (!hook) return;
+  try {
+    hook(exchange);
+  } catch (err) {
+    log(`onExchangeComplete failed: ${err instanceof Error ? err.message : String(err)}`);
+  }
+}
+
 function handleEvent(event: ProviderEvent, _routing: RoutingContext): void {
  switch (event.type) {
    case 'init':
@@ -497,6 +560,34 @@ function handleEvent(event: ProviderEvent, _routing: RoutingContext): void {
  }
 }

+/**
+ * Deliver a harness-generated file (e.g. a Codex-rendered image) to the
+ * batch's reply destination. The model never sends these itself — its native
+ * client already rendered them — so the loop delivers them via the same outbox
+ * path send_file uses. Best-effort: a missing reply destination or an
+ * unreadable file logs and is skipped rather than failing the whole turn.
+ */
+function deliverHarnessFile(filePath: string, routing: RoutingContext): void {
+  if (!routing.platformId || !routing.channelType) {
+    log(`Dropping harness file ${filePath}: batch has no reply destination`);
+    return;
+  }
+  try {
+    const { filename, seq } = enqueueFileOut({
+      srcPath: filePath,
+      routing: {
+        platform_id: routing.platformId,
+        channel_type: routing.channelType,
+        thread_id: routing.threadId,
+        in_reply_to: routing.inReplyTo,
+      },
+    });
+    log(`Delivered harness file #${seq} → ${routing.channelType}:${routing.platformId} (${filename})`);
+  } catch (err) {
+    log(`Failed to deliver harness file ${filePath}: ${err instanceof Error ? err.message : String(err)}`);
+  }
+}
+
 /**
 * Parse the agent's final text for <message to="name">...</message> blocks
 * and dispatch each one to its resolved destination. Text outside of blocks
@@ -6,6 +6,25 @@ export interface AgentProvider {
   */
  readonly supportsNativeSlashCommands: boolean;

+  /**
+   * Optional. When true, the runner scaffolds a persistent `memory/` tree in the
+   * agent's workspace at boot. Providers with their own native memory (e.g.
+   * Claude's `CLAUDE.local.md`) omit this and get nothing — memory is opt-in per
+   * provider, never gated on a provider name.
+   */
+  readonly usesMemoryScaffold?: boolean;
+
+  /**
+   * Optional. Called by the poll-loop after each completed exchange (a
+   * result, a wrapping retry, or an error). Providers whose harness keeps no
+   * on-disk transcript implement this to persist exchanges themselves (e.g.
+   * markdown into the agent's `conversations/` dir); providers that persist
+   * and archive their own transcript (e.g. the Claude Agent SDK's `.jsonl`)
+   * omit it. Best-effort: the loop catches and logs anything it throws. The
+   * implementation lives with the provider, never in the runner.
+   */
+  onExchangeComplete?(exchange: ProviderExchange): void;
+
  /** Start a new query. Returns a handle for streaming input and output. */
  query(input: QueryInput): AgentQuery;

@@ -31,6 +50,16 @@ export interface AgentProvider {
  maybeRotateContinuation?(continuation: string, cwd: string): string | null;
 }

+/** One prompt/result round-trip, as reported to `onExchangeComplete`. */
+export interface ProviderExchange {
+  /** The user prompt this exchange answers (never an internal retry nudge). */
+  prompt: string;
+  result: string | null;
+  /** Continuation/thread id in effect for the exchange, if any. */
+  continuation?: string;
+  status: 'completed' | 'undelivered' | 'error';
+}
+
 /**
 * Options passed to provider constructors. Fields are common to most
 * providers; individual providers may ignore any they don't need.
@@ -99,6 +128,13 @@ export type ProviderEvent =
  | { type: 'result'; text: string | null }
  | { type: 'error'; message: string; retryable: boolean; classification?: string }
  | { type: 'progress'; message: string }
+  /**
+   * A file the harness produced that the model won't deliver itself (e.g.
+   * Codex's built-in image generation renders to its native client, so the
+   * model believes delivery already happened). The poll-loop delivers it to
+   * the batch's reply destination. `path` is absolute inside the container.
+   */
+  | { type: 'file'; path: string }
  /**
   * Liveness signal. Providers MUST yield this on every underlying SDK
   * event (tool call, thinking, partial message, anything) so the
@@ -0,0 +1,5 @@
+[
+  { "name": "vercel", "version": "52.2.1" },
+  { "name": "agent-browser", "version": "0.27.1", "onlyBuilt": true },
+  { "name": "@anthropic-ai/claude-code", "version": "2.1.170", "onlyBuilt": true }
+]
@@ -0,0 +1,61 @@
+import { describe, it, expect } from 'vitest';
+import { readFileSync } from 'node:fs';
+import { fileURLToPath } from 'node:url';
+import { dirname, join } from 'node:path';
+
+// Guards the cli-tools.json seam: the global CLIs the agent invokes at runtime
+// are installed from the manifest (a skill adds one with a json-merge), not
+// hand-edited into the Dockerfile. These go red on a bad merge that drops a
+// baseline tool, or on dewiring the Dockerfile / switching the installer off
+// the pnpm supply-chain path.
+const here = dirname(fileURLToPath(import.meta.url));
+const manifest = JSON.parse(readFileSync(join(here, 'cli-tools.json'), 'utf8')) as Array<{
+  name: string;
+  version: string;
+  onlyBuilt?: boolean;
+}>;
+const dockerfile = readFileSync(join(here, 'Dockerfile'), 'utf8');
+const installer = readFileSync(join(here, 'install-cli-tools.sh'), 'utf8');
+
+describe('cli-tools manifest', () => {
+  it('is a non-empty array of { name, version }', () => {
+    expect(Array.isArray(manifest)).toBe(true);
+    expect(manifest.length).toBeGreaterThan(0);
+    for (const tool of manifest) {
+      expect(typeof tool.name).toBe('string');
+      expect(tool.name.length).toBeGreaterThan(0);
+      expect(typeof tool.version).toBe('string');
+      expect(tool.version.length).toBeGreaterThan(0);
+    }
+  });
+
+  it('has unique tool names (json-merge is keyed on name)', () => {
+    const names = manifest.map((t) => t.name);
+    expect(new Set(names).size).toBe(names.length);
+  });
+
+  it('pins every version to an exact semver (no latest, no ranges — supply-chain policy)', () => {
+    for (const tool of manifest) {
+      expect(tool.version, `${tool.name} must be an exact semver, not "${tool.version}"`).toMatch(
+        /^\d+\.\d+\.\d+(?:[-+][0-9A-Za-z.-]+)?$/,
+      );
+    }
+  });
+
+  it('keeps the baseline CLIs the agent depends on', () => {
+    const names = manifest.map((t) => t.name);
+    for (const required of ['vercel', 'agent-browser', '@anthropic-ai/claude-code']) {
+      expect(names).toContain(required);
+    }
+  });
+
+  it('is wired into the Dockerfile build (COPY manifest + run installer)', () => {
+    expect(dockerfile).toMatch(/COPY cli-tools\.json install-cli-tools\.sh/);
+    expect(dockerfile).toMatch(/install-cli-tools\.sh \/tmp\/cli-tools\.json/);
+  });
+
+  it('installs via pnpm and writes only-built opt-ins (preserves the supply-chain path)', () => {
+    expect(installer).toMatch(/pnpm install -g/);
+    expect(installer).toMatch(/only-built-dependencies\[\]=/);
+  });
+});
@@ -0,0 +1,29 @@
+#!/bin/sh
+# Install the global Node CLIs the agent invokes at runtime, from cli-tools.json.
+#
+# A skill adds a tool by appending a { "name", "version" } entry to that
+# manifest (a json-merge) instead of editing the Dockerfile — the reach-in
+# becomes the safest change shape, deterministic and removable.
+#
+# Every tool is installed via `pnpm install -g`, pinned to an exact version, so
+# the pnpm supply-chain policy still applies. Tools with a native postinstall
+# set "onlyBuilt": true to opt in to running build scripts (pnpm skips them by
+# default). Run as root before `USER node`, so /root/.npmrc is the right home.
+set -eu
+
+MANIFEST="${1:-/tmp/cli-tools.json}"
+
+# Write the per-tool only-built-dependencies opt-ins pnpm reads at install time.
+node -e '
+  const tools = require(process.argv[1]);
+  const optIns = tools.filter((t) => t.onlyBuilt).map((t) => "only-built-dependencies[]=" + t.name);
+  require("fs").writeFileSync("/root/.npmrc", optIns.join("\n") + (optIns.length ? "\n" : ""));
+' "$MANIFEST"
+
+# Install every tool, pinned. name@version specs never contain spaces, so the
+# unquoted expansion word-splits cleanly into positional args.
+# shellcheck disable=SC2046
+set -- $(node -e 'require(process.argv[1]).forEach((t) => console.log(t.name + "@" + t.version))' "$MANIFEST")
+if [ "$#" -gt 0 ]; then
+  pnpm install -g "$@"
+fi
@@ -9,6 +9,5 @@ The files in this directory are original design documents and developer referenc
 | [SPEC.md](SPEC.md) | [Architecture](https://docs.nanoclaw.dev/concepts/architecture) |
 | [SECURITY.md](SECURITY.md) | [Security model](https://docs.nanoclaw.dev/concepts/security) |
 | [REQUIREMENTS.md](REQUIREMENTS.md) | [Introduction](https://docs.nanoclaw.dev/introduction) |
-| [skills-as-branches.md](skills-as-branches.md) | [Skills system](https://docs.nanoclaw.dev/integrations/skills-system) |
 | [docker-sandboxes.md](docker-sandboxes.md) | [Docker Sandboxes](https://docs.nanoclaw.dev/advanced/docker-sandboxes) |
 | [APPLE-CONTAINER-NETWORKING.md](APPLE-CONTAINER-NETWORKING.md) | [Container runtime](https://docs.nanoclaw.dev/advanced/container-runtime) |
@@ -83,6 +83,48 @@ Each NanoClaw group gets its own OneCLI agent identity. This allows different cr
 - Any credentials matching blocked patterns
 - `.env` is shadowed with `/dev/null` in the project root mount

+### 6. Egress Lockdown (Forced Proxy)
+
+The `HTTPS_PROXY` env var only redirects *proxy-aware* clients — a tool that
+ignores it (or a raw socket) could reach the internet directly and bypass
+credential injection, approvals, and audit. Egress lockdown closes that hole at
+the network layer.
+
+**How it works:** agents are placed on a Docker `--internal` network
+(`nanoclaw-egress`) that has **no route to the internet**. The OneCLI gateway
+container is attached to that network, aliased as `host.docker.internal`, so the
+injected proxy URL (`…@host.docker.internal:10255`) resolves to the gateway
+*container-to-container*. The gateway is therefore the **only reachable hop** —
+anything else has nowhere to go. The agent is non-root with no `NET_ADMIN`, so
+it cannot undo this. Identical mechanism on macOS and Linux (no host firewall,
+no `host-gateway` route).
+
+- **Self-healing:** the gateway is re-attached to the network at every spawn and
+  on each host-sweep tick, so an out-of-band detach (e.g. `docker compose up` on
+  the OneCLI stack — its compose lives in `~/.onecli`, not this repo) recovers
+  automatically.
+- **Fail-fast:** if lockdown is on but the network can't be created or the
+  gateway can't be attached (e.g. a non-standard gateway container name, or the
+  gateway isn't running), nanoclaw **refuses to spawn the agent** and surfaces a
+  clear error — it never silently falls back to open egress. Fix the cause (or
+  set `NANOCLAW_EGRESS_LOCKDOWN=false`) and retry. The host-sweep re-heal is the
+  exception: a heal failure there is logged but not fatal, since already-running
+  agents stay on the internal net (no leak) until the gateway returns.
+
+**Configuration:**
+
+| Env | Default | Meaning |
+| --- | --- | --- |
+| `NANOCLAW_EGRESS_LOCKDOWN` | `false` | Set `true` to opt in (otherwise the host-gateway path is used). Enabled automatically by `/add-golden-registry`. |
+| `NANOCLAW_EGRESS_NETWORK` | `nanoclaw-egress` | Network name. |
+| `ONECLI_GATEWAY_CONTAINER` | `onecli` | Gateway container to attach. |
+
+**⚠ Behavior when enabled:** with lockdown on, agents have **no direct
+internet** — all traffic must go through OneCLI. Proxy-aware clients (npm, pnpm,
+pip, curl, node/bun with the proxy env) are unaffected. Any workflow that relies
+on a **non-proxy-aware** tool reaching the internet directly will fail by design.
+Lockdown is **off by default**; opt in with `NANOCLAW_EGRESS_LOCKDOWN=true`.
+
 ## Privilege Comparison

 | Capability | Main Group | Non-Main Group |
@@ -668,15 +668,19 @@ CREATE TABLE agent_groups (
 );

 -- Platform groups/channels (WhatsApp group, Slack channel, Discord channel, email thread, etc.)
+-- One row per chat PER ADAPTER INSTANCE. instance defaults to channel_type
+-- (the "default instance"), so single-instance installs never see it.
 CREATE TABLE messaging_groups (
  id                     TEXT PRIMARY KEY,
  channel_type           TEXT NOT NULL,     -- 'whatsapp', 'slack', 'discord', 'telegram', 'email'
  platform_id            TEXT NOT NULL,     -- platform-specific ID (JID, channel ID, etc.)
+  instance               TEXT NOT NULL,     -- adapter-instance name; default = channel_type
  name                   TEXT,
  is_group               INTEGER DEFAULT 0,
  unknown_sender_policy  TEXT NOT NULL DEFAULT 'strict',  -- 'strict' | 'request_approval' | 'public'
  created_at             TEXT NOT NULL,
-  UNIQUE(channel_type, platform_id)
+  denied_at              TEXT,
+  UNIQUE(channel_type, platform_id, instance)
 );

 -- Users (messaging platform identities, namespaced "<channel_type>:<handle>")
@@ -0,0 +1,36 @@
+# Customizing NanoClaw
+
+NanoClaw is made to be forked and changed. The catch with most projects is that once you edit the code, every upstream update turns into a merge fight, and the more you customized, the worse it gets.
+
+NanoClaw avoids that with one simple idea: **every change you make is a skill.**
+
+## The idea in a minute
+
+- A **skill** is a small, self-contained add-on. It brings its own code and knows how to install itself.
+- Your **fork is just a list of skills**, plus one "recipe" that says which skills you have and how they fit together.
+- Because your changes live beside the core instead of tangled into it, **pulling in updates stays easy**.
+
+## What makes it work
+
+A good skill mostly **adds** things: new files, a line appended to an existing file, a dependency. It avoids rewriting existing code in place.
+
+And it ships a test for each spot where it touches the rest of the system. When an update moves something your skill depends on, that test fails and points at the fix, instead of you finding out when things break in production.
+
+## How you actually work
+
+You don't have to think in skills while you're building. **Edit the code directly, get it working, then turn your changes into skills afterward.** A coding agent does the conversion for you, following [skill-guidelines.md](skill-guidelines.md).
+
+The only rule worth remembering: **a change isn't really part of your fork until it's a skill**, because that's the form that survives an upgrade.
+
+## Upgrading
+
+Always upgrade by running `/update-nanoclaw`. **Don't just `git pull`.** The command sets a rollback point, pulls the upstream changes, runs your tests, and walks you through anything that needs fixing, usually a small, local fix in one skill.
+
+## The deal
+
+We keep the core small and stable, and every breaking change ships with its migration. You keep your changes as skills, with tests. Do that, and upgrades won't break you. Changes edited directly into the core are the one thing the model can't protect.
+
+## Go deeper
+
+- **[The skills model in full](skills-model.md)**: how skills, recipes, tests, and upgrades work under the hood.
+- **[Skill guidelines](skill-guidelines.md)**: the authoritative checklist for writing one.
@@ -27,21 +27,24 @@ CREATE TABLE agent_groups (

 ### 1.2 `messaging_groups`

-One row per platform chat (one WhatsApp group, one Slack channel, one 1:1 DM, etc.).
+One row per platform chat (one WhatsApp group, one Slack channel, one 1:1 DM, etc.) per adapter instance.

 ```sql
 CREATE TABLE messaging_groups (
  id                    TEXT PRIMARY KEY,
  channel_type          TEXT NOT NULL,
  platform_id           TEXT NOT NULL,
+  instance              TEXT NOT NULL,
  name                  TEXT,
  is_group              INTEGER DEFAULT 0,
  unknown_sender_policy TEXT NOT NULL DEFAULT 'strict',
  created_at            TEXT NOT NULL,
-  UNIQUE(channel_type, platform_id)
+  denied_at             TEXT,
+  UNIQUE(channel_type, platform_id, instance)
 );
 ```

+- `instance`: adapter-instance name — N adapters of one platform (e.g. three Slack apps in one workspace) each own their rows. The default instance IS the channel type: migration 016 backfills `instance = channel_type` and `createMessagingGroup` stamps the same default, so single-instance installs never see the dimension. Inbound lookups are exact-on-instance (an unknown named instance auto-creates its own row); outbound lookups resolve default-instance-first.
 - `unknown_sender_policy`: `strict` (drop), `request_approval` (ask admin), `public` (allow).
 - **Readers:** `src/router.ts`, `src/delivery.ts`, `src/session-manager.ts`
 - **Writers:** `src/db/messaging-groups.ts`, channel setup flows
@@ -134,7 +137,7 @@ CREATE TABLE user_dms (
 );
 ```

-Populated lazily by `ensureUserDm()` in `src/user-dm.ts`.
+Populated lazily by `ensureUserDm()` in `src/user-dm.ts`. Cold DMs resolve via the channel's default adapter instance — `PRIMARY KEY (user_id, channel_type)` is per-platform, not per-instance.

 ### 1.8 `sessions`

@@ -53,6 +53,80 @@ Model selection considerations for Apple Silicon:

 The agent uses tool calls extensively (read/write files, shell commands). Models that support tool use reliably work best. Gemma 4 and Qwen 3 Coder both handle structured tool calls well.

+## Allowing Prompt Caching (filter the cache-busting hash)
+
+Out of the box this path is slow — every reply re-reads the whole multi-thousand-token system prompt from scratch, even for a one-word answer. Ollama has a prompt cache that should skip that repeated work, but on this path it never kicks in.
+
+**Cause.** The Claude Agent SDK adds a per-request hash to the front of every prompt — `x-anthropic-billing-header: ...; cch=<hash>;`. It changes on every request, and Ollama's cache only reuses a prompt whose start is unchanged. So that one shifting value at the front makes Ollama treat every prompt as new and re-read all of it. (Ollama ignores the hash itself, so filtering it has no effect on output.)
+
+**Fix.** Run a tiny proxy between the container and Ollama that filters the hash out (pins `cch=<hash>` to a constant). The start of the prompt is now stable, so the cache kicks in and only the new message gets processed. In our setup — a 31B model on Apple Silicon — follow-up replies dropped from ~80s to ~4s; your numbers will vary with model size and hardware. Output is unchanged, since Ollama ignores the value anyway.
+
+Point the agent group's `ANTHROPIC_BASE_URL` at the proxy instead of Ollama directly (everything else from the sections above is unchanged):
+
+```
+ANTHROPIC_BASE_URL=http://host.docker.internal:11999   # the proxy
+# proxy forwards to http://127.0.0.1:11434 (Ollama)
+```
+
+The proxy is ~40 lines of dependency-free Node:
+
+```js
+// ollama-cch-proxy.mjs — normalize the SDK's per-request cch nonce so Ollama's
+// prefix cache survives across turns. Listens on :11999, forwards to Ollama.
+import http from 'node:http';
+
+const TARGET_HOST = process.env.OLLAMA_HOST || '127.0.0.1';
+const TARGET_PORT = Number(process.env.OLLAMA_PORT || 11434);
+const LISTEN_PORT = Number(process.env.PROXY_PORT || 11999);
+
+const server = http.createServer((req, res) => {
+  const chunks = [];
+  req.on('data', (c) => chunks.push(c));
+  req.on('end', () => {
+    let body = Buffer.concat(chunks);
+    if (req.method === 'POST' && body.length) {
+      body = Buffer.from(body.toString('utf8').replace(/cch=[0-9a-f]+;/g, 'cch=00000;'), 'utf8');
+    }
+    const headers = { ...req.headers, host: `${TARGET_HOST}:${TARGET_PORT}`, 'content-length': String(body.length) };
+    const proxyReq = http.request(
+      { host: TARGET_HOST, port: TARGET_PORT, method: req.method, path: req.url, headers },
+      (proxyRes) => {
+        res.writeHead(proxyRes.statusCode || 502, proxyRes.headers);
+        proxyRes.pipe(res);
+      },
+    );
+    proxyReq.on('error', (e) => { res.writeHead(502); res.end(String(e)); });
+    proxyReq.end(body);
+  });
+});
+server.listen(LISTEN_PORT, '0.0.0.0', () => console.log(`cch-proxy :${LISTEN_PORT} -> ${TARGET_HOST}:${TARGET_PORT}`));
+```
+
+Run it durably so it survives reboots. On Linux, a systemd user service:
+
+```ini
+# ~/.config/systemd/user/ollama-cch-proxy.service
+[Unit]
+Description=Ollama cch-normalizing proxy for NanoClaw
+After=network-online.target
+
+[Service]
+ExecStart=/usr/bin/node %h/.config/nanoclaw/ollama-cch-proxy.mjs
+Restart=always
+
+[Install]
+WantedBy=default.target
+```
+
+```bash
+systemctl --user enable --now ollama-cch-proxy
+loginctl enable-linger "$USER"   # so it runs without an active login session
+```
+
+On macOS use a `launchd` user agent (`~/Library/LaunchAgents/`) running the same script.
+
+**Scope.** This only affects the Claude-Code-CLI → Ollama path described here. Codex and OpenCode don't use the Claude Agent SDK, so they never emit the `cch` hash and get prompt caching for free.
+
 ## What Changes at the Code Level

 Three files need to support this feature. See `/add-ollama-provider` for the exact changes.
@@ -0,0 +1,83 @@
+# Upgrading the OneCLI gateway
+
+NanoClaw talks to the OneCLI gateway (credential vault + egress proxy) through `@onecli-sh/sdk`. The gateway is an external component with its own release line, so NanoClaw pins the **sanctioned gateway version** in [`versions.json`](../versions.json) under `onecli-gateway`. When an update moves that pin, the gateway must be upgraded — this doc is the migration path. It is written to be handed to a coding agent verbatim: detect → upgrade → verify → rollback.
+
+There is deliberately **no runtime version check, and setup does not migrate the gateway for you**: the gateway is a separate out-of-band component, and the migrator is your coding agent running `/update-nanoclaw` — it diffs `versions.json` across the update and routes you here when the `onecli-gateway` pin moved. (Setup detects a pre-`/v1` gateway and points at this doc, but never upgrades it.) Run the steps below verbatim.
+
+## 1. Detect
+
+Find out what is running and what is required:
+
+```bash
+cat versions.json                                   # the sanctioned pin
+curl -s http://127.0.0.1:10254/api/health           # reports the running gateway version
+curl -s -o /dev/null -w '%{http_code}' http://127.0.0.1:10254/v1/health
+```
+
+If the last command prints `404`, the server predates the `/v1` API that `@onecli-sh/sdk` 2.x requires — every SDK call will fail with 404s that look transient but are permanent. If your gateway is remote, substitute its host for `127.0.0.1` (it's in `.env` as `ONECLI_URL` / `NANOCLAW_ONECLI_API_HOST`).
+
+Why gateways fall behind: the OneCLI installer's docker-compose tracks the `latest` image tag, but Docker never re-pulls a tag — the server freezes at whatever `latest` meant on install day.
+
+## 2. Upgrade
+
+The gateway runs as a Docker service in `~/.onecli`. Upgrade just that container to the pinned `onecli-gateway` version — vault data lives in named Docker volumes and survives. This upgrades only the gateway; the CLI binary is pinned separately (see below).
+
+**Local gateway (the common case):**
+
+```bash
+cd ~/.onecli && ONECLI_VERSION=<onecli-gateway pin from versions.json> docker compose pull onecli && docker compose up -d
+```
+
+**Remote gateway** — run the same command on the gateway's host (NanoClaw can't reach it over SSH).
+
+## 3. Verify
+
+Host-side health is necessary but **not sufficient**:
+
+```bash
+curl -s http://127.0.0.1:10254/v1/health     # must return {"status":"ok",...}
+```
+
+**Verify the bind interface (container reachability).** Agent containers reach the gateway over the docker bridge (`host.docker.internal` → e.g. `172.17.0.1`), so a server bound only to `127.0.0.1` boots clean host-side while every credentialed call from containers dies at the proxy:
+
+```bash
+docker run --rm --add-host=host.docker.internal:host-gateway \
+  curlimages/curl -s -o /dev/null -w '%{http_code}' http://host.docker.internal:10254/v1/health
+```
+
+This must print `200`. If it can't connect while the host-side check passed, set the bind address in `~/.onecli/.env` to the docker-bridge IP (or `0.0.0.0` on a host with a closed firewall) and `cd ~/.onecli && docker compose up -d`. Symptom if skipped: host log clean, agents fail all API calls.
+
+Finally, restart the NanoClaw service (per-install names — derive with `setup/lib/install-slug.sh`):
+
+```bash
+# macOS
+source setup/lib/install-slug.sh && launchctl kickstart -k gui/$(id -u)/$(launchd_label)
+# Linux
+source setup/lib/install-slug.sh && systemctl --user restart $(systemd_unit)
+```
+
+## 4. Rollback
+
+```bash
+cd ~/.onecli && ONECLI_VERSION=<old-version> docker compose up -d
+```
+
+If the NanoClaw update itself is being rolled back, also pin `@onecli-sh/sdk` back to its previous version in `package.json` and run `pnpm install`. Vault data is unaffected in both directions.
+
+## The CLI binary (`onecli-cli` pin)
+
+The `onecli` host CLI is pinned the same way, under `onecli-cli` in `versions.json`. Setup installs exactly that version by direct release download — it never resolves "latest". When an update moves this pin, replace the binary with the pinned release:
+
+```bash
+onecli --version                                            # detect: what is installed
+V=<onecli-cli pin from versions.json>
+OS=$(uname -s | tr '[:upper:]' '[:lower:]')                 # darwin | linux
+ARCH=$(uname -m | sed 's/x86_64/amd64/;s/aarch64/arm64/')   # amd64 | arm64
+curl -fsSL -o /tmp/onecli.tgz \
+  "https://github.com/onecli/onecli-cli/releases/download/v${V}/onecli_${V}_${OS}_${ARCH}.tar.gz"
+tar -xzf /tmp/onecli.tgz -C /tmp
+install -m 0755 /tmp/onecli "$(command -v onecli || echo ~/.local/bin/onecli)"
+onecli --version                                            # verify: must match versions.json
+```
+
+To roll back, run the same block after reverting `versions.json` (or checking out the previous NanoClaw version). The CLI is stateless — vault data lives in the gateway, so swapping the binary in either direction loses nothing.
@@ -0,0 +1,44 @@
+# Switching an agent group between providers
+
+How an **operator** moves a live agent group from one agent provider to another (e.g. Claude → Codex) and back. Switching is an operator action: it runs from the host via `ncl groups config update --provider` + restart.
+
+NanoClaw's runtime does not migrate anything when you switch. Provider-neutral state simply stays where it is; provider-specific state (memory, in-flight context) stays with its provider, and carrying memory across is a separate, explicit operator step (`/migrate-memory`, executed by your coding agent).
+
+## Preconditions
+
+1. **The target provider is installed** — run its `/add-<provider>` skill and rebuild the container image (`./container/build.sh`). If the provider isn't installed (or the name is a typo), the container fails at boot and the host surfaces its last words in the logs: look for `Container exited non-zero` with a `stderrTail` like `Unknown provider: codexx. Registered: claude, codex`.
+2. **Auth is configured** — each provider documents its own auth in its install skill (for Codex: a ChatGPT-subscription or API-key secret in the OneCLI vault).
+
+## Switching
+
+```bash
+ncl groups config update --id <group-id> --provider codex
+ncl groups restart --id <group-id>
+```
+
+Sessions resolve their provider at container spawn (`sessions.agent_provider` is only set when you've explicitly pinned a session), so existing sessions pick up the new provider on their next wake.
+
+## What carries over automatically
+
+| State | How |
+|-------|-----|
+| Group identity, wiring, members, roles, destinations | Provider-neutral, in the central DB — untouched |
+| Container config (model aside), skills, MCP servers, packages, mounts, cli_scope | Provider-neutral — untouched |
+| Workspace files (`groups/<folder>/` — notes, data files the agent created) | Same workspace, mounted for every provider |
+| Conversation archives (`conversations/`) | Provider-neutral markdown — readable by the new provider |
+| Agent surfaces (system instructions / project docs) | Composed fresh at every spawn from the same sources — nothing to migrate |
+
+## What does NOT carry over
+
+- **Agent memory.** Each provider keeps its own store: Claude's per-group memory is `CLAUDE.local.md` in the workspace; scaffold providers (e.g. Codex) keep a `memory/` tree. Neither is touched by a switch — the old store sits intact, the new provider starts with its own. To carry memory across, run **`/migrate-memory`**: your coding agent reads the source store, distills it into the target store (copy, never move), and restarts the group. Both directions work.
+- **In-flight conversation context.** Continuations are provider-specific (a Claude SDK session, a Codex thread) and stored in separate per-provider slots — the new provider starts a fresh thread. The old slot is kept, not deleted. Recent context is recoverable from `conversations/` archives.
+- **Provider state dirs** (`.claude-shared/`, `.codex-shared/`). Each provider keeps its own; they sit idle while unused and are reused if you switch back.
+
+## Rolling back
+
+```bash
+ncl groups config update --id <group-id> --provider claude
+ncl groups restart --id <group-id>
+```
+
+Rollback is lossless by construction: the per-provider continuation slot means Claude resumes its previous session (subject to normal transcript-rotation age limits), and `CLAUDE.local.md` was never modified by the switch. Memory written **while on the other provider** lives in that provider's store — run `/migrate-memory` again if you want it carried back.
@@ -187,7 +187,7 @@ leaking the token to disk outweighs the debugging value.

 | File | Role |
 |---|---|
-| `nanoclaw.sh` | Top-level wrapper. Phase 1 (bootstrap) and phase 2 (setup:auto) orchestration. Writes bootstrap's raw log + progression entry. |
+| `nanoclaw.sh` | Top-level wrapper. Phase 1 (bootstrap) and phase 2 (setup:auto) orchestration. Writes bootstrap's raw log + progression entry. `--uninstall` bypasses bootstrap entirely — it execs setup:auto directly (the flow lives in `setup/uninstall/`), or prints manual-cleanup guidance and exits 1 when the TS toolchain is missing. |
 | `setup.sh` | Phase 1 bootstrap: Node, pnpm, native-module verify. Emits its own `BOOTSTRAP` status block (historically printed to stdout; now goes to the bootstrap raw log). |
 | `setup/auto.ts` | Phase 2 driver. Orchestrates the clack UI, step execution, user prompts, and writes to all three log levels for every step it spawns. |
 | `setup/logs.ts` | The logging primitives (`logStep`, `logUserInput`, `logComplete`, `stepRawLog`, `initSetupLog`). Single source of truth for level 2/3 formatting and file paths. |
@@ -0,0 +1,168 @@
+# Skill guidelines
+
+The authoritative checklist for writing a NanoClaw skill: the bar that conformance tooling and registry review will hold every skill to. [customizing.md](customizing.md) is the short introduction; [skills-model.md](skills-model.md) explains why the model works this way. This document evolves with the system; when a rule here proves wrong, fix the rule.
+
+---
+
+## Principles
+
+Every customization is an additive **skill**: not an edit buried in core, but a skill that carries its own code and knows how to install and remove itself. Two principles make a skill *maintainable*; everything else in this document follows from them.
+
+### 1. Minimal integration surface
+
+A skill adds files and makes the **smallest possible reach-ins** into existing code. Adding a file or a dependency never breaks on upgrade; reaching into existing code is the only thing that does, so the integration surface *is* the upgrade risk. Keep reach-ins few, tiny, and ideally a single line that *calls* into the skill's own code.
+
+Follows from this:
+
+- **Mostly add.** See the change shapes below, in safety order.
+- **Push logic into skill-owned files** so the core edit is one call, not an inlined block. This shrinks the surface *and* makes the point testable.
+- **Colocated, self-contained** edits over edits in two places.
+- **Use an existing registry or hook when there is one**: appending to a registry is a smaller surface than reaching into code. When none exists, a true code-level edit is fine and first-class. (Whether to *add* a hook because a spot has become a hotspot is the maintainer's call, not the skill's.)
+
+### 2. A test for every functional integration point
+
+Every reach-in with a **functional consequence** gets a test that goes **red if the wiring is deleted or drifts**. That's what protects the fork from upstream changes. The tests are also the verification: there is no separate "verify" step.
+
+Follows from this:
+
+- **Tests target integration with core, not internal correctness.** Unit tests of a skill's own logic, or its behavior against an external service, are the creator's call: fine, just not required.
+- **A direct unit test doesn't count**: calling the skill's own function bypasses the wiring and stays green when the reach-in is deleted. Drive the real entry, or assert the wiring structurally.
+- **Build / typecheck is an always-on leg**: drift (moved imports, renamed fields) is the main enemy and slips past runtime tests.
+- **The test lives where the point runs**: host code uses vitest under `src/`; container code uses `bun:test` under `container/agent-runner/`.
+- **"Functional" is the filter**: weigh a reach-in by what breaks if it's gone. A cosmetic one (raising a log line's level) gets no test.
+
+The two interlock: a minimal surface keeps the integration points few and testable; a test per point keeps the surface safe. *Maintainable = small surface, every functional point guarded.*
+
+---
+
+## Skill anatomy
+
+A skill carries everything it needs:
+
+- **Code**: the files it adds. They live in the skill's own folder, or, for large registry-backed skills like channels and providers, on a registry branch the skill fetches from. Apply copies them in.
+- **Apply**: the steps in `SKILL.md`, written as prose an agent can run. Apply must be safe to re-run: upgrades re-run it, and a skill that half-applies twice is a bug.
+- **Remove**: a separate `REMOVE.md` that reverses *every* change apply made: barrel lines deleted (not commented out), every copied file removed including tests, dependencies uninstalled, Dockerfile edits reverted, env lines removed. **REMOVE.md is required exactly when apply leaves anything behind.** A pure instruction-only skill that copies nothing needs none, and an empty one is noise.
+- **Tests**: files that ship with the skill and are copied into the project's test tree on apply, so they run against the *composed* system.
+- **Recipe entry**: how it composes with the fork's other skills (ordering, dependencies).
+
+---
+
+## Change shapes
+
+In rough order of safety:
+
+- **Add a file**: safest. New code in the skill's own files, or fetched from a registry branch (`git show origin/<branch>:path > path`).
+- **Append to a file**: an import in a barrel, a line in `.env`, an entry at the end of a list.
+- **Edit a value in JSON**: e.g. a `package.json` field.
+- **Add a dependency**, pinned to an exact version.
+- **Insert into existing code (an "integration point")**: the one risky move. Keep it to a line or two that *calls* code living in the skill's own files, never an inlined block of logic. A skill full of these is a smell.
+
+Fetching from a registry branch is **additive, never a merge**. `git fetch origin <branch>` then `git show origin/<branch>:path > path` per file. Never `git merge` a registry branch into an install.
+
+---
+
+## Integration points
+
+The integration point is wherever the skill reaches into existing code. Make it **minimal, colocated, and self-contained**:
+
+- All real logic lives in the skill's own file behind a single entry function; the edit to core is just the call.
+- **Prefer one colocated block** over edits in two places. For an inserted call, a dynamic import at the call site keeps the import and call together and avoids touching the top-of-file import block (itself a merge hotspot):
+
+  ```typescript
+  const { startDashboard } = await import('./dashboard-pusher.js');
+  await startDashboard();
+  ```
+
+  A static import + call is acceptable too; this is a recommendation, not a mandate.
+- Keep any gating (feature flags, env checks) *inside* the skill's function, so the core edit stays a single call.
+- When the reach-in lands inside an entangled function, extract a tiny skill-owned helper so the core touch is one line, like `args.push(...mySkillEnvArgs())`, rather than exporting the whole function or inlining the logic.
+
+---
+
+## Testing
+
+**What the standard requires: integration with the NanoClaw system.**
+
+- **Required:** a test for every functional integration point, and, where an added file consumes core (core APIs, data shapes, registries), a test that exercises that consumption against the real core. That's the leg that catches core drift.
+- **Optional, the creator's call:** unit tests of the skill's own internal logic, or its behavior against an external service. Often good practice; not what defines a maintainable skill, because they don't protect against upstream changes.
+
+### Choosing the test type
+
+For a code-edit integration point, how you test the wiring depends on whether you can invoke the function the edit lives in. **Prefer behavior; fall back to structure.**
+
+- **If the edit lives in an invocable function, test that function's behavior.** Calling it exercises the edit; remove or break the edit and the test goes red. This is the strongest option, and usually available, because a minimal integration point pushes the logic into the skill's own exported function anyway.
+- **If the edit lives in a non-invocable entry point** (e.g. `main()` or boot), **use a structural / AST test.** Use the TypeScript compiler API and assert not just that the symbol exists but its **placement**: awaited, a direct statement of the right function, importing the right module path, correctly ordered. A present-but-misplaced call must go red.
+
+Two more legs apply when relevant:
+
+- **Build / typecheck** always applies: it catches a renamed symbol, a moved module, a bad signature.
+- **A behavior test of how added code consumes core**, required when the added file reaches into core APIs or data at runtime. When the consumption is a *typed* call into a core API (a Chat SDK adapter calling `createChatSdkBridge`), the build leg already guards it and no separate behavior test is required. The behavior-test requirement targets runtime consumption: core DB state, data shapes, registries.
+
+Together these cover deletion, misplacement, drift, and core consumption. Only true runtime-reachability (a call stranded behind a dead branch) needs the heavy option of booting the real entry point, a rare "real run" reserved for critical wiring.
+
+### Registration reach-ins: behavior, not structural
+
+A registry queryable at runtime gets a **behavior** test: import the real barrel, assert the registry contains the entry. A structural parse only proves the *source line* exists. It stays green when the barrel can't evaluate or the package isn't installed, which is exactly when the thing is actually broken. The behavior test goes red on a deleted barrel line, a barrel that won't evaluate, *and* an uninstalled package (the unmocked import throws), so it covers the dependency integration point for free.
+
+Two consequences. First, **don't mock the adapter's package in the shipped test**: that would defeat the dependency check, and the test runs in the composed install where the package is present. Second, the only reason to fall back to a structural parse is an adapter with real import-time side effects (spawns a process, opens a socket, needs creds at load), which is an adapter smell to fix, not a reason to weaken the test. Conformant adapters do all side-effectful work in the factory or `setup()`, never at import.
+
+### Test archetypes
+
+The test matches the kind of integration point:
+
+- **In-process seam with core** (a channel into the router, a pusher into the central DB): drive the real added component against the **real core collaborators** (DB, registry, router), faking only the external edge. The highest-value archetype: it exercises the added file's consumption of core, which is what catches core drift.
+- **Wiring / registration** (a barrel import, a `main()` call, an entry in an `mcpServers` map): behavior test via the registry where queryable (see above); structural / AST test where not.
+- **Config / container probe** (mounts, Dockerfile, a tool installed in the image): run the change where you can. Spin up a container to confirm a mount or binary. Checking that a line exists in a file is the last resort.
+- **Agentic run** (operational, instruction-only skills): run the workflow with a small model; did it complete?
+- **Patch behavior** (a patch skill that changes core logic): a behavior test of the changed behavior.
+- **Provider (multi-point)**: a non-default agent backend reaches into *two* barrels (host `src/providers/index.ts`; container `container/agent-runner/src/providers/index.ts`), plus Dockerfile edits and a CLI or SDK dependency. Each is a separate way to break, and each needs its own guard. Ship a **barrel-driven registration test per tree** that imports *only* the real barrel and asserts the registry contains the provider. **The trap:** a `*.factory.test.ts` that imports the provider module directly self-registers it and stays green when the barrel line is deleted; that's a unit test, not a registration guard. REMOVE.md must reverse both barrel lines, all copied files in both trees, the dependency, and the Dockerfile edits.
+- **Content / instruction-only** (a reference wiki, a pure workflow): makes no functional reach-in, so it owes no integration test. Conformance is anatomy: idempotent apply, plus REMOVE.md iff apply leaves anything behind.
+
+### Dependencies are integration points
+
+A skill that installs a package has made a reach-in: the code now assumes it's there. Guard it so a missing package goes red, in order of preference:
+
+1. **An unmocked import in a behavior test**: the test imports real code that imports the package, so a missing package throws. Covers presence *and* exercises the real dependency.
+2. **The build leg**: a typed import of a missing module fails typecheck. The fallback when the package genuinely can't be imported in a test (e.g. it binds a port on import). Only works if the validate step runs the build before or alongside the tests, so verify the order.
+3. **A Dockerfile-installed CLI binary** is the case most often left unguarded: it isn't importable, so neither guard above sees it. Use a **structural test** asserting the Dockerfile `ARG <X>_VERSION=` and install line are present, optionally backed by a `<bin> --version` container probe. Pin the version; reject `latest`.
+
+You do *not* need to test the dependency's own API contract; that's optional external-service coverage.
+
+### When there is genuinely nothing to test in-tree
+
+Some skills' only functional integration is a runtime operator action with no source footprint: registering an MCP server through `ncl`, or a mount through the sanctioned query wrapper (until the `ncl` add-mount verb lands). There's no line in the tree whose deletion a test could catch, so a registration test is structurally inapplicable. **State this explicitly in SKILL.md** rather than inventing a hollow test; conformance is then anatomy plus the dependency guard. This is a conformant outcome, valid only when the reach-in has no in-tree representation. (A raw-SQL write into core's schema to achieve the same thing is a smell, not a workaround.)
+
+### Test rules
+
+- **Hermetic at the external edge.** Mock genuinely external services (a fake HTTP server, stubbed creds), never the package under guard (see "Registration reach-ins").
+- **Exercise the real entry, or assert it structurally.** A test that imports the skill's function directly does not test the integration.
+- **Tests travel with the skill** and are copied in on apply; an integration test only means anything against the composed project.
+- **Robustness check.** Apply the skill with a small, cheap model. If a small model fumbles the instructions, they're too vague. Fix the instructions, don't blame the model. (Small models also keep applying skills cheap.)
+
+---
+
+## Anti-patterns
+
+Each with its fix. These are patterns to remove, not to test around: a drift-prone, untestable reach-in is usually a symptom of a bad pattern, not a missing test. Reviewers reject them; the conformance linter will flag them automatically.
+
+1. **A separate VERIFY.md.** Delete it; tests are the verification. Fold any genuinely useful manual smoke check into SKILL.md's next steps.
+2. **REMOVE.md soft-disable** (comments out an import; leaves copied files behind). DELETE the import line and `rm` every file the skill copied.
+3. **REMOVE.md incomplete** (misses env vars, the package uninstall, copied tests). Reverse *every* change; read the env vars from the skill's own credentials section, don't guess.
+4. **Raw SQL against a core DB** (read or write). Use a core helper or an `ncl` verb; the in-tree query wrapper is the sanctioned last resort. Never the `sqlite3` binary.
+5. **Credential threading** (`-e KEY=…` or a stdin secrets payload into the container). OneCLI gateway only; it injects credentials per request.
+6. **Branch-merge install** (`git merge` of a registry branch or any code branch). Install by additive fetch: `git fetch origin <branch>`, then `git show origin/<branch>:path > path` per file. For an update/reapply workflow, re-run each installed skill's additive apply, never merge.
+7. **Diff-against-past framing** ("earlier versions…", "this is now redundant") and **documenting non-steps** ("no X needed"). Write present-tense DO steps only. A skill reads as a standalone artifact with no memory of its own edits.
+8. **Stale reach-in targets** (an edit aimed at code that no longer exists; a reach-in already shipped in trunk). Verify the target exists *before* instructing the edit; reconcile already-in-trunk ones to a no-op. Before appending to an allowlist or list, check how it's consumed; the entry may already be derived from a registry, making the edit dead.
+9. **Hand-maintained duplicate copies** (a mirror directory kept in sync by hand or sed). Generate the mirror from a single canonical source.
+
+---
+
+## Worked examples
+
+In-tree exemplars for the code archetypes. (Two carry known smells, kept deliberately pending architectural fixes; they demonstrate the test shapes, not perfection.)
+
+- `add-dashboard`: in-process seam with core (the pusher against the central DB), plus an AST wiring test for its `main()` call.
+- `add-slack`: Chat SDK channel registration; the template for the whole channel family.
+- `add-deltachat`: native channel registration.
+- `add-atomic-chat-tool`: MCP-tool wiring across both runtimes (container registration and host env-helper call).
+- `add-opencode` / `add-codex`: the provider multi-point archetype, with two barrels, Dockerfile pins, and per-tree registration tests.
@@ -1,677 +0,0 @@
-# Skills as Branches
-
-## Overview
-
-This document covers **feature skills** — skills that add capabilities via git branch merges. This is the most complex skill type and the primary way NanoClaw is extended.
-
-NanoClaw has four types of skills overall. See [CONTRIBUTING.md](../CONTRIBUTING.md) for the full taxonomy:
-
-| Type | Location | How it works |
-|------|----------|-------------|
-| **Feature** (this doc) | `.claude/skills/` + `skill/*` branch | SKILL.md has instructions; code lives on a branch, applied via `git merge` |
-| **Utility** | `.claude/skills/<name>/` with code files | Self-contained tools; code in skill directory, copied into place on install |
-| **Operational** | `.claude/skills/` on `main` | Instruction-only workflows (setup, debug, update) |
-| **Container** | `container/skills/` | Loaded inside agent containers at runtime |
-
---
-
-Feature skills are distributed as git branches on the upstream repository. Applying a skill is a `git merge`. Updating core is a `git merge`. Everything is standard git.
-
-This replaces the previous `skills-engine/` system (three-way file merging, `.nanoclaw/` state, manifest files, replay, backup/restore) with plain git operations and Claude for conflict resolution.
-
-## How It Works
-
-### Repository structure
-
-The upstream repo (`nanocoai/nanoclaw`) maintains:
-
- `main` — core NanoClaw (no skill code)
- `skill/discord` — main + Discord integration
- `skill/telegram` — main + Telegram integration
- `skill/slack` — main + Slack integration
- `skill/gmail` — main + Gmail integration
- etc.
-
-Each skill branch contains all the code changes for that skill: new files, modified source files, updated `package.json` dependencies, `.env.example` additions — everything. No manifest, no structured operations, no separate `add/` and `modify/` directories.
-
-### Skill discovery and installation
-
-Skills are split into two categories:
-
-**Operational skills** (on `main`, always available):
- `/setup`, `/debug`, `/update-nanoclaw`, `/customize`, `/update-skills`
- These are instruction-only SKILL.md files — no code changes, just workflows
- Live in `.claude/skills/` on `main`, immediately available to every user
-
-**Feature skills** (in marketplace, installed on demand):
- `/add-discord`, `/add-telegram`, `/add-slack`, `/add-gmail`, etc.
- Each has a SKILL.md with setup instructions and a corresponding `skill/*` branch with code
- Live in the marketplace repo (`nanocoai/nanoclaw-skills`)
-
-Users never interact with the marketplace directly. The operational skills `/setup` and `/customize` handle plugin installation transparently:
-
-```bash
-# Claude runs this behind the scenes — users don't see it
-claude plugin install nanoclaw-skills@nanoclaw-skills --scope project
-```
-
-Skills are hot-loaded after `claude plugin install` — no restart needed. This means `/setup` can install the marketplace plugin, then immediately run any feature skill, all in one session.
-
-### Selective skill installation
-
-`/setup` asks users what channels they want, then only offers relevant skills:
-
-1. "Which messaging channels do you want to use?" → Discord, Telegram, Slack, WhatsApp
-2. User picks Telegram → Claude installs the plugin and runs `/add-telegram`
-3. After Telegram is set up: "Want to add Agent Swarm support for Telegram?" → offers `/add-telegram-swarm`
-4. "Want to enable community skills?" → installs community marketplace plugins
-
-Dependent skills (e.g., `telegram-swarm` depends on `telegram`) are only offered after their parent is installed. `/customize` follows the same pattern for post-setup additions.
-
-### Marketplace configuration
-
-NanoClaw's `.claude/settings.json` registers the official marketplace:
-
-```json
-{
-  "extraKnownMarketplaces": {
-    "nanoclaw-skills": {
-      "source": {
-        "source": "github",
-        "repo": "nanocoai/nanoclaw-skills"
-      }
-    }
-  }
-}
-```
-
-The marketplace repo uses Claude Code's plugin structure:
-
-```
-nanocoai/nanoclaw-skills/
-  .claude-plugin/
-    marketplace.json              # Plugin catalog
-  plugins/
-    nanoclaw-skills/              # Single plugin bundling all official skills
-      .claude-plugin/
-        plugin.json               # Plugin manifest
-      skills/
-        add-discord/
-          SKILL.md                # Setup instructions; step 1 is "merge the branch"
-        add-telegram/
-          SKILL.md
-        add-slack/
-          SKILL.md
-        ...
-```
-
-Multiple skills are bundled in one plugin — installing `nanoclaw-skills` makes all feature skills available at once. Individual skills don't need separate installation.
-
-Each SKILL.md tells Claude to merge the corresponding skill branch as step 1, then walks through interactive setup (env vars, bot creation, etc.).
-
-### Applying a skill
-
-User runs `/add-discord` (discovered via marketplace). Claude follows the SKILL.md:
-
-1. `git fetch upstream skill/discord`
-2. `git merge upstream/skill/discord`
-3. Interactive setup (create bot, get token, configure env vars, etc.)
-
-Or manually:
-
-```bash
-git fetch upstream skill/discord
-git merge upstream/skill/discord
-```
-
-### Applying multiple skills
-
-```bash
-git merge upstream/skill/discord
-git merge upstream/skill/telegram
-```
-
-Git handles the composition. If both skills modify the same lines, it's a real conflict and Claude resolves it.
-
-### Updating core
-
-```bash
-git fetch upstream main
-git merge upstream/main
-```
-
-Since skill branches are kept merged-forward with main (see CI section), the user's merged-in skill changes and upstream changes have proper common ancestors.
-
-### Checking for skill updates
-
-Users who previously merged a skill branch can check for updates. For each `upstream/skill/*` branch, check whether the branch has commits that aren't in the user's HEAD:
-
-```bash
-git fetch upstream
-for branch in $(git branch -r | grep 'upstream/skill/'); do
-  # Check if user has merged this skill at some point
-  merge_base=$(git merge-base HEAD "$branch" 2>/dev/null) || continue
-  # Check if the skill branch has new commits beyond what the user has
-  if ! git merge-base --is-ancestor "$branch" HEAD 2>/dev/null; then
-    echo "$branch has updates available"
-  fi
-done
-```
-
-This requires no state — it uses git history to determine which skills were previously merged and whether they have new commits.
-
-This logic is available in two ways:
- Built into `/update-nanoclaw` — after merging main, optionally check for skill updates
- Standalone `/update-skills` — check and merge skill updates independently
-
-### Conflict resolution
-
-At any merge step, conflicts may arise. Claude resolves them — reading the conflicted files, understanding the intent of both sides, and producing the correct result. This is what makes the branch approach viable at scale: conflict resolution that previously required human judgment is now automated.
-
-### Skill dependencies
-
-Some skills depend on other skills. E.g., `skill/telegram-swarm` requires `skill/telegram`. Dependent skill branches are branched from their parent skill branch, not from `main`.
-
-This means `skill/telegram-swarm` includes all of telegram's changes plus its own additions. When a user merges `skill/telegram-swarm`, they get both — no need to merge telegram separately.
-
-Dependencies are implicit in git history — `git merge-base --is-ancestor` determines whether one skill branch is an ancestor of another. No separate dependency file is needed.
-
-### Uninstalling a skill
-
-```bash
-# Find the merge commit
-git log --merges --oneline | grep discord
-
-# Revert it
-git revert -m 1 <merge-commit>
-```
-
-This creates a new commit that undoes the skill's changes. Claude can handle the whole flow.
-
-If the user has modified the skill's code since merging (custom changes on top), the revert might conflict — Claude resolves it.
-
-If the user later wants to re-apply the skill, they need to revert the revert first (git treats reverted changes as "already applied and undone"). Claude handles this too.
-
-## CI: Keeping Skill Branches Current
-
-A GitHub Action runs on every push to `main`:
-
-1. List all `skill/*` branches
-2. For each skill branch, merge `main` into it (merge-forward, not rebase)
-3. Run build and tests on the merged result
-4. If tests pass, push the updated skill branch
-5. If a skill fails (conflict, build error, test failure), open a GitHub issue for manual resolution
-
-**Why merge-forward instead of rebase:**
- No force-push — preserves history for users who already merged the skill
- Users can re-merge a skill branch to pick up skill updates (bug fixes, improvements)
- Git has proper common ancestors throughout the merge graph
-
-**Why this scales:** With a few hundred skills and a few commits to main per day, the CI cost is trivial. Haiku is fast and cheap. The approach that wouldn't have been feasible a year or two ago is now practical because Claude can resolve conflicts at scale.
-
-## Installation Flow
-
-### New users (recommended)
-
-1. Fork `nanocoai/nanoclaw` on GitHub (click the Fork button)
-2. Clone your fork:
-   ```bash
-   git clone https://github.com/<you>/nanoclaw.git
-   cd nanoclaw
-   ```
-3. Run Claude Code:
-   ```bash
-   claude
-   ```
-4. Run `/setup` — Claude handles dependencies, authentication, container setup, service configuration, and adds `upstream` remote if not present
-
-Forking is recommended because it gives users a remote to push their customizations to. Clone-only works for trying things out but provides no remote backup.
-
-### Existing users migrating from clone
-
-Users who previously ran `git clone https://github.com/nanocoai/nanoclaw.git` and have local customizations:
-
-1. Fork `nanocoai/nanoclaw` on GitHub
-2. Reroute remotes:
-   ```bash
-   git remote rename origin upstream
-   git remote add origin https://github.com/<you>/nanoclaw.git
-   git push --force origin main
-   ```
-   The `--force` is needed because the fresh fork's main is at upstream's latest, but the user wants their (possibly behind) version. The fork was just created so there's nothing to lose.
-3. From this point, `origin` = their fork, `upstream` = nanocoai/nanoclaw
-
-### Existing users migrating from the old skills engine
-
-Users who previously applied skills via the `skills-engine/` system have skill code in their tree but no merge commits linking to skill branches. Git doesn't know these changes came from a skill, so merging a skill branch on top would conflict or duplicate.
-
-**For new skills going forward:** just merge skill branches as normal. No issue.
-
-**For existing old-engine skills**, two migration paths:
-
-**Option A: Per-skill reapply (keep your fork)**
-1. For each old-engine skill: identify and revert the old changes, then merge the skill branch fresh
-2. Claude assists with identifying what to revert and resolving any conflicts
-3. Custom modifications (non-skill changes) are preserved
-
-**Option B: Fresh start (cleanest)**
-1. Create a new fork from upstream
-2. Merge the skill branches you want
-3. Manually re-apply your custom (non-skill) changes
-4. Claude assists by diffing your old fork against the new one to identify custom changes
-
-In both cases:
- Delete the `.nanoclaw/` directory (no longer needed)
- The `skills-engine/` code will be removed from upstream once all skills are migrated
- `/update-skills` only tracks skills applied via branch merge — old-engine skills won't appear in update checks
-
-## User Workflows
-
-### Custom changes
-
-Users make custom changes directly on their main branch. This is the standard fork workflow — their `main` IS their customized version.
-
-```bash
-# Make changes
-vim src/config.ts
-git commit -am "change trigger word to @Bob"
-git push origin main
-```
-
-Custom changes, skills, and core updates all coexist on their main branch. Git handles the three-way merging at each merge step because it can trace common ancestors through the merge history.
-
-### Applying a skill
-
-Run `/add-discord` in Claude Code (discovered via the marketplace plugin), or manually:
-
-```bash
-git fetch upstream skill/discord
-git merge upstream/skill/discord
-# Follow setup instructions for configuration
-git push origin main
-```
-
-If the user is behind upstream's main when they merge a skill branch, the merge might bring in some core changes too (since skill branches are merged-forward with main). This is generally fine — they get a compatible version of everything.
-
-### Updating core
-
-```bash
-git fetch upstream main
-git merge upstream/main
-git push origin main
-```
-
-This is the same as the existing `/update-nanoclaw` skill's merge path.
-
-### Updating skills
-
-Run `/update-skills` or let `/update-nanoclaw` check after a core update. For each previously-merged skill branch that has new commits, Claude offers to merge the updates.
-
-### Contributing back to upstream
-
-Users who want to submit a PR to upstream:
-
-```bash
-git fetch upstream main
-git checkout -b my-fix upstream/main
-# Make changes
-git push origin my-fix
-# Create PR from my-fix to nanocoai/nanoclaw:main
-```
-
-Standard fork contribution workflow. Their custom changes stay on their main and don't leak into the PR.
-
-## Contributing a Skill
-
-The flow below is for **feature skills** (branch-based). For utility skills (self-contained tools) and container skills, the contributor opens a PR that adds files directly to `.claude/skills/<name>/` or `container/skills/<name>/` — no branch extraction needed. See [CONTRIBUTING.md](../CONTRIBUTING.md) for all skill types.
-
-### Contributor flow (feature skills)
-
-1. Fork `nanocoai/nanoclaw`
-2. Branch from `main`
-3. Make the code changes (new channel file, modified integration points, updated package.json, .env.example additions, etc.)
-4. Open a PR to `main`
-
-The contributor opens a normal PR — they don't need to know about skill branches or marketplace repos. They just make code changes and submit.
-
-### Maintainer flow
-
-When a skill PR is reviewed and approved:
-
-1. Create a `skill/<name>` branch from the PR's commits:
-   ```bash
-   git fetch origin pull/<PR_NUMBER>/head:skill/<name>
-   git push origin skill/<name>
-   ```
-2. Force-push to the contributor's PR branch, replacing it with a single commit that adds the contributor to `CONTRIBUTORS.md` (removing all code changes)
-3. Merge the slimmed PR into `main` (just the contributor addition)
-4. Add the skill's SKILL.md to the marketplace repo (`nanocoai/nanoclaw-skills`)
-
-This way:
- The contributor gets merge credit (their PR is merged)
- They're added to CONTRIBUTORS.md automatically by the maintainer
- The skill branch is created from their work
- `main` stays clean (no skill code)
- The contributor only had to do one thing: open a PR with code changes
-
-**Note:** GitHub PRs from forks have "Allow edits from maintainers" checked by default, so the maintainer can push to the contributor's PR branch.
-
-### Skill SKILL.md
-
-The contributor can optionally provide a SKILL.md (either in the PR or separately). This goes into the marketplace repo and contains:
-
-1. Frontmatter (name, description, triggers)
-2. Step 1: Merge the skill branch
-3. Steps 2-N: Interactive setup (create bot, get token, configure env vars, verify)
-
-If the contributor doesn't provide a SKILL.md, the maintainer writes one based on the PR.
-
-## Community Marketplaces
-
-Anyone can maintain their own fork with skill branches and their own marketplace repo. This enables a community-driven skill ecosystem without requiring write access to the upstream repo.
-
-### How it works
-
-A community contributor:
-
-1. Maintains a fork of NanoClaw (e.g., `alice/nanoclaw`)
-2. Creates `skill/*` branches on their fork with their custom skills
-3. Creates a marketplace repo (e.g., `alice/nanoclaw-skills`) with a `.claude-plugin/marketplace.json` and plugin structure
-
-### Adding a community marketplace
-
-If the community contributor is trusted, they can open a PR to add their marketplace to NanoClaw's `.claude/settings.json`:
-
-```json
-{
-  "extraKnownMarketplaces": {
-    "nanoclaw-skills": {
-      "source": {
-        "source": "github",
-        "repo": "nanocoai/nanoclaw-skills"
-      }
-    },
-    "alice-nanoclaw-skills": {
-      "source": {
-        "source": "github",
-        "repo": "alice/nanoclaw-skills"
-      }
-    }
-  }
-}
-```
-
-Once merged, all NanoClaw users automatically discover the community marketplace alongside the official one.
-
-### Installing community skills
-
-`/setup` and `/customize` ask users whether they want to enable community skills. If yes, Claude installs community marketplace plugins via `claude plugin install`:
-
-```bash
-claude plugin install alice-skills@alice-nanoclaw-skills --scope project
-```
-
-Community skills are hot-loaded and immediately available — no restart needed. Dependent skills are only offered after their prerequisites are met (e.g., community Telegram add-ons only after Telegram is installed).
-
-Users can also browse and install community plugins manually via `/plugin`.
-
-### Properties of this system
-
- **No gatekeeping required.** Anyone can create skills on their fork without permission. They only need approval to be listed in the auto-discovered marketplaces.
- **Multiple marketplaces coexist.** Users see skills from all trusted marketplaces in `/plugin`.
- **Community skills use the same merge pattern.** The SKILL.md just points to a different remote:
-  ```bash
-  git remote add alice https://github.com/alice/nanoclaw.git
-  git fetch alice skill/my-cool-feature
-  git merge alice/skill/my-cool-feature
-  ```
- **Users can also add marketplaces manually.** Even without being listed in settings.json, users can run `/plugin marketplace add alice/nanoclaw-skills` to discover skills from any source.
- **CI is per-fork.** Each community maintainer runs their own CI to keep their skill branches merged-forward. They can use the same GitHub Action as the upstream repo.
-
-## Flavors
-
-A flavor is a curated fork of NanoClaw — a combination of skills, custom changes, and configuration tailored for a specific use case (e.g., "NanoClaw for Sales," "NanoClaw Minimal," "NanoClaw for Developers").
-
-### Creating a flavor
-
-1. Fork `nanocoai/nanoclaw`
-2. Merge in the skills you want
-3. Make custom changes (trigger word, prompts, integrations, etc.)
-4. Your fork's `main` IS the flavor
-
-### Installing a flavor
-
-During `/setup`, users are offered a choice of flavors before any configuration happens. The setup skill reads `flavors.yaml` from the repo (shipped with upstream, always up to date) and presents options:
-
-AskUserQuestion: "Start with a flavor or default NanoClaw?"
- Default NanoClaw
- NanoClaw for Sales — Gmail + Slack + CRM (maintained by alice)
- NanoClaw Minimal — Telegram-only, lightweight (maintained by bob)
-
-If a flavor is chosen:
-
-```bash
-git remote add <flavor-name> https://github.com/alice/nanoclaw.git
-git fetch <flavor-name> main
-git merge <flavor-name>/main
-```
-
-Then setup continues normally (dependencies, auth, container, service).
-
-**This choice is only offered on a fresh fork** — when the user's main matches or is close to upstream's main with no local commits. If `/setup` detects significant local changes (re-running setup on an existing install), it skips the flavor selection and goes straight to configuration.
-
-After installation, the user's fork has three remotes:
- `origin` — their fork (push customizations here)
- `upstream` — `nanocoai/nanoclaw` (core updates)
- `<flavor-name>` — the flavor fork (flavor updates)
-
-### Updating a flavor
-
-```bash
-git fetch <flavor-name> main
-git merge <flavor-name>/main
-```
-
-The flavor maintainer keeps their fork updated (merging upstream, updating skills). Users pull flavor updates the same way they pull core updates.
-
-### Flavors registry
-
-`flavors.yaml` lives in the upstream repo:
-
-```yaml
-flavors:
-  - name: NanoClaw for Sales
-    repo: alice/nanoclaw
-    description: Gmail + Slack + CRM integration, daily pipeline summaries
-    maintainer: alice
-
-  - name: NanoClaw Minimal
-    repo: bob/nanoclaw
-    description: Telegram-only, no container overhead
-    maintainer: bob
-```
-
-Anyone can PR to add their flavor. The file is available locally when `/setup` runs since it's part of the cloned repo.
-
-### Discoverability
-
- **During setup** — flavor selection is offered as part of the initial setup flow
- **`/browse-flavors` skill** — reads `flavors.yaml` and presents options at any time
- **GitHub topics** — flavor forks can tag themselves with `nanoclaw-flavor` for searchability
- **Discord / website** — community-curated lists
-
-## Migration
-
-Migration from the old skills engine to branches is complete. All feature skills now live on `skill/*` branches, and the skills engine has been removed.
-
-### Skill branches
-
-| Branch | Base | Description |
-|--------|------|-------------|
-| `skill/whatsapp` | `main` | WhatsApp channel |
-| `skill/telegram` | `main` | Telegram channel |
-| `skill/slack` | `main` | Slack channel |
-| `skill/discord` | `main` | Discord channel |
-| `skill/gmail` | `main` | Gmail channel |
-| `skill/voice-transcription` | `skill/whatsapp` | OpenAI Whisper voice transcription |
-| `skill/image-vision` | `skill/whatsapp` | Image attachment processing |
-| `skill/pdf-reader` | `skill/whatsapp` | PDF attachment reading |
-| `skill/local-whisper` | `skill/voice-transcription` | Local whisper.cpp transcription |
-| `skill/ollama-tool` | `main` | Ollama MCP server for local models |
-| `skill/apple-container` | `main` | Apple Container runtime |
-| `skill/reactions` | `main` | WhatsApp emoji reactions |
-
-### What was removed
-
- `skills-engine/` directory (entire engine)
- `scripts/apply-skill.ts`, `scripts/uninstall-skill.ts`, `scripts/rebase.ts`
- `scripts/fix-skill-drift.ts`, `scripts/validate-all-skills.ts`
- `.github/workflows/skill-drift.yml`, `.github/workflows/skill-pr.yml`
- All `add/`, `modify/`, `tests/`, and `manifest.yaml` from skill directories
- `.nanoclaw/` state directory
-
-Operational skills (`setup`, `debug`, `update-nanoclaw`, `customize`, `update-skills`) remain on main in `.claude/skills/`.
-
-## What Changes
-
-### README Quick Start
-
-Before:
-```bash
-git clone https://github.com/nanocoai/NanoClaw.git
-cd NanoClaw
-claude
-```
-
-After:
-```
-1. Fork nanocoai/nanoclaw on GitHub
-2. git clone https://github.com/<you>/nanoclaw.git
-3. cd nanoclaw
-4. claude
-5. /setup
-```
-
-### Setup skill (`/setup`)
-
-Updates to the setup flow:
-
- Check if `upstream` remote exists; if not, add it: `git remote add upstream https://github.com/nanocoai/nanoclaw.git`
- Check if `origin` points to the user's fork (not nanocoai). If it points to nanocoai, guide them through the fork migration.
- **Install marketplace plugin:** `claude plugin install nanoclaw-skills@nanoclaw-skills --scope project` — makes all feature skills available (hot-loaded, no restart)
- **Ask which channels to add:** present channel options (Discord, Telegram, Slack, WhatsApp, Gmail), run corresponding `/add-*` skills for selected channels
- **Offer dependent skills:** after a channel is set up, offer relevant add-ons (e.g., Agent Swarm after Telegram, voice transcription after WhatsApp)
- **Optionally enable community marketplaces:** ask if the user wants community skills, install those marketplace plugins too
-
-### `.claude/settings.json`
-
-Marketplace configuration so the official marketplace is auto-registered:
-
-```json
-{
-  "extraKnownMarketplaces": {
-    "nanoclaw-skills": {
-      "source": {
-        "source": "github",
-        "repo": "nanocoai/nanoclaw-skills"
-      }
-    }
-  }
-}
-```
-
-### Skills directory on main
-
-The `.claude/skills/` directory on `main` retains only operational skills (setup, debug, update-nanoclaw, customize, update-skills). Feature skills (add-discord, add-telegram, etc.) live in the marketplace repo, installed via `claude plugin install` during `/setup` or `/customize`.
-
-### Skills engine removal
-
-The following can be removed:
-
- `skills-engine/` — entire directory (apply, merge, replay, state, backup, etc.)
- `scripts/apply-skill.ts`
- `scripts/uninstall-skill.ts`
- `scripts/fix-skill-drift.ts`
- `scripts/validate-all-skills.ts`
- `.nanoclaw/` — state directory
- `add/` and `modify/` subdirectories from all skill directories
- Feature skill SKILL.md files from `.claude/skills/` on main (they now live in the marketplace)
-
-Operational skills (`setup`, `debug`, `update-nanoclaw`, `customize`, `update-skills`) remain on main in `.claude/skills/`.
-
-### New infrastructure
-
- **Marketplace repo** (`nanocoai/nanoclaw-skills`) — single Claude Code plugin bundling SKILL.md files for all feature skills
- **CI GitHub Action** — merge-forward `main` into all `skill/*` branches on every push to `main`, using Claude (Haiku) for conflict resolution
- **`/update-skills` skill** — checks for and applies skill branch updates using git history
- **`CONTRIBUTORS.md`** — tracks skill contributors
-
-### Update skill (`/update-nanoclaw`)
-
-The update skill gets simpler with the branch-based approach. The old skills engine required replaying all applied skills after merging core updates — that entire step disappears. Skill changes are already in the user's git history, so `git merge upstream/main` just works.
-
-**What stays the same:**
- Preflight (clean working tree, upstream remote)
- Backup branch + tag
- Preview (git log, git diff, file buckets)
- Merge/cherry-pick/rebase options
- Conflict preview (dry-run merge)
- Conflict resolution
- Build + test validation
- Rollback instructions
-
-**What's removed:**
- Skill replay step (was needed by the old skills engine to re-apply skills after core update)
- Re-running structured operations (npm deps, env vars — these are part of git history now)
-
-**What's added:**
- Optional step at the end: "Check for skill updates?" which runs the `/update-skills` logic
- This checks whether any previously-merged skill branches have new commits (bug fixes, improvements to the skill itself — not just merge-forwards from main)
-
-**Why users don't need to re-merge skills after a core update:**
-When the user merged a skill branch, those changes became part of their git history. When they later merge `upstream/main`, git performs a normal three-way merge — the skill changes in their tree are untouched, and only core changes are brought in. The merge-forward CI ensures skill branches stay compatible with latest main, but that's for new users applying the skill fresh. Existing users who already merged the skill don't need to do anything.
-
-Users only need to re-merge a skill branch if the skill itself was updated (not just merged-forward with main). The `/update-skills` check detects this.
-
-## Discord Announcement
-
-### For existing users
-
-> **Skills are now git branches**
->
-> We've simplified how skills work in NanoClaw. Instead of a custom skills engine, skills are now git branches that you merge in.
->
-> **What this means for you:**
-> - Applying a skill: `git fetch upstream skill/discord && git merge upstream/skill/discord`
-> - Updating core: `git fetch upstream main && git merge upstream/main`
-> - Checking for skill updates: `/update-skills`
-> - No more `.nanoclaw/` state directory or skills engine
->
-> **We now recommend forking instead of cloning.** This gives you a remote to push your customizations to.
->
-> **If you currently have a clone with local changes**, migrate to a fork:
-> 1. Fork `nanocoai/nanoclaw` on GitHub
-> 2. Run:
->    ```
->    git remote rename origin upstream
->    git remote add origin https://github.com/<you>/nanoclaw.git
->    git push --force origin main
->    ```
->    This works even if you're way behind — just push your current state.
->
-> **If you previously applied skills via the old system**, your code changes are already in your working tree — nothing to redo. You can delete the `.nanoclaw/` directory. Future skills and updates use the branch-based approach.
->
-> **Discovering skills:** Skills are now available through Claude Code's plugin marketplace. Run `/plugin` in Claude Code to browse and install available skills.
-
-### For skill contributors
-
-> **Contributing skills**
->
-> To contribute a skill:
-> 1. Fork `nanocoai/nanoclaw`
-> 2. Branch from `main` and make your code changes
-> 3. Open a regular PR
->
-> That's it. We'll create a `skill/<name>` branch from your PR, add you to CONTRIBUTORS.md, and add the SKILL.md to the marketplace. CI automatically keeps skill branches merged-forward with `main` using Claude to resolve any conflicts.
->
-> **Want to run your own skill marketplace?** Maintain skill branches on your fork and create a marketplace repo. Open a PR to add it to NanoClaw's auto-discovered marketplaces — or users can add it manually via `/plugin marketplace add`.
@@ -0,0 +1,150 @@
+# The skills model
+
+How NanoClaw stays customizable without breaking its forks. This is the full version; [customizing.md](customizing.md) is the short one, and [skill-guidelines.md](skill-guidelines.md) is the authoritative checklist for writing a skill.
+
+## The problem
+
+People fork NanoClaw and change the code. When we ship updates, their changes collide with ours and `git merge` turns into a fight. The more someone customized, the worse it gets. We can't grow the core without breaking everyone downstream.
+
+## The bet
+
+Every customization is a skill: not an edit buried in the core, but a skill that adds the change on top.
+
+The core stays small and stable. Everything else composes on top as skills. Adding your 1st skill and your 500th skill is the same amount of work.
+
+This works for any fork: a personal install with three tweaks, a company build with fifty.
+
+## A fork is a recipe of skills
+
+You don't track your changes as a pile of edits. You track them as skills.
+
+- Each customization = one small skill.
+- One "recipe" skill lists all your skills and how they fit together: the order, and any dependencies between them.
+
+So a fork is defined by its recipe. Most upgrades don't need to run it (see "Upgrading"), but it's what lets you rebuild the fork from scratch on clean upstream, and it's how you hand your whole fork to someone else. It replaces every "what did I change" artifact you'd otherwise keep (a migration guide, a manifest, a pile of notes) with one runnable thing.
+
+The recipe is the one fork-specific thing. It lives in your fork, never upstream. (A recipe is itself a skill: a SKILL.md listing the fork's skills in apply order.)
+
+## What's in a skill
+
+A skill carries everything it needs:
+
+- **Its code**: the files it adds (see "Where a skill's files live").
+- **Apply and remove.** Apply installs it; remove uninstalls it. Uninstall isn't a separate problem; it ships with the skill. (Remove is required exactly when apply leaves anything behind. A pure instruction-only skill that changes nothing needs none.)
+- **Its tests**: see "A test for every integration point." The tests *are* the verification. If they pass against the composed project, the skill applied correctly and works; there is no separate "verify" step.
+- **Its recipe entry**: how it composes with the others.
+
+Apply must be safe to re-run. Upgrades re-run skills, so a skill that half-applies twice is a bug.
+
+## Two kinds of skills
+
+- **Capability skills** add something new: a channel, a provider, a tool, a dashboard.
+- **Patch skills** make small tweaks or bug fixes to existing behavior, instead of adding a capability.
+
+Patch skills follow the same rules: a test for every edit, and code pushed into independent files wherever possible instead of inline. To keep the overhead down, bundle several small patches into a single patch skill rather than making one skill per one-line fix.
+
+One honest exception: a bug fix that genuinely changes an existing line can't always be moved into a new file. That single line is the one place an upgrade can still hard-conflict. If upstream touched the same line, the fix has to be re-derived against the new code. That's fine when it's small and tested; just don't pretend it's free.
+
+(Packaging is a separate axis: some skills fetch code from a registry branch, some ship files in their own folder, some are pure instructions.)
+
+## What makes a good skill
+
+A good skill mostly just *adds* things:
+
+- Adds new files.
+- Adds a line to an existing file (an import, an entry, a line in `.env`).
+- Adds a dependency.
+- Changes a value in a JSON file like `package.json`.
+
+These never really break.
+
+The one risky move is when a skill has to *reach into* existing code and wire something in at a specific spot. That's the only part that breaks when we change the code later. Keep these rare, and keep them to a line or two that just *calls* code living in the skill's own files, not big chunks of logic inline.
+
+Rule of thumb: aim for skills that are almost all "adds." Not 100%; some reach-ins are fine. But a skill full of reach-ins is a smell, and a sign that spot in the core should become a proper hook.
+
+## Where a skill's files live
+
+The files a skill adds live in the skill's own folder, and the skill copies them into the project when it runs. The skill is self-contained.
+
+The exception is skills that plug into a registry: channels and providers. Their code is larger, multi-file, and has to stay in sync with the core as it changes over time. That code lives on a long-lived **registry branch** (`channels`, `providers`) that we forward-merge against main, and the skill fetches it from there (`git show origin/channels:path > path`). A frozen copy in a skill folder would go stale.
+
+This fetch is **additive, never a merge**. The skill copies in the files it needs; it does *not* `git merge` the branch. Merging a registry branch into a customized install is exactly the conflict fight this model exists to avoid. A skill's **tests live on the branch alongside its code** and are fetched the same way; a channel's adapter travels with its registration test. A provider is the multi-point case: its code spans the host *and* container trees plus a Dockerfile edit, so it fetches files into both trees and ships a registration test per tree. See the provider archetype in [skill-guidelines.md](skill-guidelines.md).
+
+Either way the skill brings its own code, from its folder or from its branch.
+
+## A test for every integration point
+
+The tests a skill *must* ship are the ones that prove it integrates with the core and keeps working as the core changes. That's the whole point. Tests of a skill's own internal logic, or of its behavior against an external service, are fine but optional: the creator's call, because they don't guard against upstream changes. A pure-add skill that touches nothing existing needs no required integration test at all.
+
+The places that break on upgrade are the **integration points**: wherever a skill reaches into the existing system. That's not just the obvious code edit. An appended import, a config entry, a Dockerfile change, a mount, an installed dependency, and a direct read of the core's data all count. Each gets a guard that goes **red if it breaks or goes missing**:
+
+- **A behavior or structural test of the wiring.** Prefer behavior when the seam is queryable at runtime: a channel's registration test imports the real barrel and asserts the registry contains it. Fall back to a structural test only for wiring with no invocable seam.
+- **The build / typecheck.** Always on. It catches the drift a runtime test can't: a renamed symbol, a moved module, a changed signature.
+- **Coverage of how an added file consumes the core.** When a skill's own file reaches into core APIs or data, a test must exercise that consumption against the *real* core. That's the leg that catches core drift.
+
+Why points and not whole skills: a skill can have several, and each is a separate way to break. The count is honest signal: a skill's integration points are exactly its upgrade risk. Pure-add skills have zero and stay cheap.
+
+This is what makes upgrades cheap to fix: when we move something in the core, the integration-point tests are exactly what fail, and that failing list *is* the set of skills to update.
+
+**Tests travel with the skill.** They're files kept with the skill, in its folder or on its branch, and applying the skill copies them into the project's test tree. An integration-point test has to run against the *composed* system, so it only means anything once the skill is applied.
+
+**The recipe tests the stack.** A single skill's tests prove that skill works alone. The recipe carries tests that run the skills *together*, in order. That's where you catch two skills that collide.
+
+The full testing doctrine (how to pick the test type per point, the archetypes, the dependency cases) is in [skill-guidelines.md](skill-guidelines.md).
+
+## How you actually work
+
+You don't have to write a skill before you touch anything. Edit the code directly, get it working, then turn those edits into skills afterward; a coding agent does that conversion. Good authoring guidelines and a good recipe make skillifying-after-the-fact close to trivial.
+
+The point isn't to slow you down at edit time. It's that nothing counts as part of your fork until it's a skill, because that's the only form that survives an upgrade.
+
+## Upgrading
+
+**Every update goes through `/update-nanoclaw`, never a raw `git pull`.** You don't know what an update contains until it lands; it might carry a breaking change with a migration. So the command inspects what's coming and runs the proper process: back up, pull the changes in, apply migrations, run tests, fix what broke, and flag when a fresh rebuild is needed instead.
+
+Two different moves, two different rules. Your **fork pulls trunk**: that's a normal pull, run by the update command, and it's safe precisely because your changes live beside the core as skills rather than inside it. A **skill never merges**: it installs by fetching files and copying them in. If a skill's instructions say `git merge`, it isn't built to this model.
+
+The update takes one of two paths:
+
+**Normal upgrade: pull and fix what breaks.** Most of the time it pulls the latest upstream, resolves the occasional small conflict, runs the tests, and fixes whatever they flag. This stays cheap *because* the changes are small self-contained skills with tests: conflicts are rare, and when something does break, the failing test points at the exact skill and the fix is local.
+
+**Rebuild from the recipe: the rare path.** Take fresh upstream and apply every skill from scratch. The command flags this when you've fallen far behind across many breaking changes (a clean rebuild beats catching up step by step). It's also how you hand your entire fork to someone else.
+
+Around both:
+
+- **The update skill updates itself first.** The first thing it does is fetch the latest version of the upgrade process. Otherwise you're upgrading with stale instructions.
+- **Snapshot first, restore on failure.** The upgrade sets a rollback point before it starts: today a git backup branch and tag; the model calls for a full project snapshot (code, database, data, files) so anything that fails rolls back and retries. Until that snapshot lands, a migration that touches data makes its own data backup. Nothing in the upgrade needs its own undo logic.
+- **Broken skills don't block you.** If a core change broke a skill, its test tells you, but the skill is usually still usable, and an agent fixes it at apply time. Skills are fixed lazily, when applied, not ahead of time for every core version.
+
+## Migrations
+
+Migrations are core, not an afterthought. Every breaking change ships with its migration, packaged together. A "migration" is broad: upgrading dependencies, a database change, a data backfill, moving files to new locations, whatever the change requires.
+
+Migrations are **forward-only**. They don't need reverse scripts; the rollback point in front of the upgrade is the undo. If one fails, restore and retry.
+
+A **startup tripwire** keeps installs on the supported path. Every sanctioned update path (install, update, migrate) stamps a marker with the version it reached; at startup the host checks that marker against the running code. If it's missing or doesn't match, because someone pulled by hand, the host stops, loudly, with the exact command to fix it instead of silently breaking.
+
+The tripwire doesn't reason about *which* changes are breaking; it just enforces that the path was used. (DB schema migrations already run automatically at startup, so they aren't its concern; it guards everything else a raw `git pull` leaves undone.) To override, you stamp the marker yourself: an explicit "I know what I'm doing," not a deletion. If you have your **own** upgrade flow (a deploy script, a CI job), make stamping the last step after it succeeds: `pnpm exec tsx scripts/upgrade-state.ts set`. See [upgrade-recovery.md](upgrade-recovery.md).
+
+## The maintainer's side of the deal
+
+This is a two-sided contract. Users keep their changes as skills. In return, the maintainer keeps the core stable and owns the breakage.
+
+As maintainer:
+
+1. **Keep the core small and stable.** Resist hardwiring features into the core. Push them to skills too.
+2. **Before shipping a core change, run the skills against it.** That tells you what you broke before users find out.
+3. **When you break a skill, you fix it, not the users.** If a refactor moves something, update the affected skills or ship a migration. Don't make every user rediscover the same fix.
+4. **Ship the migration with the breaking change.** Packaged together: code, DB, files. Not a separate "good luck" note.
+5. **Watch for hotspots.** When lots of skills reach into the same spot in the core, that's the signal to add a proper hook there, so those reach-ins become clean adds.
+6. **Test against real forks.** Every core change and migration runs against a fleet of real, skill-built forks before shipping. Real proof on real installs.
+
+## The public registry
+
+Skills will be shared and composed; that's the whole point. A skill runs real code when it applies (copies files, installs dependencies, edits the Dockerfile). So a public registry of skills is a trust surface.
+
+The rule: **every skill is reviewed and approved before it goes into the public registry, and every new version is re-reviewed.** Approving once and trusting forever is how supply chains get poisoned. Automated checks (linting against the guidelines, plus a harness that applies the skill on fresh upstream, runs its tests, removes it, and applies it twice) will clear the mechanical part so human review can focus on intent and safety. First-party skills are trusted by where they come from; the gate is for the public registry.
+
+## The promise
+
+Build your changes as skills following this, and we won't break you. It's a promise we can only make for skills: changes edited directly into the core are beyond what we can protect.
@@ -0,0 +1,51 @@
+# Recovering from the upgrade tripwire
+
+If NanoClaw refuses to start with a message like *"update did not go through the supported path"*, this page explains what happened and how to clear it.
+
+## What happened
+
+NanoClaw records the version it reached each time you upgrade through a supported path — `/setup`, `/update-nanoclaw`, or `/migrate-nanoclaw`. That record lives in `data/upgrade-state.json`.
+
+At startup the host checks that record against the running code. If it's **missing** or its version **doesn't match** the code, the host stops. This almost always means the code was updated by a raw `git pull` instead of the supported flow — so migrations, dependency installs, or container rebuilds that the flow would have run may not have happened.
+
+## If you just ran a supported upgrade
+
+If you reached this point by running `/update-nanoclaw`, `/migrate-nanoclaw`, or `/setup` and it **completed successfully**, this is expected the first time an existing install meets the tripwire (your previous version predated it). Clear it by stamping the current version:
+
+```bash
+pnpm exec tsx scripts/upgrade-state.ts set
+```
+
+Then restart NanoClaw. It won't trip again.
+
+**Coding agents:** if the upgrade you just ran reported success, run the command above automatically and restart. Only stop to ask a human if the upgrade itself failed or you can't confirm it finished.
+
+## If you ran `git pull` yourself
+
+Don't just clear the tripwire — that skips the work the supported flow does. Instead, run the supported upgrade so backups, migrations, and rebuilds happen:
+
+```
+/update-nanoclaw
+```
+
+Once it finishes it stamps the marker for you, and the next start is clean.
+
+## If you have your own upgrade flow
+
+If you've built your own way to upgrade — a custom skill, a deploy script, a CI job, a service that pulls and restarts — it won't stamp the marker, so the host will trip on the next start. Add the stamp as the **last step** of that flow, after the upgrade succeeds and before the restart:
+
+```bash
+pnpm exec tsx scripts/upgrade-state.ts set
+```
+
+That's the same thing `/setup`, `/update-nanoclaw`, and `/migrate-nanoclaw` do at the end. Do it only when the upgrade actually completed — the marker is your assertion that this install reached the current version through a path you trust.
+
+## The override
+
+`pnpm exec tsx scripts/upgrade-state.ts set` is the override: it declares "this install is good at the current version." Use it when you know the install is actually in a good state (e.g. you completed the steps manually). It's safe to re-run.
+
+To inspect the current marker:
+
+```bash
+pnpm exec tsx scripts/upgrade-state.ts get
+```
@@ -25,6 +25,44 @@ set -euo pipefail
 PROJECT_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
 cd "$PROJECT_ROOT"

+# ─── --uninstall: short-circuit before any setup work ──────────────────
+# Never install dependencies just to uninstall. With the TS toolchain
+# present, hand straight off to setup:auto (the flow lives in
+# setup/uninstall/); without it, print manual cleanup guidance. Runs
+# before diagnostics.sh is sourced so a pure uninstall doesn't emit
+# setup_launched, and before all pre-flights/bootstrap.
+for arg in "$@"; do
+  if [ "$arg" = "--uninstall" ]; then
+    # exec tsx directly rather than `pnpm run -- …`: pnpm passes the `--`
+    # separator through to the script, where the flag parser treats
+    # everything after it as positional args and the flags get dropped.
+    # Gate on node (tsx's shebang interpreter) — pnpm isn't used here.
+    if command -v node >/dev/null 2>&1 && [ -x "$PROJECT_ROOT/node_modules/.bin/tsx" ]; then
+      exec "$PROJECT_ROOT/node_modules/.bin/tsx" "$PROJECT_ROOT/setup/auto.ts" "$@"
+    fi
+    export NANOCLAW_PROJECT_ROOT="$PROJECT_ROOT"
+    # shellcheck source=setup/lib/install-slug.sh
+    source "$PROJECT_ROOT/setup/lib/install-slug.sh"
+    UNINSTALL_RUNTIME="${CONTAINER_RUNTIME:-docker}"
+    echo "Can't run the uninstaller: dependencies are missing (node_modules/)."
+    echo "Either re-run 'bash nanoclaw.sh' once to restore them, or clean up manually:"
+    echo ""
+    if [ "$(uname -s)" = "Darwin" ]; then
+      echo "  launchctl unload ~/Library/LaunchAgents/$(launchd_label).plist"
+      echo "  rm -f ~/Library/LaunchAgents/$(launchd_label).plist"
+    else
+      echo "  systemctl --user disable --now $(systemd_unit).service"
+      echo "  rm -f ~/.config/systemd/user/$(systemd_unit).service && systemctl --user daemon-reload"
+    fi
+    echo "  $UNINSTALL_RUNTIME ps -aq --filter label=nanoclaw-install=$(_nanoclaw_install_slug) | xargs -r $UNINSTALL_RUNTIME rm -f"
+    echo "  $UNINSTALL_RUNTIME rmi $(container_image_base):latest"
+    echo "  rm -f ~/.local/bin/ncl    # only if it points at this folder"
+    echo ""
+    echo "Then back up $PROJECT_ROOT/.env if you need the keys, and delete the folder."
+    exit 1
+  fi
+done
+
 LOGS_DIR="$PROJECT_ROOT/logs"
 STEPS_DIR="$LOGS_DIR/setup-steps"
 PROGRESS_LOG="$LOGS_DIR/setup.log"
@@ -1,6 +1,6 @@
 {
  "name": "nanoclaw",
-  "version": "2.0.76",
+  "version": "2.1.16",
  "description": "Personal Claude assistant. Lightweight, secure, customizable.",
  "type": "module",
  "packageManager": "pnpm@10.33.0",
@@ -30,7 +30,7 @@
  "dependencies": {
    "@clack/core": "^1.2.0",
    "@clack/prompts": "^1.2.0",
-    "@onecli-sh/sdk": "^0.5.0",
+    "@onecli-sh/sdk": "2.2.1",
    "better-sqlite3": "11.10.0",
    "chat": "^4.24.0",
    "cron-parser": "5.5.0",
@@ -15,8 +15,8 @@ importers:
        specifier: ^1.2.0
        version: 1.2.0
      '@onecli-sh/sdk':
-        specifier: ^0.5.0
-        version: 0.5.0
+        specifier: 2.2.1
+        version: 2.2.1
      better-sqlite3:
        specifier: 11.10.0
        version: 11.10.0
@@ -303,8 +303,8 @@ packages:
      '@emnapi/core': ^1.7.1
      '@emnapi/runtime': ^1.7.1

-  '@onecli-sh/sdk@0.5.0':
-    resolution: {integrity: sha512-oe5Yx9o98v6N1PgzcCR7nULHHqcqKWNJIDOHGOSNX+l20mLlZpFUqfKPeFmsojBNRQMoqbvZQKUlFMp6gVuYBA==}
+  '@onecli-sh/sdk@2.2.1':
+    resolution: {integrity: sha512-q2mCW4ZsARlLEoTxz/P0NQ4MiCh7Z2n28pxkSc7srS+tozyw40PdTnWYW7NI8hfSYplZTx5856Adq1iPi4KN3Q==}
    engines: {node: '>=20'}

  '@oxc-project/types@0.124.0':
@@ -1665,7 +1665,7 @@ snapshots:
      '@tybys/wasm-util': 0.10.1
    optional: true

-  '@onecli-sh/sdk@0.5.0': {}
+  '@onecli-sh/sdk@2.2.1': {}

  '@oxc-project/types@0.124.0': {}

@@ -1,5 +1,5 @@
-<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="90" height="20" role="img" aria-label="181k tokens, 91% of context window">
-  <title>181k tokens, 91% of context window</title>
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="90" height="20" role="img" aria-label="195k tokens, 98% of context window">
+  <title>195k tokens, 98% of context window</title>
  <linearGradient id="s" x2="0" y2="100%">
    <stop offset="0" stop-color="#bbb" stop-opacity=".1"/>
    <stop offset="1" stop-opacity=".1"/>
@@ -15,8 +15,8 @@
      <g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" font-size="11">
        <text aria-hidden="true" x="26" y="15" fill="#010101" fill-opacity=".3">tokens</text>
        <text x="26" y="14">tokens</text>
-        <text aria-hidden="true" x="71" y="15" fill="#010101" fill-opacity=".3">181k</text>
-        <text x="71" y="14">181k</text>
+        <text aria-hidden="true" x="71" y="15" fill="#010101" fill-opacity=".3">195k</text>
+        <text x="71" y="14">195k</text>
      </g>
    </g>
  </a>
@@ -21,6 +21,7 @@ import path from 'path';

 import { DATA_DIR } from '../src/config.js';
 import { createAgentGroup, getAgentGroupByFolder } from '../src/db/agent-groups.js';
+import { updateContainerConfigScalars } from '../src/db/container-configs.js';
 import { initDb } from '../src/db/connection.js';
 import {
  createMessagingGroup,
@@ -102,6 +103,7 @@ async function main(): Promise<void> {

  // 2. Agent group + filesystem.
  const folder = args.folder || `cli-with-${normalizeName(args.displayName)}`;
+  const pickedProvider = process.env.NANOCLAW_PICKED_PROVIDER?.trim().toLowerCase();
  let ag: AgentGroup | undefined = getAgentGroupByFolder(folder);
  if (!ag) {
    const agId = generateId('ag');
@@ -123,6 +125,10 @@ async function main(): Promise<void> {
      `You are ${args.agentName}, a personal NanoClaw agent for ${args.displayName}. ` +
      'When the user first reaches out, introduce yourself briefly and invite them to chat. Keep replies concise.',
  });
+  // Runtime provider lives on the config row, not the deprecated agent_provider.
+  if (pickedProvider && pickedProvider !== 'claude') {
+    updateContainerConfigScalars(ag.id, { provider: pickedProvider });
+  }

  // 3. CLI messaging group + wiring.
  let cliMg: MessagingGroup | undefined = getMessagingGroupByPlatform(CLI_CHANNEL, CLI_PLATFORM_ID);
@@ -30,10 +30,11 @@
 * For direct-addressable channels (telegram, whatsapp, etc.), --platform-id
 * is typically the same as the handle in --user-id, with the channel prefix.
 */
+import fs from 'fs';
 import net from 'net';
 import path from 'path';

-import { DATA_DIR } from '../src/config.js';
+import { DATA_DIR, GROUPS_DIR } from '../src/config.js';
 import { createAgentGroup, getAgentGroupByFolder } from '../src/db/agent-groups.js';
 import { initDb } from '../src/db/connection.js';
 import {
@@ -47,8 +48,7 @@ import { normalizeName } from '../src/modules/agent-to-agent/db/agent-destinatio
 import { addMember } from '../src/modules/permissions/db/agent-group-members.js';
 import { getUserRoles, grantRole } from '../src/modules/permissions/db/user-roles.js';
 import { upsertUser } from '../src/modules/permissions/db/users.js';
-import { updateContainerConfigScalars } from '../src/db/container-configs.js';
-import { initGroupFilesystem } from '../src/group-init.js';
+import { ensureContainerConfig, updateContainerConfigScalars } from '../src/db/container-configs.js';
 import { namespacedPlatformId } from '../src/platform-id.js';
 import type { AgentGroup, MessagingGroup } from '../src/types.js';

@@ -189,6 +189,7 @@ async function main(): Promise<void> {

  // 2. Agent group + filesystem.
  const folder = `dm-with-${normalizeName(args.displayName)}`;
+  const pickedProvider = process.env.NANOCLAW_PICKED_PROVIDER?.trim().toLowerCase();
  let ag: AgentGroup | undefined = getAgentGroupByFolder(folder);
  if (!ag) {
    const agId = generateId('ag');
@@ -204,12 +205,23 @@ async function main(): Promise<void> {
  } else {
    console.log(`Reusing agent group: ${ag.id} (${folder})`);
  }
-  initGroupFilesystem(ag, {
-    instructions:
-      `# ${args.agentName}\n\n` +
+  // Ensure the config row exists; defer workspace scaffolding to the first
+  // spawn (group-init), where the DB-resolved provider decides the surface
+  // (Claude: CLAUDE.local.md; a surfaces-owning provider: the memory scaffold)
+  // — so a non-Claude group never gets stale CLAUDE.* files written here.
+  ensureContainerConfig(ag.id);
+  // Runtime provider lives on the config row, not the deprecated agent_provider.
+  if (pickedProvider && pickedProvider !== 'claude') {
+    updateContainerConfigScalars(ag.id, { provider: pickedProvider });
+  }
+  const groupDir = path.resolve(GROUPS_DIR, folder);
+  fs.mkdirSync(groupDir, { recursive: true });
+  fs.writeFileSync(
+    path.join(groupDir, '.seed.md'),
+    `# ${args.agentName}\n\n` +
      `You are ${args.agentName}, a personal NanoClaw agent for ${args.displayName}. ` +
-      'When the user first reaches out (or you receive a system welcome prompt), introduce yourself briefly and invite them to chat. Keep replies concise.',
-  });
+      'When the user first reaches out (or you receive a system welcome prompt), introduce yourself briefly and invite them to chat. Keep replies concise.\n',
+  );

  // 2b. Assign the user a role for this agent group. The caller picks via
  // --role; the channel drivers default to 'owner' for the self-host case.
@@ -0,0 +1,26 @@
+/**
+ * scripts/upgrade-state.ts — read or stamp the upgrade marker.
+ *
+ * Usage:
+ *   pnpm exec tsx scripts/upgrade-state.ts get
+ *   pnpm exec tsx scripts/upgrade-state.ts set [version] [via]
+ *
+ * `set` with no version stamps the current package.json version. The
+ * sanctioned upgrade paths (setup / update / migrate) call `set` on
+ * success; running it by hand is also the documented way to clear the
+ * startup tripwire — see docs/upgrade-recovery.md.
+ */
+import { getCodeVersion, markerPath, readUpgradeState, writeUpgradeState } from '../src/upgrade-state.js';
+
+const [, , cmd, versionArg, viaArg] = process.argv;
+
+if (cmd === 'get') {
+  const state = readUpgradeState();
+  console.log(state ? JSON.stringify(state) : 'none');
+} else if (cmd === 'set') {
+  const state = writeUpgradeState({ version: versionArg || getCodeVersion(), via: viaArg || 'manual' });
+  console.log(`Stamped ${markerPath()}: ${JSON.stringify(state)}`);
+} else {
+  console.error('Usage: pnpm exec tsx scripts/upgrade-state.ts get | set [version] [via]');
+  process.exit(2);
+}
@@ -0,0 +1,121 @@
+#!/usr/bin/env bash
+#
+# Install the Codex agent provider non-interactively: copy the payload from the
+# `providers` branch, wire the three provider barrels, and add the Codex CLI to
+# the container manifest (container/cli-tools.json). The image rebuild is the
+# caller's job (the setup container step / `./container/build.sh`).
+#
+# Emits exactly one status block on stdout (ADD_CODEX); all chatty progress
+# goes to stderr. Keep in sync with .claude/skills/add-codex/SKILL.md.
+set -euo pipefail
+
+PROJECT_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
+cd "$PROJECT_ROOT"
+
+# Keep in sync with add-codex SKILL.md. This is the canonical Codex CLI pin —
+# it lands in container/cli-tools.json (the global-CLI manifest), not the Dockerfile.
+CODEX_VERSION="0.138.0"
+
+# Resolve the remote carrying the providers branch (same nanoclaw remote that
+# carries channels — handles forks where it isn't `origin`).
+# shellcheck source=setup/lib/channels-remote.sh
+source "$PROJECT_ROOT/setup/lib/channels-remote.sh"
+REMOTE=$(resolve_channels_remote)
+BRANCH="${REMOTE}/providers"
+
+# The codex payload — host provider, container runtime, setup module, doctrine.
+# Barrels are appended to, not copied.
+PAYLOAD_FILES=(
+  src/providers/codex.ts
+  src/providers/codex-agents-md.ts
+  src/providers/codex-registration.test.ts
+  src/providers/codex-host-contribution.test.ts
+  src/providers/codex-agents-md.test.ts
+  container/agent-runner/src/providers/codex.ts
+  container/agent-runner/src/providers/codex-app-server.ts
+  container/agent-runner/src/providers/exchange-archive.ts
+  container/agent-runner/src/providers/exchange-archive.test.ts
+  container/agent-runner/src/providers/codex-registration.test.ts
+  container/agent-runner/src/providers/codex.factory.test.ts
+  container/agent-runner/src/providers/codex.turns.test.ts
+  container/agent-runner/src/providers/codex-app-server.test.ts
+  container/agent-runner/src/providers/codex-cli-tools.test.ts
+  setup/providers/codex.ts
+  setup/providers/codex.test.ts
+  setup/providers/codex-registration.test.ts
+  container/AGENTS.md
+)
+BARRELS=(
+  src/providers/index.ts
+  container/agent-runner/src/providers/index.ts
+  setup/providers/index.ts
+)
+
+ALREADY_INSTALLED=true
+emit_status() {
+  local status=$1 error=${2:-}
+  echo "=== NANOCLAW SETUP: ADD_CODEX ==="
+  echo "STATUS: ${status}"
+  echo "CODEX_VERSION: ${CODEX_VERSION}"
+  echo "ALREADY_INSTALLED: ${ALREADY_INSTALLED}"
+  [ -n "$error" ] && echo "ERROR: ${error}"
+  echo "=== END ==="
+}
+log() { echo "[add-codex] $*" >&2; }
+
+# Idempotent: a complete install has the host provider file, the host barrel
+# import, and the Codex CLI in the container manifest. Any missing → (re)install.
+need_install() {
+  [ ! -f src/providers/codex.ts ] && return 0
+  ! grep -q "^import './codex.js';" src/providers/index.ts 2>/dev/null && return 0
+  ! grep -q '@openai/codex' container/cli-tools.json 2>/dev/null && return 0
+  return 1
+}
+
+if need_install; then
+  ALREADY_INSTALLED=false
+
+  log "Fetching providers branch from ${REMOTE}…"
+  git fetch "$REMOTE" providers >&2 2>/dev/null || {
+    emit_status failed "git fetch ${REMOTE} providers failed"
+    exit 1
+  }
+
+  log "Copying Codex payload from ${BRANCH}…"
+  for f in "${PAYLOAD_FILES[@]}"; do
+    mkdir -p "$(dirname "$f")"
+    git show "${BRANCH}:$f" > "$f" 2>/dev/null || {
+      emit_status failed "providers branch is missing ${f}"
+      exit 1
+    }
+  done
+
+  log "Wiring provider barrels…"
+  for b in "${BARRELS[@]}"; do
+    grep -q "^import './codex.js';" "$b" || printf "import './codex.js';\n" >> "$b"
+  done
+
+  log "Adding the Codex CLI to the container manifest (cli-tools.json)…"
+  # A json-merge: append { name, version } if absent. The Dockerfile installs
+  # every manifest entry via pinned `pnpm install -g` — no Dockerfile edit, no
+  # awk surgery. @openai/codex has no native postinstall, so no "onlyBuilt".
+  MANIFEST=container/cli-tools.json
+  node -e '
+    const fs = require("fs");
+    const [file, name, version] = process.argv.slice(1);
+    const tools = JSON.parse(fs.readFileSync(file, "utf8"));
+    if (!tools.some((t) => t.name === name)) {
+      tools.push({ name, version });
+      const fmt = (t) =>
+        "  { " +
+        Object.entries(t).map(([k, v]) => JSON.stringify(k) + ": " + JSON.stringify(v)).join(", ") +
+        " }";
+      fs.writeFileSync(file, "[\n" + tools.map(fmt).join(",\n") + "\n]\n");
+    }
+  ' "$MANIFEST" "@openai/codex" "${CODEX_VERSION}" || {
+    emit_status failed "failed to add @openai/codex to ${MANIFEST}"
+    exit 1
+  }
+fi
+
+emit_status ok
@@ -38,8 +38,12 @@ import { runTeamsChannel } from './channels/teams.js';
 import { runTelegramChannel } from './channels/telegram.js';
 import { runWhatsAppChannel } from './channels/whatsapp.js';
 import { pingCliAgent, type PingResult } from './lib/agent-ping.js';
+import { getSetupProvider, listSetupProviders } from './providers/registry.js';
+// Provider payloads self-register their picker entry + auth on import.
+import './providers/index.js';
 import { brightSelect } from './lib/bright-select.js';
 import { offerClaudeOnFailure } from './lib/claude-handoff.js';
+import { setPickedProvider } from './lib/picked-provider.js';
 import {
  applyToEnv,
  parseFlags,
@@ -48,6 +52,8 @@ import {
 } from './lib/setup-config-parse.js';
 import { runAdvancedScreen } from './lib/setup-config-screen.js';
 import { runWindowedStep } from './lib/windowed-runner.js';
+import { runUninstallFlow } from './uninstall/flow.js';
+import { detectExistingInstall } from './uninstall/scan.js';
 import { detectRegisteredGroups, detectExistingDisplayName } from './environment.js';
 import { pollHealth } from './onecli.js';
 import { getLaunchdLabel, getSystemdUnit } from '../src/install-slug.js';
@@ -88,6 +94,17 @@ async function main(): Promise<void> {
  let configValues = { ...readFromEnv(), ...flagResult.values };
  applyToEnv(configValues);

+  // --uninstall routes to the uninstall flow before any setup side effects —
+  // in particular before initProgressionLog(), so an uninstall never resets
+  // logs/setup.log on its way to (possibly) deleting logs/ entirely.
+  if (configValues.uninstall === true) {
+    await runUninstallFlow({
+      dryRun: configValues.dryRun === true,
+      yes: configValues.yes === true,
+      invokedFrom: 'flag',
+    });
+  }
+
  printIntro();
  initProgressionLog();
  phEmit('auto_started');
@@ -121,6 +138,37 @@ async function main(): Promise<void> {
      .filter(Boolean),
  );

+  // Offer removal when setup lands on an existing install. Skipped on every
+  // resume path — both the fail() retry and the sg-docker re-exec pass
+  // NANOCLAW_SKIP (and the latter sets NANOCLAW_REEXEC_SG) — so the prompt
+  // appears at most once per fresh run.
+  const isResume = process.env.NANOCLAW_REEXEC_SG === '1' || skip.size > 0;
+  if (!isResume && detectExistingInstall(process.cwd())) {
+    const action = ensureAnswer(
+      await brightSelect<'keep' | 'uninstall'>({
+        message: 'NanoClaw is already installed in this folder. What would you like to do?',
+        options: [
+          {
+            value: 'keep',
+            label: 'Keep it & continue setup',
+            hint: 'recommended — re-running setup is safe',
+          },
+          {
+            value: 'uninstall',
+            label: 'Uninstall NanoClaw & exit',
+            hint: 'removes service, data, and agent files — asks before each step',
+          },
+        ],
+        initialValue: 'keep',
+      }),
+    ) as 'keep' | 'uninstall';
+    setupLog.userInput('existing_install', action);
+    phEmit('existing_install_detected', { action });
+    if (action === 'uninstall') {
+      await runUninstallFlow({ dryRun: false, yes: false, invokedFrom: 'setup-detection' });
+    }
+  }
+
  if (!skip.has('environment')) {
    const res = await runQuietStep('environment', {
      running: 'Checking your system…',
@@ -277,8 +325,54 @@ async function main(): Promise<void> {
    }
  }

+  let agentProvider: string | undefined;
  if (!skip.has('auth')) {
-    await runAuthStep();
+    // Agent runtime pick. Claude is the default and a no-op — choosing it
+    // runs the existing Claude auth flow unchanged. A branch provider walks
+    // its own auth (e.g. Codex: ChatGPT subscription or API key, vault-only)
+    // and verifies its payload is wired. The pick installs and authenticates
+    // the runtime; it is NOT an install-wide default — and it is NOT a
+    // creation flag. Provider is a DB property of a group: the creation flows
+    // create provider-agnostic groups, and setup sets the picked provider on
+    // each via `ncl groups config update --provider` right after creating it
+    // (the creation scripts inherit it and apply at create — see picked-provider). Existing groups switch the
+    // same way (docs/provider-migration.md).
+    agentProvider = await askAgentProviderChoice();
+    setPickedProvider(agentProvider);
+    let providerEntry = getSetupProvider(agentProvider);
+    if (agentProvider !== 'claude' && !providerEntry) {
+      // A non-claude provider picked from the hard-wired list isn't wired in
+      // this install yet — install it via its self-contained script (channel
+      // style, idempotent: self-skips if already installed), rebuild the image
+      // (the container step already ran, the Dockerfile just changed), then
+      // load the payload's setup module so it self-registers.
+      const install = await runQuietChild(
+        `add-${agentProvider}`,
+        'bash',
+        [`setup/add-${agentProvider}.sh`],
+        {
+          running: `Installing ${agentProvider}…`,
+          done: `${agentProvider} installed.`,
+        },
+      );
+      if (!install.ok) {
+        await fail(
+          `add-${agentProvider}`,
+          `Couldn't install ${agentProvider}.`,
+          'See logs/setup-steps/ for details, then retry setup.',
+        );
+      }
+      p.log.info(brandBody('Rebuilding the container image with the new provider…'));
+      spawnSync('./container/build.sh', [], { stdio: 'inherit' });
+      await import(`./providers/${agentProvider}.js`);
+      providerEntry = getSetupProvider(agentProvider);
+    }
+    if (providerEntry?.runAuth) {
+      await providerEntry.runAuth();
+      await providerEntry.runInstallCheck?.();
+    } else {
+      await runAuthStep();
+    }
  }

  if (!skip.has('mounts')) {
@@ -704,6 +798,39 @@ function sendChatMessage(message: string): Promise<void> {

 // ─── auth step (select → branch) ────────────────────────────────────────

+// Providers offered for install are hard-wired in trunk — an audited control
+// surface (no branch enumeration that anyone with write access could extend).
+// Codex is the only one offered here; opencode/ollama install via their own
+// /add-* skills. Each is installed by its self-contained setup/add-<name>.sh.
+const INSTALLABLE_PROVIDERS = [
+  { value: 'codex', label: 'Codex', hint: 'OpenAI — ChatGPT subscription or API key' },
+] as const;
+
+async function askAgentProviderChoice(): Promise<string> {
+  const installed = listSetupProviders();
+  const installedNames = new Set(installed.map((entry) => entry.value));
+  // Offer the hard-wired installable providers this install hasn't wired yet —
+  // selecting one installs it via setup/add-<name>.sh.
+  const available = INSTALLABLE_PROVIDERS.filter((prov) => !installedNames.has(prov.value));
+  const options = [
+    ...installed.map(({ value, label, hint }) => ({ value, label, hint })),
+    ...available.map((prov) => ({ value: prov.value, label: prov.label, hint: `${prov.hint} — installs now` })),
+  ];
+  // The pick installs and authenticates a runtime — it is not an
+  // install-wide default, so re-runs safely Enter-through on claude (its
+  // auth flow short-circuits when the secret already exists).
+  const choice = ensureAnswer(
+    await brightSelect<string>({
+      message: 'Which agent runtime should power your assistant?',
+      options,
+      initialValue: 'claude',
+    }),
+  ) as string;
+  setupLog.userInput('agent_provider', choice);
+  phEmit('agent_provider_chosen', { provider: choice });
+  return choice;
+}
+
 async function runAuthStep(): Promise<void> {
  if (anthropicSecretExists()) {
    p.log.success(brandBody('Your Claude account is already connected.'));
@@ -1217,7 +1344,7 @@ function detectExistingOnecli(): { version: string; apiHost: string } | null {
    } catch {
      // not JSON — try to extract a URL directly
    }
-    const m = raw.match(/https?:\/\/[\w.\-]+(?::\d+)?/);
+    const m = raw.match(/https?:\/\/[\w.-]+(?::\d+)?/);
    return m ? { version, apiHost: m[0] } : null;
  } catch {
    return null;
@@ -68,8 +68,12 @@ export async function run(args: string[]): Promise<void> {

  log.info('Invoking init-cli-agent', { displayName, agentName });

+  // Provider-agnostic: init-cli-agent creates a default group and emits its id.
+  // Surface that id so the orchestrator can set the picked provider on it (via
+  // ncl) before the ping — provider is a DB property, never a creation flag.
+  let stdout = '';
  try {
-    execFileSync('pnpm', scriptArgs, {
+    stdout = execFileSync('pnpm', scriptArgs, {
      cwd: projectRoot,
      stdio: ['ignore', 'pipe', 'pipe'],
      encoding: 'utf-8',
@@ -90,10 +94,13 @@ export async function run(args: string[]): Promise<void> {
    process.exit(1);
  }

+  const agentGroupId = stdout.match(/^AGENT_GROUP_ID:\s*(\S+)/m)?.[1];
+
  emitStatus('CLI_AGENT', {
    DISPLAY_NAME: displayName,
    AGENT_NAME: agentName || displayName,
    CHANNEL: 'cli/local',
+    ...(agentGroupId ? { AGENT_GROUP_ID: agentGroupId } : {}),
    STATUS: 'success',
    LOG: 'logs/setup.log',
  });
@@ -35,6 +35,29 @@ export function readEnvKey(key: string, projectRoot?: string): string | null {
  return null;
 }

+/**
+ * Set (or replace) a single `KEY=value` line in `.env`, creating the file if
+ * needed. Non-secret config only — secrets belong in the OneCLI vault.
+ */
+export function upsertEnvKey(key: string, value: string, projectRoot?: string): void {
+  const envPath = path.join(projectRoot ?? process.cwd(), '.env');
+  let content = '';
+  try {
+    content = fs.readFileSync(envPath, 'utf-8');
+  } catch {
+    /* no .env yet */
+  }
+  const line = `${key}=${value}`;
+  const lines = content.split('\n');
+  const idx = lines.findIndex((l) => l.trim().startsWith(`${key}=`));
+  if (idx >= 0) lines[idx] = line;
+  else {
+    while (lines.length > 0 && lines[lines.length - 1].trim() === '') lines.pop();
+    lines.push(line);
+  }
+  fs.writeFileSync(envPath, lines.join('\n') + '\n');
+}
+
 export function detectExistingDisplayName(projectRoot: string): string | null {
  const dbPath = path.join(projectRoot, 'data', 'v2.db');
  if (!fs.existsSync(dbPath)) return null;
@@ -23,6 +23,7 @@ const STEPS: Record<
  verify: () => import('./verify.js'),
  onecli: () => import('./onecli.js'),
  auth: () => import('./auth.js'),
+  'provider-auth': () => import('./provider-auth.js'),
  'cli-agent': () => import('./cli-agent.js'),
 };

@@ -66,17 +66,43 @@ export interface BrightSelectOptions<T> {
  initialValue?: T;
 }

+/**
+ * Discard any stdin buffered while no prompt was reading — keypresses made
+ * during spinners and installs otherwise get consumed by the next select the
+ * instant it opens, submitting it before it ever renders for the user (a
+ * stray `↓`+`Enter` silently picks option 2). Raw-mode reads only see kernel
+ * tty data via the event loop, so the drain needs a real (short) window.
+ */
+export function flushStdin(windowMs = 50): Promise<void> {
+  return new Promise((resolve) => {
+    const stdin = process.stdin;
+    if (!stdin.isTTY) return resolve();
+    const wasRaw = stdin.isRaw === true;
+    stdin.setRawMode?.(true);
+    const discard = (): void => {};
+    stdin.on('data', discard);
+    stdin.resume();
+    setTimeout(() => {
+      stdin.off('data', discard);
+      stdin.pause();
+      if (!wasRaw) stdin.setRawMode?.(false);
+      resolve();
+    }, windowMs);
+  });
+}
+
 /**
 * Matches the return shape of `p.select` — resolves to the selected value
 * on submit, or to clack's cancel symbol on Ctrl-C / Esc. Callers pass
 * the result through `ensureAnswer(...)` the same way they do for
 * `p.select`.
 */
-export function brightSelect<T>(
+export async function brightSelect<T>(
  opts: BrightSelectOptions<T>,
 ): Promise<T | symbol> {
  const { message, options, initialValue } = opts;

+  await flushStdin();
  return new SelectPrompt({
    options: options as Array<{ value: T; label?: string; hint?: string }>,
    initialValue,
@@ -11,9 +11,17 @@
 *   1. Build a handoff prompt from the caller's context: channel, current
 *      step, completed steps, collected values (secrets redacted), relevant
 *      files to read.
- *   2. Spawn `claude --append-system-prompt "<context>"
- *      --permission-mode acceptEdits` with `stdio: 'inherit'` so Claude owns
- *      the terminal.
+ *   2. Spawn `claude "<prompt>" --permission-mode auto` with
+ *      `stdio: 'inherit'` so Claude owns the terminal. The positional prompt
+ *      is auto-submitted as the first user message, so Claude starts
+ *      orienting immediately instead of sitting at an empty prompt — and the
+ *      context stays visible in the transcript and survives `--resume`,
+ *      which an --append-system-prompt would not.
+ *   2a. All handoffs in one setup run share a single session: the first
+ *      spawn pins a generated UUID via `--session-id`, later spawns pass
+ *      `--resume <uuid>` so Claude keeps the context of earlier handoffs.
+ *      (stdio is inherited, so we can't *read* the session id Claude picks —
+ *      pinning our own is the only way to find the session again.)
 *   3. When Claude exits (user types /exit, Ctrl-D, or closes the session),
 *      control returns to the setup driver. The driver can then re-offer the
 *      same step (e.g., "How did that go?" select).
@@ -23,6 +31,7 @@
 * attempting to parse it as a real answer.
 */
 import { execSync, spawn } from 'child_process';
+import { randomUUID } from 'crypto';
 import path from 'path';

 import * as p from '@clack/prompts';
@@ -61,8 +70,8 @@ export interface HandoffContext {
 }

 /**
- * Spawn interactive Claude with context pre-loaded as a system-prompt
- * append. Returns when Claude exits.
+ * Spawn interactive Claude with the handoff context as an auto-submitted
+ * first prompt. Returns when Claude exits.
 *
 * Silently no-ops (returns `false`) if `claude` isn't on PATH — setup runs
 * where the binary is guaranteed to exist (we install it in the auth step),
@@ -78,8 +87,6 @@ export async function offerClaudeHandoff(ctx: HandoffContext): Promise<boolean>
    return false;
  }

-  const systemPrompt = buildSystemPrompt(ctx);
-
  note(
    [
      "I'm handing you off to Claude in interactive mode.",
@@ -90,18 +97,39 @@ export async function offerClaudeHandoff(ctx: HandoffContext): Promise<boolean>
    'Handing off to Claude',
  );

+  return spawnInteractiveClaude(buildHandoffPrompt(ctx));
+}
+
+// One session shared by every interactive handoff in this setup-driver
+// process. We pin the id ourselves (--session-id) on the first spawn because
+// stdio is inherited and Claude's own id is never visible to us; subsequent
+// spawns --resume it so Claude remembers earlier handoffs. Separate from
+// claude-assist's non-interactive session — the two formats don't mix.
+const handoffSessionId = randomUUID();
+let handoffSessionStarted = false;
+
+/**
+ * Spawn interactive Claude with the handoff context auto-submitted as the
+ * first user message. Resolves when Claude exits and control returns to
+ * the setup driver.
+ */
+function spawnInteractiveClaude(prompt: string): Promise<boolean> {
+  const sessionArgs = handoffSessionStarted
+    ? ['--resume', handoffSessionId]
+    : ['--session-id', handoffSessionId];
  return new Promise<boolean>((resolve) => {
    const child = spawn(
      'claude',
      [
-        '--append-system-prompt',
-        systemPrompt,
+        prompt,
        '--permission-mode',
-        'acceptEdits',
+        'auto',
+        ...sessionArgs,
      ],
      { stdio: 'inherit' },
    );
    child.on('close', () => {
+      handoffSessionStarted = true;
      p.log.success(brandBody("Back from Claude. Let's continue."));
      resolve(true);
    });
@@ -164,20 +192,20 @@ function isClaudeUsable(): boolean {
  }
 }

-function buildSystemPrompt(ctx: HandoffContext): string {
+function buildHandoffPrompt(ctx: HandoffContext): string {
  const lines: string[] = [
-    `The user is running NanoClaw's interactive \`setup:auto\` flow to wire the ${ctx.channel} channel.`,
-    `They got stuck at the step: "${ctx.step}" (${ctx.stepDescription}) and asked for help.`,
+    `I'm running NanoClaw's interactive \`setup:auto\` flow to wire the ${ctx.channel} channel`,
+    `and got stuck at the step: "${ctx.step}" (${ctx.stepDescription}).`,
    '',
-    "Your job: help them complete this specific step and get back to setup.",
-    "You can read files, run commands (with acceptEdits permissions), search the web,",
-    "and explain concepts. Be concise. When they're ready to resume, tell them to type",
-    "/exit and they'll return to the setup flow at the same step.",
+    'Help me complete this specific step and get back to setup.',
+    'You can read files, run commands, search the web,',
+    "and explain concepts. Be concise. When I'm ready to resume, remind me to type",
+    "/exit and I'll return to the setup flow at the same step.",
    '',
  ];

  if (ctx.completedSteps && ctx.completedSteps.length > 0) {
-    lines.push('Steps they have already completed:');
+    lines.push("Steps I've already completed:");
    for (const s of ctx.completedSteps) lines.push(`  ✓ ${s}`);
    lines.push('');
  }
@@ -243,8 +271,6 @@ async function offerFailureHandoff(
  );
  if (!want) return false;

-  const systemPrompt = buildFailureSystemPrompt(ctx, projectRoot);
-
  note(
    [
      "Launching Claude to help debug this failure.",
@@ -255,29 +281,10 @@ async function offerFailureHandoff(
    'Handing off to Claude',
  );

-  return new Promise<boolean>((resolve) => {
-    const child = spawn(
-      'claude',
-      [
-        '--append-system-prompt',
-        systemPrompt,
-        '--permission-mode',
-        'acceptEdits',
-      ],
-      { stdio: 'inherit' },
-    );
-    child.on('close', () => {
-      p.log.success(brandBody("Back from Claude. Let's continue."));
-      resolve(true);
-    });
-    child.on('error', () => {
-      p.log.error("Couldn't launch Claude. Continuing without handoff.");
-      resolve(false);
-    });
-  });
+  return spawnInteractiveClaude(buildFailurePrompt(ctx, projectRoot));
 }

-function buildFailureSystemPrompt(ctx: AssistContext, projectRoot: string): string {
+function buildFailurePrompt(ctx: AssistContext, projectRoot: string): string {
  const stepRefs = STEP_FILES[ctx.stepName] ?? [];
  const references = [
    ...BIG_PICTURE_FILES,
@@ -289,20 +296,20 @@ function buildFailureSystemPrompt(ctx: AssistContext, projectRoot: string): stri
  ].filter((v, i, a) => a.indexOf(v) === i);

  const lines: string[] = [
-    "The user is running NanoClaw's interactive setup flow and hit a failure.",
+    "I'm running NanoClaw's interactive setup flow and hit a failure.",
    '',
    `Failed step: ${ctx.stepName}`,
    `Error: ${ctx.msg}`,
  ];

-  if (ctx.hint) lines.push(`Hint: ${ctx.hint}`);
+  if (ctx.hint) lines.push(`Hint shown to me: ${ctx.hint}`);

  lines.push(
    '',
-    'Your job: help them diagnose and fix this issue. Read the referenced files',
-    'and logs to understand what went wrong, then help them fix it. You can read',
-    'files, run commands, check logs, and explain what happened. Be concise.',
-    "When they're ready to resume setup, tell them to type /exit.",
+    'Help me diagnose and fix this issue. Read the referenced files and logs',
+    'to understand what went wrong, then help me fix it. You can read files,',
+    'run commands, check logs, and explain what happened. Be concise.',
+    "When I'm ready to resume setup, remind me to type /exit.",
    '',
    'Relevant files (read as needed with the Read tool):',
  );
@@ -16,7 +16,13 @@ const INSTALL_ID_PATH = path.join('data', 'install-id');

 let cached: string | null = null;

-export function installId(): string {
+/**
+ * `persist: false` reads an existing id but never creates `data/install-id`
+ * — required by the uninstall path, which must not mutate the filesystem
+ * before (or instead of) removing it. Events in one process still join:
+ * the generated id is cached.
+ */
+export function installId(persist = true): string {
  if (cached) return cached;
  try {
    const existing = fs.readFileSync(INSTALL_ID_PATH, 'utf-8').trim();
@@ -28,11 +34,13 @@ export function installId(): string {
    // fall through to create
  }
  const id = randomUUID().toLowerCase();
-  try {
-    fs.mkdirSync(path.dirname(INSTALL_ID_PATH), { recursive: true });
-    fs.writeFileSync(INSTALL_ID_PATH, id);
-  } catch {
-    // best-effort; still return the id so the event fires
+  if (persist) {
+    try {
+      fs.mkdirSync(path.dirname(INSTALL_ID_PATH), { recursive: true });
+      fs.writeFileSync(INSTALL_ID_PATH, id);
+    } catch {
+      // best-effort; still return the id so the event fires
+    }
  }
  cached = id;
  return id;
@@ -41,6 +49,7 @@ export function installId(): string {
 export function emit(
  event: string,
  props: Record<string, string | number | boolean | undefined> = {},
+  opts: { persistId?: boolean } = {},
 ): void {
  if (process.env.NANOCLAW_NO_DIAGNOSTICS === '1') return;

@@ -53,7 +62,7 @@ export function emit(
  const body = JSON.stringify({
    api_key: POSTHOG_KEY,
    event,
-    distinct_id: installId(),
+    distinct_id: installId(opts.persistId !== false),
    properties: cleaned,
  });

@@ -0,0 +1,28 @@
+/**
+ * The agent runtime the operator picked in THIS setup run.
+ *
+ * There is no install-wide default provider and no `--provider` in the
+ * creation contract — provider is a DB property of a group. Setup is the one
+ * orchestrator that knows the operator's pick, so it stashes it here (set once
+ * at the auth step). The group-creation scripts (`init-first-agent`,
+ * `init-cli-agent`) run as **child processes**, so the pick is carried over the
+ * process boundary via an environment variable they inherit; they apply it to
+ * the group at creation, before the welcome wakes the container. This is the
+ * only place the value lives — a setup-run-scoped global, NOT a persisted
+ * install default. `undefined` / `'claude'` means the built-in default and no
+ * provider write at all.
+ */
+const ENV_KEY = 'NANOCLAW_PICKED_PROVIDER';
+
+export function setPickedProvider(provider: string | undefined): void {
+  const normalized = provider?.trim().toLowerCase() || undefined;
+  if (normalized && normalized !== 'claude') {
+    process.env[ENV_KEY] = normalized;
+  } else {
+    delete process.env[ENV_KEY];
+  }
+}
+
+export function getPickedProvider(): string | undefined {
+  return process.env[ENV_KEY]?.trim().toLowerCase() || undefined;
+}
@@ -132,6 +132,32 @@ export const CONFIG: Entry[] = [
    type: 'boolean',
    default: false,
  },
+
+  // Uninstall route — handled in auto.ts before any setup work begins.
+  {
+    key: 'uninstall',
+    label: 'Uninstall',
+    help: 'Remove this NanoClaw copy (service, containers, data, vault agents). Asks per group.',
+    surface: 'flag',
+    type: 'boolean',
+    default: false,
+  },
+  {
+    key: 'dryRun',
+    label: 'Uninstall dry run',
+    help: 'With --uninstall: preview what would be removed without changing anything.',
+    surface: 'flag',
+    type: 'boolean',
+    default: false,
+  },
+  {
+    key: 'yes',
+    label: 'Uninstall without prompts',
+    help: 'With --uninstall: delete everything found without asking (orphan vault agents are still kept).',
+    surface: 'flag',
+    type: 'boolean',
+    default: false,
+  },
 ];

 // ─── name derivation ───────────────────────────────────────────────────
@@ -0,0 +1,48 @@
+/**
+ * versions.json is the machine-checkable source for sanctioned component
+ * versions: setup steps read it, /update-nanoclaw diffs it across updates.
+ * These tests go red if the file, the pin, or the onecli-step wiring is
+ * deleted — the pin moving back to a hardcoded constant is the regression
+ * this guards against.
+ */
+import fs from 'fs';
+import path from 'path';
+import { fileURLToPath } from 'url';
+
+import { describe, expect, it } from 'vitest';
+
+import { readVersionPin } from './version-pins.js';
+
+const here = path.dirname(fileURLToPath(import.meta.url));
+
+describe('readVersionPin', () => {
+  it('resolves the onecli-gateway pin from the real versions.json', () => {
+    expect(readVersionPin('onecli-gateway')).toMatch(/^\d+\.\d+\.\d+$/);
+  });
+
+  it('resolves the onecli-cli pin from the real versions.json', () => {
+    expect(readVersionPin('onecli-cli')).toMatch(/^\d+\.\d+\.\d+$/);
+  });
+
+  it('throws for a component with no pin', () => {
+    expect(() => readVersionPin('no-such-component')).toThrow(/no pin/);
+  });
+});
+
+describe('onecli step wiring', () => {
+  it('reads its gateway pin from versions.json, not a hardcoded constant', () => {
+    const source = fs.readFileSync(path.join(here, '..', 'onecli.ts'), 'utf-8');
+    expect(source).toContain("readVersionPin('onecli-gateway')");
+    expect(source).not.toMatch(/ONECLI_GATEWAY_VERSION = '\d/);
+  });
+
+  it('reads its CLI pin from versions.json and never resolves "latest"', () => {
+    const source = fs.readFileSync(path.join(here, '..', 'onecli.ts'), 'utf-8');
+    expect(source).toContain("readVersionPin('onecli-cli')");
+    expect(source).not.toMatch(/ONECLI_CLI(?:_FALLBACK)?_VERSION = '\d/);
+    // The upstream installer and the /releases/latest redirect probe both
+    // chase "latest" — reintroducing either bypasses the sanctioned pin.
+    expect(source).not.toContain('onecli.sh/cli/install');
+    expect(source).not.toContain('/releases/latest');
+  });
+});
@@ -0,0 +1,31 @@
+/**
+ * Sanctioned version pins for external components (`versions.json` at the
+ * repo root) — the single machine-checkable source. Setup steps read their
+ * pin here; `/update-nanoclaw` diffs the file across an update and routes
+ * the user to the migration doc for any pin that moved (see CONTRIBUTING.md,
+ * "Breaking changes").
+ */
+import fs from 'fs';
+import path from 'path';
+import { fileURLToPath } from 'url';
+
+const VERSIONS_FILE = path.resolve(
+  path.dirname(fileURLToPath(import.meta.url)),
+  '..',
+  '..',
+  'versions.json',
+);
+
+/**
+ * Returns the pinned version for a component, e.g.
+ * `readVersionPin('onecli-gateway')`. Throws when the file or the pin is
+ * missing — a missing pin is an install-tree defect, not a runtime condition.
+ */
+export function readVersionPin(component: string): string {
+  const pins: unknown = JSON.parse(fs.readFileSync(VERSIONS_FILE, 'utf-8'));
+  const value = (pins as Record<string, unknown>)[component];
+  if (typeof value !== 'string' || value.length === 0) {
+    throw new Error(`versions.json has no pin for "${component}"`);
+  }
+  return value;
+}
@@ -0,0 +1,29 @@
+/**
+ * The step DETECTS gateway /v1 compatibility and warns (pointing at
+ * docs/onecli-upgrades.md) — it does not migrate the gateway; that's the
+ * agent's job via /update-nanoclaw. The verify helper must distinguish
+ * incompatible (pre-/v1 server: warn) from unreachable (transient: nothing to
+ * say) so the warning only fires on a real pre-/v1 server.
+ */
+import { describe, expect, it } from 'vitest';
+
+import { verifyGatewayV1 } from './onecli.js';
+
+function fakeFetch(behavior: 'ok' | '404' | 'down'): typeof fetch {
+  return (async () => {
+    if (behavior === 'down') throw new Error('ECONNREFUSED');
+    return { ok: behavior === 'ok' } as Response;
+  }) as unknown as typeof fetch;
+}
+
+describe('verifyGatewayV1', () => {
+  it('ok when /v1/health answers', async () => {
+    expect(await verifyGatewayV1('http://x', fakeFetch('ok'))).toBe('ok');
+  });
+  it('incompatible when the server answers HTTP without /v1', async () => {
+    expect(await verifyGatewayV1('http://x', fakeFetch('404'))).toBe('incompatible');
+  });
+  it('unreachable on connection failure', async () => {
+    expect(await verifyGatewayV1('http://x', fakeFetch('down'))).toBe('unreachable');
+  });
+});
@@ -17,6 +17,7 @@ import os from 'os';
 import path from 'path';

 import { log } from '../src/log.js';
+import { readVersionPin } from './lib/version-pins.js';
 import { emitStatus } from './status.js';

 const LOCAL_BIN = path.join(os.homedir(), '.local', 'bin');
@@ -102,20 +103,18 @@ function writeEnvOnecliUrl(url: string): void {
  writeEnvVar('ONECLI_URL', url);
 }

-// Last-known-good CLI release. Used only if BOTH the upstream installer
-// and the redirect-based version probe fail. Bump deliberately when a
-// new CLI release ships.
-const ONECLI_GATEWAY_VERSION = '1.23.0';
-const ONECLI_CLI_FALLBACK_VERSION = '1.3.0';
+// The SANCTIONED gateway version: fresh installs pin to it. Upgrading an
+// existing gateway is NOT done here — the gateway is a separate out-of-band
+// component, and the migrator is the user's coding agent following
+// docs/onecli-upgrades.md during /update-nanoclaw. The pin lives in
+// versions.json ("onecli-gateway") so that flow can diff it across updates and
+// route the agent to the doc; bump it there deliberately on a new release.
+const ONECLI_GATEWAY_VERSION = readVersionPin('onecli-gateway');
+// The CLI binary follows the same convention: installed at its pin
+// ("onecli-cli" in versions.json), never at whatever "latest" means today.
+const ONECLI_CLI_VERSION = readVersionPin('onecli-cli');
 const ONECLI_CLI_REPO = 'onecli/onecli-cli';

-function installOnecliCliOnly(): { stdout: string; ok: boolean } {
-  const upstream = runInstall('curl -fsSL onecli.sh/cli/install | sh');
-  if (upstream.ok) return { stdout: upstream.stdout, ok: true };
-  const fallback = installOnecliCliDirect();
-  return { stdout: upstream.stdout + (upstream.stderr ?? '') + '\n' + fallback.stdout, ok: fallback.ok };
-}
-
 // Remove containers in the "onecli" compose project whose service name isn't
 // in the v2 set. Pre-v2 OneCLI used service "app" (container onecli-app-1);
 // v2 uses "onecli". Compose flags the old container as an orphan but won't
@@ -161,24 +160,10 @@ function installOnecli(): { stdout: string; ok: boolean } {
    return { stdout: stdout + (gw.stderr ?? ''), ok: false };
  }

-  // CLI install. The upstream script calls the GitHub releases API
-  // (api.github.com) to resolve the latest tag — which 403s anonymous
-  // callers after 60 requests/hour per IP. Try upstream first; on failure
-  // resolve the version ourselves (via HTTP redirect, which isn't
-  // API-throttled) and download the release archive directly.
-  const upstream = runInstall('curl -fsSL onecli.sh/cli/install | sh');
-  stdout += upstream.stdout;
-  if (upstream.ok) return { stdout, ok: true };
-
-  log.warn('Upstream CLI installer failed — falling back to direct download', {
-    stderr: upstream.stderr,
-  });
-  stdout += (upstream.stderr ?? '') + '\n';
-
-  const fallback = installOnecliCliDirect();
-  stdout += fallback.stdout;
-  if (!fallback.ok) {
-    log.error('OneCLI CLI install failed (both upstream and direct fallback)');
+  const cli = installOnecliCliDirect();
+  stdout += cli.stdout;
+  if (!cli.ok) {
+    log.error('OneCLI CLI install failed');
    return { stdout, ok: false };
  }
  return { stdout, ok: true };
@@ -198,11 +183,11 @@ function runInstall(cmd: string): { stdout: string; stderr?: string; ok: boolean
 }

 /**
- * Reinstate the OneCLI CLI install without hitting GitHub's rate-limited
- * releases API. Resolves the version via the HTTP redirect from
- * /releases/latest → /releases/tag/vX.Y.Z, then downloads the archive
- * directly. Falls back to ONECLI_CLI_FALLBACK_VERSION if the redirect
- * probe also fails.
+ * Install the OneCLI CLI at the sanctioned pin by downloading the release
+ * archive straight from GitHub. Deliberately no "latest" resolution — the
+ * upstream installer script always chases the newest release, which would
+ * drift from the pin. PATH setup is not lost by skipping it:
+ * ensureShellProfilePath() in run() covers it.
 */
 function installOnecliCliDirect(): { stdout: string; ok: boolean } {
  const lines: string[] = [];
@@ -221,24 +206,7 @@ function installOnecliCliDirect(): { stdout: string; ok: boolean } {
    return { stdout: lines.join('\n'), ok: false };
  }

-  let version: string | null = null;
-  try {
-    const redirect = execSync(
-      `curl -fsSL -o /dev/null -w '%{url_effective}' https://github.com/${ONECLI_CLI_REPO}/releases/latest`,
-      { encoding: 'utf-8', stdio: ['ignore', 'pipe', 'pipe'] },
-    ).trim();
-    const m = redirect.match(/\/tag\/v?([^/]+)$/);
-    if (m) version = m[1];
-  } catch {
-    // redirect probe failed — we'll pin the fallback
-  }
-  if (!version) {
-    version = ONECLI_CLI_FALLBACK_VERSION;
-    append(`Version probe failed; installing pinned fallback ${version}.`);
-  } else {
-    append(`Resolved onecli CLI ${version} via release redirect.`);
-  }
-
+  const version = ONECLI_CLI_VERSION;
  const archive = `onecli_${version}_${osName}_${arch}.tar.gz`;
  const url = `https://github.com/${ONECLI_CLI_REPO}/releases/download/v${version}/${archive}`;
  const tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'onecli-'));
@@ -275,6 +243,39 @@ function installOnecliCliDirect(): { stdout: string; ok: boolean } {
  }
 }

+/**
+ * /v1 API compatibility check. @onecli-sh/sdk 2.x requires the server's /v1
+ * API; servers older than the cutover answer 404 on every SDK call (permanent,
+ * but presents as transient per-spawn failures). This is detect-only — setup
+ * does not migrate the gateway. The upgrade is an out-of-band action on a
+ * separate component that the agent runs via docs/onecli-upgrades.md during
+ * /update-nanoclaw, so this step only surfaces the condition and points there.
+ */
+export async function verifyGatewayV1(
+  url: string,
+  fetchImpl: typeof fetch = fetch,
+): Promise<'ok' | 'incompatible' | 'unreachable'> {
+  try {
+    const res = await fetchImpl(`${url}/v1/health`, { signal: AbortSignal.timeout(5000) });
+    return res.ok ? 'ok' : 'incompatible';
+  } catch {
+    return 'unreachable';
+  }
+}
+
+/**
+ * Detect-and-warn helper: returns a status HINT (and logs) when the gateway is
+ * pre-/v1, else null. Never fails the step or auto-upgrades — the agent owns
+ * the upgrade via docs/onecli-upgrades.md.
+ */
+function gatewayV1Hint(result: 'ok' | 'incompatible' | 'unreachable'): string | null {
+  if (result !== 'incompatible') return null;
+  log.warn('OneCLI gateway lacks the /v1 API @onecli-sh/sdk 2.x requires', {
+    pin: ONECLI_GATEWAY_VERSION,
+  });
+  return 'OneCLI gateway lacks the /v1 API @onecli-sh/sdk 2.x requires — upgrade it: docs/onecli-upgrades.md';
+}
+
 export async function pollHealth(url: string, timeoutMs: number): Promise<boolean> {
  // `/api/health` matches the path probe.sh uses — keep them aligned.
  const deadline = Date.now() + timeoutMs;
@@ -300,7 +301,7 @@ export async function run(args: string[]): Promise<void> {
    // Remote-mode: install only the CLI, point it at the remote gateway, and
    // record the URL in .env. No local gateway is started.
    log.info('Installing OneCLI CLI for remote gateway', { remoteUrl });
-    const res = installOnecliCliOnly();
+    const res = installOnecliCliDirect();
    if (!res.ok || !onecliVersion()) {
      emitStatus('ONECLI', {
        INSTALLED: false,
@@ -339,12 +340,14 @@ export async function run(args: string[]): Promise<void> {
      log.info('Wrote ONECLI_API_KEY to .env');
    }
    const healthy = await pollHealth(remoteUrl, 5000);
+    const v1Hint = healthy ? gatewayV1Hint(await verifyGatewayV1(remoteUrl)) : null;
    emitStatus('ONECLI', {
      INSTALLED: true,
      REMOTE: true,
      ONECLI_URL: remoteUrl,
      HEALTHY: healthy,
      STATUS: 'success',
+      ...(v1Hint ? { GATEWAY_HINT: v1Hint } : {}),
      LOG: 'logs/setup.log',
    });
    return;
@@ -378,12 +381,14 @@ export async function run(args: string[]): Promise<void> {
    writeEnvOnecliUrl(url);
    log.info('Reusing existing OneCLI', { url });
    const healthy = await pollHealth(url, 5000);
+    const v1Hint = healthy ? gatewayV1Hint(await verifyGatewayV1(url)) : null;
    emitStatus('ONECLI', {
      INSTALLED: true,
      REUSED: true,
      ONECLI_URL: url,
      HEALTHY: healthy,
      STATUS: 'success',
+      ...(v1Hint ? { GATEWAY_HINT: v1Hint } : {}),
      LOG: 'logs/setup.log',
    });
    return;
@@ -436,6 +441,7 @@ export async function run(args: string[]): Promise<void> {
  log.info('Wrote ONECLI_URL to .env', { url });

  const healthy = await pollHealth(url, 15000);
+  const v1Hint = healthy ? gatewayV1Hint(await verifyGatewayV1(url)) : null;

  emitStatus('ONECLI', {
    INSTALLED: true,
@@ -446,6 +452,7 @@ export async function run(args: string[]): Promise<void> {
    // The next step (auth) will surface a genuinely broken gateway via
    // `onecli secrets list`, so don't trigger rescue attempts from here.
    STATUS: 'success',
+    ...(v1Hint ? { GATEWAY_HINT: v1Hint } : {}),
    ...(healthy
      ? {}
      : {
@@ -0,0 +1,80 @@
+/**
+ * Standalone provider auth — the late-adopter entry point.
+ *
+ * Fresh installs reach a provider's auth walk-through via the setup picker;
+ * an existing install adding a provider later runs THIS instead:
+ *
+ *   pnpm exec tsx setup/index.ts --step provider-auth codex
+ *
+ * Same walk-through, same vault-only invariant, idempotent (each provider's
+ * runAuth short-circuits when its secret already exists) — and unlike
+ * re-running full setup, it touches nothing else: no install-wide default
+ * provider rewrite, no service changes. Provider install skills call this as
+ * their auth step so there is exactly one auth implementation per provider.
+ */
+import { execSync } from 'child_process';
+import fs from 'fs';
+import path from 'path';
+
+import { getSetupProvider, listSetupProviders } from './providers/registry.js';
+// Provider payloads self-register on import.
+import './providers/index.js';
+
+// Hard-wired install scripts — the audited control surface (no branch
+// enumeration). Each setup/add-<name>.sh is idempotent and self-skips when the
+// payload is already wired. Codex is the only manifest-style provider today.
+const INSTALL_SCRIPTS: Record<string, string> = {
+  codex: 'setup/add-codex.sh',
+};
+
+export async function run(args: string[]): Promise<void> {
+  const name = args[0]?.trim().toLowerCase();
+  const withAuth = listSetupProviders().filter((entry) => entry.runAuth);
+
+  if (!name) {
+    console.error(
+      `Usage: pnpm exec tsx setup/index.ts --step provider-auth <provider>\n` +
+        `Providers with an auth step: ${withAuth.map((entry) => entry.value).join(', ') || '(none installed)'}`,
+    );
+    process.exit(1);
+  }
+
+  let entry = getSetupProvider(name);
+  const script = INSTALL_SCRIPTS[name];
+  if (script) {
+    // Install OR refresh: the script is idempotent and is also the upgrade
+    // path — payload files resync and a bumped Dockerfile pin replaces the
+    // local one. Rebuild the image only when the Dockerfile actually changed
+    // (payload code is mounted, not baked).
+    const dfPath = path.join(process.cwd(), 'container', 'Dockerfile');
+    const dfBefore = fs.readFileSync(dfPath, 'utf-8');
+    console.log(`${entry ? 'Refreshing' : 'Installing'} ${name}…`);
+    execSync(`bash ${script}`, { stdio: 'inherit' });
+    if (fs.readFileSync(dfPath, 'utf-8') !== dfBefore) {
+      console.log('Dockerfile pin changed — rebuilding the container image…');
+      execSync('./container/build.sh', { stdio: 'inherit' });
+    }
+    if (!entry) {
+      await import(`./providers/${name}.js`);
+      entry = getSetupProvider(name);
+    }
+    if (!entry) {
+      console.error(`Install completed but ${name} did not register — check setup/providers/${name}.ts`);
+      process.exit(1);
+    }
+  } else if (!entry) {
+    console.error(
+      `Unknown provider: ${name}. Installed: ${listSetupProviders()
+        .map((e) => e.value)
+        .join(', ')}.`,
+    );
+    process.exit(1);
+  }
+  if (!entry.runAuth) {
+    console.error(`Provider "${name}" uses the standard auth flow — run the full setup, or /add-${name}'s steps.`);
+    process.exit(1);
+  }
+
+  await entry.runAuth();
+  await entry.runInstallCheck?.();
+}
@@ -0,0 +1,83 @@
+import { describe, it, expect } from 'vitest';
+import fs from 'fs';
+import path from 'path';
+import { fileURLToPath } from 'url';
+
+/**
+ * Provider is a DB property of a group, set only via
+ * `ncl groups config update --provider`. The group-creation contract that a
+ * fork's coding agent and its skills depend on must carry zero provider
+ * vocabulary — no `--provider` flag passed to, parsed by, or threaded through
+ * any creation path. These guards go red if that flag creeps back in.
+ *
+ * (Prose references to the ncl surface in comments are fine — we assert the
+ * absence of the `'--provider'` arg *literal*, not the substring.)
+ */
+const repoRoot = path.resolve(path.dirname(fileURLToPath(import.meta.url)), '..');
+
+function read(rel: string): string {
+  return fs.readFileSync(path.join(repoRoot, rel), 'utf-8');
+}
+
+const CREATION_FILES = [
+  'scripts/init-first-agent.ts',
+  'scripts/init-cli-agent.ts',
+  'setup/register.ts',
+  'setup/cli-agent.ts',
+  'setup/channels/telegram.ts',
+  'setup/channels/discord.ts',
+  'setup/channels/slack.ts',
+  'setup/channels/whatsapp.ts',
+  'setup/channels/signal.ts',
+  'setup/channels/imessage.ts',
+  'setup/channels/teams.ts',
+];
+
+describe('creation is provider-agnostic', () => {
+  for (const file of CREATION_FILES) {
+    it(`${file} passes/parses no --provider flag`, () => {
+      const src = read(file);
+      expect(src).not.toContain("'--provider'");
+      expect(src).not.toMatch(/case '--provider'/);
+    });
+  }
+});
+
+describe('setup carries the picked provider to creation via a setup-run env var', () => {
+  it('picked-provider stashes/reads the pick in the NANOCLAW_PICKED_PROVIDER env var', () => {
+    const src = read('setup/lib/picked-provider.ts');
+    expect(src).toContain('NANOCLAW_PICKED_PROVIDER');
+    // The pick is set into process.env so child creation scripts inherit it —
+    // an in-process module global can't cross the process boundary.
+    expect(src).toMatch(/process\.env\[/);
+  });
+
+  // The creation scripts run as child processes, inherit the env var, and apply
+  // it to the group's runtime config — container_configs.provider, the source of
+  // truth materialized into container.json (agent_provider is deprecated) — before
+  // the welcome wakes the container. No `--provider` flag in the contract (above).
+  for (const file of ['scripts/init-first-agent.ts', 'scripts/init-cli-agent.ts']) {
+    it(`${file} applies the env-carried provider to container_configs.provider`, () => {
+      const src = read(file);
+      expect(src).toContain('NANOCLAW_PICKED_PROVIDER');
+      expect(src).toMatch(/updateContainerConfigScalars\([^)]*provider:\s*pickedProvider/);
+    });
+  }
+});
+
+describe('codex installs from a hard-wired self-contained script', () => {
+  // The provider picker no longer enumerates a remote manifest branch (an
+  // unaudited control surface). Codex is offered in trunk and installed by its
+  // own setup/add-<name>.sh, exactly like a channel adapter.
+  it('setup/add-codex.sh exists', () => {
+    expect(fs.existsSync(path.join(repoRoot, 'setup/add-codex.sh'))).toBe(true);
+  });
+
+  it('setup/auto.ts installs the picked provider by running setup/add-<name>.sh', () => {
+    const src = read('setup/auto.ts');
+    expect(src).toContain('setup/add-${agentProvider}.sh');
+    // The removed branch-enumeration machinery must not creep back in.
+    expect(src).not.toContain('listBranchProviderManifests');
+    expect(src).not.toContain('installProviderFromBranch');
+  });
+});
@@ -0,0 +1,3 @@
+// Setup-side provider barrel. Provider payloads with their own setup surface
+// (picker entry, auth walk-through, install check) self-register on import.
+// Skills add a provider by appending one import line below.
@@ -0,0 +1,43 @@
+/**
+ * Setup-side provider registration guards.
+ *
+ * Behavior (barrel-driven): imports the real setup/providers barrel and
+ * asserts the built-in default — red if the barrel fails to evaluate.
+ * Per-provider registration guards ship WITH each provider payload (the
+ * skill copies them in), same archetype as the host/container registration
+ * tests.
+ *
+ * Structural: the picker and the standalone provider-auth step are wiring
+ * inside non-invocable entry flows (setup main, STEPS map) — assert their
+ * consumption of the registry in source, so deleting either reach-in goes red.
+ */
+import fs from 'fs';
+import path from 'path';
+import { describe, expect, it } from 'vitest';
+
+import { getSetupProvider, listSetupProviders } from './registry.js';
+import './index.js'; // the real setup provider barrel — triggers self-registration
+
+describe('setup provider registry', () => {
+  it('always carries claude as the built-in default with the standard auth flow', () => {
+    const claude = getSetupProvider('claude');
+    expect(claude).toBeDefined();
+    expect(claude!.runAuth).toBeUndefined();
+    expect(listSetupProviders()[0]!.value).toBe('claude');
+  });
+});
+
+describe('setup flow consumes the registry (structural)', () => {
+  it('the picker renders options from listSetupProviders', () => {
+    const src = fs.readFileSync(path.join(process.cwd(), 'setup', 'auto.ts'), 'utf-8');
+    expect(src).toContain('listSetupProviders()');
+    expect(src).toContain("import './providers/index.js'");
+    // The capability-keyed branch — a provider's own auth runs iff it declares one.
+    expect(src).toMatch(/providerEntry\?\.runAuth/);
+  });
+
+  it('the standalone provider-auth step is reachable from the STEPS map', () => {
+    const src = fs.readFileSync(path.join(process.cwd(), 'setup', 'index.ts'), 'utf-8');
+    expect(src).toContain("'provider-auth'");
+  });
+});
@@ -0,0 +1,59 @@
+/**
+ * Setup-side provider registry — the picker and the standalone `provider-auth`
+ * step render from this map instead of hardcoding provider names in the setup
+ * flow (same capability-not-name rule as the host provider-container registry).
+ *
+ * `claude` is the built-in default: it has no `runAuth` of its own, which the
+ * setup flow reads as "run the standard auth step". A provider payload adds
+ * itself by shipping a `setup/providers/<name>.ts` with a top-level
+ * `registerSetupProvider(...)` call and appending one import line to the
+ * `setup/providers/index.ts` barrel — the same shape as the host and container
+ * provider registries, guarded the same way (a barrel-driven registration test).
+ */
+import type { AssistContext } from '../lib/claude-assist.js'; // type-only — registry stays runtime-dependency-free
+
+/**
+ * Outcome of a provider-owned failure-assist hook:
+ *   - 'launched'    — the provider's debugger ran (user may have fixed things).
+ *   - 'declined'    — the user said no; do NOT offer another debugger.
+ *   - 'unavailable' — the provider's CLI can't be used here; the dispatcher
+ *                     falls back to the guarded Claude offer (never install/sign-in).
+ */
+export type FailureAssistResult = 'launched' | 'declined' | 'unavailable';
+
+export interface SetupProviderEntry {
+  value: string;
+  label: string;
+  hint: string;
+  /** Provider-owned auth walk-through (vault-only). Absent → standard auth step. */
+  runAuth?: () => Promise<void>;
+  /** Verifies the provider's payload is wired (files, barrels, Dockerfile pin). */
+  runInstallCheck?: () => Promise<void>;
+  /** Provider-owned interactive failure debugger. 'unavailable' → dispatcher
+   *  falls back to the guarded Claude offer (never install/sign-in). */
+  offerFailureAssist?: (ctx: AssistContext, projectRoot: string) => Promise<FailureAssistResult>;
+}
+
+const registry = new Map<string, SetupProviderEntry>();
+
+registry.set('claude', {
+  value: 'claude',
+  label: 'Claude',
+  hint: 'default — Anthropic subscription or API key',
+});
+
+export function registerSetupProvider(entry: SetupProviderEntry): void {
+  if (registry.has(entry.value)) {
+    throw new Error(`Setup provider already registered: ${entry.value}`);
+  }
+  registry.set(entry.value, entry);
+}
+
+export function getSetupProvider(name: string): SetupProviderEntry | undefined {
+  return registry.get(name.toLowerCase());
+}
+
+/** Claude (the default) first, then the rest in registration order. */
+export function listSetupProviders(): SetupProviderEntry[] {
+  return [...registry.values()];
+}
@@ -11,6 +11,7 @@ import { DATA_DIR } from '../src/config.js';
 import { initDb } from '../src/db/connection.js';
 import { runMigrations } from '../src/db/migrations/index.js';
 import { createAgentGroup, getAgentGroupByFolder } from '../src/db/agent-groups.js';
+import { ensureContainerConfig } from '../src/db/container-configs.js';
 import {
  createMessagingGroup,
  createMessagingGroupAgent,
@@ -18,7 +19,6 @@ import {
  getMessagingGroupAgentByPair,
 } from '../src/db/messaging-groups.js';
 import { isValidGroupFolder } from '../src/group-folder.js';
-import { initGroupFilesystem } from '../src/group-init.js';
 import { log } from '../src/log.js';
 import { namespacedPlatformId } from '../src/platform-id.js';
 import { resolveSession, writeSessionMessage } from '../src/session-manager.js';
@@ -118,7 +118,7 @@ export async function run(args: string[]): Promise<void> {
  // Chat SDK adapters prefix, native adapters (WhatsApp/iMessage/Signal) don't.
  parsed.platformId = namespacedPlatformId(parsed.channel, parsed.platformId);

-  log.info('Registering channel', parsed);
+  log.info('Registering channel', { ...parsed });

  // Init v2 central DB
  fs.mkdirSync(path.join(projectRoot, 'data'), { recursive: true });
@@ -126,7 +126,11 @@ export async function run(args: string[]): Promise<void> {
  const db = initDb(dbPath);
  runMigrations(db);

-  // 1. Create or find agent group
+  // 1. Create or find agent group. Provider-agnostic: provider is a DB
+  // property set via `ncl groups config update --provider`, not a creation
+  // flag. The workspace is scaffolded at the first spawn (group-init), where
+  // the DB-resolved provider is known; here we only ensure the config row
+  // exists so that update has a row to write.
  let agentGroup = getAgentGroupByFolder(parsed.folder);
  if (!agentGroup) {
    const agId = generateId('ag');
@@ -140,7 +144,7 @@ export async function run(args: string[]): Promise<void> {
    agentGroup = getAgentGroupByFolder(parsed.folder)!;
    log.info('Created agent group', { id: agId, folder: parsed.folder });
  }
-  initGroupFilesystem(agentGroup);
+  ensureContainerConfig(agentGroup.id);

  // 2. Create or find messaging group
  let messagingGroup = getMessagingGroupByPlatform(parsed.channel, parsed.platformId);
@@ -11,6 +11,7 @@ import path from 'path';

 import { log } from '../src/log.js';
 import { getLaunchdLabel, getSystemdUnit } from '../src/install-slug.js';
+import { writeUpgradeState } from '../src/upgrade-state.js';
 import { cleanupUnhealthyPeers } from './peer-cleanup.js';
 import {
  commandExists,
@@ -54,6 +55,11 @@ export async function run(_args: string[]): Promise<void> {

  fs.mkdirSync(path.join(projectRoot, 'logs'), { recursive: true });

+  // Stamp the upgrade marker before the host first starts, so the startup
+  // tripwire (enforceUpgradeTripwire) sees this as a sanctioned install.
+  const stamped = writeUpgradeState({ via: 'setup' });
+  log.info('Stamped upgrade marker', { version: stamped.version });
+
  // Peer preflight — a crash-looping peer install (most often the legacy v1
  // `com.nanoclaw` plist) will keep trashing this install's containers on
  // every respawn via its own cleanupOrphans. Detect and unload any peer
@@ -0,0 +1,365 @@
+/**
+ * Uninstall flow — clack UI orchestration over scan/plan/remove.
+ *
+ * Self-deletion constraint: this flow runs on tsx out of the node_modules
+ * it deletes. All imports are static (loaded before any deletion), dist/
+ * and node_modules/ are removed last (the runtime tail), and once execution
+ * starts nothing here writes to logs/ (which would recreate it) or does a
+ * dynamic import. After the runtime tail, the only output is console.log.
+ *
+ * Removes ONLY what belongs to this checkout (per-checkout install slug).
+ * Each non-empty group shows a WHAT/WHERE table and asks a default-No
+ * confirm. Nothing is deleted until every decision has been made, so
+ * Ctrl-C anywhere in the confirm phase leaves the install untouched.
+ */
+import { spawnSync } from 'child_process';
+import fs from 'fs';
+import os from 'os';
+import path from 'path';
+
+import * as p from '@clack/prompts';
+import k from 'kleur';
+
+import { emit as phEmit } from '../lib/diagnostics.js';
+import { note } from '../lib/theme.js';
+import * as setupLog from '../logs.js';
+import {
+  resolveOnecliDeletions,
+  type RunCommand,
+  type VaultAgent,
+} from './onecli-agents.js';
+import { buildRemovalPlan, type Decisions } from './plan.js';
+import { executePlan, type ExecDeps } from './remove.js';
+import { scanInstall, tilde, type Inventory } from './scan.js';
+
+const GROUPS = {
+  service: {
+    title: '1) App & background service',
+    desc: 'Runs NanoClaw in the background. Removing this stops the assistant. None of your data lives here.',
+    prompt: 'Delete the app & background service shown above?',
+  },
+  data: {
+    title: '2) App data, logs & secrets',
+    desc: 'Message database, conversation history, logs, build files, and your .env (API keys / tokens). Removing this erases stored conversations and saved credentials.',
+    prompt: 'Delete app data, logs & secrets shown above? (erases conversations + API keys)',
+  },
+  user: {
+    title: "3) Your agents' memory & files",
+    desc: 'Notes and memory your agents created (groups/) and any migrated data (store/). Content you made — it cannot be recovered after deletion.',
+    prompt: "Delete your agents' memory & files shown above? (cannot be undone)",
+  },
+  onecli: {
+    title: '4) OneCLI credential agents',
+    desc: 'Per-agent entries this copy registered in the OneCLI vault. The OneCLI app, your credentials, and the gateway are NOT touched.',
+  },
+} as const;
+
+const runCommand: RunCommand = (cmd, args) => {
+  const res = spawnSync(cmd, args, { encoding: 'utf-8' });
+  return { status: res.status, stdout: res.stdout ?? '' };
+};
+
+export async function runUninstallFlow(opts: {
+  dryRun: boolean;
+  yes: boolean;
+  invokedFrom: 'flag' | 'setup-detection';
+}): Promise<never> {
+  const { dryRun, yes } = opts;
+
+  if (!process.stdin.isTTY && !yes && !dryRun) {
+    console.error(
+      'Uninstall needs an interactive terminal. Re-run with --yes to delete everything found without prompts, or --dry-run to preview.',
+    );
+    process.exit(1);
+  }
+
+  const projectRoot = process.cwd();
+  const home = os.homedir();
+
+  p.intro(k.bold(`Uninstall NanoClaw`));
+  // persistId: false — the emit must not create data/install-id, which would
+  // both break --dry-run's "changes nothing" promise and resurrect a data/
+  // row in the very inventory we are about to scan.
+  phEmit('uninstall_started', { invokedFrom: opts.invokedFrom, dryRun, yes }, { persistId: false });
+
+  const spinner = p.spinner();
+  spinner.start('Checking what exists for this copy…');
+  const inv = scanInstall({
+    projectRoot,
+    home,
+    platform: process.platform,
+    runCommand,
+  });
+  spinner.stop(`Scanned copy ${inv.slug} at ${tilde(projectRoot, home)}.`);
+
+  const svcRows = serviceRows(inv, home);
+  const dataRows = [...inv.data, ...inv.runtime].map(({ what, where }) => ({ what, where }));
+  const userRows = inv.user.map(({ what, where }) => ({ what, where }));
+  const totalFound =
+    svcRows.length +
+    dataRows.length +
+    userRows.length +
+    inv.onecli.mine.length +
+    inv.onecli.orphans.length;
+
+  if (totalFound === 0) {
+    p.outro(
+      `✓ Nothing to uninstall — this copy (${inv.slug}) is already clean.\n` +
+        k.dim('   (No service, containers, image, data, or OneCLI agents found for this folder.)'),
+    );
+    process.exit(0);
+  }
+
+  if (dryRun) {
+    p.log.message(
+      k.cyan('PREVIEW ONLY — this shows what would be deleted and changes nothing.'),
+    );
+    if (svcRows.length > 0) note(groupBody(GROUPS.service.desc, svcRows), GROUPS.service.title);
+    if (dataRows.length > 0) note(groupBody(GROUPS.data.desc, dataRows), GROUPS.data.title);
+    if (userRows.length > 0) note(groupBody(GROUPS.user.desc, userRows), GROUPS.user.title);
+    if (inv.onecli.mine.length > 0 || inv.onecli.orphans.length > 0) {
+      const lines = [GROUPS.onecli.desc, ''];
+      lines.push('Would be deleted (after confirmation):');
+      for (const a of inv.onecli.mine) lines.push(`  ● ${a.name} — ${a.identifier}`);
+      if (inv.onecli.mine.length === 0) lines.push('  (none)');
+      lines.push('Left in place — may belong to another copy:');
+      for (const a of inv.onecli.orphans) lines.push(`  ○ ${a.name} — ${a.identifier}`);
+      if (inv.onecli.orphans.length === 0) lines.push('  (none)');
+      note(lines.join('\n'), GROUPS.onecli.title);
+    }
+    const empty = emptyGroupTitles(svcRows.length, dataRows.length, userRows.length, inv);
+    if (empty.length > 0) p.log.message(k.dim(`Nothing found for: ${empty.join(', ')}`));
+    for (const n of inv.notes) p.log.message(k.dim(`• ${n}`));
+    p.outro('Preview complete. Nothing was changed.');
+    process.exit(0);
+  }
+
+  if (yes) {
+    p.log.warn('--yes given: deleting everything found below without asking.');
+  } else {
+    p.log.message(
+      k.dim(
+        'You will be asked about each group that has something. Default is to keep\n(just press Enter). Type "y" to delete a group.',
+      ),
+    );
+  }
+
+  // ── confirm phase — nothing is deleted until every decision is made ──
+
+  let serviceYes = false;
+  if (svcRows.length > 0) {
+    note(groupBody(GROUPS.service.desc, svcRows), GROUPS.service.title);
+    serviceYes = await confirmGroup(GROUPS.service.prompt, yes);
+  }
+
+  let dataYes = false;
+  if (dataRows.length > 0) {
+    note(groupBody(GROUPS.data.desc, dataRows), GROUPS.data.title);
+    dataYes = await confirmGroup(GROUPS.data.prompt, yes);
+  }
+
+  let userYes = false;
+  if (userRows.length > 0) {
+    note(groupBody(GROUPS.user.desc, userRows), GROUPS.user.title);
+    userYes = await confirmGroup(GROUPS.user.prompt, yes);
+  }
+
+  const keptNotes: string[] = [];
+  if (!serviceYes && svcRows.length > 0) keptNotes.push(`${GROUPS.service.title}: kept by your choice.`);
+  if (!dataYes && dataRows.length > 0) keptNotes.push(`${GROUPS.data.title}: kept by your choice.`);
+  if (!userYes && userRows.length > 0) keptNotes.push(`${GROUPS.user.title}: kept by your choice.`);
+
+  const onecliDelete = await decideOnecli(inv, yes, keptNotes);
+
+  // Record the decisions before execution can delete logs/ — but only into
+  // an existing logs/ (userInput would otherwise mkdir it back into
+  // existence, leaving a fresh logs/setup.log behind after the uninstall).
+  if (fs.existsSync(path.join(projectRoot, 'logs'))) {
+    setupLog.userInput(
+      'uninstall_decisions',
+      JSON.stringify({
+        service: serviceYes,
+        data: dataYes,
+        user: userYes,
+        onecliAgentsDeleted: onecliDelete.length,
+      }),
+    );
+  }
+
+  const decisions: Decisions = {
+    service: serviceYes,
+    data: dataYes,
+    user: userYes,
+    onecliDelete,
+  };
+  const actions = buildRemovalPlan(inv, decisions);
+
+  if (actions.length === 0) {
+    printLeftAlone([...inv.notes, ...keptNotes]);
+    p.outro('Nothing selected — nothing was changed.');
+    process.exit(0);
+  }
+
+  phEmit(
+    'uninstall_executed',
+    {
+      invokedFrom: opts.invokedFrom,
+      service: serviceYes,
+      data: dataYes,
+      user: userYes,
+      onecliAgentsDeleted: onecliDelete.length,
+    },
+    { persistId: false },
+  );
+
+  // The runtime tail (dist/, node_modules/) runs after every other action
+  // AND after the summary — nothing but console.log may happen once the
+  // modules we're running from are gone.
+  const head = actions.filter((a) => a.kind !== 'delete-runtime-path');
+  const tail = actions.filter((a) => a.kind === 'delete-runtime-path');
+
+  const deps: ExecDeps = {
+    runCommand,
+    log: (line) => p.log.message(line),
+    isRoot: process.getuid?.() === 0,
+  };
+  const { notes: execNotes } = executePlan(head, deps);
+
+  printLeftAlone([...inv.notes, ...keptNotes, ...execNotes]);
+
+  const { notes: tailNotes } = executePlan(tail, {
+    ...deps,
+    log: (line) => console.log(`  ${line}`),
+  });
+  for (const n of tailNotes) console.log(`  • ${n}`);
+  console.log(`\n✓ Done. NanoClaw copy ${inv.slug} has been uninstalled.`);
+  process.exit(0);
+}
+
+/** Unwrap a confirm result; Ctrl-C / Esc cancels the whole uninstall — nothing deleted. */
+function answered<T>(value: T | symbol): T {
+  if (p.isCancel(value)) {
+    p.cancel('Uninstall cancelled. Nothing was deleted.');
+    process.exit(0);
+  }
+  return value as T;
+}
+
+async function confirmGroup(prompt: string, yes: boolean): Promise<boolean> {
+  if (yes) return true;
+  return answered(await p.confirm({ message: prompt, initialValue: false }));
+}
+
+/**
+ * Group 4 has two sub-decisions the single-prompt loop can't express:
+ * MINE is one yes/no; ORPHANS get a separate default-No prompt with an
+ * explicit cross-copy warning. --yes deletes MINE but never ORPHANS
+ * (enforced in resolveOnecliDeletions); anything kept is reported with
+ * the exact manual delete command (by vault uuid).
+ */
+async function decideOnecli(
+  inv: Inventory,
+  yes: boolean,
+  keptNotes: string[],
+): Promise<VaultAgent[]> {
+  const { mine, orphans } = inv.onecli;
+  if (mine.length === 0 && orphans.length === 0) return [];
+
+  const rows = [
+    ...mine.map((a) => ({ what: 'OneCLI agent', where: `${a.name} — ${a.identifier}` })),
+    ...orphans.map((a) => ({ what: 'OneCLI agent (orphan)', where: `${a.name} — ${a.identifier}` })),
+  ];
+  note(groupBody(GROUPS.onecli.desc, rows), GROUPS.onecli.title);
+
+  let deleteMine = false;
+  if (mine.length > 0 && !yes) {
+    deleteMine = answered(
+      await p.confirm({
+        message: `Delete this copy's ${mine.length} OneCLI agent(s)?`,
+        initialValue: false,
+      }),
+    );
+    if (!deleteMine) keptNotes.push('OneCLI agents (this copy): kept by your choice.');
+  }
+
+  let deleteOrphans = false;
+  if (orphans.length > 0) {
+    if (yes) {
+      p.log.warn(
+        `${orphans.length} other NanoClaw-style agent(s) in the vault are not linked to this copy;\n--yes does NOT delete them (they may belong to another copy).`,
+      );
+    } else {
+      p.log.warn(
+        `Found ${orphans.length} other NanoClaw-style agent(s) in the vault not linked to this copy —\nthey may belong to ANOTHER NanoClaw copy on this machine.`,
+      );
+      deleteOrphans = answered(
+        await p.confirm({ message: 'Delete them too?', initialValue: false }),
+      );
+    }
+    if (yes || !deleteOrphans) {
+      keptNotes.push(
+        `OneCLI orphan agents (${orphans.length}): left in place — remove manually if they're yours:`,
+      );
+      for (const a of orphans) {
+        keptNotes.push(`  onecli agents delete --id ${a.uuid}   # ${a.name} — ${a.identifier}`);
+      }
+    }
+  }
+
+  return resolveOnecliDeletions({
+    mine,
+    orphans,
+    assumeYes: yes,
+    deleteMine,
+    deleteOrphans,
+  });
+}
+
+function serviceRows(inv: Inventory, home: string): { what: string; where: string }[] {
+  const s = inv.service;
+  const rows: { what: string; where: string }[] = [];
+  if (s.launchdPlist) rows.push({ what: 'Background service', where: tilde(s.launchdPlist, home) });
+  if (s.systemdUserUnit) rows.push({ what: 'Background service', where: tilde(s.systemdUserUnit, home) });
+  if (s.systemdSystemUnit) rows.push({ what: 'Background service (system)', where: s.systemdSystemUnit });
+  if (s.pidFile) rows.push({ what: 'Running process', where: 'nanoclaw.pid' });
+  if (s.containerIds.length > 0) {
+    rows.push({ what: 'Running containers', where: `${s.containerIds.length} container(s)` });
+  }
+  if (s.image) rows.push({ what: 'Container image', where: s.image });
+  if (s.nclSymlink) rows.push({ what: 'Command-line tool (ncl)', where: tilde(s.nclSymlink, home) });
+  return rows;
+}
+
+function groupBody(desc: string, rows: { what: string; where: string }[]): string {
+  const width = Math.max(...rows.map((r) => r.what.length), 'WHAT'.length);
+  const lines = [desc, '', `${'WHAT'.padEnd(width + 2)}WHERE`];
+  for (const r of rows) lines.push(`${r.what.padEnd(width + 2)}${r.where}`);
+  return lines.join('\n');
+}
+
+function emptyGroupTitles(
+  svcCount: number,
+  dataCount: number,
+  userCount: number,
+  inv: Inventory,
+): string[] {
+  const empty: string[] = [];
+  if (svcCount === 0) empty.push(GROUPS.service.title);
+  if (dataCount === 0) empty.push(GROUPS.data.title);
+  if (userCount === 0) empty.push(GROUPS.user.title);
+  if (inv.onecli.mine.length === 0 && inv.onecli.orphans.length === 0) {
+    empty.push(GROUPS.onecli.title);
+  }
+  return empty;
+}
+
+function printLeftAlone(notes: string[]): void {
+  const lines = [
+    '• OneCLI app, vault & credentials: ~/.local/share/onecli, ~/.local/bin/onecli',
+    '• Host-wide config: ~/.config/nanoclaw/ (mount/sender allowlists)',
+    '• PATH line in ~/.bashrc and ~/.zshrc',
+    '• Other NanoClaw copies on this machine',
+    ...notes.map((n) => `• ${n}`),
+  ];
+  note(lines.join('\n'), 'Left alone (shared / not ours)');
+}
@@ -0,0 +1,150 @@
+import { describe, it, expect, beforeEach, afterEach } from 'vitest';
+import fs from 'fs';
+import os from 'os';
+import path from 'path';
+
+import Database from 'better-sqlite3';
+
+import {
+  listVaultAgents,
+  readAgentGroupIds,
+  resolveOnecliDeletions,
+  splitVaultAgents,
+  type VaultAgent,
+} from './onecli-agents.js';
+
+const agent = (uuid: string, identifier: string, name = identifier): VaultAgent => ({
+  uuid,
+  identifier,
+  name,
+});
+
+describe('listVaultAgents', () => {
+  it('parses non-default agents from onecli JSON output', () => {
+    const payload = JSON.stringify({
+      data: [
+        { id: 'u-1', identifier: 'ag-main', name: 'Main', isDefault: false },
+        { id: 'u-2', identifier: 'default', name: 'Default', isDefault: false },
+        { id: 'u-3', identifier: 'ag-dev', name: 'Dev', isDefault: true },
+      ],
+    });
+    const result = listVaultAgents(() => ({ status: 0, stdout: payload }));
+    expect(result.available).toBe(true);
+    expect(result.agents).toEqual([agent('u-1', 'ag-main', 'Main')]);
+  });
+
+  it('reports unavailable when the command fails', () => {
+    expect(listVaultAgents(() => ({ status: 1, stdout: '' })).available).toBe(false);
+  });
+
+  it('reports unavailable when the command cannot be spawned', () => {
+    const result = listVaultAgents(() => {
+      throw new Error('ENOENT');
+    });
+    expect(result.available).toBe(false);
+    expect(result.agents).toEqual([]);
+  });
+
+  it('reports unavailable on unparseable output', () => {
+    expect(listVaultAgents(() => ({ status: 0, stdout: 'not json' })).available).toBe(false);
+    expect(listVaultAgents(() => ({ status: 0, stdout: '{"nope":1}' })).available).toBe(false);
+  });
+});
+
+describe('readAgentGroupIds', () => {
+  let tempDir: string;
+
+  beforeEach(() => {
+    tempDir = fs.mkdtempSync(path.join(os.tmpdir(), 'nanoclaw-uninstall-test-'));
+  });
+
+  afterEach(() => {
+    fs.rmSync(tempDir, { recursive: true, force: true });
+  });
+
+  it('reads ids from a real DB', () => {
+    const dbPath = path.join(tempDir, 'v2.db');
+    const db = new Database(dbPath);
+    db.exec('CREATE TABLE agent_groups (id TEXT PRIMARY KEY)');
+    db.prepare('INSERT INTO agent_groups (id) VALUES (?)').run('ag-one');
+    db.prepare('INSERT INTO agent_groups (id) VALUES (?)').run('ag-two');
+    db.close();
+
+    const result = readAgentGroupIds(dbPath);
+    expect(result.known).toBe(true);
+    expect(result.ids).toEqual(new Set(['ag-one', 'ag-two']));
+  });
+
+  it('returns known:false for a missing file', () => {
+    const result = readAgentGroupIds(path.join(tempDir, 'missing.db'));
+    expect(result.known).toBe(false);
+    expect(result.ids.size).toBe(0);
+  });
+
+  it('returns known:false for a corrupt file', () => {
+    const dbPath = path.join(tempDir, 'corrupt.db');
+    fs.writeFileSync(dbPath, 'this is not a sqlite database at all');
+    const result = readAgentGroupIds(dbPath);
+    expect(result.known).toBe(false);
+    expect(result.ids.size).toBe(0);
+  });
+});
+
+describe('splitVaultAgents', () => {
+  it('splits mine vs ag-* orphans and ignores foreign identifiers', () => {
+    const agents = [
+      agent('u-1', 'ag-mine'),
+      agent('u-2', 'ag-other'),
+      agent('u-3', 'some-tool'),
+    ];
+    const { mine, orphans } = splitVaultAgents(agents, new Set(['ag-mine']), true);
+    expect(mine).toEqual([agent('u-1', 'ag-mine')]);
+    expect(orphans).toEqual([agent('u-2', 'ag-other')]);
+  });
+
+  it('forces all ag-* agents into orphans when ids are unknown', () => {
+    const agents = [agent('u-1', 'ag-mine'), agent('u-2', 'ag-other')];
+    // ids set even contains ag-mine — known:false must override.
+    const { mine, orphans } = splitVaultAgents(agents, new Set(['ag-mine']), false);
+    expect(mine).toEqual([]);
+    expect(orphans).toEqual(agents);
+  });
+});
+
+describe('resolveOnecliDeletions', () => {
+  const mine = [agent('u-1', 'ag-mine')];
+  const orphans = [agent('u-2', 'ag-other')];
+
+  it('never deletes orphans under --yes, even if asked to', () => {
+    const deletions = resolveOnecliDeletions({
+      mine,
+      orphans,
+      assumeYes: true,
+      deleteMine: false,
+      deleteOrphans: true,
+    });
+    expect(deletions).toEqual(mine);
+  });
+
+  it('deletes orphans only on explicit interactive consent', () => {
+    expect(
+      resolveOnecliDeletions({
+        mine,
+        orphans,
+        assumeYes: false,
+        deleteMine: true,
+        deleteOrphans: true,
+      }),
+    ).toEqual([...mine, ...orphans]);
+
+    expect(
+      resolveOnecliDeletions({
+        mine,
+        orphans,
+        assumeYes: false,
+        deleteMine: false,
+        deleteOrphans: false,
+      }),
+    ).toEqual([]);
+  });
+});
@@ -0,0 +1,141 @@
+/**
+ * OneCLI vault-agent inventory for the uninstaller.
+ *
+ * Vault agents split into two sets: MINE (identifier matches an agent-group
+ * id in this copy's data/v2.db) and ORPHANS (NanoClaw-style `ag-*`
+ * identifiers not in our DB — possibly another copy's). Deletion is always
+ * by the vault's internal uuid: the agent-group id is NOT a valid
+ * `onecli agents delete --id` value (see src/container-runner.ts).
+ */
+import fs from 'fs';
+
+import Database from 'better-sqlite3';
+
+export interface VaultAgent {
+  /** Internal vault uuid — the only valid `onecli agents delete --id` value. */
+  uuid: string;
+  /** What the agent was registered under, e.g. a NanoClaw agent-group id (`ag-*`). */
+  identifier: string;
+  name: string;
+}
+
+export type RunCommand = (
+  cmd: string,
+  args: string[],
+) => { status: number | null; stdout: string };
+
+/**
+ * List non-default vault agents via `onecli agents list`. `available: false`
+ * means the vault couldn't be read at all (binary missing, command failed,
+ * or unparseable output) — distinct from an empty vault.
+ */
+export function listVaultAgents(run: RunCommand): {
+  available: boolean;
+  agents: VaultAgent[];
+} {
+  let result: { status: number | null; stdout: string };
+  try {
+    result = run('onecli', ['agents', 'list']);
+  } catch {
+    return { available: false, agents: [] };
+  }
+  if (result.status !== 0) return { available: false, agents: [] };
+
+  let parsed: unknown;
+  try {
+    parsed = JSON.parse(result.stdout);
+  } catch {
+    return { available: false, agents: [] };
+  }
+
+  const data =
+    parsed !== null && typeof parsed === 'object' && 'data' in parsed
+      ? (parsed as { data: unknown }).data
+      : null;
+  if (!Array.isArray(data)) return { available: false, agents: [] };
+
+  const agents: VaultAgent[] = [];
+  for (const entry of data) {
+    if (entry === null || typeof entry !== 'object') continue;
+    const a = entry as Record<string, unknown>;
+    if (a.isDefault === true) continue;
+    const identifier = typeof a.identifier === 'string' ? a.identifier : '';
+    const uuid = typeof a.id === 'string' ? a.id : '';
+    if (!identifier || identifier === 'default' || !uuid) continue;
+    agents.push({
+      uuid,
+      identifier,
+      name: typeof a.name === 'string' ? a.name : '',
+    });
+  }
+  return { available: true, agents };
+}
+
+/**
+ * Read this copy's agent-group ids from data/v2.db (readonly).
+ *
+ * `known: false` distinguishes "we couldn't read the DB at all" from "this
+ * copy has zero agent groups" — without it every ag-* vault agent would be
+ * mislabeled an orphan and --yes would silently leave this copy's agents
+ * behind.
+ */
+export function readAgentGroupIds(dbPath: string): {
+  ids: Set<string>;
+  known: boolean;
+} {
+  if (!fs.existsSync(dbPath)) return { ids: new Set(), known: false };
+
+  let db: Database.Database | null = null;
+  try {
+    db = new Database(dbPath, { readonly: true });
+    const rows = db.prepare('SELECT id FROM agent_groups').all() as {
+      id: string;
+    }[];
+    return { ids: new Set(rows.map((r) => r.id)), known: true };
+  } catch {
+    return { ids: new Set(), known: false };
+  } finally {
+    db?.close();
+  }
+}
+
+/**
+ * Split vault agents into MINE (identifier ∈ ids) and ORPHANS (ag-* not in
+ * ids). Non-NanoClaw identifiers are ignored entirely. With `known: false`
+ * nothing can be MINE, so every ag-* agent lands in ORPHANS — the caller is
+ * responsible for warning that the labels are unreliable.
+ */
+export function splitVaultAgents(
+  agents: VaultAgent[],
+  ids: Set<string>,
+  known: boolean,
+): { mine: VaultAgent[]; orphans: VaultAgent[] } {
+  const mine: VaultAgent[] = [];
+  const orphans: VaultAgent[] = [];
+  for (const agent of agents) {
+    if (known && ids.has(agent.identifier)) {
+      mine.push(agent);
+    } else if (agent.identifier.startsWith('ag-')) {
+      orphans.push(agent);
+    }
+  }
+  return { mine, orphans };
+}
+
+/**
+ * Resolve the vault-agent delete set from the user's answers. Under --yes
+ * (`assumeYes`) MINE is always deleted but ORPHANS never are — deleting
+ * what may be another copy's agents requires explicit human intent.
+ */
+export function resolveOnecliDeletions(input: {
+  mine: VaultAgent[];
+  orphans: VaultAgent[];
+  assumeYes: boolean;
+  deleteMine: boolean;
+  deleteOrphans: boolean;
+}): VaultAgent[] {
+  const out: VaultAgent[] = [];
+  if (input.assumeYes || input.deleteMine) out.push(...input.mine);
+  if (!input.assumeYes && input.deleteOrphans) out.push(...input.orphans);
+  return out;
+}
@@ -0,0 +1,156 @@
+import { describe, it, expect } from 'vitest';
+
+import type { VaultAgent } from './onecli-agents.js';
+import { buildRemovalPlan, type Decisions, type RemovalAction } from './plan.js';
+import type { Inventory, PathItem } from './scan.js';
+
+const item = (p: string, what: string): PathItem => ({ what, where: p, path: p });
+
+const agent = (uuid: string, identifier: string): VaultAgent => ({
+  uuid,
+  identifier,
+  name: identifier,
+});
+
+function inventory(overrides: Partial<Inventory> = {}): Inventory {
+  return {
+    slug: 'abcd1234',
+    projectRoot: '/proj',
+    containerRuntime: 'docker',
+    service: {
+      launchdPlist: '/home/u/Library/LaunchAgents/com.nanoclaw-v2-abcd1234.plist',
+      containerIds: ['c1', 'c2'],
+      image: 'nanoclaw-agent-v2-abcd1234:latest',
+      nclSymlink: '/home/u/.local/bin/ncl',
+    },
+    data: [
+      item('/proj/data', 'Database & conversations'),
+      item('/proj/logs', 'Logs'),
+      item('/proj/.env', 'Secrets / API keys (.env)'),
+      item('/proj/start-nanoclaw.sh', 'Start script'),
+    ],
+    runtime: [
+      // node_modules deliberately FIRST — the planner must still order it last.
+      item('/proj/node_modules', 'Installed dependencies'),
+      item('/proj/dist', 'Build output'),
+    ],
+    user: [item('/proj/groups', 'Agent memory & files'), item('/proj/store', 'Migrated data store')],
+    onecli: { mine: [], orphans: [], idsKnown: true },
+    notes: [],
+    ...overrides,
+  };
+}
+
+const allYes = (onecliDelete: VaultAgent[] = []): Decisions => ({
+  service: true,
+  data: true,
+  user: true,
+  onecliDelete,
+});
+
+const kinds = (actions: RemovalAction[]) => actions.map((a) => a.kind);
+
+describe('buildRemovalPlan ordering invariants', () => {
+  it('removes .env only via the atomic backup action, never a bare delete', () => {
+    const actions = buildRemovalPlan(inventory(), allYes());
+    expect(actions.filter((a) => a.kind === 'backup-env')).toHaveLength(1);
+    expect(
+      actions.some((a) => a.kind === 'delete-path' && a.item.path === '/proj/.env'),
+    ).toBe(false);
+  });
+
+  it('puts the runtime tail strictly last, with node_modules final', () => {
+    const actions = buildRemovalPlan(inventory(), allYes([agent('u-1', 'ag-mine')]));
+    const tail = actions.slice(-2);
+    expect(tail.map((a) => a.kind)).toEqual(['delete-runtime-path', 'delete-runtime-path']);
+    expect(tail.map((a) => (a.kind === 'delete-runtime-path' ? a.item.path : ''))).toEqual([
+      '/proj/dist',
+      '/proj/node_modules',
+    ]);
+    // No non-tail action after the first runtime delete.
+    const firstTailIdx = actions.findIndex((a) => a.kind === 'delete-runtime-path');
+    expect(
+      actions.slice(firstTailIdx).every((a) => a.kind === 'delete-runtime-path'),
+    ).toBe(true);
+  });
+
+  it('deletes OneCLI agents before the data group (which removes data/v2.db)', () => {
+    const actions = buildRemovalPlan(inventory(), allYes([agent('u-1', 'ag-mine')]));
+    const onecliIdx = actions.findIndex((a) => a.kind === 'delete-onecli-agent');
+    const dataIdx = actions.findIndex(
+      (a) => a.kind === 'delete-path' && a.item.path === '/proj/data',
+    );
+    expect(onecliIdx).toBeGreaterThanOrEqual(0);
+    expect(dataIdx).toBeGreaterThan(onecliIdx);
+  });
+
+  it('runs service teardown before container removal so the host cannot respawn them', () => {
+    const actions = buildRemovalPlan(inventory(), allYes());
+    const unloadIdx = actions.findIndex((a) => a.kind === 'unload-service');
+    const pkillIdx = actions.findIndex((a) => a.kind === 'pkill-host');
+    const rmContainersIdx = actions.findIndex((a) => a.kind === 'rm-containers');
+    expect(unloadIdx).toBeLessThan(rmContainersIdx);
+    expect(pkillIdx).toBeLessThan(rmContainersIdx);
+  });
+});
+
+describe('buildRemovalPlan declined groups', () => {
+  it('declined data yields no data deletes and no runtime tail', () => {
+    const actions = buildRemovalPlan(inventory(), {
+      service: true,
+      data: false,
+      user: true,
+      onecliDelete: [],
+    });
+    expect(kinds(actions)).not.toContain('backup-env');
+    expect(kinds(actions)).not.toContain('delete-runtime-path');
+    expect(
+      actions.some((a) => a.kind === 'delete-path' && a.item.path.startsWith('/proj/data')),
+    ).toBe(false);
+  });
+
+  it('all declined yields an empty plan', () => {
+    const actions = buildRemovalPlan(inventory(), {
+      service: false,
+      data: false,
+      user: false,
+      onecliDelete: [],
+    });
+    expect(actions).toEqual([]);
+  });
+
+  it('declined service yields no service actions', () => {
+    const actions = buildRemovalPlan(inventory(), {
+      service: false,
+      data: true,
+      user: false,
+      onecliDelete: [],
+    });
+    for (const kind of ['unload-service', 'pkill-host', 'rm-containers', 'rmi', 'rm-ncl-symlink']) {
+      expect(kinds(actions)).not.toContain(kind);
+    }
+  });
+});
+
+describe('buildRemovalPlan conditional actions', () => {
+  it('skips backup-env when there is no .env', () => {
+    const inv = inventory({ data: [item('/proj/data', 'Database & conversations')] });
+    expect(kinds(buildRemovalPlan(inv, allYes()))).not.toContain('backup-env');
+  });
+
+  it('always re-sweeps containers and processes with a confirmed service group', () => {
+    const inv = inventory({ service: { containerIds: [] } });
+    const actions = buildRemovalPlan(inv, allYes());
+    const actionKinds = kinds(actions);
+    expect(actionKinds).not.toContain('rmi');
+    expect(actionKinds).not.toContain('unload-service');
+    // pkill and rm-containers run unconditionally — a manually started host
+    // has no plist/unit, and the live host may have spawned containers the
+    // scan never saw. Removal re-lists by install label, not scan-time ids.
+    expect(actionKinds).toContain('pkill-host');
+    const rm = actions.find((a) => a.kind === 'rm-containers');
+    expect(rm && rm.kind === 'rm-containers' ? rm.labelFilter : '').toBe(
+      'nanoclaw-install=abcd1234',
+    );
+  });
+});
@@ -0,0 +1,130 @@
+/**
+ * Pure removal planner: inventory + per-group decisions → ordered actions.
+ *
+ * The order is load-bearing:
+ *   1. Service / processes / containers / image / symlink — stop the host
+ *      first so it can't respawn containers mid-removal.
+ *   2. OneCLI agent deletions — before the data group, which removes the
+ *      data/v2.db the mine/orphan split was computed from.
+ *   3. Data group, with the .env backup strictly before its deletion.
+ *   4. User group (groups/, store/).
+ *   5. Runtime tail: dist/ then node_modules/ — ALWAYS last. The uninstaller
+ *      runs on tsx out of node_modules; nothing may load after this.
+ */
+import path from 'path';
+
+import type { VaultAgent } from './onecli-agents.js';
+import type { Inventory, PathItem } from './scan.js';
+
+export interface Decisions {
+  service: boolean;
+  data: boolean;
+  user: boolean;
+  onecliDelete: VaultAgent[];
+}
+
+export type RemovalAction =
+  | {
+      kind: 'unload-service';
+      flavor: 'launchd' | 'systemd-user' | 'systemd-system';
+      unitPath: string;
+      /** systemd unit name without .service (unused for launchd). */
+      unitName: string;
+    }
+  | { kind: 'kill-pid'; pidFile: string }
+  | { kind: 'pkill-host'; pattern: string }
+  /**
+   * Containers are re-listed by label at removal time, not removed from
+   * scan-time ids — the host stays alive through the whole confirm phase
+   * and can spawn new containers after the scan.
+   */
+  | { kind: 'rm-containers'; runtime: string; labelFilter: string }
+  | { kind: 'rmi'; runtime: string; image: string }
+  | { kind: 'rm-ncl-symlink'; linkPath: string }
+  | { kind: 'delete-onecli-agent'; agent: VaultAgent }
+  /**
+   * Backs up AND removes .env as one atomic action: a failed backup must
+   * never be followed by the deletion (the backup is the user's only copy
+   * of their API keys). .env is deliberately excluded from `delete-path`.
+   */
+  | { kind: 'backup-env'; envPath: string }
+  | { kind: 'delete-path'; item: PathItem }
+  | { kind: 'delete-runtime-path'; item: PathItem };
+
+export function buildRemovalPlan(inv: Inventory, d: Decisions): RemovalAction[] {
+  const actions: RemovalAction[] = [];
+
+  if (d.service) {
+    const s = inv.service;
+    if (s.launchdPlist) {
+      actions.push({
+        kind: 'unload-service',
+        flavor: 'launchd',
+        unitPath: s.launchdPlist,
+        unitName: path.basename(s.launchdPlist, '.plist'),
+      });
+    }
+    if (s.systemdUserUnit) {
+      actions.push({
+        kind: 'unload-service',
+        flavor: 'systemd-user',
+        unitPath: s.systemdUserUnit,
+        unitName: path.basename(s.systemdUserUnit, '.service'),
+      });
+    }
+    if (s.systemdSystemUnit) {
+      actions.push({
+        kind: 'unload-service',
+        flavor: 'systemd-system',
+        unitPath: s.systemdSystemUnit,
+        unitName: path.basename(s.systemdSystemUnit, '.service'),
+      });
+    }
+    if (s.pidFile) actions.push({ kind: 'kill-pid', pidFile: s.pidFile });
+    actions.push({
+      kind: 'pkill-host',
+      pattern: `${inv.projectRoot}/dist/index.js`,
+    });
+    // Unconditional (like pkill): the scan may have found zero containers
+    // while the still-running host spawned one since.
+    actions.push({
+      kind: 'rm-containers',
+      runtime: inv.containerRuntime,
+      labelFilter: `nanoclaw-install=${inv.slug}`,
+    });
+    if (s.image) {
+      actions.push({ kind: 'rmi', runtime: inv.containerRuntime, image: s.image });
+    }
+    if (s.nclSymlink) {
+      actions.push({ kind: 'rm-ncl-symlink', linkPath: s.nclSymlink });
+    }
+  }
+
+  for (const agent of d.onecliDelete) {
+    actions.push({ kind: 'delete-onecli-agent', agent });
+  }
+
+  if (d.data) {
+    const env = inv.data.find((i) => path.basename(i.path) === '.env');
+    if (env) actions.push({ kind: 'backup-env', envPath: env.path });
+    for (const item of inv.data) {
+      if (item === env) continue; // removed by backup-env, never a bare delete
+      actions.push({ kind: 'delete-path', item });
+    }
+  }
+
+  if (d.user) {
+    for (const item of inv.user) actions.push({ kind: 'delete-path', item });
+  }
+
+  if (d.data) {
+    const tail = [...inv.runtime].sort(
+      (a, b) =>
+        Number(path.basename(a.path) === 'node_modules') -
+        Number(path.basename(b.path) === 'node_modules'),
+    );
+    for (const item of tail) actions.push({ kind: 'delete-runtime-path', item });
+  }
+
+  return actions;
+}
@@ -0,0 +1,212 @@
+import { describe, it, expect, beforeEach, afterEach } from 'vitest';
+import fs from 'fs';
+import os from 'os';
+import path from 'path';
+
+import type { RunCommand } from './onecli-agents.js';
+import type { RemovalAction } from './plan.js';
+import { backupEnv, executePlan, type ExecDeps } from './remove.js';
+
+let tempDir: string;
+
+beforeEach(() => {
+  tempDir = fs.mkdtempSync(path.join(os.tmpdir(), 'nanoclaw-remove-test-'));
+});
+
+afterEach(() => {
+  fs.rmSync(tempDir, { recursive: true, force: true });
+});
+
+function deps(overrides: Partial<ExecDeps> = {}): ExecDeps {
+  return {
+    runCommand: () => ({ status: 0, stdout: '' }),
+    log: () => {},
+    isRoot: false,
+    ...overrides,
+  };
+}
+
+describe('backupEnv', () => {
+  it('backs up to .env.bak', () => {
+    const envPath = path.join(tempDir, '.env');
+    fs.writeFileSync(envPath, 'KEY=secret');
+
+    const backup = backupEnv(envPath);
+
+    expect(backup).toBe(path.join(tempDir, '.env.bak'));
+    expect(fs.readFileSync(backup, 'utf-8')).toBe('KEY=secret');
+  });
+
+  it('falls back to a timestamped name when .env.bak exists', () => {
+    const envPath = path.join(tempDir, '.env');
+    fs.writeFileSync(envPath, 'KEY=new');
+    fs.writeFileSync(path.join(tempDir, '.env.bak'), 'KEY=old');
+
+    const backup = backupEnv(envPath);
+
+    expect(path.basename(backup)).toMatch(/^\.env\.bak\.\d{8}-\d{6}$/);
+    expect(fs.readFileSync(backup, 'utf-8')).toBe('KEY=new');
+    // The earlier backup is never clobbered.
+    expect(fs.readFileSync(path.join(tempDir, '.env.bak'), 'utf-8')).toBe('KEY=old');
+  });
+});
+
+describe('executePlan', () => {
+  it('deletes paths recursively', () => {
+    const dir = path.join(tempDir, 'data');
+    fs.mkdirSync(path.join(dir, 'nested'), { recursive: true });
+    fs.writeFileSync(path.join(dir, 'nested', 'f.txt'), 'x');
+
+    const { notes } = executePlan(
+      [{ kind: 'delete-path', item: { what: 'Data', where: dir, path: dir } }],
+      deps(),
+    );
+
+    expect(fs.existsSync(dir)).toBe(false);
+    expect(notes).toEqual([]);
+  });
+
+  it('continues past a failing action and records a note', () => {
+    const dir = path.join(tempDir, 'logs');
+    fs.mkdirSync(dir);
+    const actions: RemovalAction[] = [
+      {
+        kind: 'unload-service',
+        flavor: 'launchd',
+        unitPath: path.join(tempDir, 'svc.plist'),
+        unitName: 'com.nanoclaw-v2-test',
+      },
+      { kind: 'delete-path', item: { what: 'Logs', where: dir, path: dir } },
+    ];
+    const failing: RunCommand = () => {
+      throw new Error('launchctl exploded');
+    };
+
+    const { notes } = executePlan(actions, deps({ runCommand: failing }));
+
+    expect(notes).toHaveLength(1);
+    expect(notes[0]).toContain('unload-service');
+    expect(notes[0]).toContain('launchctl exploded');
+    // Later actions still ran.
+    expect(fs.existsSync(dir)).toBe(false);
+  });
+
+  it('leaves a system unit in place without root and notes the sudo command', () => {
+    const unitPath = path.join(tempDir, 'nanoclaw-v2-test.service');
+    fs.writeFileSync(unitPath, '[Unit]');
+    const calls: string[] = [];
+    const recorder: RunCommand = (cmd) => {
+      calls.push(cmd);
+      return { status: 0, stdout: '' };
+    };
+
+    const { notes } = executePlan(
+      [
+        {
+          kind: 'unload-service',
+          flavor: 'systemd-system',
+          unitPath,
+          unitName: 'nanoclaw-v2-test',
+        },
+      ],
+      deps({ runCommand: recorder, isRoot: false }),
+    );
+
+    expect(fs.existsSync(unitPath)).toBe(true);
+    expect(calls).toEqual([]);
+    expect(notes.some((n) => n.includes('re-run with sudo'))).toBe(true);
+  });
+
+  it('notes a failed image removal with the retry command', () => {
+    const { notes } = executePlan(
+      [{ kind: 'rmi', runtime: 'docker', image: 'img:latest' }],
+      deps({ runCommand: () => ({ status: 1, stdout: '' }) }),
+    );
+    expect(notes.some((n) => n.includes('docker rmi img:latest'))).toBe(true);
+  });
+
+  it('removes .env only after a successful backup', () => {
+    const envPath = path.join(tempDir, '.env');
+    fs.writeFileSync(envPath, 'KEY=secret');
+
+    const { notes } = executePlan([{ kind: 'backup-env', envPath }], deps());
+
+    expect(fs.existsSync(envPath)).toBe(false);
+    expect(fs.readFileSync(path.join(tempDir, '.env.bak'), 'utf-8')).toBe('KEY=secret');
+    expect(notes).toEqual([]);
+  });
+
+  it('keeps .env when the backup fails', () => {
+    const envPath = path.join(tempDir, '.env');
+    fs.writeFileSync(envPath, 'KEY=secret');
+    fs.chmodSync(tempDir, 0o555); // backup destination unwritable
+
+    try {
+      const { notes } = executePlan([{ kind: 'backup-env', envPath }], deps());
+      expect(fs.existsSync(envPath)).toBe(true);
+      expect(notes.some((n) => n.includes('backup-env'))).toBe(true);
+    } finally {
+      fs.chmodSync(tempDir, 0o755);
+    }
+  });
+
+  it('re-lists containers by label at removal time instead of using scan-time ids', () => {
+    const calls: string[][] = [];
+    const docker: RunCommand = (cmd, args) => {
+      calls.push([cmd, ...args]);
+      if (args[0] === 'ps') return { status: 0, stdout: 'fresh1\nfresh2\n' };
+      return { status: 0, stdout: '' };
+    };
+
+    executePlan(
+      [{ kind: 'rm-containers', runtime: 'docker', labelFilter: 'nanoclaw-install=abcd1234' }],
+      deps({ runCommand: docker }),
+    );
+
+    expect(calls).toEqual([
+      ['docker', 'ps', '-aq', '--filter', 'label=nanoclaw-install=abcd1234'],
+      ['docker', 'rm', '-f', 'fresh1', 'fresh2'],
+    ]);
+  });
+
+  it('notes a manual command when the container runtime is unavailable', () => {
+    const { notes } = executePlan(
+      [{ kind: 'rm-containers', runtime: 'docker', labelFilter: 'nanoclaw-install=x' }],
+      deps({ runCommand: () => ({ status: null, stdout: '' }) }),
+    );
+    expect(notes.some((n) => n.includes('xargs -r docker rm -f'))).toBe(true);
+  });
+
+  it('notes a manual delete when onecli itself cannot be run', () => {
+    const { notes } = executePlan(
+      [
+        {
+          kind: 'delete-onecli-agent',
+          agent: { uuid: 'u-123', identifier: 'ag-mine', name: 'Mine' },
+        },
+      ],
+      deps({ runCommand: () => ({ status: null, stdout: '' }) }),
+    );
+    expect(notes.some((n) => n.includes('onecli agents delete --id u-123'))).toBe(true);
+  });
+
+  it('deletes OneCLI agents by vault uuid, never by identifier', () => {
+    const calls: string[][] = [];
+    const recorder: RunCommand = (cmd, args) => {
+      calls.push([cmd, ...args]);
+      return { status: 0, stdout: '' };
+    };
+
+    executePlan(
+      [
+        {
+          kind: 'delete-onecli-agent',
+          agent: { uuid: 'u-123', identifier: 'ag-mine', name: 'Mine' },
+        },
+      ],
+      deps({ runCommand: recorder }),
+    );
+
+    expect(calls).toEqual([['onecli', 'agents', 'delete', '--id', 'u-123']]);
+  });
+});
@@ -0,0 +1,193 @@
+/**
+ * Removal-plan executor. Each action runs in its own try/catch: a failure
+ * becomes a summary note and execution continues (re-running the
+ * uninstaller is idempotent — the next scan only finds what's left).
+ *
+ * Must stay safe to run after logs/ and node_modules/ are gone: only static
+ * imports, no dynamic import(), no setup-log writes. Output goes through
+ * the injected `log` callback.
+ */
+import fs from 'fs';
+import path from 'path';
+
+import type { RunCommand } from './onecli-agents.js';
+import type { RemovalAction } from './plan.js';
+
+export interface ExecDeps {
+  runCommand: RunCommand;
+  log: (line: string) => void;
+  /** True when running as root — required to remove a system-level unit. */
+  isRoot: boolean;
+}
+
+export function executePlan(
+  actions: RemovalAction[],
+  deps: ExecDeps,
+): { notes: string[] } {
+  const notes: string[] = [];
+  for (const action of actions) {
+    try {
+      runAction(action, deps, notes);
+    } catch (err) {
+      const msg = err instanceof Error ? err.message : String(err);
+      notes.push(
+        `${action.kind}: failed (${msg}) — re-run the uninstaller to retry.`,
+      );
+    }
+  }
+  return { notes };
+}
+
+/**
+ * Copy .env aside before deletion. Never clobbers an existing backup —
+ * falls back to a timestamped name on collision. Returns the backup path.
+ */
+export function backupEnv(envPath: string): string {
+  const dir = path.dirname(envPath);
+  let backup = path.join(dir, '.env.bak');
+  if (fs.existsSync(backup)) {
+    const stamp = new Date()
+      .toISOString()
+      .replace(/[-:]/g, '')
+      .replace('T', '-')
+      .slice(0, 15);
+    backup = path.join(dir, `.env.bak.${stamp}`);
+  }
+  fs.copyFileSync(envPath, backup);
+  return backup;
+}
+
+function runAction(action: RemovalAction, deps: ExecDeps, notes: string[]): void {
+  const { runCommand, log } = deps;
+  switch (action.kind) {
+    case 'unload-service':
+      switch (action.flavor) {
+        case 'launchd':
+          runCommand('launchctl', ['unload', action.unitPath]);
+          fs.rmSync(action.unitPath, { force: true });
+          log('✓ background service removed');
+          break;
+        case 'systemd-user':
+          runCommand('systemctl', [
+            '--user',
+            'disable',
+            '--now',
+            `${action.unitName}.service`,
+          ]);
+          fs.rmSync(action.unitPath, { force: true });
+          runCommand('systemctl', ['--user', 'daemon-reload']);
+          log('✓ background service removed');
+          break;
+        case 'systemd-system':
+          if (!deps.isRoot) {
+            log('! system service needs root — left in place');
+            notes.push(
+              `System service ${action.unitPath} — re-run with sudo to remove.`,
+            );
+            break;
+          }
+          runCommand('systemctl', ['disable', '--now', `${action.unitName}.service`]);
+          fs.rmSync(action.unitPath, { force: true });
+          runCommand('systemctl', ['daemon-reload']);
+          log('✓ system service removed');
+          break;
+      }
+      break;
+    case 'kill-pid': {
+      let pid = NaN;
+      try {
+        pid = Number(fs.readFileSync(action.pidFile, 'utf-8').trim());
+      } catch {
+        // pidfile already gone
+      }
+      if (Number.isInteger(pid) && pid > 0) {
+        try {
+          process.kill(pid);
+          log('✓ stopped host process');
+        } catch {
+          // not running
+        }
+      }
+      break;
+    }
+    case 'pkill-host':
+      // Exit 1 = no matching process — not a failure.
+      runCommand('pkill', ['-f', action.pattern]);
+      break;
+    case 'rm-containers': {
+      // Re-list at removal time: the host was alive during the confirm
+      // phase and may have spawned containers the scan never saw.
+      const ps = runCommand(action.runtime, [
+        'ps',
+        '-aq',
+        '--filter',
+        `label=${action.labelFilter}`,
+      ]);
+      if (ps.status !== 0) {
+        notes.push(
+          `Containers: '${action.runtime}' unavailable — remove later with: ` +
+            `${action.runtime} ps -aq --filter label=${action.labelFilter} | xargs -r ${action.runtime} rm -f`,
+        );
+        break;
+      }
+      const ids = ps.stdout
+        .split('\n')
+        .map((s) => s.trim())
+        .filter(Boolean);
+      if (ids.length === 0) break;
+      runCommand(action.runtime, ['rm', '-f', ...ids]);
+      log(`✓ removed ${ids.length} container(s)`);
+      break;
+    }
+    case 'rmi': {
+      const res = runCommand(action.runtime, ['rmi', action.image]);
+      if (res.status === 0) {
+        log('✓ removed container image');
+      } else {
+        log('! could not remove image (in use?)');
+        notes.push(
+          `Image ${action.image}: not removed — retry with: ${action.runtime} rmi ${action.image}`,
+        );
+      }
+      break;
+    }
+    case 'rm-ncl-symlink':
+      fs.rmSync(action.linkPath, { force: true });
+      log('✓ removed ncl command');
+      break;
+    case 'delete-onecli-agent': {
+      const res = runCommand('onecli', [
+        'agents',
+        'delete',
+        '--id',
+        action.agent.uuid,
+      ]);
+      if (res.status === 0) {
+        log(`✓ deleted OneCLI agent ${action.agent.name} (${action.agent.identifier})`);
+      } else if (res.status === null) {
+        // spawn failure (binary gone since the scan), not a missing agent
+        log(`! couldn't run onecli for ${action.agent.identifier}`);
+        notes.push(
+          `OneCLI agent ${action.agent.name} (${action.agent.identifier}): couldn't run onecli — ` +
+            `delete manually with: onecli agents delete --id ${action.agent.uuid}`,
+        );
+      } else {
+        log(`! OneCLI agent ${action.agent.identifier} already gone`);
+      }
+      break;
+    }
+    case 'backup-env': {
+      // Backup and removal are one action so a failed backup (which throws
+      // into executePlan's catch) can never be followed by the deletion.
+      const backup = backupEnv(action.envPath);
+      fs.rmSync(action.envPath, { force: true });
+      log(`✓ removed .env (backup at ${backup})`);
+      break;
+    }
+    case 'delete-path':
+    case 'delete-runtime-path':
+      fs.rmSync(action.item.path, { recursive: true, force: true });
+      log(`✓ removed ${action.item.what}`);
+      break;
+  }
+}
@@ -0,0 +1,196 @@
+import { describe, it, expect, beforeEach, afterEach } from 'vitest';
+import fs from 'fs';
+import os from 'os';
+import path from 'path';
+
+import Database from 'better-sqlite3';
+
+import { getLaunchdLabel, getSystemdUnit } from '../../src/install-slug.js';
+import type { RunCommand } from './onecli-agents.js';
+import { detectExistingInstall, scanInstall, type ScanDeps } from './scan.js';
+
+let root: string;
+let home: string;
+
+beforeEach(() => {
+  root = fs.mkdtempSync(path.join(os.tmpdir(), 'nanoclaw-scan-root-'));
+  home = fs.mkdtempSync(path.join(os.tmpdir(), 'nanoclaw-scan-home-'));
+});
+
+afterEach(() => {
+  fs.rmSync(root, { recursive: true, force: true });
+  fs.rmSync(home, { recursive: true, force: true });
+});
+
+/** Fake runCommand: unhandled commands fail (binary missing / daemon down). */
+function fakeRun(
+  handlers: Record<string, (args: string[]) => { status: number | null; stdout: string }>,
+): RunCommand {
+  return (cmd, args) => (handlers[cmd] ?? (() => ({ status: 1, stdout: '' })))(args);
+}
+
+function deps(overrides: Partial<ScanDeps> = {}): ScanDeps {
+  return {
+    projectRoot: root,
+    home,
+    platform: 'darwin',
+    runCommand: fakeRun({}),
+    ...overrides,
+  };
+}
+
+const dockerUp = (containerIds: string[], hasImage: boolean) =>
+  fakeRun({
+    docker: (args) => {
+      if (args[0] === 'ps') return { status: 0, stdout: containerIds.join('\n') + '\n' };
+      if (args[0] === 'image') return { status: hasImage ? 0 : 1, stdout: '' };
+      return { status: 1, stdout: '' };
+    },
+  });
+
+describe('scanInstall path groups', () => {
+  it('puts dist and node_modules in runtime, not data', () => {
+    for (const dir of ['data', 'logs', 'dist', 'node_modules', 'groups', 'store']) {
+      fs.mkdirSync(path.join(root, dir));
+    }
+    fs.writeFileSync(path.join(root, '.env'), 'KEY=v');
+    fs.writeFileSync(path.join(root, 'start-nanoclaw.sh'), '#!/bin/bash');
+
+    const inv = scanInstall(deps());
+
+    expect(inv.data.map((i) => path.basename(i.path))).toEqual([
+      'data',
+      'logs',
+      '.env',
+      'start-nanoclaw.sh',
+    ]);
+    expect(inv.runtime.map((i) => path.basename(i.path))).toEqual([
+      'dist',
+      'node_modules',
+    ]);
+    expect(inv.user.map((i) => path.basename(i.path))).toEqual(['groups', 'store']);
+  });
+
+  it('finds nothing in an empty checkout', () => {
+    const inv = scanInstall(deps());
+    expect(inv.data).toEqual([]);
+    expect(inv.runtime).toEqual([]);
+    expect(inv.user).toEqual([]);
+    expect(inv.service.containerIds).toEqual([]);
+    expect(inv.service.image).toBeUndefined();
+  });
+});
+
+describe('scanInstall service artifacts', () => {
+  it('detects the launchd plist on macOS', () => {
+    const plist = path.join(
+      home,
+      'Library',
+      'LaunchAgents',
+      `${getLaunchdLabel(root)}.plist`,
+    );
+    fs.mkdirSync(path.dirname(plist), { recursive: true });
+    fs.writeFileSync(plist, '<plist/>');
+
+    const inv = scanInstall(deps());
+    expect(inv.service.launchdPlist).toBe(plist);
+    expect(inv.service.systemdUserUnit).toBeUndefined();
+  });
+
+  it('detects systemd user unit and pidfile on Linux', () => {
+    const unit = path.join(
+      home,
+      '.config',
+      'systemd',
+      'user',
+      `${getSystemdUnit(root)}.service`,
+    );
+    fs.mkdirSync(path.dirname(unit), { recursive: true });
+    fs.writeFileSync(unit, '[Unit]');
+    fs.writeFileSync(path.join(root, 'nanoclaw.pid'), '12345');
+
+    const inv = scanInstall(deps({ platform: 'linux' }));
+    expect(inv.service.systemdUserUnit).toBe(unit);
+    expect(inv.service.pidFile).toBe(path.join(root, 'nanoclaw.pid'));
+    expect(inv.service.launchdPlist).toBeUndefined();
+  });
+
+  it('captures container ids and image when docker is up', () => {
+    const inv = scanInstall(deps({ runCommand: dockerUp(['abc123', 'def456'], true) }));
+    expect(inv.service.containerIds).toEqual(['abc123', 'def456']);
+    expect(inv.service.image).toMatch(/^nanoclaw-agent-v2-[0-9a-f]{8}:latest$/);
+    expect(inv.notes).toEqual([]);
+  });
+
+  it('degrades with a manual-cleanup note when docker is unavailable', () => {
+    const inv = scanInstall(deps());
+    expect(inv.service.containerIds).toEqual([]);
+    expect(inv.service.image).toBeUndefined();
+    expect(inv.notes.some((n) => n.includes("'docker' unavailable"))).toBe(true);
+  });
+});
+
+describe('scanInstall ncl symlink', () => {
+  const link = () => path.join(home, '.local', 'bin', 'ncl');
+
+  it('includes the symlink only when it targets this checkout', () => {
+    fs.mkdirSync(path.dirname(link()), { recursive: true });
+    fs.symlinkSync(path.join(root, 'bin', 'ncl'), link());
+
+    const inv = scanInstall(deps());
+    expect(inv.service.nclSymlink).toBe(link());
+  });
+
+  it('leaves a symlink pointing at another copy, with a note', () => {
+    fs.mkdirSync(path.dirname(link()), { recursive: true });
+    fs.symlinkSync('/some/other/copy/bin/ncl', link());
+
+    const inv = scanInstall(deps());
+    expect(inv.service.nclSymlink).toBeUndefined();
+    expect(inv.notes.some((n) => n.includes('points to another NanoClaw copy'))).toBe(true);
+  });
+});
+
+describe('scanInstall OneCLI agents', () => {
+  const vault = JSON.stringify({
+    data: [
+      { id: 'u-1', identifier: 'ag-mine', name: 'Mine', isDefault: false },
+      { id: 'u-2', identifier: 'ag-other', name: 'Other', isDefault: false },
+    ],
+  });
+  const onecliUp = fakeRun({ onecli: () => ({ status: 0, stdout: vault }) });
+
+  it('splits mine vs orphans against the central DB', () => {
+    fs.mkdirSync(path.join(root, 'data'));
+    const db = new Database(path.join(root, 'data', 'v2.db'));
+    db.exec('CREATE TABLE agent_groups (id TEXT PRIMARY KEY)');
+    db.prepare('INSERT INTO agent_groups (id) VALUES (?)').run('ag-mine');
+    db.close();
+
+    const inv = scanInstall(deps({ runCommand: onecliUp }));
+    expect(inv.onecli.idsKnown).toBe(true);
+    expect(inv.onecli.mine.map((a) => a.identifier)).toEqual(['ag-mine']);
+    expect(inv.onecli.orphans.map((a) => a.identifier)).toEqual(['ag-other']);
+  });
+
+  it('flags orphan labels as unreliable when the DB is unreadable', () => {
+    const inv = scanInstall(deps({ runCommand: onecliUp }));
+    expect(inv.onecli.idsKnown).toBe(false);
+    expect(inv.onecli.mine).toEqual([]);
+    expect(inv.onecli.orphans.map((a) => a.identifier)).toEqual(['ag-mine', 'ag-other']);
+    expect(inv.notes.some((n) => n.includes("Couldn't read agent_groups"))).toBe(true);
+  });
+});
+
+describe('detectExistingInstall', () => {
+  it('is false for an empty checkout', () => {
+    expect(detectExistingInstall(root)).toBe(false);
+  });
+
+  it('is true when the central DB exists', () => {
+    fs.mkdirSync(path.join(root, 'data'));
+    const db = new Database(path.join(root, 'data', 'v2.db'));
+    db.close();
+    expect(detectExistingInstall(root)).toBe(true);
+  });
+});
@@ -0,0 +1,278 @@
+/**
+ * Uninstall inventory scan — find every artifact this checkout created.
+ *
+ * Everything NanoClaw creates is tagged with the per-checkout install slug
+ * (sha1(projectRoot)[:8]), so several copies can coexist on one machine.
+ * The scan reports ONLY things belonging to the given project root; shared
+ * tools (the OneCLI app/vault, shell PATH lines, host-wide config) are
+ * never inventoried.
+ *
+ * External commands (docker, onecli) go through the injected `runCommand`
+ * so tests can fake them; filesystem checks are real — tests use temp dirs.
+ * A missing/down docker daemon degrades to an empty result plus a note with
+ * manual cleanup commands; it never throws.
+ *
+ * Deliberately does NOT import src/config.ts (import-time side effects).
+ */
+import fs from 'fs';
+import os from 'os';
+import path from 'path';
+
+import {
+  getContainerImageBase,
+  getInstallSlug,
+  getLaunchdLabel,
+  getSystemdUnit,
+} from '../../src/install-slug.js';
+import {
+  listVaultAgents,
+  readAgentGroupIds,
+  splitVaultAgents,
+  type RunCommand,
+  type VaultAgent,
+} from './onecli-agents.js';
+
+export interface PathItem {
+  /** Human label, e.g. "Database & conversations". */
+  what: string;
+  /** Display location (tilde-abbreviated). */
+  where: string;
+  /** Absolute path to remove. */
+  path: string;
+}
+
+export interface ServiceInventory {
+  launchdPlist?: string;
+  systemdUserUnit?: string;
+  systemdSystemUnit?: string;
+  pidFile?: string;
+  containerIds: string[];
+  image?: string;
+  nclSymlink?: string;
+}
+
+export interface OnecliInventory {
+  mine: VaultAgent[];
+  orphans: VaultAgent[];
+  /** False when agent_groups couldn't be read — orphan labels are then unreliable. */
+  idsKnown: boolean;
+}
+
+export interface Inventory {
+  slug: string;
+  projectRoot: string;
+  containerRuntime: string;
+  service: ServiceInventory;
+  /** Group 2: app data, logs & secrets. */
+  data: PathItem[];
+  /**
+   * dist/ + node_modules/ — displayed with the data group but removed dead
+   * last: the uninstaller itself runs on tsx out of node_modules.
+   */
+  runtime: PathItem[];
+  /** Group 3: groups/ and store/ — user content, unrecoverable. */
+  user: PathItem[];
+  onecli: OnecliInventory;
+  notes: string[];
+}
+
+export interface ScanDeps {
+  projectRoot: string;
+  home: string;
+  platform: NodeJS.Platform;
+  runCommand: RunCommand;
+}
+
+export function tilde(p: string, home: string): string {
+  return p.startsWith(home) ? `~${p.slice(home.length)}` : p;
+}
+
+export function scanInstall(deps: ScanDeps): Inventory {
+  const { projectRoot, home, runCommand } = deps;
+  const slug = getInstallSlug(projectRoot);
+  const containerRuntime = process.env.CONTAINER_RUNTIME ?? 'docker';
+  const notes: string[] = [];
+
+  const service = scanService(deps, slug, containerRuntime, notes);
+
+  const data = existingItems(projectRoot, home, [
+    { rel: 'data', what: 'Database & conversations' },
+    { rel: 'logs', what: 'Logs' },
+    { rel: '.env', what: 'Secrets / API keys (.env)', where: 'backed up before removal' },
+    { rel: 'start-nanoclaw.sh', what: 'Start script', where: 'start-nanoclaw.sh' },
+    { rel: 'nanoclaw.pid', what: 'PID file', where: 'nanoclaw.pid' },
+  ]);
+
+  const runtime = existingItems(projectRoot, home, [
+    { rel: 'dist', what: 'Build output' },
+    { rel: 'node_modules', what: 'Installed dependencies' },
+  ]);
+
+  const user = existingItems(projectRoot, home, [
+    { rel: 'groups', what: 'Agent memory & files' },
+    { rel: 'store', what: 'Migrated data store' },
+  ]);
+
+  const onecli = scanOnecli(projectRoot, runCommand, notes);
+
+  return {
+    slug,
+    projectRoot,
+    containerRuntime,
+    service,
+    data,
+    runtime,
+    user,
+    onecli,
+    notes,
+  };
+}
+
+/**
+ * Cheap existing-install probe for mid-setup detection: service registration
+ * (per-platform) or a central DB. No docker or onecli calls.
+ */
+export function detectExistingInstall(projectRoot: string): boolean {
+  if (fs.existsSync(path.join(projectRoot, 'data', 'v2.db'))) return true;
+  const home = os.homedir();
+  if (process.platform === 'darwin') {
+    return fs.existsSync(
+      path.join(home, 'Library', 'LaunchAgents', `${getLaunchdLabel(projectRoot)}.plist`),
+    );
+  }
+  if (process.platform === 'linux') {
+    const unit = getSystemdUnit(projectRoot);
+    return (
+      fs.existsSync(path.join(home, '.config', 'systemd', 'user', `${unit}.service`)) ||
+      fs.existsSync(`/etc/systemd/system/${unit}.service`)
+    );
+  }
+  return false;
+}
+
+function scanService(
+  deps: ScanDeps,
+  slug: string,
+  containerRuntime: string,
+  notes: string[],
+): ServiceInventory {
+  const { projectRoot, home, platform, runCommand } = deps;
+  const service: ServiceInventory = { containerIds: [] };
+
+  if (platform === 'darwin') {
+    const plist = path.join(
+      home,
+      'Library',
+      'LaunchAgents',
+      `${getLaunchdLabel(projectRoot)}.plist`,
+    );
+    if (fs.existsSync(plist)) service.launchdPlist = plist;
+  } else if (platform === 'linux') {
+    const unit = getSystemdUnit(projectRoot);
+    const userUnit = path.join(home, '.config', 'systemd', 'user', `${unit}.service`);
+    const systemUnit = `/etc/systemd/system/${unit}.service`;
+    if (fs.existsSync(userUnit)) service.systemdUserUnit = userUnit;
+    if (fs.existsSync(systemUnit)) service.systemdSystemUnit = systemUnit;
+    const pidFile = path.join(projectRoot, 'nanoclaw.pid');
+    if (fs.existsSync(pidFile)) service.pidFile = pidFile;
+  }
+
+  // Container label matches what container-runner.ts stamps at spawn time.
+  const installLabel = `nanoclaw-install=${slug}`;
+  const image = `${getContainerImageBase(projectRoot)}:latest`;
+  let runtimeOk = true;
+  try {
+    const ps = runCommand(containerRuntime, [
+      'ps',
+      '-aq',
+      '--filter',
+      `label=${installLabel}`,
+    ]);
+    if (ps.status === 0) {
+      service.containerIds = ps.stdout
+        .split('\n')
+        .map((s) => s.trim())
+        .filter(Boolean);
+    } else {
+      runtimeOk = false;
+    }
+  } catch {
+    runtimeOk = false;
+  }
+  if (runtimeOk) {
+    try {
+      const inspect = runCommand(containerRuntime, ['image', 'inspect', image]);
+      if (inspect.status === 0) service.image = image;
+    } catch {
+      runtimeOk = false;
+    }
+  }
+  if (!runtimeOk) {
+    notes.push(
+      `Containers/image: '${containerRuntime}' unavailable; remove later with: ` +
+        `${containerRuntime} ps -aq --filter label=${installLabel} | xargs -r ${containerRuntime} rm -f; ` +
+        `${containerRuntime} rmi ${image}`,
+    );
+  }
+
+  const link = path.join(home, '.local', 'bin', 'ncl');
+  let linkStat: fs.Stats | null = null;
+  try {
+    linkStat = fs.lstatSync(link);
+  } catch {
+    linkStat = null;
+  }
+  if (linkStat?.isSymbolicLink()) {
+    let target = fs.readlinkSync(link);
+    if (!path.isAbsolute(target)) {
+      target = path.resolve(path.dirname(link), target);
+    }
+    if (path.resolve(target) === path.join(projectRoot, 'bin', 'ncl')) {
+      service.nclSymlink = link;
+    } else {
+      notes.push(
+        `ncl command ${tilde(link, home)} points to another NanoClaw copy; left untouched.`,
+      );
+    }
+  }
+
+  return service;
+}
+
+function scanOnecli(
+  projectRoot: string,
+  runCommand: RunCommand,
+  notes: string[],
+): OnecliInventory {
+  const vault = listVaultAgents(runCommand);
+  if (!vault.available || vault.agents.length === 0) {
+    return { mine: [], orphans: [], idsKnown: false };
+  }
+
+  const { ids, known } = readAgentGroupIds(path.join(projectRoot, 'data', 'v2.db'));
+  const { mine, orphans } = splitVaultAgents(vault.agents, ids, known);
+  if (!known && orphans.length > 0) {
+    notes.push(
+      "Couldn't read agent_groups from data/v2.db; OneCLI agents shown as 'orphan' may actually belong to this copy.",
+    );
+  }
+  return { mine, orphans, idsKnown: known };
+}
+
+function existingItems(
+  projectRoot: string,
+  home: string,
+  specs: { rel: string; what: string; where?: string }[],
+): PathItem[] {
+  const items: PathItem[] = [];
+  for (const spec of specs) {
+    const p = path.join(projectRoot, spec.rel);
+    if (!fs.existsSync(p)) continue;
+    items.push({
+      what: spec.what,
+      where: spec.where ?? `${tilde(p, home)}/`,
+      path: p,
+    });
+  }
+  return items;
+}
@@ -44,6 +44,9 @@ export interface DeliveryAddress {
 */
 export interface InboundEvent {
  channelType: string;
+  /** Receiving adapter instance; stamped host-side (src/index.ts onInbound).
+   *  Absent (e.g. CLI onInboundEvent) means the default instance (= channelType). */
+  instance?: string;
  platformId: string;
  threadId: string | null;
  message: {
@@ -112,6 +115,15 @@ export interface ChannelAdapter {
  name: string;
  channelType: string;

+  /**
+   * Adapter-instance name — distinguishes N adapters of one platform
+   * (e.g. three Slack apps in one workspace). Defaults to channelType.
+   * channelType stays the SEMANTIC platform key (user ids '<channelType>:<handle>',
+   * formatting, container config); instance is a host-side routing key only.
+   * Must be unique across active adapters and URL-safe (no '/', '?', ':').
+   */
+  instance?: string;
+
  /**
   * Whether this adapter models conversations as threads.
   *
@@ -30,19 +30,24 @@ function now() {
 /** Create a mock ChannelAdapter for testing. */
 function createMockAdapter(
  channelType: string,
-): ChannelAdapter & { delivered: OutboundMessage[]; inbound: InboundMessage[] } {
+  instance?: string,
+): ChannelAdapter & { delivered: OutboundMessage[]; inbound: InboundMessage[]; setupTimes: number[] } {
  const delivered: OutboundMessage[] = [];
  const inbound: InboundMessage[] = [];
+  const setupTimes: number[] = [];
  let setupConfig: ChannelSetup | null = null;

  return {
-    name: channelType,
+    name: instance ?? channelType,
    channelType,
+    instance,
    supportsThreads: false,
    delivered,
    inbound,
+    setupTimes,

    async setup(config: ChannelSetup) {
+      setupTimes.push(Date.now());
      setupConfig = config;
    },

@@ -117,6 +122,117 @@ describe('channel registry', () => {
  });
 });

+describe('channel registry — instance keying', () => {
+  // Fresh module per test: the registry and activeAdapters maps are
+  // module-level, and these arms register conflicting same-channelType
+  // adapters that must not leak across tests.
+  beforeEach(() => {
+    vi.resetModules();
+  });
+
+  afterEach(async () => {
+    const { teardownChannelAdapters } = await import('./channel-registry.js');
+    await teardownChannelAdapters();
+    // Drop this test's registrations so later describe blocks (which import
+    // the registry without resetting) start from an empty registry instead
+    // of inheriting same-channelType pairs.
+    vi.resetModules();
+  });
+
+  const mockSetup = () => ({
+    onInbound: () => {},
+    onInboundEvent: () => {},
+    onMetadata: () => {},
+    onAction: () => {},
+  });
+
+  it('keys two same-channelType adapters by instance — both resolvable', async () => {
+    const reg = await import('./channel-registry.js');
+    const worker = createMockAdapter('slack', 'slack-worker');
+    const tester = createMockAdapter('slack', 'slack-tester');
+    reg.registerChannelAdapter('slack-worker', { factory: () => worker });
+    reg.registerChannelAdapter('slack-tester', { factory: () => tester });
+
+    await reg.initChannelAdapters(mockSetup);
+
+    expect(reg.getChannelAdapter('slack-worker')).toBe(worker);
+    expect(reg.getChannelAdapter('slack-tester')).toBe(tester);
+    expect(reg.getActiveAdapters()).toHaveLength(2);
+  });
+
+  it('resolves channelType to the default-instance adapter when one exists, else first-registered', async () => {
+    const reg = await import('./channel-registry.js');
+    const named = createMockAdapter('slack', 'slack-tester');
+    const unnamed = createMockAdapter('slack');
+    reg.registerChannelAdapter('slack-tester', { factory: () => named });
+    reg.registerChannelAdapter('slack', { factory: () => unnamed });
+
+    await reg.initChannelAdapters(mockSetup);
+
+    // Exact key (default instance keyed by channelType) beats the fallback
+    // scan, even though the named sibling registered first.
+    expect(reg.getChannelAdapter('slack')).toBe(unnamed);
+
+    // With ONLY named instances active, channelType still resolves —
+    // deterministic first-registered fallback.
+    await reg.teardownChannelAdapters();
+    vi.resetModules();
+    const reg2 = await import('./channel-registry.js');
+    const first = createMockAdapter('slack', 'slack-tester');
+    const second = createMockAdapter('slack', 'slack-worker');
+    reg2.registerChannelAdapter('slack-tester', { factory: () => first });
+    reg2.registerChannelAdapter('slack-worker', { factory: () => second });
+    await reg2.initChannelAdapters(mockSetup);
+    expect(reg2.getChannelAdapter('slack')).toBe(first);
+  });
+
+  it('does NOT reroute default-instance outbound through a named sibling when the default adapter is missing', async () => {
+    // The default Slack app is offline (token rotated, factory returned
+    // null, …) while a named sibling boots fine. Outbound for the default
+    // instance must get the offline-adapter handling (drop into the retry
+    // path) — NEVER a cross-identity send through the sibling bot.
+    const reg = await import('./channel-registry.js');
+    const tester = createMockAdapter('slack', 'slack-tester');
+    reg.registerChannelAdapter('slack-tester', { factory: () => tester });
+    reg.registerChannelAdapter('slack', { factory: () => null });
+
+    await reg.initChannelAdapters(mockSetup);
+
+    // Exact lookup (delivery/typing path): the default key resolves nothing.
+    expect(reg.getChannelAdapterExact('slack')).toBeUndefined();
+    // Fallback-capable lookup (channelType-only callers) still resolves.
+    expect(reg.getChannelAdapter('slack')).toBe(tester);
+
+    // The delivery bridge dispatches by exact key: a default-instance
+    // message (instance === channelType after backfill) is dropped, not
+    // delivered through the sibling's identity.
+    const bridge = reg.createChannelDeliveryAdapter();
+    const result = await bridge.deliver(
+      'slack',
+      'slack:C1',
+      null,
+      'chat',
+      JSON.stringify({ text: 'to the default bot' }),
+      undefined,
+      'slack',
+    );
+    expect(result).toBeUndefined();
+    expect(tester.delivered).toHaveLength(0);
+
+    // Sanity: the same bridge DOES deliver when the exact instance is live.
+    await bridge.deliver(
+      'slack',
+      'slack:C1',
+      null,
+      'chat',
+      JSON.stringify({ text: 'to the tester bot' }),
+      undefined,
+      'slack-tester',
+    );
+    expect(tester.delivered).toHaveLength(1);
+  });
+});
+
 describe('channel + router integration', () => {
  beforeEach(async () => {
    if (fs.existsSync(TEST_DIR)) fs.rmSync(TEST_DIR, { recursive: true });
@@ -4,7 +4,8 @@
 * Channels self-register on import. The host calls initChannelAdapters() at startup
 * to instantiate and set up all registered adapters.
 */
-import type { ChannelAdapter, ChannelRegistration, ChannelSetup } from './adapter.js';
+import type { ChannelAdapter, ChannelRegistration, ChannelSetup, OutboundFile } from './adapter.js';
+import type { ChannelDeliveryAdapter } from '../delivery.js';
 import { log } from '../log.js';

 const SETUP_RETRY_DELAYS_MS = [2000, 5000, 10000];
@@ -26,9 +27,79 @@ export function registerChannelAdapter(name: string, registration: ChannelRegist
  registry.set(name, registration);
 }

-/** Get a live adapter by channel type. */
-export function getChannelAdapter(channelType: string): ChannelAdapter | undefined {
-  return activeAdapters.get(channelType);
+/** Get a live adapter by its EXACT registry key (instance name; default
+ *  instances are keyed by channelType itself). No channelType fallback —
+ *  callers that address a specific instance (outbound delivery, typing)
+ *  must never be rerouted through a sibling instance: that would send
+ *  through the wrong bot identity with the wrong token. A missing key
+ *  means the owning adapter is offline; callers apply their normal
+ *  offline-adapter handling. */
+export function getChannelAdapterExact(key: string): ChannelAdapter | undefined {
+  return activeAdapters.get(key);
+}
+
+/** Get a live adapter by instance name, falling back to any adapter of the
+ *  given channel type. The fallback exists ONLY for channelType-only callers
+ *  (user-id prefix resolution and cold DMs in user-dm.ts, approval delivery
+ *  in channel-approval.ts, the router's thread-policy probe when an event
+ *  carries no instance) — they must still resolve when every instance of a
+ *  platform is named. First registered wins (Map insertion order,
+ *  deterministic). Default instances are keyed by channelType itself, so
+ *  single-instance installs always hit the exact-key path. Instance-addressed
+ *  dispatch (delivery, typing) must use getChannelAdapterExact instead. */
+export function getChannelAdapter(key: string): ChannelAdapter | undefined {
+  const exact = activeAdapters.get(key);
+  if (exact) return exact;
+  for (const [registryKey, adapter] of activeAdapters) {
+    if (adapter.channelType === key) {
+      log.warn('Channel adapter fallback: requested key resolved through a differently-keyed instance', {
+        requested: key,
+        resolvedKey: registryKey,
+      });
+      return adapter;
+    }
+  }
+  return undefined;
+}
+
+/**
+ * Build the host's outbound delivery bridge: dispatches delivery-poll and
+ * typing traffic into the adapter registry. Resolution is EXACT-key only —
+ * `instance ?? channelType`. For default-instance messaging_groups rows the
+ * stored instance IS the channelType, which matches default-registered
+ * adapters, so single-instance behavior is unchanged. A named instance whose
+ * adapter is offline gets the normal offline-adapter handling (warn + drop
+ * into the delivery retry path) — never a cross-identity send through a
+ * sibling bot of the same platform.
+ */
+export function createChannelDeliveryAdapter(): ChannelDeliveryAdapter {
+  return {
+    async deliver(
+      channelType: string,
+      platformId: string,
+      threadId: string | null,
+      kind: string,
+      content: string,
+      files?: OutboundFile[],
+      instance?: string,
+    ): Promise<string | undefined> {
+      const adapter = getChannelAdapterExact(instance ?? channelType);
+      if (!adapter) {
+        log.warn('No adapter for channel type', { channelType, instance });
+        return;
+      }
+      return adapter.deliver(platformId, threadId, { kind, content: JSON.parse(content), files });
+    },
+    async setTyping(
+      channelType: string,
+      platformId: string,
+      threadId: string | null,
+      instance?: string,
+    ): Promise<void> {
+      const adapter = getChannelAdapterExact(instance ?? channelType);
+      await adapter?.setTyping?.(platformId, threadId);
+    },
+  };
 }

 /** Get all active adapters. */
@@ -85,8 +156,16 @@ export async function initChannelAdapters(setupFn: (adapter: ChannelAdapter) =>
          throw err;
        }
      }
-      activeAdapters.set(adapter.channelType, adapter);
-      log.info('Channel adapter started', { channel: name, type: adapter.channelType });
+      // Adapters key by instance (default instance = channelType), so N
+      // instances of one platform coexist. Duplicate keys warn instead of
+      // throwing — boot stays resilient, matching the historical silent
+      // last-write-wins, but now visibly.
+      const key = adapter.instance ?? adapter.channelType;
+      if (activeAdapters.has(key)) {
+        log.warn('Duplicate adapter instance key — overwriting previous adapter', { key, channel: name });
+      }
+      activeAdapters.set(key, adapter);
+      log.info('Channel adapter started', { channel: name, type: adapter.channelType, instance: key });
    } catch (err) {
      log.error('Failed to start channel adapter', { channel: name, err });
    }
@@ -0,0 +1,112 @@
+/**
+ * Approval-card actor byline in the Chat SDK bridge.
+ *
+ * Drives the bridge's real onAction handler through the real Chat SDK
+ * dispatch (`chat.processAction`): `bridge.setup()` registers the handler on
+ * a real Chat instance, which the test captures from the webhook-server
+ * registration (mocked so no HTTP server binds a port). After a button click
+ * the bridge edits the card; the edit must append " — <actor>" so shared
+ * channels see who resolved an approval. Goes red if the byLine concatenation
+ * is removed from the edited markdown.
+ */
+import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
+
+import type { Adapter, Chat } from 'chat';
+
+const captured = vi.hoisted(() => ({ chat: null as unknown }));
+
+vi.mock('../webhook-server.js', () => ({
+  registerWebhookAdapter: vi.fn((chat: unknown) => {
+    captured.chat = chat;
+  }),
+}));
+
+import { closeDb, initTestDb, runMigrations } from '../db/index.js';
+import type { ChannelSetup } from './adapter.js';
+import { createChatSdkBridge } from './chat-sdk-bridge.js';
+
+interface CapturedEdit {
+  threadId: string;
+  messageId: string;
+  markdown: string;
+}
+
+function makeAdapter(edits: CapturedEdit[]): Adapter {
+  return {
+    name: 'stub',
+    initialize: async () => {},
+    channelIdFromThreadId: (threadId: string) => `stub:${threadId}`,
+    editMessage: async (threadId: string, messageId: string, content: { markdown: string }) => {
+      edits.push({ threadId, messageId, markdown: content.markdown });
+    },
+  } as unknown as Adapter;
+}
+
+async function fireAction(user: Record<string, unknown>): Promise<{ edits: CapturedEdit[]; actions: string[] }> {
+  const edits: CapturedEdit[] = [];
+  const actions: string[] = [];
+  const adapter = makeAdapter(edits);
+  const bridge = createChatSdkBridge({ adapter, supportsThreads: false });
+
+  await bridge.setup({
+    onInbound: async () => {},
+    onInboundEvent: async () => {},
+    onMetadata: () => {},
+    onAction: (questionId: string, selectedOption: string, userId: string) => {
+      actions.push(`${questionId}:${selectedOption}:${userId}`);
+    },
+  } as ChannelSetup);
+
+  const chat = captured.chat as Chat;
+  expect(chat).toBeTruthy();
+  await chat.processAction(
+    {
+      actionId: 'ncq:q-1:approve',
+      adapter,
+      messageId: 'msg-1',
+      raw: {},
+      threadId: 'T-1',
+      user: user as never,
+      value: 'approve',
+    },
+    undefined,
+  );
+  return { edits, actions };
+}
+
+beforeEach(() => {
+  captured.chat = null;
+  const db = initTestDb();
+  runMigrations(db);
+});
+
+afterEach(() => {
+  closeDb();
+});
+
+describe('chat-sdk-bridge approval-card byline', () => {
+  it('appends the acting user to the edited card markdown', async () => {
+    const { edits, actions } = await fireAction({ userId: 'U1', userName: 'gavriel', fullName: 'Gavriel C' });
+
+    expect(edits).toHaveLength(1);
+    expect(edits[0].threadId).toBe('T-1');
+    expect(edits[0].messageId).toBe('msg-1');
+    expect(edits[0].markdown).toContain('approve — gavriel');
+    expect(actions).toEqual(['q-1:approve:U1']);
+  });
+
+  it('falls back to fullName when userName is missing', async () => {
+    const { edits } = await fireAction({ userId: 'U2', fullName: 'Gavriel C' });
+
+    expect(edits).toHaveLength(1);
+    expect(edits[0].markdown).toContain('— Gavriel C');
+  });
+
+  it('omits the byline when the actor has no name', async () => {
+    const { edits } = await fireAction({ userId: 'U3' });
+
+    expect(edits).toHaveLength(1);
+    expect(edits[0].markdown).not.toContain('—');
+    expect(edits[0].markdown).toContain('approve');
+  });
+});
@@ -1,9 +1,13 @@
-import { describe, expect, it } from 'vitest';
+import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';

 import type { Adapter, AdapterPostableMessage, RawMessage } from 'chat';

 import { createChatSdkBridge, splitForLimit } from './chat-sdk-bridge.js';

+vi.mock('../webhook-server.js', () => ({
+  registerWebhookAdapter: vi.fn(),
+}));
+
 function stubAdapter(partial: Partial<Adapter>): Adapter {
  return { name: 'stub', ...partial } as unknown as Adapter;
 }
@@ -93,6 +97,147 @@ describe('createChatSdkBridge', () => {
  });
 });

+describe('createChatSdkBridge — instance identity', () => {
+  it('default: name === channelType === adapter.name, instance undefined', () => {
+    const bridge = createChatSdkBridge({
+      adapter: stubAdapter({ name: 'slack' }),
+      supportsThreads: true,
+    });
+    expect(bridge.name).toBe('slack');
+    expect(bridge.channelType).toBe('slack');
+    expect(bridge.instance).toBeUndefined();
+  });
+
+  it('named instance: name follows the instance, channelType stays the platform', () => {
+    const bridge = createChatSdkBridge({
+      adapter: stubAdapter({ name: 'slack' }),
+      instance: 'slack-tester',
+      supportsThreads: true,
+    });
+    expect(bridge.name).toBe('slack-tester');
+    expect(bridge.channelType).toBe('slack');
+    expect(bridge.instance).toBe('slack-tester');
+  });
+
+  it('rejects instance names that would break the webhook route or state delimiter', () => {
+    for (const bad of ['a/b', 'a:b', 'a?b', 'a b']) {
+      expect(() =>
+        createChatSdkBridge({ adapter: stubAdapter({ name: 'slack' }), instance: bad, supportsThreads: true }),
+      ).toThrow(/URL-safe/);
+    }
+  });
+
+  it('rejects empty and whitespace-only instance names (config bug — fail loud)', () => {
+    // '' is falsy: a truthiness guard would skip it, dead-ending the
+    // webhook route ('/webhook/' + '') and collapsing the state namespace
+    // into the default instance's unprefixed keyspace — the exact
+    // cross-bot dedupe/lock collisions the namespace exists to prevent.
+    for (const bad of ['', ' ', '   ', '\t']) {
+      expect(() =>
+        createChatSdkBridge({ adapter: stubAdapter({ name: 'slack' }), instance: bad, supportsThreads: true }),
+      ).toThrow(/URL-safe/);
+    }
+  });
+});
+
+describe('createChatSdkBridge.setup — webhook route and state namespace', () => {
+  // Real setup() over a stub adapter: Chat.initialize() needs a working
+  // StateAdapter (chat_sdk_* tables) and an adapter.initialize — nothing
+  // platform-side. registerWebhookAdapter is mocked at module level so we
+  // can assert the (chat, adapterName, routingPath) triple.
+  function setupStubAdapter(): Adapter {
+    return stubAdapter({
+      name: 'slack',
+      initialize: async () => {},
+    } as unknown as Partial<Adapter>);
+  }
+
+  beforeEach(async () => {
+    const { initTestDb } = await import('../db/connection.js');
+    const { runMigrations } = await import('../db/migrations/index.js');
+    runMigrations(initTestDb());
+    const { registerWebhookAdapter } = await import('../webhook-server.js');
+    vi.mocked(registerWebhookAdapter).mockClear();
+  });
+
+  afterEach(async () => {
+    const { closeDb } = await import('../db/connection.js');
+    closeDb();
+  });
+
+  const hostConfig = {
+    onInbound: () => {},
+    onInboundEvent: () => {},
+    onMetadata: () => {},
+    onAction: () => {},
+  };
+
+  it('named instance registers the webhook with adapterName as handler key and instance as route', async () => {
+    const { registerWebhookAdapter } = await import('../webhook-server.js');
+    const bridge = createChatSdkBridge({
+      adapter: setupStubAdapter(),
+      instance: 'slack-tester',
+      supportsThreads: true,
+    });
+    await bridge.setup(hostConfig);
+    expect(registerWebhookAdapter).toHaveBeenCalledTimes(1);
+    const [, adapterName, routingPath] = vi.mocked(registerWebhookAdapter).mock.calls[0];
+    expect(adapterName).toBe('slack');
+    expect(routingPath).toBe('slack-tester');
+    await bridge.teardown();
+  });
+
+  it('default instance registers the historical route', async () => {
+    const { registerWebhookAdapter } = await import('../webhook-server.js');
+    const bridge = createChatSdkBridge({ adapter: setupStubAdapter(), supportsThreads: true });
+    await bridge.setup(hostConfig);
+    const [, adapterName, routingPath] = vi.mocked(registerWebhookAdapter).mock.calls[0];
+    expect(adapterName).toBe('slack');
+    expect(routingPath ?? adapterName).toBe('slack');
+    await bridge.teardown();
+  });
+
+  it('named instance namespaces Chat SDK state; default stays unprefixed (live-install constraint)', async () => {
+    const { getDb } = await import('../db/connection.js');
+
+    const named = createChatSdkBridge({
+      adapter: setupStubAdapter(),
+      instance: 'slack-tester',
+      supportsThreads: true,
+    });
+    await named.setup(hostConfig);
+    await named.subscribe!('slack:C1', 'slack:T1');
+
+    const def = createChatSdkBridge({ adapter: setupStubAdapter(), supportsThreads: true });
+    await def.setup(hostConfig);
+    await def.subscribe!('slack:C1', 'slack:T1');
+
+    const rows = getDb().prepare('SELECT thread_id FROM chat_sdk_subscriptions ORDER BY thread_id').all() as Array<{
+      thread_id: string;
+    }>;
+    expect(rows.map((r) => r.thread_id)).toEqual(['slack-tester:slack:T1', 'slack:T1']);
+
+    await named.teardown();
+    await def.teardown();
+  });
+
+  it('explicitly naming the primary instance after the platform stays on the unprefixed keyspace', async () => {
+    const { getDb } = await import('../db/connection.js');
+    const bridge = createChatSdkBridge({
+      adapter: setupStubAdapter(),
+      instance: 'slack', // explicit, but equal to adapter.name ⇒ default keyspace
+      supportsThreads: true,
+    });
+    await bridge.setup(hostConfig);
+    await bridge.subscribe!('slack:C1', 'slack:T9');
+    const rows = getDb().prepare('SELECT thread_id FROM chat_sdk_subscriptions').all() as Array<{
+      thread_id: string;
+    }>;
+    expect(rows.map((r) => r.thread_id)).toEqual(['slack:T9']);
+    await bridge.teardown();
+  });
+});
+
 describe('createChatSdkBridge.deliver — display cards (send_card)', () => {
  // The send_card MCP tool writes outbound rows with `{ type: 'card', card, fallbackText }`.
  // Before this branch existed the bridge silently dropped them: cards have no
@@ -47,6 +47,15 @@ export type ReplyContextExtractor = (raw: Record<string, any>) => ReplyContext |

 export interface ChatSdkBridgeConfig {
  adapter: Adapter;
+  /**
+   * Adapter-instance name for running multiple bridges of one platform
+   * (e.g. several Slack apps in one workspace). Defaults to the platform
+   * name. Drives the registry key, the webhook route (/webhook/<instance>),
+   * and the Chat SDK state namespace. channelType is NOT affected — user
+   * identity, formatting, and container config stay keyed on the platform.
+   * Must be URL-safe: non-empty, only letters, digits, '.', '_' or '-'.
+   */
+  instance?: string;
  concurrency?: ConcurrencyStrategy;
  /** Bot token for authenticating forwarded Gateway events (required for interaction handling). */
  botToken?: string;
@@ -121,6 +130,19 @@ export function splitForLimit(text: string, limit: number): string[] {

 export function createChatSdkBridge(config: ChatSdkBridgeConfig): ChannelAdapter {
  const { adapter } = config;
+  // The instance name becomes a webhook route segment (the route regex is
+  // [^/?]+) and ':' is the state-namespace delimiter — reject anything that
+  // would break either, at construction time rather than at first webhook.
+  // Positive allow-list (not a deny-list): also rejects '' and
+  // whitespace-only names, which are config bugs — '' is falsy, so it
+  // would skip a truthiness guard, dead-end the webhook route, and
+  // collapse the state namespace into the default instance's keyspace.
+  if (config.instance !== undefined && !/^[A-Za-z0-9._-]+$/.test(config.instance)) {
+    throw new Error(
+      `chat-sdk bridge instance ${JSON.stringify(config.instance)} must be URL-safe: ` +
+        `non-empty, only letters, digits, '.', '_' or '-'`,
+    );
+  }
  const transformText = (t: string): string => (config.transformOutboundText ? config.transformOutboundText(t) : t);
  let chat: Chat;
  let state: SqliteStateAdapter;
@@ -193,14 +215,21 @@ export function createChatSdkBridge(config: ChatSdkBridgeConfig): ChannelAdapter
  }

  const bridge: ChannelAdapter = {
-    name: adapter.name,
-    channelType: adapter.name,
+    name: config.instance ?? adapter.name,
+    channelType: adapter.name, // unchanged — semantic platform key
+    instance: config.instance, // undefined ⇒ default instance
+
    supportsThreads: config.supportsThreads,

    async setup(hostConfig: ChannelSetup) {
      setupConfig = hostConfig;

-      state = new SqliteStateAdapter();
+      // State namespace: ONLY for a named non-default instance. A skill
+      // that explicitly names the primary instance after the platform
+      // (instance === adapter.name) still lands on the legacy UNPREFIXED
+      // keyspace — prefixing the default would orphan every live install's
+      // chat_sdk_subscriptions/kv/locks/lists rows.
+      state = new SqliteStateAdapter(config.instance && config.instance !== adapter.name ? config.instance : undefined);

      chat = new Chat({
        adapters: { [adapter.name]: adapter },
@@ -284,11 +313,13 @@ export function createChatSdkBridge(config: ChatSdkBridgeConfig): ChannelAdapter
        const matched = render?.options.find((o) => o.value === selectedOption);
        const selectedLabel = matched?.selectedLabel ?? selectedOption ?? '(clicked)';

-        // Update the card to show the selected answer and remove buttons
+        // Update the card to show the selected answer, who acted, and remove buttons
+        const actorName = event.user?.userName || event.user?.fullName || '';
+        const byLine = actorName ? ` — ${actorName}` : '';
        try {
          const tid = event.threadId;
          await adapter.editMessage(tid, event.messageId, {
-            markdown: `${title}\n\n${selectedLabel}`,
+            markdown: `${title}\n\n${selectedLabel}${byLine}`,
          });
        } catch (err) {
          log.warn('Failed to update card after action', { err });
@@ -358,8 +389,12 @@ export function createChatSdkBridge(config: ChatSdkBridgeConfig): ChannelAdapter
        startGateway();
        log.info('Gateway listener started', { adapter: adapter.name });
      } else {
-        // Non-gateway adapters (Slack, Teams, GitHub, etc.) — register on the shared webhook server
-        registerWebhookAdapter(chat, adapter.name);
+        // Non-gateway adapters (Slack, Teams, GitHub, etc.) — register on the
+        // shared webhook server. The handler key stays adapter.name (the
+        // Chat instance's webhooks map is keyed by it); the route segment is
+        // the instance, so each same-platform bridge gets its own URL (and
+        // its own signing secret — platforms sign per-app).
+        registerWebhookAdapter(chat, adapter.name, config.instance ?? adapter.name);
      }

      log.info('Chat SDK bridge initialized', { adapter: adapter.name });
@@ -90,8 +90,8 @@ describe('groups CLI delete cascades dependent rows (#2525)', () => {
      now(),
    );
    db.prepare(
-      `INSERT INTO messaging_groups (id, channel_type, platform_id, name, is_group, unknown_sender_policy, created_at)
-       VALUES (?, 'telegram', 'tg-1', 'chat', 1, 'strict', ?)`,
+      `INSERT INTO messaging_groups (id, channel_type, platform_id, instance, name, is_group, unknown_sender_policy, created_at)
+       VALUES (?, 'telegram', 'tg-1', 'telegram', 'chat', 1, 'strict', ?)`,
    ).run(MGID, now());

    db.prepare(
@@ -26,6 +26,12 @@ vi.mock('./db/sessions.js', () => ({
 const mockWriteSessionMessage = vi.fn();
 vi.mock('./session-manager.js', () => ({
  writeSessionMessage: (...args: unknown[]) => mockWriteSessionMessage(...args),
+  openInboundDb: () => ({}),
+}));
+
+const mockCountDueMessages = vi.fn((..._args: unknown[]) => 0);
+vi.mock('./db/session-db.js', () => ({
+  countDueMessages: (...args: unknown[]) => mockCountDueMessages(...args),
 }));

 import { restartAgentGroupContainers } from './container-restart.js';
@@ -148,4 +154,21 @@ describe('restartAgentGroupContainers', () => {
    expect(mockWriteSessionMessage.mock.calls[0][1]).toBe('s1');
    expect(mockWriteSessionMessage.mock.calls[1][1]).toBe('s2');
  });
+
+  it('wakes even without a wake message when in-flight messages are pending', () => {
+    // A provider switch mid-conversation kills a container holding claimed
+    // messages — without an immediate respawn those messages stay dark until
+    // the next inbound or a slow sweep backoff.
+    mockGetSessionsByAgentGroup.mockReturnValue([makeSession('s1', 'ag1')]);
+    mockIsContainerRunning.mockReturnValue(true);
+    mockCountDueMessages.mockReturnValue(2);
+
+    restartAgentGroupContainers('ag1', 'provider switch');
+
+    const onExit = mockKillContainer.mock.calls[0][2] as () => void;
+    expect(typeof onExit).toBe('function');
+    mockGetSession.mockReturnValue(makeSession('s1', 'ag1'));
+    onExit();
+    expect(mockWakeContainer).toHaveBeenCalled();
+  });
 });
@@ -5,9 +5,10 @@
 * wakes a fresh container via the onExit callback — race-free.
 */
 import { isContainerRunning, killContainer, wakeContainer } from './container-runner.js';
+import { countDueMessages } from './db/session-db.js';
 import { getSession, getSessionsByAgentGroup } from './db/sessions.js';
 import { log } from './log.js';
-import { writeSessionMessage } from './session-manager.js';
+import { openInboundDb, writeSessionMessage } from './session-manager.js';

 /**
 * Kill all running containers for an agent group and respawn them.
@@ -40,10 +41,15 @@ export function restartAgentGroupContainers(agentGroupId: string, reason: string
        onWake: 1,
      });
    }
+    // Always respawn after the kill when there is anything to process: an
+    // explicit wake message, or in-flight messages the dying container had
+    // claimed. Without this, a provider switch mid-conversation leaves the
+    // claimed messages dark until the next inbound or a slow sweep backoff.
+    const hasPending = countDueMessages(openInboundDb(session.agent_group_id, session.id)) > 0;
    killContainer(
      session.id,
      reason,
-      wakeMessage
+      wakeMessage || hasPending
        ? () => {
            const s = getSession(session.id);
            if (s) wakeContainer(s);
@@ -1,3 +1,5 @@
+import fs from 'fs';
+import path from 'path';
 import { describe, expect, it } from 'vitest';

 import { resolveProviderName } from './container-runner.js';
@@ -25,3 +27,36 @@ describe('resolveProviderName', () => {
    expect(resolveProviderName(null, '')).toBe('claude');
  });
 });
+
+describe('buildContainerArgs ordering invariant (structural)', () => {
+  // The OneCLI gateway apply (SDK applyContainerConfig) appends credential-stub
+  // mounts — e.g. the codex auth.json sentinel nested INSIDE our RW
+  // /home/node/.codex mount. Docker applies binds in argument order, so the
+  // stub must land AFTER its parent mount or the parent shadows it and the
+  // agent silently degrades to loginless auth. Driving the real
+  // buildContainerArgs needs a live gateway + container runtime, so this
+  // guards the invariant structurally: the gateway apply must appear after
+  // the volume-mounts loop in the source.
+  it('applies the OneCLI gateway after the volume mounts', () => {
+    const src = fs.readFileSync(path.join(process.cwd(), 'src', 'container-runner.ts'), 'utf-8');
+    const mountsLoop = src.indexOf('for (const mount of mounts)');
+    const gatewayApply = src.indexOf('onecli.applyContainerConfig');
+    expect(mountsLoop).toBeGreaterThan(-1);
+    expect(gatewayApply).toBeGreaterThan(-1);
+    expect(gatewayApply).toBeGreaterThan(mountsLoop);
+  });
+});
+
+describe('container boot-failure tripwire (structural)', () => {
+  // A container that dies at boot (unknown provider, missing CLI binary, bad
+  // config) explains itself only on stderr — which logs at debug, below the
+  // default level. The spawn handler must keep a stderr tail and surface it
+  // at warn on a non-zero exit, or the operator sees only "exited code 1" on
+  // repeat. Driving a real failing spawn needs a container runtime, so this
+  // guards the wiring structurally, matching the invariant test above.
+  it('surfaces the stderr tail when the container exits non-zero', () => {
+    const src = fs.readFileSync(path.join(process.cwd(), 'src', 'container-runner.ts'), 'utf-8');
+    expect(src).toContain('stderrTail.push(line)');
+    expect(src).toMatch(/Container exited non-zero.*stderrTail/s);
+  });
+});
@@ -21,8 +21,9 @@ import {
 } from './config.js';
 import { materializeContainerJson } from './container-config.js';
 import { getContainerConfig } from './db/container-configs.js';
-import { updateContainerConfigScalars, updateContainerConfigJson } from './db/container-configs.js';
+import { updateContainerConfigScalars } from './db/container-configs.js';
 import { CONTAINER_RUNTIME_BIN, hostGatewayArgs, readonlyMountArgs, stopContainer } from './container-runtime.js';
+import { EGRESS_NETWORK, egressNetworkArgs, ensureEgressNetwork } from './egress-lockdown.js';
 import { composeGroupClaudeMd } from './claude-md-compose.js';
 import { getAgentGroup } from './db/agent-groups.js';
 import { getDb, hasTable } from './db/connection.js';
@@ -35,6 +36,7 @@ import { validateAdditionalMounts } from './modules/mount-security/index.js';
 import './providers/index.js';
 import {
  getProviderContainerConfig,
+  providerProvidesAgentSurfaces,
  type ProviderContainerContribution,
  type VolumeMount,
 } from './providers/provider-container-registry.js';
@@ -126,12 +128,19 @@ async function spawnContainer(session: Session): Promise<void> {
  // and buildContainerArgs so we don't re-read.
  const containerConfig = materializeContainerJson(agentGroup.id);

+  // Per-group filesystem state lives forever after first creation. Init is
+  // idempotent: it only writes paths that don't already exist, so this call
+  // is a no-op for groups that have spawned before. Runs before the provider
+  // contribution so a surfaces-providing provider finds the group dir ready.
+  const providerName = resolveProviderName(session.agent_provider, containerConfig.provider);
+  initGroupFilesystem(agentGroup, { provider: providerName });
+
  // Resolve the effective provider + any host-side contribution it declares
  // (extra mounts, env passthrough). Computed once and threaded through both
  // buildMounts and buildContainerArgs so side effects (mkdir, etc.) fire once.
  const { provider, contribution } = resolveProviderContribution(session, agentGroup, containerConfig);

-  const mounts = buildMounts(agentGroup, session, containerConfig, contribution);
+  const mounts = buildMounts(agentGroup, session, containerConfig, provider, contribution);
  const containerName = `nanoclaw-v2-${agentGroup.folder}-${Date.now()}`;
  // OneCLI agent identifier is always the agent group id — stable across
  // sessions and reversible via getAgentGroup() for approval routing.
@@ -159,10 +168,16 @@ async function spawnContainer(session: Session): Promise<void> {
  activeContainers.set(session.id, { process: container, containerName });
  markContainerRunning(session.id);

-  // Log stderr
+  // Log stderr. A container that dies at boot (unknown provider, missing
+  // binary, bad config) explains itself only here — and debug is below the
+  // default log level — so keep a tail to surface on a non-zero exit.
+  const stderrTail: string[] = [];
  container.stderr?.on('data', (data) => {
    for (const line of data.toString().trim().split('\n')) {
-      if (line) log.debug(line, { container: agentGroup.folder });
+      if (!line) continue;
+      log.debug(line, { container: agentGroup.folder });
+      stderrTail.push(line);
+      if (stderrTail.length > 10) stderrTail.shift();
    }
  });

@@ -178,7 +193,12 @@ async function spawnContainer(session: Session): Promise<void> {
    activeContainers.delete(session.id);
    markContainerStopped(session.id);
    stopTypingRefresh(session.id);
-    log.info('Container exited', { sessionId: session.id, code, containerName });
+    // code null = killed by signal (normal shutdown path), not a boot failure.
+    if (code !== 0 && code !== null && stderrTail.length > 0) {
+      log.warn('Container exited non-zero', { sessionId: session.id, code, containerName, stderrTail });
+    } else {
+      log.info('Container exited', { sessionId: session.id, code, containerName });
+    }
  });

  container.on('error', (err) => {
@@ -233,32 +253,37 @@ function resolveProviderContribution(
    ? fn({
        sessionDir: sessionDir(agentGroup.id, session.id),
        agentGroupId: agentGroup.id,
+        groupDir: path.resolve(GROUPS_DIR, agentGroup.folder),
+        selectedSkills: selectedSkillNames(containerConfig),
        hostEnv: process.env,
      })
    : {};
  return { provider, contribution };
 }

-function buildMounts(
+export function buildMounts(
  agentGroup: AgentGroup,
  session: Session,
  containerConfig: import('./container-config.js').ContainerConfig,
+  provider: string,
  providerContribution: ProviderContainerContribution,
 ): VolumeMount[] {
  const projectRoot = process.cwd();

-  // Per-group filesystem state lives forever after first creation. Init is
-  // idempotent: it only writes paths that don't already exist, so this call
-  // is a no-op for groups that have spawned before.
-  initGroupFilesystem(agentGroup);
+  // Default agent surfaces (composed project doc, skill links, provider state
+  // dir) apply unless the provider's registration declares it provides its
+  // own — a capability, never a provider name. See provider-container-registry.
+  const defaultSurfaces = !providerProvidesAgentSurfaces(provider);

-  // Sync skill symlinks based on container.json selection before mounting.
  const claudeDir = path.join(DATA_DIR, 'v2-sessions', agentGroup.id, '.claude-shared');
-  syncSkillSymlinks(claudeDir, containerConfig);
+  if (defaultSurfaces) {
+    // Sync skill symlinks based on container.json selection before mounting.
+    syncSkillSymlinks(claudeDir, containerConfig);

-  // Compose CLAUDE.md fresh every spawn from the shared base, enabled skill
-  // fragments, and MCP server instructions. See `claude-md-compose.ts`.
-  composeGroupClaudeMd(agentGroup);
+    // Compose CLAUDE.md fresh every spawn from the shared base, enabled skill
+    // fragments, and MCP server instructions. See `claude-md-compose.ts`.
+    composeGroupClaudeMd(agentGroup);
+  }

  const mounts: VolumeMount[] = [];
  const sessDir = sessionDir(agentGroup.id, session.id);
@@ -285,11 +310,11 @@ function buildMounts(
  // already RO-mounted, so writes through it fail regardless — no need for
  // a nested mount there.
  const composedClaudeMd = path.join(groupDir, 'CLAUDE.md');
-  if (fs.existsSync(composedClaudeMd)) {
+  if (defaultSurfaces && fs.existsSync(composedClaudeMd)) {
    mounts.push({ hostPath: composedClaudeMd, containerPath: '/workspace/agent/CLAUDE.md', readonly: true });
  }
  const fragmentsDir = path.join(groupDir, '.claude-fragments');
-  if (fs.existsSync(fragmentsDir)) {
+  if (defaultSurfaces && fs.existsSync(fragmentsDir)) {
    mounts.push({ hostPath: fragmentsDir, containerPath: '/workspace/agent/.claude-fragments', readonly: true });
  }

@@ -302,13 +327,15 @@ function buildMounts(
  // Shared CLAUDE.md — read-only, imported by the composed entry point via
  // the `.claude-shared.md` symlink inside the group dir.
  const sharedClaudeMd = path.join(process.cwd(), 'container', 'CLAUDE.md');
-  if (fs.existsSync(sharedClaudeMd)) {
+  if (defaultSurfaces && fs.existsSync(sharedClaudeMd)) {
    mounts.push({ hostPath: sharedClaudeMd, containerPath: '/app/CLAUDE.md', readonly: true });
  }

  // Per-group .claude-shared at /home/node/.claude (Claude state, settings,
  // skill symlinks)
-  mounts.push({ hostPath: claudeDir, containerPath: '/home/node/.claude', readonly: false });
+  if (defaultSurfaces) {
+    mounts.push({ hostPath: claudeDir, containerPath: '/home/node/.claude', readonly: false });
+  }

  // Shared agent-runner source — read-only, same code for all groups.
  const agentRunnerSrc = path.join(projectRoot, 'container', 'agent-runner', 'src');
@@ -345,25 +372,7 @@ function syncSkillSymlinks(claudeDir: string, containerConfig: import('./contain
    fs.mkdirSync(skillsDir, { recursive: true });
  }

-  // Determine desired skill set
-  const projectRoot = process.cwd();
-  const sharedSkillsDir = path.join(projectRoot, 'container', 'skills');
-  let desired: string[];
-  if (containerConfig.skills === 'all') {
-    // Recompute from shared dir — newly-added upstream skills appear automatically
-    desired = fs.existsSync(sharedSkillsDir)
-      ? fs.readdirSync(sharedSkillsDir).filter((e) => {
-          try {
-            return fs.statSync(path.join(sharedSkillsDir, e)).isDirectory();
-          } catch {
-            return false;
-          }
-        })
-      : [];
-  } else {
-    desired = containerConfig.skills;
-  }
-
+  const desired = selectedSkillNames(containerConfig);
  const desiredSet = new Set(desired);

  // Remove symlinks not in the desired set
@@ -396,12 +405,30 @@ function syncSkillSymlinks(claudeDir: string, containerConfig: import('./contain
  }
 }

+/**
+ * Resolve the group's skill selection to concrete names — `'all'` recomputes
+ * from `container/skills/` so newly-added upstream skills appear automatically.
+ */
+function selectedSkillNames(containerConfig: import('./container-config.js').ContainerConfig): string[] {
+  if (containerConfig.skills !== 'all') return containerConfig.skills;
+  const sharedSkillsDir = path.join(process.cwd(), 'container', 'skills');
+  return fs.existsSync(sharedSkillsDir)
+    ? fs.readdirSync(sharedSkillsDir).filter((e) => {
+        try {
+          return fs.statSync(path.join(sharedSkillsDir, e)).isDirectory();
+        } catch {
+          return false;
+        }
+      })
+    : [];
+}
+
 async function buildContainerArgs(
  mounts: VolumeMount[],
  containerName: string,
  agentGroup: AgentGroup,
  containerConfig: import('./container-config.js').ContainerConfig,
-  provider: string,
+  _provider: string,
  providerContribution: ProviderContainerContribution,
  agentIdentifier?: string,
 ): Promise<string[]> {
@@ -418,22 +445,14 @@ async function buildContainerArgs(
    }
  }

-  // OneCLI gateway — injects HTTPS_PROXY + certs so container API calls
-  // are routed through the agent vault for credential injection. Treated as
-  // a transient hard failure: if we can't wire the gateway, we don't spawn.
-  // The caller (router or host-sweep) catches the throw, leaves the inbound
-  // message pending, and the next sweep tick retries.
-  if (agentIdentifier) {
-    await onecli.ensureAgent({ name: agentGroup.name, identifier: agentIdentifier });
+  // Egress lockdown when enabled — throws if it can't be established, aborting
+  // the spawn rather than running with open egress. Otherwise the host gateway.
+  if (ensureEgressNetwork()) {
+    args.push(...egressNetworkArgs());
+    log.info('Egress lockdown active', { containerName, network: EGRESS_NETWORK });
+  } else {
+    args.push(...hostGatewayArgs());
  }
-  const onecliApplied = await onecli.applyContainerConfig(args, { addHostMapping: false, agent: agentIdentifier });
-  if (!onecliApplied) {
-    throw new Error('OneCLI gateway not applied — refusing to spawn container without credentials');
-  }
-  log.info('OneCLI gateway applied', { containerName });
-
-  // Host gateway
-  args.push(...hostGatewayArgs());

  // User mapping
  const hostUid = process.getuid?.();
@@ -452,6 +471,24 @@ async function buildContainerArgs(
    }
  }

+  // OneCLI gateway — injects HTTPS_PROXY + certs so container API calls
+  // are routed through the agent vault for credential injection, and mounts
+  // any credential stubs the gateway serves (e.g. a sentinel auth file).
+  // Runs AFTER the volume mounts so a stub nested inside one of our mounts
+  // (a parent dir mounted RW above it) lands later in the args and isn't
+  // shadowed by it. Treated as a transient hard failure: if we can't wire
+  // the gateway, we don't spawn. The caller (router or host-sweep) catches
+  // the throw, leaves the inbound message pending, and the next sweep tick
+  // retries.
+  if (agentIdentifier) {
+    await onecli.ensureAgent({ name: agentGroup.name, identifier: agentIdentifier });
+  }
+  const onecliApplied = await onecli.applyContainerConfig(args, { addHostMapping: false, agent: agentIdentifier });
+  if (!onecliApplied) {
+    throw new Error('OneCLI gateway not applied — refusing to spawn container without credentials');
+  }
+  log.info('OneCLI gateway applied', { containerName });
+
  // Override entrypoint: run v2 entry point directly via Bun (no tsc, no stdin).
  args.push('--entrypoint', 'bash');

@@ -0,0 +1,255 @@
+/**
+ * Channel-instance dimension tests (migration 016 + messaging-groups queries).
+ *
+ * Covers the three load-bearing rules:
+ *   1. Backfill/default — instance = channel_type everywhere it isn't set,
+ *      so single-instance installs behave byte-identically.
+ *   2. UNIQUE(channel_type, platform_id, instance) — siblings coexist,
+ *      single-bot pair-uniqueness is preserved via the default value.
+ *   3. Lookup asymmetry — inbound (getMessagingGroupWithAgentCount) is
+ *      exact-on-instance with NO fallback (unknown named instance ⇒ null ⇒
+ *      router auto-creates instead of hijacking a sibling's row); outbound
+ *      (getMessagingGroupByPlatform) is default-instance-first.
+ *
+ * The wired-DB arm reproduces the failure mode that bit migration 011: a
+ * table recreate on a live DB with FK children. It must pass with
+ * disableForeignKeys: true and fail without it.
+ */
+import { describe, it, expect, beforeEach, afterEach } from 'vitest';
+
+import { initTestDb, closeDb, getDb } from './connection.js';
+import { runMigrations, migrations, type Migration } from './migrations/index.js';
+import {
+  createMessagingGroup,
+  getMessagingGroupByPlatform,
+  getMessagingGroupWithAgentCount,
+} from './messaging-groups.js';
+import type { MessagingGroup } from '../types.js';
+
+function now(): string {
+  return new Date().toISOString();
+}
+
+function mg(overrides: Partial<MessagingGroup> & { id: string }): MessagingGroup {
+  return {
+    channel_type: 'slack',
+    platform_id: 'slack:C1',
+    name: null,
+    is_group: 1,
+    unknown_sender_policy: 'public',
+    created_at: now(),
+    ...overrides,
+  };
+}
+
+afterEach(() => {
+  closeDb();
+});
+
+describe('migration 016 — fresh DB', () => {
+  beforeEach(() => {
+    const db = initTestDb();
+    runMigrations(db);
+  });
+
+  it('adds a NOT NULL instance column', () => {
+    const cols = getDb().prepare("PRAGMA table_info('messaging_groups')").all() as Array<{
+      name: string;
+      notnull: number;
+    }>;
+    const instance = cols.find((c) => c.name === 'instance');
+    expect(instance).toBeDefined();
+    expect(instance!.notnull).toBe(1);
+  });
+
+  it('createMessagingGroup without instance stamps instance = channel_type', () => {
+    createMessagingGroup(mg({ id: 'mg-default' }));
+    const row = getDb().prepare("SELECT instance FROM messaging_groups WHERE id = 'mg-default'").get() as {
+      instance: string;
+    };
+    expect(row.instance).toBe('slack');
+  });
+
+  it('allows sibling instances on the same (channel_type, platform_id)', () => {
+    createMessagingGroup(mg({ id: 'mg-default' }));
+    createMessagingGroup(mg({ id: 'mg-tester', instance: 'slack-tester' }));
+    const count = getDb().prepare('SELECT COUNT(*) AS c FROM messaging_groups').get() as { c: number };
+    expect(count.c).toBe(2);
+  });
+
+  it('rejects a duplicate (channel_type, platform_id, instance) triple', () => {
+    createMessagingGroup(mg({ id: 'mg-a', instance: 'slack-tester' }));
+    expect(() => createMessagingGroup(mg({ id: 'mg-b', instance: 'slack-tester' }))).toThrow();
+  });
+
+  it('rejects a duplicate default pair (single-bot uniqueness preserved)', () => {
+    createMessagingGroup(mg({ id: 'mg-a' }));
+    expect(() => createMessagingGroup(mg({ id: 'mg-b' }))).toThrow();
+  });
+});
+
+describe('migration 016 — wired legacy DB upgrade (the FK recreate arm)', () => {
+  it('recreates messaging_groups under FK children without violations and backfills instance', () => {
+    const db = initTestDb();
+    // Bring the DB to the pre-016 schema.
+    runMigrations(
+      db,
+      migrations.filter((m) => m.name !== 'messaging-group-instance'),
+    );
+    const preCols = db.prepare("PRAGMA table_info('messaging_groups')").all() as Array<{ name: string }>;
+    expect(preCols.some((c) => c.name === 'instance')).toBe(false);
+
+    // Seed a wired install: messaging_groups with live FK children
+    // (messaging_group_agents + sessions reference messaging_groups.id).
+    // Raw SQL — the new createMessagingGroup expects the instance column.
+    db.prepare("INSERT INTO agent_groups (id, name, folder, created_at) VALUES ('ag-1', 'A', 'a', ?)").run(now());
+    db.prepare(
+      `INSERT INTO messaging_groups (id, channel_type, platform_id, name, is_group, unknown_sender_policy, created_at)
+       VALUES ('mg-1', 'telegram', 'telegram:123', 'Chat', 0, 'public', ?)`,
+    ).run(now());
+    db.prepare(
+      `INSERT INTO messaging_group_agents (id, messaging_group_id, agent_group_id, engage_mode, sender_scope, ignored_message_policy, created_at)
+       VALUES ('mga-1', 'mg-1', 'ag-1', 'pattern', 'all', 'drop', ?)`,
+    ).run(now());
+    db.prepare(
+      `INSERT INTO sessions (id, agent_group_id, messaging_group_id, created_at)
+       VALUES ('sess-1', 'ag-1', 'mg-1', ?)`,
+    ).run(now());
+
+    // Upgrade: only 016 is pending now. Without disableForeignKeys this
+    // throws 'FOREIGN KEY constraint failed' at DROP TABLE.
+    expect(() => runMigrations(db)).not.toThrow();
+
+    // Backfill: existing row got instance = channel_type.
+    const row = db.prepare("SELECT instance FROM messaging_groups WHERE id = 'mg-1'").get() as { instance: string };
+    expect(row.instance).toBe('telegram');
+
+    // Children intact and pointing at the recreated parent.
+    expect(
+      db.prepare("SELECT COUNT(*) AS c FROM messaging_group_agents WHERE messaging_group_id = 'mg-1'").get(),
+    ).toEqual({ c: 1 });
+    expect(db.prepare("SELECT COUNT(*) AS c FROM sessions WHERE messaging_group_id = 'mg-1'").get()).toEqual({ c: 1 });
+
+    // Full-DB FK integrity (FK enforcement was restored by the runner).
+    expect(db.pragma('foreign_key_check')).toEqual([]);
+    expect(db.pragma('foreign_keys', { simple: true })).toBe(1);
+  });
+
+  it('tolerates pre-existing FK orphans: the migration still applies (no boot crash-loop)', () => {
+    const db = initTestDb();
+    runMigrations(
+      db,
+      migrations.filter((m) => m.name !== 'messaging-group-instance'),
+    );
+
+    // Seed the orphan class that demonstrably exists on live installs
+    // (ensureUserDm tolerates it at runtime): a user_dms row whose
+    // messaging_group was deleted through a FK-OFF connection — the
+    // sqlite3 CLI ships with foreign_keys OFF, and operators are told to
+    // poke v2.db when troubleshooting.
+    db.prepare("INSERT INTO users (id, kind, created_at) VALUES ('slack:U1', 'slack', ?)").run(now());
+    db.pragma('foreign_keys = OFF');
+    db.prepare(
+      `INSERT INTO user_dms (user_id, channel_type, messaging_group_id, resolved_at)
+       VALUES ('slack:U1', 'slack', 'mg-deleted-via-cli', ?)`,
+    ).run(now());
+    db.pragma('foreign_keys = ON');
+    expect(db.pragma('foreign_key_check')).toHaveLength(1);
+
+    // 016 did not create this violation — it must still apply (the runner
+    // diffs post-up violations against a pre-up snapshot and only throws
+    // on NEW ones; pre-existing ones are warned about and carried through).
+    expect(() => runMigrations(db)).not.toThrow();
+    const cols = db.prepare("PRAGMA table_info('messaging_groups')").all() as Array<{ name: string }>;
+    expect(cols.some((c) => c.name === 'instance')).toBe(true);
+
+    // The orphan is untouched: still present, still the only violation.
+    expect(db.pragma('foreign_key_check')).toHaveLength(1);
+  });
+
+  it('still rejects a migration that ITSELF introduces FK violations', () => {
+    const db = initTestDb();
+    runMigrations(db);
+
+    const rogue: Migration = {
+      version: 999,
+      name: 'test-rogue-fk-violation',
+      disableForeignKeys: true,
+      up: (d) => {
+        d.prepare("INSERT INTO users (id, kind, created_at) VALUES ('slack:U-rogue', 'slack', datetime('now'))").run();
+        d.prepare(
+          `INSERT INTO user_dms (user_id, channel_type, messaging_group_id, resolved_at)
+           VALUES ('slack:U-rogue', 'slack', 'mg-never-existed', datetime('now'))`,
+        ).run();
+      },
+    };
+
+    expect(() => runMigrations(db, [...migrations, rogue])).toThrow(/left FK violations/);
+
+    // Rolled back atomically: not recorded as applied, nothing committed.
+    expect(db.prepare("SELECT 1 FROM schema_version WHERE name = 'test-rogue-fk-violation'").get()).toBeUndefined();
+    expect(db.pragma('foreign_key_check')).toEqual([]);
+  });
+
+  it('is idempotent — re-running the full barrel is a no-op', () => {
+    const db = initTestDb();
+    runMigrations(db);
+    createMessagingGroup(mg({ id: 'mg-keep', instance: 'slack-tester' }));
+    expect(() => runMigrations(db)).not.toThrow();
+    const row = db.prepare("SELECT instance FROM messaging_groups WHERE id = 'mg-keep'").get() as {
+      instance: string;
+    };
+    expect(row.instance).toBe('slack-tester');
+  });
+});
+
+describe('lookup asymmetry — inbound exact-only vs outbound default-first', () => {
+  beforeEach(() => {
+    const db = initTestDb();
+    runMigrations(db);
+    // The named instance ('alpha-tester') sorts lexically BEFORE the
+    // channel type ('slack') and is inserted first — so both rowid order
+    // and the triple-autoindex order put it ahead of the default row.
+    // A query missing the `(instance = channel_type) DESC` ORDER BY would
+    // return it; only the deterministic default-first ordering picks
+    // mg-default.
+    createMessagingGroup(mg({ id: 'mg-tester', instance: 'alpha-tester' }));
+    createMessagingGroup(mg({ id: 'mg-default' }));
+  });
+
+  it('getMessagingGroupWithAgentCount without instance resolves the default-instance row', () => {
+    const found = getMessagingGroupWithAgentCount('slack', 'slack:C1');
+    expect(found).not.toBeNull();
+    expect(found!.mg.id).toBe('mg-default');
+  });
+
+  it('getMessagingGroupWithAgentCount with a named instance resolves exactly that row', () => {
+    const found = getMessagingGroupWithAgentCount('slack', 'slack:C1', 'alpha-tester');
+    expect(found).not.toBeNull();
+    expect(found!.mg.id).toBe('mg-tester');
+  });
+
+  it('getMessagingGroupWithAgentCount with an unknown instance returns null (no-hijack rule)', () => {
+    expect(getMessagingGroupWithAgentCount('slack', 'slack:C1', 'slack-unknown')).toBeNull();
+  });
+
+  it('getMessagingGroupByPlatform without instance prefers the default-instance row', () => {
+    const found = getMessagingGroupByPlatform('slack', 'slack:C1');
+    expect(found).toBeDefined();
+    expect(found!.id).toBe('mg-default');
+  });
+
+  it('getMessagingGroupByPlatform with explicit instance is exact', () => {
+    expect(getMessagingGroupByPlatform('slack', 'slack:C1', 'alpha-tester')!.id).toBe('mg-tester');
+    expect(getMessagingGroupByPlatform('slack', 'slack:C1', 'slack-unknown')).toBeUndefined();
+  });
+
+  it('getMessagingGroupByPlatform falls back deterministically when only named instances exist', () => {
+    const db = getDb();
+    db.prepare("DELETE FROM messaging_groups WHERE id = 'mg-default'").run();
+    createMessagingGroup(mg({ id: 'mg-zeta', instance: 'zeta' }));
+    const found = getMessagingGroupByPlatform('slack', 'slack:C1');
+    // Lexically-first named instance: 'alpha-tester' < 'zeta'.
+    expect(found!.id).toBe('mg-tester');
+  });
+});
@@ -21,19 +21,43 @@ import { getDb, hasTable } from './connection.js';
 export function createMessagingGroup(group: MessagingGroup): void {
  getDb()
    .prepare(
-      `INSERT INTO messaging_groups (id, channel_type, platform_id, name, is_group, unknown_sender_policy, created_at)
-       VALUES (@id, @channel_type, @platform_id, @name, @is_group, @unknown_sender_policy, @created_at)`,
+      `INSERT INTO messaging_groups (id, channel_type, platform_id, instance, name, is_group, unknown_sender_policy, created_at)
+       VALUES (@id, @channel_type, @platform_id, @instance, @name, @is_group, @unknown_sender_policy, @created_at)`,
    )
-    .run(group);
+    .run({ ...group, instance: group.instance ?? group.channel_type });
 }

 export function getMessagingGroup(id: string): MessagingGroup | undefined {
  return getDb().prepare('SELECT * FROM messaging_groups WHERE id = ?').get(id) as MessagingGroup | undefined;
 }

-export function getMessagingGroupByPlatform(channelType: string, platformId: string): MessagingGroup | undefined {
+/**
+ * Outbound / cold-DM / setup lookup by platform address.
+ *
+ * Instance semantics are deliberately ASYMMETRIC with the router's
+ * `getMessagingGroupWithAgentCount` (exact-only): outbound callers usually
+ * don't know (or care) which adapter instance owns a chat, so an unset
+ * `instance` resolves the default instance first (instance = channel_type),
+ * falling back deterministically to the lexically-first named instance.
+ * A set `instance` is exact-only — unknown instance returns undefined.
+ */
+export function getMessagingGroupByPlatform(
+  channelType: string,
+  platformId: string,
+  instance?: string,
+): MessagingGroup | undefined {
+  if (instance !== undefined) {
+    return getDb()
+      .prepare('SELECT * FROM messaging_groups WHERE channel_type = ? AND platform_id = ? AND instance = ?')
+      .get(channelType, platformId, instance) as MessagingGroup | undefined;
+  }
  return getDb()
-    .prepare('SELECT * FROM messaging_groups WHERE channel_type = ? AND platform_id = ?')
+    .prepare(
+      `SELECT * FROM messaging_groups
+        WHERE channel_type = ? AND platform_id = ?
+     ORDER BY (instance = channel_type) DESC, instance ASC
+        LIMIT 1`,
+    )
    .get(channelType, platformId) as MessagingGroup | undefined;
 }

@@ -46,23 +70,31 @@ export function getMessagingGroupByPlatform(channelType: string, platformId: str
 *
 * Returns `null` when no messaging_groups row exists for this channel.
 * Returns `{ mg, agentCount: 0 }` when the row exists but has no wired
- * agents. Uses the `UNIQUE(channel_type, platform_id)` index plus the
- * `UNIQUE(messaging_group_id, agent_group_id)` index for the JOIN — both
+ * agents. Uses the `UNIQUE(channel_type, platform_id, instance)` index plus
+ * the `UNIQUE(messaging_group_id, agent_group_id)` index for the JOIN — both
 * covered by existing SQLite auto-indexes from the UNIQUE constraints.
+ *
+ * `instance` is EXACT-ONLY, with no fallback — deliberately asymmetric with
+ * `getMessagingGroupByPlatform`'s default-instance-first resolution. An
+ * unknown named instance must return null so the router auto-creates a
+ * per-instance group instead of hijacking a sibling instance's row. The
+ * default param (= channelType) keeps instance-less callers resolving the
+ * default instance, identical to pre-instance behavior.
 */
 export function getMessagingGroupWithAgentCount(
  channelType: string,
  platformId: string,
+  instance: string = channelType,
 ): { mg: MessagingGroup; agentCount: number } | null {
  const row = getDb()
    .prepare(
      `SELECT mg.*, COUNT(mga.id) AS agent_count
         FROM messaging_groups mg
    LEFT JOIN messaging_group_agents mga ON mga.messaging_group_id = mg.id
-        WHERE mg.channel_type = ? AND mg.platform_id = ?
+        WHERE mg.channel_type = ? AND mg.platform_id = ? AND mg.instance = ?
     GROUP BY mg.id`,
    )
-    .get(channelType, platformId) as (MessagingGroup & { agent_count: number }) | undefined;
+    .get(channelType, platformId, instance) as (MessagingGroup & { agent_count: number }) | undefined;
  if (!row) return null;
  const { agent_count, ...mg } = row;
  return { mg: mg as MessagingGroup, agentCount: agent_count };
@@ -72,6 +104,12 @@ export function getAllMessagingGroups(): MessagingGroup[] {
  return getDb().prepare('SELECT * FROM messaging_groups ORDER BY name').all() as MessagingGroup[];
 }

+/**
+ * All messaging groups on a platform, across every adapter instance.
+ * Semantics intentionally unchanged by the instance dimension — channel_type
+ * stays the semantic platform key. No live caller today; if a caller needs
+ * a single instance's rows, filter on `mg.instance`.
+ */
 export function getMessagingGroupsByChannel(channelType: string): MessagingGroup[] {
  return getDb().prepare('SELECT * FROM messaging_groups WHERE channel_type = ?').all(channelType) as MessagingGroup[];
 }
@@ -0,0 +1,61 @@
+/**
+ * Channel-instance dimension on messaging_groups.
+ *
+ * `instance` names the adapter instance that owns a chat — N adapters of one
+ * platform (e.g. three Slack apps in one workspace) each get their own
+ * messaging_groups rows. The default instance IS the channel type: every
+ * existing row is backfilled with `instance = channel_type`, so all existing
+ * lookups keep resolving the same rows with zero operator action. NOT NULL
+ * (instead of nullable + partial unique index) keeps every lookup two-state:
+ * "default instance" is just the literal value `channel_type`.
+ *
+ * Uniqueness relaxes from UNIQUE(channel_type, platform_id) to
+ * UNIQUE(channel_type, platform_id, instance). SQLite cannot relax a
+ * table-level UNIQUE in place — this requires the documented 12-step
+ * recreate (new table → copy → DROP → RENAME, sqlite.org/lang_altertable.html).
+ * DROP TABLE fails `FOREIGN KEY constraint failed` on live DBs because five
+ * child tables REFERENCE messaging_groups(id) (messaging_group_agents,
+ * user_dms, sessions, pending_sender_approvals, pending_channel_approvals) —
+ * the exact failure that forced migration 011 to abandon its rebuild (see
+ * its header). Hence `disableForeignKeys: true`: the runner toggles
+ * foreign_keys=OFF around the transaction (the pragma is a no-op inside one)
+ * and runs PRAGMA foreign_key_check inside it so violations roll back.
+ *
+ * Column list mirrors the live tip schema exactly (001 columns + 012's
+ * denied_at) — verified against PRAGMA table_info on a freshly-migrated DB.
+ * A recreate with a stale column list silently drops data.
+ */
+import type Database from 'better-sqlite3';
+import type { Migration } from './index.js';
+
+export const migration016: Migration = {
+  version: 16,
+  name: 'messaging-group-instance',
+  disableForeignKeys: true,
+  up: (db: Database.Database) => {
+    // Idempotency guard per the 012 pattern.
+    const cols = db.prepare("PRAGMA table_info('messaging_groups')").all() as Array<{ name: string }>;
+    if (cols.some((c) => c.name === 'instance')) return;
+
+    db.exec(`
+      CREATE TABLE messaging_groups_new (
+        id                    TEXT PRIMARY KEY,
+        channel_type          TEXT NOT NULL,
+        platform_id           TEXT NOT NULL,
+        instance              TEXT NOT NULL,
+        name                  TEXT,
+        is_group              INTEGER DEFAULT 0,
+        unknown_sender_policy TEXT NOT NULL DEFAULT 'strict',
+        created_at            TEXT NOT NULL,
+        denied_at             TEXT,
+        UNIQUE(channel_type, platform_id, instance)
+      );
+      INSERT INTO messaging_groups_new
+        (id, channel_type, platform_id, instance, name, is_group, unknown_sender_policy, created_at, denied_at)
+        SELECT id, channel_type, platform_id, channel_type, name, is_group, unknown_sender_policy, created_at, denied_at
+          FROM messaging_groups;
+      DROP TABLE messaging_groups;
+      ALTER TABLE messaging_groups_new RENAME TO messaging_groups;
+    `);
+  },
+};
--- a/Show More
+++ b/Show More