Compare commits

...

95 Commits

Author SHA1 Message Date
Koshkoshinsk a6a46621dd fix(codex): deliver harness file events + add file to ProviderEvent
Codex's built-in image generation yields { type: 'file', path } that the
ProviderEvent union didn't declare (breaks tsc once codex.ts lands on
trunk) and the poll-loop never consumed (the image was dropped and never
reached chat). Adding the type alone clears the build but leaves delivery
broken — this fixes both.

- add { type: 'file'; path: string } to ProviderEvent
- extract enqueueFileOut() owning the outbox-staging + messages_out
  {files:[]} contract so send_file and the poll-loop can't drift apart
- poll-loop delivers file events to the batch's reply destination,
  best-effort (missing dest / unreadable file logs, never fails the turn)
- tests for enqueueFileOut

Refs CDX-001.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 01:50:32 +03:00
gavrielc ac0a799cbf refactor(add-codex): install Codex CLI via cli-tools.json, not the Dockerfile
adfae67 moved the agent's global Node CLIs into container/cli-tools.json so a
skill adds one with a json-merge instead of editing the Dockerfile. The Codex
provider install was left behind — add-codex.sh still awk'd an ARG + RUN into
the Dockerfile and its test guarded that shape.

Migrate add-codex to the seam:
- add-codex.sh appends { name: "@openai/codex", version } to cli-tools.json
  (idempotent json-merge); install/idempotency gates read the manifest.
- SKILL.md / REMOVE.md document the manifest append/removal, not Dockerfile edits.
- codex-dockerfile.test.ts -> codex-cli-tools.test.ts, asserting the manifest
  entry (skips when the manifest is absent, e.g. the bare providers branch).

Pairs with the providers-branch commit that drops the codex Dockerfile lines,
renames the payload test, and points the setup install-check at the manifest.

Verified end-to-end: full add-codex install into a clean worktree leaves the
Dockerfile codex-free, the manifest correctly appended and idempotent; vitest
cli-tools.test.ts (6) and bun codex-cli-tools.test.ts (2) green; host tsc clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 21:40:44 +03:00
github-actions[bot] e3986eb58c chore: bump version to 2.1.16 2026-06-14 18:29:28 +00:00
github-actions[bot] 6d0d48d585 docs: update token count to 195k tokens · 98% of context window 2026-06-14 18:29:25 +00:00
gavrielc a142c496f7 Merge pull request #2756 from nanocoai/provider-selection
feat(providers): operator-driven provider selection, switching, and memory migration
2026-06-14 21:29:12 +03:00
gavrielc c5b4d11536 Apply suggestion from @gavrielc 2026-06-14 21:16:19 +03:00
Daniel M ed8b4149e7 Merge pull request #2764 from glifocat/docs/fix-claude-md-relocated-paths
docs(CLAUDE.md): fix two relocated Key Files paths
2026-06-14 18:13:31 +03:00
glifocat d5ce02d1b8 docs(CLAUDE.md): fix two relocated Key Files paths
The Key Files table and the Secrets/OneCLI section referenced
src/onecli-approvals.ts and src/user-dm.ts, but both files were moved
under src/modules/ (src/modules/approvals/onecli-approvals.ts and
src/modules/permissions/user-dm.ts). onecli-approvals.ts is already
cited at its correct new path elsewhere in the same doc, so this was a
partial-rename miss. Docs only — no code changes.

Closes #2763

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 17:01:40 +02:00
omri-maya c8af599944 Merge branch 'main' into provider-selection 2026-06-14 15:17:13 +03:00
github-actions[bot] 435233a062 chore: bump version to 2.1.15 2026-06-14 11:04:33 +00:00
gavrielc 785fce3754 Merge pull request #2758 from nanocoai/feat/cli-tools-manifest
feat(container): data-drive global CLI installs from cli-tools.json
2026-06-14 14:04:16 +03:00
Omri Maya 6d521a9d8d refactor(memory): scope imported-memory doctrine to /migrate-memory
The "read imported-agent-memory.md, treat it as binding" doctrine sat in the
memory definition that every group loads, but it only matters when an import
actually happened. Move it into the /migrate-memory skill — the step that
writes the imported file and its index pointer (which the agent inlines into
its prompt each turn) — and drop the always-on block from definition.md.

Addresses review feedback on #2756.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 12:12:41 +03:00
gavrielc adfae67611 feat(container): data-drive global CLI installs from cli-tools.json
The agent's global Node CLIs (claude-code, agent-browser, vercel) were each
a hardcoded ARG + RUN layer in the Dockerfile, so adding or bumping one meant
editing the Dockerfile — a code reach-in every tool-installing skill had to make.

Move the tool list into container/cli-tools.json. A skill now adds a CLI by
appending a {name, version} entry (a json-merge) — the safest change shape:
deterministic, idempotent, removable. install-cli-tools.sh parses the manifest
with node (no new jq dep), writes the per-tool only-built-dependencies opt-ins,
and runs one pinned `pnpm install -g`, so the pnpm supply-chain path is unchanged.

Behavior is byte-for-byte: same opt-ins, same pinned installs. agent-browser is
now pinned (0.27.1, what `latest` last resolved to) instead of floating.

container/cli-tools.test.ts guards the seam: red if a baseline tool is dropped,
a version unpins, or the Dockerfile wiring / pnpm path is removed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 12:07:14 +03:00
Omri Maya 13a37def89 feat(providers): operator-driven provider selection, switching, and memory migration
Make the agent provider a first-class, operator-chosen property instead of a
Claude-only assumption. Trunk gains the seams; the actual non-default payloads
(Codex first) install from the `providers` branch.

Setup
- A provider registry feeds a hard-wired setup picker (Claude | Codex). Picking
  a non-default provider installs its payload (setup/add-codex.sh, channel-style),
  runs a vault-only auth walkthrough (--step provider-auth), and records the pick
  on the first agent before its first spawn.
- Picking Claude changes nothing — default installs are byte-for-byte unaffected.

Provider as a DB property
- Provider lives on container_configs.provider (materialized to container.json,
  read by resolveProviderName). Creation stays provider-agnostic; the picked
  provider is applied via the picked-provider seam. The deprecated
  agent_groups.agent_provider path is not used.

Switching + memory
- Switch a live group with `ncl groups config update --provider` + restart.
- Memory never migrates at runtime — each provider keeps its own store. The
  /migrate-memory skill carries a group's memory across a switch in either
  direction (flat CLAUDE.local.md <-> memory/ scaffold). group-init seeds an
  imported-agent-memory note for non-default providers; the runner's memory
  definition reads it first turn. See docs/provider-migration.md.

No install-wide default, no runtime provider guard — switching is operator-by-
convention, consistent with the no-install-gating posture.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 07:49:39 +03:00
github-actions[bot] 03382e9dd7 chore: bump version to 2.1.14 2026-06-13 13:05:30 +00:00
github-actions[bot] 9763551656 docs: update token count to 194k tokens · 97% of context window 2026-06-13 13:05:27 +00:00
gavrielc a9c9cb300d Merge pull request #2754 from nanocoai/oss/exchange-hook
feat(runner): onExchangeComplete provider hook + slash-command interruption
2026-06-13 16:05:14 +03:00
gavrielc a619fc1aa2 Apply suggestion from @gavrielc 2026-06-13 16:03:02 +03:00
Omri Maya 3d2f3e58ca feat(runner): onExchangeComplete provider hook + slash-command interruption
Inverts conversation archiving into an optional onExchangeComplete provider
hook: the runner never archives on a provider's behalf, and the markdown
writer ships with the provider that needs it. Dormant for the default
provider.

Slash commands now interrupt an in-flight turn — a runner-handled command
(/clear, /compact, /cost, …) arriving mid-turn aborts the active stream and
runs immediately instead of waiting out the turn.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-13 15:56:43 +03:00
gavrielc 11afc64ba4 Merge pull request #2747 from nanocoai/oss/onecli-sdk-v2
feat(onecli): SDK 2.2.1 — credential-stub mounts + machine-checkable pins
2026-06-13 15:49:40 +03:00
github-actions[bot] 0ee75d393c chore: bump version to 2.1.13 2026-06-13 12:27:29 +00:00
github-actions[bot] 72b9cc7ed0 docs: update token count to 192k tokens · 96% of context window 2026-06-13 12:27:24 +00:00
gavrielc 5fcf234165 Merge pull request #2746 from nanocoai/oss/agent-surfaces
feat(providers): agent-surfaces capability seam
2026-06-13 15:27:12 +03:00
github-actions[bot] 9b1236505f chore: bump version to 2.1.12 2026-06-13 12:25:58 +00:00
github-actions[bot] 878cd68c1b docs: update token count to 191k tokens · 96% of context window 2026-06-13 12:25:52 +00:00
gavrielc fab1ebf2d6 Merge pull request #2745 from nanocoai/oss/memory-scaffold
feat(memory): opt-in persistent memory scaffold for providers
2026-06-13 15:25:39 +03:00
Omri Maya 3f9e89d345 feat(onecli): SDK 2.2.1 — credential-stub mounts + machine-checkable pins
Injects credentials as request-time stubs so no credential is ever written
into a container or to disk. Gateway and CLI versions move to versions.json
(machine-checkable pins); breaking upgrades are documented in
docs/onecli-upgrades.md as an agent-executable runbook (detect / why / fix /
verify / rollback), and the update flow follows linked docs and diffs the
pins.

BREAKING: requires a gateway upgrade; the doc carries the steps.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-12 11:30:11 +03:00
Omri Maya 14810a5090 feat(providers): agent-surfaces capability seam
Host-side registry where a provider can declare, by capability rather than
by name, that it owns its agent surfaces (project doc, skills). Default
providers keep the standard surfaces; a surfaces-owning provider suppresses
them. Dormant until a provider registers — no change for existing installs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-12 11:30:10 +03:00
Omri Maya 2cfa86e570 feat(memory): opt-in persistent memory scaffold for providers
Adds a provider capability (usesMemoryScaffold) and a container-side boot
scaffold that materializes a persistent memory/ tree for providers that opt
in. Dormant for the default provider — the scaffold is only built when a
provider declares the capability, so existing installs are byte-identical
(asserted by a boot-gate wiring test).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-12 11:30:09 +03:00
github-actions[bot] 36cbf17e10 chore: bump version to 2.1.11 2026-06-11 17:16:51 +00:00
gavrielc 4459ab2e54 Merge pull request #2739 from nanocoai/feat/raw-webhook-registry
feat(webhook-server): raw-route registry — non-Chat-SDK webhooks become an append
2026-06-11 20:16:33 +03:00
gavrielc 9e6238d28f Merge main (channel instances): keep both webhook suites as separate files
The instance route-split suite (from #2733) keeps src/webhook-server.test.ts;
this branch's raw-route suite moves to src/webhook-server-raw.test.ts —
incompatible lifecycle setups (fixed port + afterEach vs random port +
afterAll) make a single merged file wrong. webhook-server.ts auto-merge
verified: raw routes take dispatch priority, stop clears both maps.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 20:07:30 +03:00
github-actions[bot] d1bda5d15b chore: bump version to 2.1.10 2026-06-11 16:42:59 +00:00
gavrielc 7eddc7d8c9 Merge pull request #2738 from nanocoai/fix/write-outbound-direct-rw
fix(session-manager): writeOutboundDirect opens outbound.db read-only — command-gate denials never deliver
2026-06-11 19:42:39 +03:00
github-actions[bot] 991ef986f8 docs: update token count to 190k tokens · 95% of context window 2026-06-11 16:42:36 +00:00
github-actions[bot] 0f2557e2bc chore: bump version to 2.1.9 2026-06-11 16:42:32 +00:00
gavrielc 4e6552ed55 Merge pull request #2737 from nanocoai/feat/approval-resolved-hook
feat(approvals): approval-resolved callback registry — modules observe resolution additively
2026-06-11 19:42:12 +03:00
github-actions[bot] 978b998ee6 chore: bump version to 2.1.8 2026-06-11 16:41:56 +00:00
gavrielc 83951d7c01 Merge pull request #2736 from nanocoai/fix/host-sweep-wake-grace
fix(host-sweep): grace period for freshly-woken containers with stale processing claims
2026-06-11 19:41:38 +03:00
github-actions[bot] 76ef097521 chore: bump version to 2.1.7 2026-06-11 16:41:14 +00:00
gavrielc 1c85fd6e50 Merge pull request #2735 from nanocoai/fix/approval-card-actor-byline
fix(chat-sdk-bridge): record the acting user on resolved approval cards
2026-06-11 19:40:59 +03:00
github-actions[bot] 42275ede1f chore: bump version to 2.1.6 2026-06-11 16:40:40 +00:00
gavrielc 53e1989529 Merge pull request #2734 from nanocoai/feat/delivery-action-getter
feat(delivery): getDeliveryAction read side for the action registry
2026-06-11 19:40:20 +03:00
github-actions[bot] 6f2142d7c7 docs: update token count to 189k tokens · 95% of context window 2026-06-11 16:39:51 +00:00
github-actions[bot] 79a0226962 chore: bump version to 2.1.5 2026-06-11 16:39:41 +00:00
gavrielc 0b31695e92 Merge pull request #2733 from nanocoai/feat/channel-instances
feat(channels): native channel-instance dimension — multi-bot substrate
2026-06-11 19:39:19 +03:00
gavrielc 421f8707d2 Merge pull request #2741 from nanocoai/setup-handoff-kickoff-prompt
fix(setup): auto-submit handoff context as Claude's first prompt
2026-06-11 17:36:04 +03:00
gavrielc 67ccd9e74c fix(setup): auto-submit handoff context as Claude's first prompt
Interactive setup handoffs (mid-flow `?` escape and on-failure) spawned
claude with all context in --append-system-prompt and no user message,
so Claude sat at an empty REPL until the user re-explained themselves.

Move the context into a positional prompt that auto-submits as the
first user message: Claude starts orienting immediately, the context
stays visible in the transcript, and it survives --resume.

Also:
- Share one session across all handoffs in a setup run: pin a
  generated UUID via --session-id on the first spawn, --resume it on
  later ones (stdio is inherited, so Claude's own id is never visible).
- Switch --permission-mode from acceptEdits to auto.
- Dedupe the two spawn blocks into spawnInteractiveClaude().

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 15:14:48 +03:00
gavrielc f69af07c57 feat(webhook-server): raw-route registry — non-Chat-SDK webhooks become an append
Add a RawWebhookHandler registry alongside the Chat SDK adapter routes
so modules can mount plain Node handlers at /webhook/{path} on the
shared server instead of editing webhook-server.ts or standing up a
second HTTP server on another port. Raw routes dispatch ahead of
adapter routes, handler throws surface as a 500, and stopWebhookServer
clears the registry.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 13:58:51 +03:00
gavrielc 93a302b5db feat(approvals): add approval-resolved callback registry
Modules can already register to handle an approval (registerApprovalHandler),
but nothing lets a module observe that an approval was resolved — e.g. to
clear an "awaiting approval" status indicator it set when the card went out.
Today that observation is only possible by core importing module code.

Add registerApprovalResolvedHandler/notifyApprovalResolved to the approvals
primitive and fire it at the three resolution exits in the response handler
(reject, approve-with-no-handler, approve-after-handler). Callback errors are
logged and isolated so one bad callback never blocks resolution or other
callbacks. The hook only fires for authorized clicks (it sits behind the
isAuthorizedApprovalClick gate) and carries the namespaced user id.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 13:54:21 +03:00
gavrielc eef285ba3b fix(session-manager): open outbound.db read-write in writeOutboundDirect
writeOutboundDirect opened the session's outbound DB through
openOutboundDb, which sets readonly: true. The INSERT it then runs threw
SQLITE_READONLY on every call, so the command-gate denial path
(router.ts) never delivered its 'Permission denied' response — the
sender just got silence, and the throw aborted routing for that inbound
event.

Switch to the openOutboundDbRw wrapper, which opens the same path with
write access (DELETE journal + busy_timeout). The host-side write to the
container-owned outbound.db is safe: both sides use DELETE journal mode,
and the even host seq stays out of the container's odd-seq space.

Adds a guard test that drives writeOutboundDirect against a real session
folder and asserts the denial rows land in messages_out with even seqs;
it goes red if the open call reverts to the readonly form.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 13:54:17 +03:00
gavrielc a806534199 fix(host-sweep): grace period for freshly-woken containers with stale processing claims
The sweep tick that wakes a container for due messages also ran the
running-container SLA check in the same iteration. A fresh container that
inherits stale processing_ack rows from a previous crash hasn't had a chance
to run its startup cleanup (clearStaleProcessingAcks) yet, so the per-claim
stuck rule saw an hours-old claim, concluded the just-spawned container was
stuck, and SIGKILL'd it — an immediate spawn-kill loop.

Carry a justWoke flag from the wake step into the SLA gate and skip the
check for that one tick. The next tick (60s later) enforces the SLA
normally, so a genuinely stuck container is still killed.

Guarded by src/host-sweep-grace.test.ts, which drives two real sweep ticks
against on-disk session DBs: the wake tick must not kill, a later tick with
the claim still stale must kill claim-stuck.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 13:53:15 +03:00
gavrielc 0ac8073e34 fix(chat-sdk-bridge): record the acting user on resolved approval cards
When a button on an approval/question card is clicked, the bridge edits
the card down to the title and the selected answer — but not who clicked
it. In shared channels every member sees the same resolved card, so the
audit trail of which user approved or rejected is lost the moment the
buttons disappear.

Append an actor byline (" — <userName>", falling back to fullName) to
the edited card markdown. The shared chat.onAction handler covers every
Chat SDK webhook platform; cards edited for actors with no resolvable
name stay byline-free.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 13:52:45 +03:00
gavrielc 539a2b3c63 feat(delivery): getDeliveryAction read side for the action registry
registerDeliveryAction had no read side, so module registrations could
not be verified through the registry itself. Add a getter beside it and
a guard test covering lookup, miss, and overwrite.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 13:52:00 +03:00
gavrielc fccaadf24c fix(channels,db): exact instance dispatch, FK-check scoping, migration-safe skill snippets
Review-round fixes on the instance dimension:
- delivery/typing resolve adapters by exact registry key, never the
  channelType fallback — a named instance with an offline adapter gets
  offline handling, not a cross-identity send through a sibling bot;
  the fallback scan (channelType-only callers) now warns when it
  resolves through a differently-keyed instance
- migration runner only fails on FK violations a migration introduced:
  pre-existing latent orphans (FK-OFF CLI surgery) are logged and
  carried, not turned into a boot crash-loop
- typing re-trigger updates the full address (channelType, platformId,
  threadId, instance) together — no torn entries on agent-shared
  sessions spanning instances
- bridge rejects empty/whitespace instance names (URL-route and
  state-namespace safety)
- add-github / add-linear SKILL.md wiring inserts include the NOT NULL
  instance column
- drop the 10s same-platform boot stagger: operational policy, not
  substrate — reintroducible skill-side for gateway-mode installs

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 11:43:18 +03:00
github-actions[bot] 3329270c67 docs: update token count to 185k tokens · 93% of context window 2026-06-10 20:02:28 +00:00
Daniel M f16ea0c783 Merge pull request #2719 from amit-shafnir/feat/uninstall-script
feat: add uninstall.sh — per-copy uninstaller with confirmation, dry-run, and OneCLI agent cleanup
2026-06-10 23:02:11 +03:00
gavrielc 1c024bc976 docs: document the channel-instance dimension
- CLAUDE.md entity model: instance on messaging_groups.
- db-central.md: updated messaging_groups DDL (instance NOT NULL, triple
  UNIQUE, denied_at), instance semantics (default = channel_type via
  migration 016 backfill; inbound exact-on-instance, outbound
  default-first), and the user_dms per-platform (not per-instance)
  cold-DM note.
- architecture.md: same DDL update in the schema appendix.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 22:07:22 +03:00
gavrielc 6c26f3ef08 feat(host): thread the channel instance through router, delivery, and typing
Inbound: src/index.ts onInbound stamps `instance: adapter.instance ??
adapter.channelType` — the single host-side stamping seam; adapters stay
instance-blind and onInboundEvent (CLI) passes events through unchanged.
The router resolves the thread-policy adapter and the messaging group by
the receiving instance (exact-only — an unknown named instance auto-creates
its own group, persisting the instance, instead of hijacking a sibling's
row).

Outbound: ChannelDeliveryAdapter.deliver/setTyping grow a trailing
`instance` param (host-internal interface only — messages_out, destinations
and session_routing schemas are untouched; containers never see instance).
deliverMessage resolves the messaging group ORIGIN-SESSION-FIRST, so a
named instance's session replies through its own adapter even when a
sibling default row shares the same (channel_type, platform_id); dispatch
goes through getChannelAdapter(instance ?? channelType).

Typing: TypingTarget stores the instance and all three tick sites
(immediate, 4s interval, re-trigger) forward it, so the indicator fires
through the bot that owns the chat.

Also updates a raw-SQL fixture in groups.test.ts for the NOT NULL instance
column.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 22:05:50 +03:00
gavrielc ab6ab6936c feat(channels): per-instance Chat SDK state namespaces and webhook routes
ChatSdkBridgeConfig gains `instance`. The bridge keeps channelType =
adapter.name (semantic platform identity is untouched) and threads the
instance into three places:

- Registry identity: bridge.name / bridge.instance follow config.instance.
- Chat SDK state: SqliteStateAdapter takes an optional namespace and
  prefixes every key at a single choke point (k()). All bridges share the
  chat_sdk_* tables and two same-platform instances see identical
  thread/message ids — without the namespace, the SDK's
  dedupe:${adapter.name}:${message.id} key makes the second bot silently
  drop every message the first processed, locks serialize across bots, and
  subscriptions leak engagement. The namespace applies ONLY when instance
  is set AND differs from adapter.name: the default instance stays on the
  legacy UNPREFIXED keyspace byte-identically, so live installs' existing
  subscriptions/kv/locks/lists rows are never orphaned. enqueue does not
  prefix (appendToList does) — layout is ns:queue:<tid>; acquireLock
  returns the raw threadId and release/extend re-apply k() at their SQL
  sites.
- Webhook route: registerWebhookAdapter(chat, adapterName, routingPath =
  adapterName) splits the URL segment from the chat.webhooks handler key,
  so each same-platform instance gets its own URL (and signing secret).
  Signature adopted verbatim from PR #2617 (credit @davekim917's #1804
  prototype); the handler body needed zero change — dispatch already read
  entry.adapterName, not the route key.

Instance names are validated URL-safe (no '/', '?', ':' or whitespace) at
bridge construction: the route regex is [^/?]+ and ':' is the namespace
delimiter. The Chat instance's inner adapters map stays keyed adapter.name
(the SDK resolves adapters via channelId.split(':')[0] and serializes by
adapter.name) — instance identity lives entirely outside the Chat.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 21:57:33 +03:00
gavrielc 501afb4beb feat(channels): key the adapter registry by instance with channelType fallback
ChannelAdapter and InboundEvent gain an optional `instance` field — the
host-side routing identity for N adapters of one platform. channelType
stays the semantic platform key (user ids, formatting, container config).

Registry changes:
- activeAdapters keys by `adapter.instance ?? adapter.channelType`, so the
  default instance keeps today's channelType key byte-identically. A
  duplicate instance key warns loudly and overwrites (today's boot
  semantics, made visible).
- getChannelAdapter(key) resolves the exact instance key first, then falls
  back to the first-registered adapter of that channel type — channelType-
  only callers (cold DMs, user-id prefix resolution, approval delivery)
  still resolve deterministically when every instance of a platform is
  named.
- initChannelAdapters staggers same-channelType setups by 10s so two
  gateway bots of one platform don't identify simultaneously from one IP.
  Inert when no two registrations share a channelType.

No adapter sets `instance` today, so every existing install boots
identically.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 21:53:01 +03:00
gavrielc 9040dbb86e feat(db): add messaging_groups.instance with FK-safe recreate migration
Adds the channel-instance dimension to the schema: an `instance` column
(NOT NULL, default instance = channel_type) on messaging_groups, relaxing
UNIQUE(channel_type, platform_id) to the triple so N adapter instances of
one platform can each own a row per chat.

SQLite can't relax a table-level UNIQUE in place, and DROP TABLE fails FK
integrity on live DBs with child rows (the failure that forced migration
011 to abandon its rebuild) — so the migration runner grows an opt-in
`disableForeignKeys` flag: foreign_keys=OFF around the transaction (the
pragma is a no-op inside one), PRAGMA foreign_key_check inside it so a
violating recreate rolls back atomically.

Query semantics (deliberately asymmetric, both documented):
- getMessagingGroupWithAgentCount (router fast path): exact-on-instance,
  no fallback — an unknown named instance returns null so the router
  auto-creates a per-instance group instead of hijacking a sibling's row.
  Default param (= channelType) keeps existing callers identical.
- getMessagingGroupByPlatform (outbound/cold-DM/setup): unset instance
  resolves default-instance-first with a deterministic ORDER BY; set
  instance is exact-only.

Existing rows are backfilled instance = channel_type, so single-instance
installs see zero behavior change and need no operator action.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 21:49:02 +03:00
Amit Shafnir d8748e3a45 fix: address uninstaller review findings
- .env backup and removal are now one atomic action: a failed backup
  throws into executePlan's catch and the deletion never runs (the bash
  original's set -e gave the same guarantee; the port had lost it)
- containers are re-listed by install label at removal time instead of
  removed from scan-time ids — the live host can spawn containers during
  the confirm phase
- uninstall telemetry no longer creates data/install-id (persistId:false
  on emit), so --dry-run truly changes nothing and the already-clean
  exit can fire
- runtime-tail failure notes are printed before the Done line instead
  of being discarded
- uninstall.sh translates the old short flags (-n/-y) instead of
  silently dropping them (-n used to fall through to a real interactive
  uninstall)
- nanoclaw.sh gates the TS uninstaller on node (tsx's interpreter), not
  pnpm, which the direct-exec path never uses
- detectExistingInstall also checks the system-level systemd unit
- a delete-onecli-agent spawn failure now notes the manual command
  instead of claiming the agent was already gone
- setupLog.userInput is skipped when logs/ is absent so the uninstall
  doesn't recreate it

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 15:50:12 +03:00
Amit Shafnir 41a720dd59 feat: port uninstaller to TS, wire nanoclaw.sh --uninstall, detect existing installs in setup
Replaces the standalone bash uninstall.sh with a TypeScript flow inside the
setup driver (setup/uninstall/): scan (slug-scoped inventory), plan (pure
ordered removal actions), remove (per-action executor that absorbs failures
into notes), and flow (clack UI). uninstall.sh is now a 3-line pointer that
execs nanoclaw.sh --uninstall.

- nanoclaw.sh --uninstall short-circuits before diagnostics/bootstrap; with
  no node_modules it prints manual cleanup commands and exits 1
- setup:auto routes --uninstall before initProgressionLog so an uninstall
  never resets logs/setup.log
- fresh setup runs detect an existing install (service registration or
  data/v2.db) and offer keep-and-continue (default) or uninstall-and-exit;
  suppressed on fail()-retry and sg re-exec resumes
- self-deletion safety: static imports only, dist/ + node_modules/ removed
  dead last, nothing but console.log after the runtime tail
- --yes never deletes orphan ag-* vault agents; their manual delete
  commands (by vault uuid) are printed instead

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 15:50:12 +03:00
Amit Shafnir 6ae83f48ac feat: add uninstall.sh — per-copy uninstaller with confirmation, dry-run, and OneCLI agent cleanup
Removes only what belongs to this checkout (slug-scoped): background
service, containers + image, data/, logs/, groups/, ncl symlink, and
this copy's OneCLI vault agents. Shared tools (OneCLI app, credentials,
other copies) are left alone. Interactive per-group confirmation with
--dry-run and --yes modes; .env is backed up before removal.

Documented in README FAQ and the CLAUDE.md key-files table.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 15:50:12 +03:00
gavrielc dc34ceb83d Merge pull request #2721 from nanocoai/docs/skills-model
docs: customizing intro, skills model, and skill guidelines
2026-06-10 11:41:40 +03:00
gavrielc ad3dfad3f5 docs: align CONTRIBUTING and README with the registry-branch install model
CONTRIBUTING still described feature skills as installed by merging a
skill/* branch, a design the shipped skills no longer use: /add-slack,
/add-telegram and the rest install by additive fetch from the channels
and providers registry branches (git fetch + git show per file), with
registration tests and a REMOVE.md. Rewrite the skill-type section to
match, point the authoring bar at docs/skill-guidelines.md, fix the
README FAQ line that sent every contribution to the registry branches,
and delete docs/skills-as-branches.md (the superseded merge-based
design, including a marketplace flow that was never the shipped path).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 23:27:51 +03:00
gavrielc 0bdc6d2bb2 docs: customizing intro, skills model, and skill guidelines
Three public docs establishing the skills-based customization contract:

- docs/customizing.md: the short doorway. The problem (merge fights on
  update), the idea (every change is a skill), how to work (edit first,
  skillify after), the one rule (/update-nanoclaw, never raw git pull),
  and the two-sided deal.
- docs/skills-model.md: the full model. Recipes, skill anatomy, the
  two kinds of skills, registry branches (additive fetch, never merge),
  a test for every integration point, upgrading, migrations and the
  startup tripwire, the maintainer commitments, and the registry
  review rule.
- docs/skill-guidelines.md: the authoritative checklist for writing a
  skill. Two principles (minimal integration surface; a test per
  functional integration point), anatomy, change shapes, testing
  doctrine with archetypes, anti-patterns, worked examples.

Also: CLAUDE.md docs index rows for the three docs, and .gitignore
entries for local-only working artifacts.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 23:12:17 +03:00
github-actions[bot] 820cd8ece6 docs: update token count to 185k tokens · 92% of context window 2026-06-09 19:31:53 +00:00
github-actions[bot] e44d497cdf chore: bump version to 2.1.4 2026-06-09 19:31:49 +00:00
gavrielc ac37ecbfd6 Merge pull request #2720 from nanocoai/security/authorize-create-agent
security: authorize create_agent host-side (approval for confined groups)
2026-06-09 22:31:36 +03:00
gavrielc c6627d32e2 security: authorize create_agent host-side (approval for confined groups)
create_agent writes central-DB state (agent_groups, container_configs,
agent_destinations) and scaffolds host filesystem state, but the only
gate lived inside the untrusted container and is bypassed by writing the
outbound system row directly (the "host re-checks permission" comment was
false). Authorize host-side by CLI scope: trusted owner agent groups
(global scope) create sub-agents directly; confined groups require admin
approval via requestApproval. Adds regression tests for the branch.

Alternative to #2383 (which denies confined groups outright); co-authored
from that work.

Co-Authored-By: hinotoi-agent <paperlantern.agent@gmail.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-09 22:29:57 +03:00
github-actions[bot] 51bf403b22 chore: bump version to 2.1.3 2026-06-09 19:29:33 +00:00
github-actions[bot] 265953ffec docs: update token count to 184k tokens · 92% of context window 2026-06-09 19:29:29 +00:00
gavrielc 6227bd1a5b Merge pull request #2478 from Hinotoi-agent/security/approval-response-admin-authz
[security] fix(approvals): require admin for approval responses
2026-06-09 22:29:07 +03:00
gavrielc 28032bc0ec Merge pull request #2468 from Hinotoi-agent/security/a2a-attachment-symlink-guard
[security] fix(agent-route): reject unsafe forwarded attachments
2026-06-09 22:29:03 +03:00
github-actions[bot] 3e3a2945a5 chore: bump version to 2.1.2 2026-06-09 18:04:39 +00:00
gavrielc f3fc18e56e chore: bump claude-code to 2.1.170 and agent SDK to 0.3.170
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-09 21:04:13 +03:00
github-actions[bot] d85efea229 chore: bump version to 2.1.1 2026-06-08 12:10:39 +00:00
github-actions[bot] c5b22cb308 docs: update token count to 183k tokens · 92% of context window 2026-06-08 12:10:36 +00:00
gavrielc 1592369201 Merge pull request #2713 from nanocoai/feat/egress-lockdown
feat(security): egress lockdown (opt-in, off by default)
2026-06-08 15:10:22 +03:00
Omri Maya 6420c0e254 feat(security): egress lockdown (opt-in) — agent egress only via OneCLI
Place agent containers on a Docker `--internal` network (no internet route)
with the OneCLI gateway attached, aliased host.docker.internal. The injected
proxy URL resolves only to the gateway, so a non-proxy-aware client or raw
socket has nowhere to go — closing the HTTPS_PROXY-bypass hole. The agent is
non-root with no NET_ADMIN, so it cannot undo this. Self-healing: the gateway
is re-attached at every spawn and on each host-sweep tick.

Fail-fast: when lockdown is enabled but the network/gateway can't be
established, refuse to spawn and surface a clear EgressLockdownError rather
than silently falling back to open egress. The host-sweep re-heal is the lone
exception — a heal failure there is logged, not fatal, since running agents
stay on the internal net (no leak) until the gateway returns.

Off by default — opt in with NANOCLAW_EGRESS_LOCKDOWN=true (so OSS users get
the prior behavior unchanged on pull). Also NANOCLAW_EGRESS_NETWORK and
ONECLI_GATEWAY_CONTAINER.

The lockdown logic lives in its own src/egress-lockdown.ts; container-runtime.ts
keeps only the generic runtime surface. Documented in docs/SECURITY.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 11:23:17 +03:00
gavrielc aef8d38b36 Merge pull request #2710 from markbala/docs/ollama-prefix-cache
docs(ollama): allow prompt caching by filtering the cache-busting hash
2026-06-07 23:21:45 +03:00
gavrielc 6d6f813deb Merge branch 'main' into docs/ollama-prefix-cache 2026-06-07 22:01:26 +03:00
markbala f9c86d0af2 docs(ollama): allow prompt caching by filtering the cache-busting hash
The Claude Agent SDK adds a per-request cch=<hash> to the front of every
prompt; it changes each turn, and Ollama's prompt cache only reuses a
prompt whose start is unchanged, so it re-reads the whole prompt every
time (slow). A tiny proxy filters the hash out (pins cch to a constant) so
caching kicks in. In our setup (31B on Apple Silicon) follow-up replies
went ~80s -> ~4s; numbers vary by model/hardware. Ollama ignores the hash,
so output is unchanged.

Scope: only the Claude-Code-CLI -> Ollama path; Codex/OpenCode emit no cch.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-07 23:20:11 +08:00
github-actions[bot] 9edb33dd3a docs: update token count to 182k tokens · 91% of context window 2026-06-07 14:06:19 +00:00
gavrielc 8ba5261ae8 Merge pull request #2707 from nanocoai/feat/upgrade-tripwire
feat(upgrade): startup tripwire + upgrade marker
2026-06-07 17:06:03 +03:00
gavrielc 8c84dec8e9 Merge remote-tracking branch 'origin/main' into feat/upgrade-tripwire
# Conflicts:
#	.claude/skills/migrate-nanoclaw/SKILL.md
2026-06-07 17:05:24 +03:00
gavrielc 092487d7ad chore: release 2.1.0; guard auto-bump against deliberate version changes
Set package.json to 2.1.0 to match the CHANGELOG entry for the upgrade
tripwire (a [BREAKING] change warrants a minor bump). The startup
tripwire reads package.json as the source of truth, so this is the
version the gate will enforce.

bump-version.yml previously ran `pnpm version patch` on every push to
main, which would patch a deliberate 2.1.0 up to 2.1.1. It now skips the
auto-bump when the pushed commits already changed package.json
themselves. fetch-depth: 0 so the before/after diff has both tips.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 17:03:02 +03:00
gavrielc 87850aa7f8 docs(changelog): release the upgrade-tripwire entry as 2.1.0
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 16:59:30 +03:00
gavrielc 526170fd47 feat(upgrade): add human-addressed guidance to tripwire banner
The startup tripwire message was written for a coding agent and gave a
human no direction — only the bare `set` override (which skips the
migrations the gate guards). Add one human-addressed stanza pointing to
/update-nanoclaw as the correct fix. The tested CODING AGENT block is
left byte-for-byte unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 16:57:13 +03:00
gavrielc 2d9375531b Merge pull request #2698 from nanocoai/feat/skill-exemplars
Skills conformance: exemplars + fleet retrofit (upgrade-maintainable skills)
2026-06-06 20:16:24 +03:00
gavrielc e734e5cddd feat(upgrade): startup tripwire + upgrade marker
Refuse to start unless this install reached the current version through a
sanctioned path (setup / update / migrate). A raw `git pull` that skips
migrations now fails loudly with a self-healing message instead of
silently breaking.

- src/upgrade-state.ts: marker at data/upgrade-state.json, getCodeVersion,
  isUpgradeCurrent, enforceUpgradeTripwire (fails closed on missing /
  corrupt / mismatched marker)
- src/index.ts: gate wired in at startup step 0.5, before DB init
- scripts/upgrade-state.ts: get/set CLI (also the override / recovery cmd)
- setup/service.ts, /update-nanoclaw, /migrate-nanoclaw: stamp on success;
  update/migrate also self-update their own skill first
- CHANGELOG [BREAKING] entry bridges existing installs via the skills'
  breaking-change check
- docs/upgrade-recovery.md: clearing the tripwire

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 13:02:12 +03:00
hinotoi-agent 728c6a641b fix(approvals): require admin for approval responses 2026-05-15 10:34:46 +08:00
hinotoi-agent 8385236c30 fix(agent-route): reject unsafe forwarded attachments 2026-05-14 21:04:04 +08:00
139 changed files with 8281 additions and 1348 deletions
+49 -54
View File
@@ -1,83 +1,78 @@
# Remove Codex provider
# Remove the Codex agent provider
Idempotent — safe to run even if some steps were never applied. Reverses both the host (`src/providers/`) and container (`container/agent-runner/src/providers/`) trees, plus the Dockerfile CLI install.
Reverses every change `/add-codex` makes and returns every group to the default provider. Safe to run when partially installed — skip any step whose target is already absent.
## 1. Delete the barrel import lines (both trees)
## 1. Switch codex groups back to the default
Delete (do not comment out) the `import './codex.js';` line from each barrel:
List groups still on codex and switch each one (each group's `memory/` tree stays on disk and readable; run `/migrate-memory` per group if its memory should carry back to Claude — see [docs/provider-migration.md](../../docs/provider-migration.md)):
```bash
ncl groups list
# for each group whose config shows provider=codex:
ncl groups config update --id <group-id> --provider claude
ncl groups restart --id <group-id>
```
## 2. Delete the barrel imports
Delete (do not comment out) the `import './codex.js';` line from each of:
- `src/providers/index.ts`
- `container/agent-runner/src/providers/index.ts`
- `setup/providers/index.ts`
This unregisters the provider from both `listProviderContainerConfigNames()` (host) and `listProviderNames()` (container).
## 2. Delete the copied files (both trees)
## 3. Delete every copied file
```bash
rm -f src/providers/codex.ts \
src/providers/codex-agents-md.ts \
src/providers/codex-registration.test.ts \
src/providers/codex-host-contribution.test.ts \
src/providers/codex-agents-md.test.ts \
container/agent-runner/src/providers/codex.ts \
container/agent-runner/src/providers/codex-app-server.ts \
container/agent-runner/src/providers/codex.factory.test.ts \
container/agent-runner/src/providers/exchange-archive.ts \
container/agent-runner/src/providers/exchange-archive.test.ts \
container/agent-runner/src/providers/codex-registration.test.ts \
container/agent-runner/src/providers/codex-dockerfile.test.ts
container/agent-runner/src/providers/codex.factory.test.ts \
container/agent-runner/src/providers/codex.turns.test.ts \
container/agent-runner/src/providers/codex-app-server.test.ts \
container/agent-runner/src/providers/codex-cli-tools.test.ts \
setup/providers/codex.ts \
setup/providers/codex.test.ts \
setup/providers/codex-registration.test.ts
```
## 3. Revert the Dockerfile CLI install
This skill itself (`.claude/skills/add-codex/`) stays — it ships with trunk so the provider can be re-added later.
In `container/Dockerfile`, remove both Codex edits (skip whichever is already gone):
`container/AGENTS.md` stays only if another installed provider uses agent surfaces; otherwise remove it too.
**(a)** Delete the version ARG from the "Pin CLI versions" block:
## 4. Remove the CLI manifest entry
```dockerfile
ARG CODEX_VERSION=0.124.0
```
**(b)** Delete the standalone Codex install layer:
```dockerfile
RUN --mount=type=cache,target=/root/.cache/pnpm \
pnpm install -g "@openai/codex@${CODEX_VERSION}"
```
Leave the other per-CLI install layers (claude-code, agent-browser, vercel) untouched.
## 4. Dependency
Codex is a CLI binary installed via the Dockerfile — there is no agent-runner package dependency to uninstall. Step 3 removes the only install surface; no `bun remove` / `pnpm uninstall` is needed.
## 5. Unset Codex env vars
Remove any Codex-specific lines you added to `.env` (`OPENAI_API_KEY`, `OPENAI_BASE_URL`, `CODEX_MODEL`) if no other integration uses them, then re-sync to the container:
Delete the `@openai/codex` entry from `container/cli-tools.json`:
```bash
mkdir -p data/env && cp .env data/env/env
node -e '
const fs = require("fs");
const file = "container/cli-tools.json";
const tools = JSON.parse(fs.readFileSync(file, "utf8")).filter((t) => t.name !== "@openai/codex");
const fmt = (t) => " { " + Object.entries(t).map(([k, v]) => JSON.stringify(k) + ": " + JSON.stringify(v)).join(", ") + " }";
fs.writeFileSync(file, "[\n" + tools.map(fmt).join(",\n") + "\n]\n");
'
```
Switch any group still on Codex back to the default provider — set `"provider": "claude"` in `groups/<folder>/container.json` and clear `agent_provider` on the group/session in the DB.
## 5. Vault secret (optional)
## 6. Rebuild and restart
The ChatGPT/OpenAI secret in the OneCLI vault grants nothing once the provider is gone. To remove it: `onecli secrets list`, then `onecli secrets delete --id <id>` for the `chatgpt.com` / `api.openai.com` entry.
Run from your NanoClaw project root:
## 6. Rebuild and verify
```bash
pnpm run build && ./container/build.sh
source setup/lib/install-slug.sh
# macOS
launchctl kickstart -k gui/$(id -u)/$(launchd_label)
# Linux
systemctl --user restart $(systemd_unit)
pnpm run build
pnpm exec tsc -p container/agent-runner/tsconfig.json --noEmit
./container/build.sh
pnpm test
cd container/agent-runner && bun test
```
## Verification
After removal, the registration guards no longer apply (their files are gone). Confirm the provider is fully unwired:
```bash
grep -R "codex.js" src/providers/index.ts container/agent-runner/src/providers/index.ts # no output
grep "@openai/codex" container/Dockerfile # no output
```
In a wired agent, requesting `agent_provider = 'codex'` should fall back to the default provider since `codex` is no longer in the registry.
All suites green and `ncl groups list` showing no codex groups means the removal is complete. Restart the service (`launchctl kickstart -k gui/$(id -u)/<label>` on macOS, `systemctl --user restart <unit>` on Linux).
+79 -139
View File
@@ -1,186 +1,126 @@
---
name: add-codex
description: Use Codex (CLI + AppServer) as the full agent provider — planning, tool orchestration, native compaction, MCP tools, session resume — in place of the Claude Agent SDK. ChatGPT subscription or OPENAI_API_KEY. Per-group via agent_provider. Distinct from using OpenAI as an MCP tool (where Claude remains the planner).
description: Use Codex (OpenAI's codex app-server) as a full agent provider — planning, tool orchestration, MCP tools, server-side history, session resume — alongside or instead of Claude. ChatGPT subscription or OpenAI API key, vault-only via OneCLI. Per-group via `ncl groups config update --provider codex`. Distinct from using OpenAI as an MCP tool (where Claude remains the planner).
---
# Codex agent provider
NanoClaw runs agents in a long-lived **poll loop** inside the container. The backend is selected with **`AGENT_PROVIDER`** (`claude` | `opencode` | `codex` | `mock`).
> Shortcut: `pnpm exec tsx setup/index.ts --step provider-auth codex` performs this whole install (manifest-driven from the providers branch: files, barrels, CLI manifest entry, image rebuild) plus auth in one command. The steps below are the same operations, for agent-driven or manual application.
Trunk ships with only the `claude` provider baked in. This skill copies the Codex provider files in from the `providers` branch, wires them into the host and container barrels, updates the Dockerfile to install the Codex CLI, and rebuilds the image.
NanoClaw selects each group's agent backend from `container_configs.provider` (default `claude`). This skill installs the Codex provider: copy the payload from the `providers` branch, append one import to each of the three provider barrels, add the pinned Codex CLI to the container manifest (`container/cli-tools.json`), rebuild, then run the vault auth walk-through.
The Codex provider runs `codex app-server` as a child process and speaks JSON-RPC over stdio. That gives it native session resume, streaming events, MCP tool access, and `thread/compact/start` compaction — same feature bar as the Claude Agent SDK, without the Anthropic-only lock-in.
The provider runs `codex app-server` as a child process speaking JSON-RPC over stdio: native streaming, MCP tools, server-side conversation history (the continuation is a thread id, no on-disk transcript). Credentials are **vault-only**: OneCLI serves a sentinel `auth.json` stub into the container and swaps the real ChatGPT token or API key on the wire — no key in `.env`, nothing readable in the container.
## Install
### Pre-flight
If all of the following are already present, skip to **Configuration**:
Check whether the payload is already wired (a prior apply, or a trunk that still carries it). All of these present means installed — skip to **Authenticate**:
- `src/providers/codex.ts`
- `src/providers/codex-registration.test.ts`
- `container/agent-runner/src/providers/codex.ts`
- `container/agent-runner/src/providers/codex-app-server.ts`
- `container/agent-runner/src/providers/codex.factory.test.ts`
- `container/agent-runner/src/providers/codex-registration.test.ts`
- `container/agent-runner/src/providers/codex-dockerfile.test.ts`
- `import './codex.js';` line in `src/providers/index.ts`
- `import './codex.js';` line in `container/agent-runner/src/providers/index.ts`
- `ARG CODEX_VERSION` and `"@openai/codex@${CODEX_VERSION}"` in the pnpm global-install block in `container/Dockerfile`
- `src/providers/codex.ts` and `src/providers/codex-agents-md.ts`
- `container/agent-runner/src/providers/codex.ts` and `codex-app-server.ts`
- `setup/providers/codex.ts`
- `import './codex.js';` in `src/providers/index.ts`, `container/agent-runner/src/providers/index.ts`, and `setup/providers/index.ts`
- an `@openai/codex` entry in `container/cli-tools.json`
Missing pieces — continue below. All steps are idempotent; re-running is safe.
### 1. Fetch the providers branch
### Fetch and copy
```bash
git fetch origin providers
```
### 2. Copy the Codex source files and tests
Copy each file with `git show origin/providers:<path> > <path>` (additive — never merge the branch):
Wholesale copies (owned entirely by this skill — user edits to these files won't survive a re-run, as designed):
Host (`src/providers/`):
- `codex.ts` — provider contribution: per-group `.codex-shared` state dir, AGENTS.md compose, skill links
- `codex-agents-md.ts` — AGENTS.md composition (32KB Codex cap: degrades by dropping the largest instruction sections, never blocks a spawn)
- `codex-registration.test.ts` — barrel-driven host registration guard
- `codex-host-contribution.test.ts` — drives the real contribution against a real test DB (the "consumes core" leg)
- `codex-agents-md.test.ts` — cap-degradation behavior
Container (`container/agent-runner/src/providers/`):
- `codex.ts` — the provider (turn loop, steering, memory scaffold + `onExchangeComplete` archiving)
- `codex-app-server.ts` — JSON-RPC child-process wrapper
- `exchange-archive.ts` — per-exchange markdown writer the `onExchangeComplete` hook uses (provider-owned, not runner code)
- `exchange-archive.test.ts` — writer behavior
- `codex-registration.test.ts` — barrel-driven container registration guard
- `codex.factory.test.ts`, `codex.turns.test.ts`, `codex-app-server.test.ts` — provider behavior
- `codex-cli-tools.test.ts` — structural guard for the Codex entry in `container/cli-tools.json`
Setup (`setup/providers/`):
- `codex.ts` — picker entry self-registration + the vault auth walk-through + install check
- `codex.test.ts` — install-check coverage
- `codex-registration.test.ts` — barrel-driven setup registration guard
Shared base (skip if present):
- `container/AGENTS.md` — the runtime-contract base the composed AGENTS.md embeds
### Wire the barrels
Append `import './codex.js';` to each of:
- `src/providers/index.ts`
- `container/agent-runner/src/providers/index.ts`
- `setup/providers/index.ts`
### CLI manifest
The agent's global Node CLIs install from `container/cli-tools.json` (a json-merge seam), not hand-edited Dockerfile layers. Add Codex by appending one entry — `@openai/codex` has no native postinstall, so no `onlyBuilt`:
```bash
git show origin/providers:src/providers/codex.ts > src/providers/codex.ts
git show origin/providers:src/providers/codex-registration.test.ts > src/providers/codex-registration.test.ts
git show origin/providers:container/agent-runner/src/providers/codex.ts > container/agent-runner/src/providers/codex.ts
git show origin/providers:container/agent-runner/src/providers/codex-app-server.ts > container/agent-runner/src/providers/codex-app-server.ts
git show origin/providers:container/agent-runner/src/providers/codex.factory.test.ts > container/agent-runner/src/providers/codex.factory.test.ts
git show origin/providers:container/agent-runner/src/providers/codex-registration.test.ts > container/agent-runner/src/providers/codex-registration.test.ts
node -e '
const fs = require("fs");
const file = "container/cli-tools.json";
const tools = JSON.parse(fs.readFileSync(file, "utf8"));
if (!tools.some((t) => t.name === "@openai/codex")) {
tools.push({ name: "@openai/codex", version: "0.138.0" });
const fmt = (t) => " { " + Object.entries(t).map(([k, v]) => JSON.stringify(k) + ": " + JSON.stringify(v)).join(", ") + " }";
fs.writeFileSync(file, "[\n" + tools.map(fmt).join(",\n") + "\n]\n");
}
'
```
The two `codex-registration.test.ts` files are the **registration guards**. Each imports only the real barrel — the host one calls `listProviderContainerConfigNames()` from `src/providers/index.ts`, the container one calls `listProviderNames()` from `container/agent-runner/src/providers/index.ts` — and asserts `codex` is present. They go red the instant a barrel import line is deleted or drifts. (`codex.factory.test.ts` imports `./codex.js` directly and self-registers, so it stays green even if the barrel line is gone — keep it as a unit test of provider behavior, but it is **not** the registration guard.)
The version (`0.138.0`) is the canonical pin — keep it in sync with `setup/add-codex.sh`. The Dockerfile already installs every manifest entry via pinned `pnpm install -g`; no Dockerfile edit is needed.
If `git show origin/providers:.../codex-registration.test.ts` errors with `path ... does not exist`, the registration tests have not landed on `origin/providers` yet. Run `git fetch origin providers` again; once the branch carries them, the copies above succeed. The rest of the install proceeds regardless — the Dockerfile and factory tests still run.
Copy the Dockerfile structural test that ships with this skill into the container provider tree:
### Build
```bash
cp .claude/skills/add-codex/codex-dockerfile.test.ts container/agent-runner/src/providers/codex-dockerfile.test.ts
pnpm run build
pnpm exec tsc -p container/agent-runner/tsconfig.json --noEmit
./container/build.sh
```
`codex-dockerfile.test.ts` reads the real `container/Dockerfile` and asserts the `ARG CODEX_VERSION=` line and the `pnpm install -g "@openai/codex@${CODEX_VERSION}"` line are both present. The Codex CLI is a binary, not an importable package, so the registration tests cannot see it — this structural test is what guards the Dockerfile edits in step 4.
### 3. Append the self-registration imports
Each barrel gets one line — alphabetical placement keeps diffs small.
`src/providers/index.ts`:
```typescript
import './codex.js';
```
`container/agent-runner/src/providers/index.ts`:
```typescript
import './codex.js';
```
### 4. Add the Codex CLI to the container Dockerfile
Two edits to `container/Dockerfile`, both idempotent (skip if already present):
**(a)** In the "Pin CLI versions" ARG block (around line 18), add after `ARG CLAUDE_CODE_VERSION=...`:
```dockerfile
ARG CODEX_VERSION=0.124.0
```
**(b)** Add a new standalone `RUN` block for the Codex CLI, after the existing per-CLI install blocks (around line 106, right after the `@anthropic-ai/claude-code` block). The Dockerfile splits each global CLI into its own layer for cache granularity — keep that pattern; do not collapse them into a single combined `pnpm install -g` call:
```dockerfile
RUN --mount=type=cache,target=/root/.cache/pnpm \
pnpm install -g "@openai/codex@${CODEX_VERSION}"
```
Note: **no agent-runner package dependency** — Codex is a CLI binary, not a library. Unlike OpenCode, there's nothing to add to `container/agent-runner/package.json`.
### 5. Build and validate
### Validate
```bash
pnpm run build # host
pnpm exec vitest run src/providers/codex-registration.test.ts # host registration guard
pnpm exec tsc -p container/agent-runner/tsconfig.json --noEmit # container typecheck
cd container/agent-runner && bun test src/providers/codex-registration.test.ts && cd - # container registration guard
cd container/agent-runner && bun test src/providers/codex-dockerfile.test.ts && cd - # Dockerfile structural guard
./container/build.sh # agent image
pnpm vitest run src/providers/codex-registration.test.ts src/providers/codex-host-contribution.test.ts src/providers/codex-agents-md.test.ts setup/providers/
cd container/agent-runner && bun test src/providers/
```
All must be clean before proceeding.
The registration tests import only the real barrels — they go red if a barrel line is missing, a barrel fails to evaluate, or the payload is broken.
- The **host** `codex-registration.test.ts` imports the real host barrel (`src/providers/index.ts`) and asserts `listProviderContainerConfigNames()` contains `codex`. It goes red if the `import './codex.js';` line is deleted or drifts, or if the barrel fails to evaluate.
- The **container** `codex-registration.test.ts` imports the real container barrel (`container/agent-runner/src/providers/index.ts`) and asserts `listProviderNames()` contains `codex`. Same failure surface for the container-side import line.
- The **Dockerfile** `codex-dockerfile.test.ts` reads `container/Dockerfile` and asserts the `ARG CODEX_VERSION=` and `@openai/codex@${CODEX_VERSION}` install lines are present — red if either edit is dropped.
The `@openai/codex` CLI binary is guarded by the Dockerfile structural test plus the container build (`./container/build.sh` fails if the install line is bad), **not** by the registration test — Codex is a CLI binary, not an importable package, so nothing imports it for the registration guard to trip on. To confirm the binary is actually present after the image rebuild, probe it inside a running container with `docker exec <container> codex --version`.
The host-side provider also consumes core APIs (per-session `~/.codex` mount, env passthrough); that typed core-API consumption is guarded by `pnpm run build`.
## Configuration
Codex supports two primary auth paths and one experimental BYO-endpoint path. Pick the one that matches your setup.
### Option A — ChatGPT subscription (recommended for individuals)
On the host (not inside the container), run Codex's OAuth login:
## Authenticate
```bash
codex login
pnpm exec tsx setup/index.ts --step provider-auth codex
```
This writes `~/.codex/auth.json` with a subscription token. The host-side Codex provider ([src/providers/codex.ts](../../../src/providers/codex.ts)) copies `auth.json` into a per-session `~/.codex` directory mounted into the container — your host's own Codex CLI is never touched.
The same walk-through fresh installs get from the setup picker: ChatGPT subscription (browser login or device pairing) or an OpenAI API key, landed in the OneCLI vault. Idempotent — it short-circuits when a matching secret already exists. It finishes with the install check.
No `.env` variables required for this mode.
## Use it
### Option B — API key (recommended for CI or API billing)
Per group:
```env
OPENAI_API_KEY=sk-...
CODEX_MODEL=gpt-5.4-mini
```bash
ncl groups config update --id <group-id> --provider codex
ncl groups restart --id <group-id>
```
The host forwards both variables into the container. If both subscription (`auth.json`) and `OPENAI_API_KEY` are present, Codex prefers the subscription.
Switching is an operator action — run it from the host. Memory does NOT carry over automatically — each provider keeps its own store; run `/migrate-memory` to carry it across. See [docs/provider-migration.md](../../docs/provider-migration.md) for the carry-over table and rollback.
### Option C — BYO OpenAI-compatible endpoint (experimental)
There is no install-wide default provider. Setup's provider picker sets codex on the first agent it creates; creation itself is provider-agnostic (no `--provider` flag — provider is a DB property). Any group switches afterward via `ncl groups config update --provider` as above.
Codex's built-in `openai` provider honors the `OPENAI_BASE_URL` env var directly. Point it at any OpenAI-compatible endpoint — Groq, Together, self-hosted vLLM, an OpenAI proxy, etc.
## Troubleshooting
```env
OPENAI_API_KEY=...
OPENAI_BASE_URL=https://api.groq.com/openai/v1
CODEX_MODEL=llama-3.3-70b-versatile
```
Codex also ships first-class local-runner flags — `codex --oss --local-provider ollama` or `--local-provider lmstudio` — that auto-detect a local server. To use those inside NanoClaw, set `CODEX_MODEL` to a model your local runner serves and add the corresponding base URL; see the Codex CLI docs for the full `model_provider = oss` configuration.
**Experimental caveat:** tool-calling quality depends on the model and endpoint. Not every OpenAI-compat provider implements the full function-calling spec, and smaller models (< 30B) often struggle with multi-step tool orchestration. Test before committing.
### Per group / per session
Set `"provider": "codex"` in the group's **`container.json`** (`groups/<folder>/container.json`) — the in-container runner reads `provider` from there, not from the DB. The DB columns **`agent_groups.agent_provider`** and **`sessions.agent_provider`** (session overrides group) only drive host-side provider contribution — per-session `~/.codex` mount, `OPENAI_*` / `CODEX_MODEL` env passthrough — and do not propagate into `container.json` at spawn time. Set both, or just edit `container.json`; if they disagree, the runner uses `container.json` and the host-side resolver falls back through session → group → `container.json``'claude'`.
`CODEX_MODEL` applies process-wide via `.env`; if you need different models for different groups, set them via `container_config.env` on the group.
Extra MCP servers still come from **`NANOCLAW_MCP_SERVERS`** / `container_config.mcpServers` on the host. The runner merges them into the same `mcpServers` object passed to all providers.
## Operational notes
- **Spawn-per-query:** Codex's app-server is spawned fresh per query invocation, matching the OpenCode pattern. No long-lived daemon to keep healthy across sessions.
- **Per-session `~/.codex` isolation:** each group gets its own copy of the host's `auth.json`. The container can rewrite `config.toml` freely on every wake without touching the host's Codex config.
- **Native compaction:** kicks in automatically at 40K cumulative input tokens between turns, via `thread/compact/start`. If compaction fails, the provider logs and continues uncompacted — no fatal error.
- **Approvals:** auto-accepted inside the container (the container is the sandbox; same posture as Claude/OpenCode).
- **Mid-turn input:** Codex turns don't accept mid-turn messages. Follow-up `push()` calls queue and drain between turns, matching the OpenCode pattern. The poll-loop only pushes between turns anyway, so no messages are dropped.
- **Stale thread recovery:** `isSessionInvalid` matches on stale-thread-ID errors (`thread not found`, `unknown thread`, etc.) so a cold-started app-server can recover cleanly when it sees a stored continuation it no longer has.
## Next Steps
The registration and Dockerfile guards in **Build and validate** confirm the wiring. For a live end-to-end check, set `agent_provider = 'codex'` on a test group and send a message after the image rebuild. A successful round-trip looks like:
- `init` event with a stable thread ID as continuation
- One or more `activity` / `progress` events during the turn
- `result` event with the model's reply
If the agent hangs or errors, check `~/.codex/auth.json` exists on the host (Option A) or that `OPENAI_API_KEY` is forwarding correctly (Option B) — `docker exec` into a running container and `env | grep -i openai` to confirm. To confirm the CLI binary itself landed in the image, `docker exec <container> codex --version`.
To back this provider out, follow [REMOVE.md](REMOVE.md).
- **Container dies at boot, channel silent:** `grep 'Container exited non-zero' logs/nanoclaw.error.log` — the `stderrTail` carries the reason (e.g. `Unknown provider: codex. Registered: claude` means the barrels aren't wired in the running build).
- **In-channel `Error: spawn codex ENOENT` on every message:** the image predates the manifest entry — re-run `./container/build.sh`.
- **Auth errors mid-conversation:** the vault secret is missing or stale — re-run `pnpm exec tsx setup/index.ts --step provider-auth codex` (subscription re-login updates the vault copy).
@@ -0,0 +1,39 @@
// Structural guard for the Codex CLI install in container/cli-tools.json.
//
// @openai/codex is a CLI *binary* installed from the global-CLI manifest (a
// json-merge seam), not an importable package, so the barrel-driven
// registration tests cannot see it. This test reads the real cli-tools.json
// and asserts the @openai/codex entry is present and pinned to an exact
// version. It goes red if the manifest entry is dropped or unpins.
//
// Runs under bun (same suite as the container registration test):
// cd container/agent-runner && bun test src/providers/codex-cli-tools.test.ts
import { existsSync, readFileSync } from 'fs';
import path from 'path';
import { describe, it, expect } from 'bun:test';
// container/agent-runner/src/providers/ -> container/cli-tools.json
const MANIFEST = path.join(import.meta.dir, '..', '..', '..', 'cli-tools.json');
const manifestPresent = existsSync(MANIFEST);
// Read lazily — `describe.skipIf` still runs the body to register tests, so the
// read has to be guarded for the bare-branch (no manifest) case.
const tools: Array<{ name: string; version: string }> = manifestPresent
? JSON.parse(readFileSync(MANIFEST, 'utf8'))
: [];
const codex = tools.find((t) => t.name === '@openai/codex');
// cli-tools.json is a trunk file; on the bare providers branch it isn't present,
// so skip there. In an installed tree (trunk + this payload) it must carry the
// pinned @openai/codex entry.
describe.skipIf(!manifestPresent)('container/cli-tools.json codex CLI install', () => {
it('includes the @openai/codex entry', () => {
expect(codex).toBeDefined();
});
it('pins it to an exact semver (no latest, no ranges)', () => {
expect(codex?.version).toMatch(/^\d+\.\d+\.\d+(?:[-+][0-9A-Za-z.-]+)?$/);
});
});
@@ -1,30 +0,0 @@
// Structural guard for the Codex CLI install in container/Dockerfile.
//
// @openai/codex is a CLI *binary* installed via the Dockerfile, not an
// importable package, so the barrel-driven registration tests cannot see it.
// This test reads the real Dockerfile and asserts the version ARG and the
// `pnpm install -g` line for @openai/codex are both present. It goes red if
// either Dockerfile edit is dropped or drifts.
//
// Runs under bun (same suite as the container registration test):
// cd container/agent-runner && bun test src/providers/codex-dockerfile.test.ts
import { readFileSync } from 'fs';
import path from 'path';
import { describe, it, expect } from 'bun:test';
// container/agent-runner/src/providers/ -> container/Dockerfile
const DOCKERFILE = path.join(import.meta.dir, '..', '..', '..', 'Dockerfile');
describe('container/Dockerfile codex CLI install', () => {
const dockerfile = readFileSync(DOCKERFILE, 'utf8');
it('declares the CODEX_VERSION ARG', () => {
expect(dockerfile).toMatch(/ARG\s+CODEX_VERSION=/);
});
it('installs the @openai/codex CLI pinned to that ARG', () => {
expect(dockerfile).toMatch(/pnpm install -g\s+"@openai\/codex@\$\{CODEX_VERSION\}"/);
});
});
+2 -2
View File
@@ -111,8 +111,8 @@ Run `/manage-channels` to wire the GitHub channel to an agent group, or insert m
```sql
-- Create messaging group (one per repo)
INSERT INTO messaging_groups (id, channel_type, platform_id, name, is_group, unknown_sender_policy, created_at)
VALUES ('mg-github-myrepo', 'github', 'github:owner/repo', 'owner/repo', 1, '<policy>', datetime('now'));
INSERT INTO messaging_groups (id, channel_type, platform_id, instance, name, is_group, unknown_sender_policy, created_at)
VALUES ('mg-github-myrepo', 'github', 'github:owner/repo', 'github', 'owner/repo', 1, '<policy>', datetime('now'));
-- Wire to agent group
INSERT INTO messaging_group_agents (id, messaging_group_id, agent_group_id, trigger_rules, response_scope, session_mode, priority, created_at)
+2 -2
View File
@@ -119,8 +119,8 @@ Run `/manage-channels` to wire the Linear channel to an agent group, or insert m
```sql
-- Create messaging group (one per team)
INSERT INTO messaging_groups (id, channel_type, platform_id, name, is_group, unknown_sender_policy, created_at)
VALUES ('mg-linear-eng', 'linear', 'linear:ENG', 'Engineering', 1, 'public', datetime('now'));
INSERT INTO messaging_groups (id, channel_type, platform_id, instance, name, is_group, unknown_sender_policy, created_at)
VALUES ('mg-linear-eng', 'linear', 'linear:ENG', 'linear', 'Engineering', 1, 'public', datetime('now'));
-- Wire to agent group
INSERT INTO messaging_group_agents (id, messaging_group_id, agent_group_id, trigger_rules, response_scope, session_mode, priority, created_at)
+3 -1
View File
@@ -71,6 +71,8 @@ Parse the `PAIR_TELEGRAM_ISSUED` status block for `CODE` and follow the `REMINDE
## 4. Run the init script
First, pick the agent provider. Read `src/providers/index.ts` and collect the installed providers from its `import './<name>.js';` lines — `claude` is always available as the built-in default. If a non-default provider is installed (e.g. codex), ask the user which one this agent should run on; if only claude is available, skip the question and omit the flag.
```bash
npx tsx scripts/init-first-agent.ts \
--channel "${CHANNEL}" \
@@ -80,7 +82,7 @@ npx tsx scripts/init-first-agent.ts \
--agent-name "${AGENT_NAME}"
```
Add `--welcome "System instruction: ..."` to override the default welcome prompt.
Add `--provider <name>` when the user picked a non-default provider (there is no install-wide default — the choice is explicit per group). Add `--welcome "System instruction: ..."` to override the default welcome prompt.
The script:
1. Upserts the `users` row and grants `owner` role if no owner exists.
+2
View File
@@ -67,6 +67,8 @@ pnpm exec tsx setup/index.ts --step register -- \
The `register` step creates the agent group (reusing it if the folder already exists), the messaging group, and the wiring row. `createMessagingGroupAgent` auto-creates the companion `agent_destinations` row so the agent can address the channel by name.
When creating a NEW agent group on a non-default provider, append `--provider <name>` (e.g. `--provider codex`) — there is no install-wide default; existing groups switch via `ncl groups config update --provider` instead.
For separate agents, also ask for a folder name and optionally a different assistant name.
## Add Channel Group
+50
View File
@@ -0,0 +1,50 @@
---
name: migrate-memory
description: Carry an agent group's memory across a provider switch, in either direction (e.g. Claude ↔ Codex, or any provider to/from another). Run after the operator switches a group's provider with `ncl groups config update --provider`. The coding agent reads the source provider's memory store, distills it into the target provider's store, and restarts the group. Triggers on "migrate memory", "carry memory over", "the agent forgot everything after the switch".
---
# Migrate memory across a provider switch
NanoClaw does not migrate memory at runtime — each provider keeps its own store, and carrying content across is the operator's move, executed by you (the coding agent). This skill is the whole mechanism: read the source store, **infer** what is durable, write it into the target store, restart.
You translate between **store shapes**, not provider names. There are two:
- **Flat file** — `CLAUDE.local.md` at the group workspace root (the Claude provider; may reference satellite files in the workspace).
- **Scaffold tree** — `memory/` (any provider with `usesMemoryScaffold`, e.g. Codex). `memory/index.md` is the index; durable notes live under `memory/memories/`; `memory/memories/imported-agent-memory.md` is the conventional landing file for imported memory.
A switch only needs migration when it **crosses shapes**. Two providers that both use the scaffold share the same `memory/` tree, so switching between them carries nothing — the memory is already there. The work is always one of: flat → scaffold, or scaffold → flat.
Principles: **copy, never move** (the source store stays intact — it IS the rollback), **idempotent** (re-running must not duplicate), **distill, don't dump** (you are the inference step: keep identity/seed instructions, user preferences, durable facts; drop conversational residue).
## Step 1: Identify the group, both providers, and the direction
- `ncl groups list`, then `ncl groups config get --id <group-id>` — note the current (target) `provider`. Ask the operator which group, and which provider it switched *from*, if either is ambiguous.
- Map each provider to its store shape (flat `CLAUDE.local.md` vs `memory/` scaffold), then inspect `groups/<folder>/`:
- **Same shape on both sides** (e.g. scaffold → scaffold) → the store is shared; nothing to migrate. Tell the operator and stop.
- **Flat → scaffold** (source has `CLAUDE.local.md` content, target uses the scaffold) → Step 2.
- **Scaffold → flat** (source has a `memory/` tree, target is Claude) → Step 3.
- Source missing or empty → nothing to migrate; tell the operator and stop.
## Step 2: flat → scaffold (`CLAUDE.local.md` → `memory/`)
1. Read `groups/<folder>/CLAUDE.local.md` and any workspace files it references.
2. If `memory/memories/imported-agent-memory.md` already exists, a previous import happened — show the operator what's there and ask before overwriting; integrate only what's new.
3. Distill the content into `groups/<folder>/memory/memories/imported-agent-memory.md` (create the directories if missing — the container scaffolds the rest of the tree at boot and never clobbers your files). Lead with anything that defines who the agent is or how it must behave; references to satellite files keep their workspace-root paths.
4. If `memory/index.md` exists, add the following: `- [Imported agent memory](memories/imported-agent-memory.md) — seed instructions and memory carried over from a previous provider. Read it first and treat it as binding; it may define who you are and how to behave. Integrate its facts into your memory as you work; never modify files that belong to another provider's memory system.`
5. Leave the source store exactly as it is.
## Step 3: scaffold → flat (`memory/` → `CLAUDE.local.md`)
1. Read `memory/index.md`, then the files it points to under `memory/memories/` (and `memory/data/` where durable).
2. Integrate the durable facts into `groups/<folder>/CLAUDE.local.md` under a clearly marked section (e.g. `## Imported from memory/ (<date>)`), deduplicating against what's already there. If the section already exists, update it instead of appending a second one.
3. Leave the source store exactly as it is.
## Step 4: Restart and verify
```bash
ncl groups restart --id <group-id>
```
Tell the operator to send the group a quick test message that depends on a migrated fact (a preference, a project name). If the agent doesn't know it, re-check that the target file landed in the right group folder.
Note: switching the provider is an operator action — `ncl groups config update --id <group-id> --provider <name>` from the host. See [docs/provider-migration.md](../../../docs/provider-migration.md) for what carries over automatically.
+14
View File
@@ -28,6 +28,15 @@ Two phases: **Extract** (build the migration guide) and **Upgrade** (use it). If
---
# Phase 0: Refresh this skill first
The migration process itself evolves, so run its newest version before doing anything else:
- Ensure the `upstream` remote exists (default `https://github.com/nanocoai/nanoclaw.git`) and fetch: `git fetch upstream --prune`. Detect the upstream branch (`main` or `master`).
- Refresh this skill from upstream: `git checkout upstream/<branch> -- .claude/skills/migrate-nanoclaw/`
- Re-read `.claude/skills/migrate-nanoclaw/SKILL.md`. If it changed, **follow the updated version from the top** instead of this one.
This is the only working-tree change expected before the preflight check below; changes limited to `.claude/skills/migrate-nanoclaw/` are this self-refresh — ignore them in the 1.0 clean-tree check and proceed.
# Phase 1: Extract
## 1.0 Preflight
@@ -464,6 +473,11 @@ Point the branch at the upgraded state with `git reset --hard <upgrade-commit>`
Run `pnpm install && pnpm run build` in the main tree to confirm.
Stamp the upgrade marker (required — without it the startup tripwire stops the host on next start). Only do this after the build above succeeds:
```bash
pnpm exec tsx scripts/upgrade-state.ts set "" migrate-nanoclaw
```
Restart the service. Service labels are per-install — derive them from `setup/lib/install-slug.sh`:
```bash
source setup/lib/install-slug.sh
+19
View File
@@ -60,11 +60,20 @@ Help a user with a customized NanoClaw install safely incorporate upstream chang
- Default to MERGE (one-pass conflict resolution). Offer REBASE as an explicit option.
- Keep token usage low: rely on `git status`, `git log`, `git diff`, and open only conflicted files.
# Step 0a: Refresh this skill first
The update process itself evolves, so run its newest version before doing anything else:
- Ensure the `upstream` remote exists (default `https://github.com/nanocoai/nanoclaw.git`) and fetch: `git fetch upstream --prune`. Detect the upstream branch (`main` or `master`).
- Refresh this skill from upstream: `git checkout upstream/<branch> -- .claude/skills/update-nanoclaw/`
- Re-read `.claude/skills/update-nanoclaw/SKILL.md`. If it changed, **follow the updated version from the top** instead of this one.
This is the only working-tree change expected before the preflight check; the full update commits it along with everything else.
# Step 0: Preflight (stop early if unsafe)
Run:
- `git status --porcelain`
If output is non-empty:
- Tell the user to commit or stash first, then stop.
- Exception: changes limited to `.claude/skills/update-nanoclaw/` are the Step 0a self-refresh — ignore those and proceed.
Confirm remotes:
- `git remote -v`
@@ -256,6 +265,16 @@ If any channels/providers are installed AND `upstream/channels` or `upstream/pro
If no channels/providers are installed, skip silently.
Proceed to Step 7.9.
# Step 7.9: Stamp the upgrade marker (required)
After validation has **succeeded**, record that this install reached the new version through the supported path. Without this, the startup tripwire stops the host on its next start.
- `pnpm exec tsx scripts/upgrade-state.ts set "" update-nanoclaw`
- The empty version argument stamps the current `package.json` version.
If validation did NOT succeed, do not stamp — leave the tripwire to catch the broken state.
Proceed to Step 8.
# Step 8: Summary + rollback instructions
+8
View File
@@ -18,12 +18,20 @@ jobs:
- uses: actions/checkout@v4
with:
fetch-depth: 0
token: ${{ steps.app-token.outputs.token }}
- uses: pnpm/action-setup@v4
- name: Bump patch version
run: |
# Skip the auto-bump when the pushed commits already changed the
# version themselves (e.g. a release PR that set a minor/major).
# Otherwise the bot would patch a deliberate 2.1.0 up to 2.1.1.
if git diff --name-only "${{ github.event.before }}" "${{ github.sha }}" | grep -qx 'package.json'; then
echo "package.json already changed in this push; skipping auto-bump."
exit 0
fi
pnpm version patch --no-git-tag-version
git add package.json
git diff --cached --quiet && exit 0
+7
View File
@@ -39,3 +39,10 @@ groups/*
.nanoclaw/
agents-sdk-docs
.agents
AGENTS.md
# Internal working docs, never committed
docs/maintainer-guide.md
docs/drafts/
forks.md
+15
View File
@@ -2,6 +2,21 @@
All notable changes to NanoClaw will be documented in this file.
## [Unreleased]
- [BREAKING] **`@onecli-sh/sdk` 0.5.0 -> 2.2.1 — requires a OneCLI server with the `/v1` API** (older servers 404 every SDK call). The sanctioned gateway and CLI versions are pinned in `versions.json`; the `onecli` setup step enforces them. **Migration:** [docs/onecli-upgrades.md](docs/onecli-upgrades.md).
- **New agent provider: Codex (OpenAI) — run `/add-codex`.** Full runtime via `codex app-server` (planning, MCP tools, server-side history, resume). Trunk ships the seams and the skill; the payload installs from the `providers` branch (the skill, the setup picker, or `--step provider-auth codex`). Auth is vault-only — no credential ever enters a container.
- **Setup can now select, install, and authenticate a non-default agent provider.** A provider registry feeds the setup picker, an installer pulls the provider's payload from its branch, a vault auth walkthrough runs (`--step provider-auth`), and the picked provider is set on the first agent (a DB property) before its first spawn. Default (Claude) installs are unaffected — picking Claude changes nothing.
- **Provider choice is explicit per group — no install-wide default.** Provider is a DB property set via `ncl groups config update --provider` + restart; creation is provider-agnostic.
- **Memory migrates via `/migrate-memory`, never at runtime.** Each provider keeps its own store; fresh groups on a surfaces-owning provider see no stale `CLAUDE.*` files. See [docs/provider-migration.md](docs/provider-migration.md).
- **Per-exchange archiving is provider-owned** — the `onExchangeComplete` hook; the markdown writer ships with the codex payload.
- **Container boot failures now say why** — the last stderr lines are logged at `warn` on a non-zero exit instead of a silent crash loop.
- **Slash commands now interrupt an in-flight turn.** A runner-handled command (`/clear`, `/compact`, `/cost`, …) arriving mid-turn aborts the active stream and runs immediately instead of waiting out the turn.
## [2.1.0] - 2026-06-07
- [BREAKING] **Startup now requires an upgrade marker.** The host refuses to boot unless `data/upgrade-state.json` records that this install reached the current version through a sanctioned path (`/setup`, `/update-nanoclaw`, `/migrate-nanoclaw`). After this update completes — and before restarting the service — stamp the marker by running `pnpm exec tsx scripts/upgrade-state.ts set`. If the host has already tripped on restart with "update did not go through the supported path", that same command clears it. See [docs/upgrade-recovery.md](docs/upgrade-recovery.md).
## [2.0.64] - 2026-05-18
- **`ncl destinations add` and `remove` through the approval flow now reach the receiver immediately.** Approved destinations weren't being projected into the receiving agent's local session state, so a freshly-added destination silently failed at `send_message` with `unknown destination`, and a removed destination stayed resolvable until the next container restart. Both now take effect the moment the approval executes. Direct (non-approval) calls were unaffected.
+10 -4
View File
@@ -33,7 +33,7 @@ user_dms (user_id, channel_type, messaging_group_id) — cold-DM cache
agent_groups (workspace, memory, CLAUDE.md, personality, container config)
↕ many-to-many via messaging_group_agents (session_mode, trigger_rules, priority)
messaging_groups (one chat/channel on one platform; unknown_sender_policy)
messaging_groups (one chat/channel on one platform; instance = adapter-instance name, defaults to channel_type; unknown_sender_policy)
sessions (agent_group_id + messaging_group_id + thread_id → per-session container)
```
@@ -69,8 +69,8 @@ For ad-hoc queries from skills or scripts, use the in-tree wrapper rather than t
| `src/modules/permissions/access.ts` | `canAccessAgentGroup` — owner / global admin / scoped admin / member resolution against `user_roles` + `agent_group_members` |
| `src/modules/approvals/primitive.ts` | `pickApprover`, `pickApprovalDelivery`, `requestApproval`, approval-handler registry |
| `src/command-gate.ts` | Router-side admin command gate — queries `user_roles` directly (no env var, no container-side check) |
| `src/onecli-approvals.ts` | OneCLI credentialed-action approval bridge |
| `src/user-dm.ts` | Cold-DM resolution + `user_dms` cache |
| `src/modules/approvals/onecli-approvals.ts` | OneCLI credentialed-action approval bridge |
| `src/modules/permissions/user-dm.ts` | Cold-DM resolution + `user_dms` cache |
| `src/group-init.ts` | Per-agent-group filesystem scaffold (CLAUDE.md, skills, agent-runner-src overlay) |
| `src/db/container-configs.ts` | CRUD for `container_configs` table (per-group container runtime config) |
| `src/backfill-container-configs.ts` | Migrates legacy `container.json` files into the DB on startup |
@@ -83,6 +83,7 @@ For ad-hoc queries from skills or scripts, use the in-tree wrapper rather than t
| `groups/<folder>/` | Per-agent-group filesystem (CLAUDE.md, skills, per-group `agent-runner-src/` overlay) |
| `scripts/init-first-agent.ts` | Bootstrap the first DM-wired agent (used by `/init-first-agent` skill) |
| `migrate-v2.sh` + `setup/migrate-v2/` | v1→v2 migration. Standalone script: `bash migrate-v2.sh`. Seeds DB, copies groups/sessions, installs channels, builds container, offers service switchover, then hands off to `/migrate-from-v1` skill for owner setup and CLAUDE.md cleanup. See [docs/migration-dev.md](docs/migration-dev.md). |
| `nanoclaw.sh --uninstall` + `setup/uninstall/` | Uninstall this copy only (slug-scoped): service, containers + image, `data/`, `logs/`, `groups/`, this copy's OneCLI agents. Confirms per group; `--dry-run` previews, `--yes` skips prompts. Other copies and the shared OneCLI app are untouched. Bypasses bootstrap entirely; `uninstall.sh` is a pointer that execs it. |
## Admin CLI (`ncl`)
@@ -151,7 +152,7 @@ Key files: `src/container-restart.ts`, `src/container-runner.ts` (`killContainer
## Secrets / Credentials / OneCLI
API keys, OAuth tokens, and auth credentials are managed by the OneCLI gateway. Secrets are injected into per-agent containers at request time — none are passed in env vars or through chat context. The container agent sees this via the `onecli-gateway` container skill (`container/skills/onecli-gateway/SKILL.md`), which teaches it how the proxy works, how to handle auth errors, and to never ask for raw credentials. Host-side wiring: `src/onecli-approvals.ts`, `ensureAgent()` in `container-runner.ts`. Run `onecli --help`.
API keys, OAuth tokens, and auth credentials are managed by the OneCLI gateway. Secrets are injected into per-agent containers at request time — none are passed in env vars or through chat context. The container agent sees this via the `onecli-gateway` container skill (`container/skills/onecli-gateway/SKILL.md`), which teaches it how the proxy works, how to handle auth errors, and to never ask for raw credentials. Host-side wiring: `src/modules/approvals/onecli-approvals.ts`, `ensureAgent()` in `container-runner.ts`. Run `onecli --help`.
### Secret modes
@@ -192,6 +193,7 @@ Four types of skills. See [CONTRIBUTING.md](CONTRIBUTING.md) for the full taxono
| `/debug` | Container issues, logs, troubleshooting |
| `/update-nanoclaw` | Bring upstream updates into a customized install |
| `/init-onecli` | Install OneCLI Agent Vault and migrate `.env` credentials |
| `/migrate-memory` | Carry a group's agent memory across a provider switch (operator-run, both directions) |
## Contributing
@@ -274,6 +276,10 @@ This project uses pnpm with `minimumReleaseAge: 4320` (3 days) in `pnpm-workspac
| [docs/build-and-runtime.md](docs/build-and-runtime.md) | Runtime split (Node host + Bun container), lockfiles, image build surface, CI, key invariants |
| [docs/v1-to-v2-changes.md](docs/v1-to-v2-changes.md) | v1→v2 architecture diff — vocabulary for where v1 things moved |
| [docs/migration-dev.md](docs/migration-dev.md) | Migration development guide — testing, debugging, dev loop |
| [docs/provider-migration.md](docs/provider-migration.md) | Switching a live agent group between providers (e.g. Claude → Codex) — what carries over, rollback |
| [docs/customizing.md](docs/customizing.md) | Short intro to customizing via skills |
| [docs/skills-model.md](docs/skills-model.md) | The skills model in full: recipes, tests, upgrades, migrations |
| [docs/skill-guidelines.md](docs/skill-guidelines.md) | Authoritative checklist for writing a skill |
## Container Build Cache
+24 -12
View File
@@ -19,6 +19,13 @@
**Not accepted:** Features, capabilities, compatibility, enhancements. These should be skills.
## Breaking Changes
Breaking changes are allowed; **silent** ones are not. NanoClaw does not migrate user installs at runtime — the user's coding agent is the migrator, so every breaking change must ship a migration path that agent can execute without a human reverse-engineering the diff:
1. **Every `[BREAKING]` CHANGELOG entry must reference its migration path** — either a skill to run (`Run /<skill-name> to <action>`) or a `docs/` page covering **detect / why / fix / verify / rollback** (see [docs/onecli-upgrades.md](docs/onecli-upgrades.md) for the shape). `/update-nanoclaw` surfaces these entries after every update and walks the user through them.
2. **If the change moves an external component's sanctioned version** (gateway, pinned CLI binary, …), update its pin in [`versions.json`](versions.json). The changelog stays human-narrative; `versions.json` is the machine-checkable signal — `/update-nanoclaw` diffs it across the update and routes the user to the linked doc for any pin that moved.
## Skills
NanoClaw uses [Claude Code skills](https://code.claude.com/docs/en/skills) — markdown files with optional supporting files that teach Claude how to do something. There are four types of skills in NanoClaw, each serving a different purpose.
@@ -29,26 +36,27 @@ Every user should have clean and minimal code that does exactly what they need.
### Skill types
#### 1. Feature skills (branch-based)
#### 1. Channel and provider skills (registry branches)
Add capabilities to NanoClaw by merging a git branch. The SKILL.md contains setup instructions; the actual code lives on a `skill/*` branch.
Add a messaging channel or an agent provider. The SKILL.md contains the install steps; the actual code lives on a long-lived registry branch (`channels` or `providers`) that we keep in sync with `main`.
**Location:** `.claude/skills/` on `main` (instructions only), code on `skill/*` branch
**Location:** `.claude/skills/` on `main` (instructions only), code on the `channels` or `providers` branch
**Examples:** `/add-telegram`, `/add-slack`, `/add-discord`, `/add-gmail`
**Examples:** `/add-telegram`, `/add-slack`, `/add-discord`, `/add-opencode`
**How they work:**
1. User runs `/add-telegram`
2. Claude follows the SKILL.md: fetches and merges the `skill/telegram` branch
3. Claude walks through interactive setup (env vars, bot creation, etc.)
2. Claude follows the SKILL.md: `git fetch origin channels`, then copies each file in with `git show origin/channels:<path> > <path>`. Install is an additive fetch, never a `git merge`.
3. The adapter's registration test is fetched the same way and run as verification
4. Claude walks through interactive setup (tokens, bot creation, etc.)
**Contributing a feature skill:**
**Contributing a channel or provider skill:**
1. Fork `nanocoai/nanoclaw` and branch from `main`
2. Make the code changes (new files, modified source, updated `package.json`, etc.)
3. Add a SKILL.md in `.claude/skills/<name>/` with setup instructions — step 1 should be merging the branch
4. Open a PR. We'll create the `skill/<name>` branch from your work
2. Build the adapter following [docs/skill-guidelines.md](docs/skill-guidelines.md): a self-registering module, one appended barrel import, and a registration test that imports the real barrel
3. Add a SKILL.md in `.claude/skills/<name>/` with the fetch-and-copy steps, and a REMOVE.md that reverses every change
4. Open a PR. We'll land the code on the registry branch from your work
See `/add-telegram` for a good example. See [docs/skills-as-branches.md](docs/skills-as-branches.md) for the full system design.
See `/add-slack` for a good example. See [docs/skills-model.md](docs/skills-model.md) for why install is a fetch, never a merge.
#### 2. Utility skills (with code files)
@@ -58,7 +66,7 @@ Standalone tools that ship code files alongside the SKILL.md. The SKILL.md tells
**Examples:** a self-contained CLI or helper shipped in a `scripts/` subfolder of the skill.
**Key difference from feature skills:** No branch merge needed. The code is self-contained in the skill directory and gets copied into place during installation.
**Key difference from channel/provider skills:** the code is self-contained in the skill directory and gets copied into place during installation; nothing is fetched from a registry branch.
**Guidelines:**
- Put code in separate files, not inline in the SKILL.md
@@ -93,6 +101,10 @@ Skills that run inside the agent container, not on the host. These teach the con
- Use `allowed-tools` frontmatter to scope tool permissions
- Keep them focused — the agent's context window is shared across all container skills
### Writing a good skill
The authoring bar is [docs/skill-guidelines.md](docs/skill-guidelines.md): mostly adds, minimal reach-ins into existing code, a test for every functional integration point, and a REMOVE.md whenever apply leaves anything behind. [docs/skills-model.md](docs/skills-model.md) explains the model behind it.
### SKILL.md format
All skills use the [Claude Code skills standard](https://code.claude.com/docs/en/skills):
+9 -1
View File
@@ -196,11 +196,19 @@ Ask Claude Code. "Why isn't the scheduler running?" "What's in the recent logs?"
If a step fails, `nanoclaw.sh` hands off to Claude Code to diagnose and resume. If that doesn't resolve it, run `claude`, then `/debug`. If Claude identifies an issue likely to affect other users, open a PR against the relevant setup step or skill.
**How do I uninstall NanoClaw?**
```bash
bash nanoclaw.sh --uninstall
```
Every install is tagged with a per-checkout id, so the uninstaller removes only what belongs to that copy: the background service, containers and image, app data and logs, your agents' files, and this copy's OneCLI vault agents. Shared things — the OneCLI app and your credentials, other NanoClaw copies on the machine — are left alone. It shows exactly what it found and asks for confirmation per group; nothing is deleted until you say yes. Use `--dry-run` to preview without changing anything, or `--yes` to skip the prompts. Your `.env` is backed up before removal. To finish, delete the checkout folder itself.
**What changes will be accepted into the codebase?**
Only security fixes, bug fixes, and clear improvements will be accepted to the base configuration. That's all.
Everything else (new capabilities, OS compatibility, hardware support, enhancements) should be contributed as skills on the `channels` or `providers` branch.
Everything else (new capabilities, OS compatibility, hardware support, enhancements) should be contributed as skills: channel and provider code on the `channels`/`providers` registry branches, everything else as a self-contained skill. See [docs/customizing.md](docs/customizing.md) and [CONTRIBUTING.md](CONTRIBUTING.md).
This keeps the base system minimal and lets every user customize their installation without inheriting features they don't want.
+11 -15
View File
@@ -16,12 +16,11 @@ FROM node:22-slim
# CJK fonts add ~200MB. Opt in only if you render Chinese/Japanese/Korean text.
ARG INSTALL_CJK_FONTS=false
# Pin CLI versions for reproducibility. Bump deliberately — unpinned installs
# mean every rebuild silently picks up the latest and can break in lockstep
# across all users.
ARG CLAUDE_CODE_VERSION=2.1.154
ARG AGENT_BROWSER_VERSION=latest
ARG VERCEL_VERSION=52.2.1
# Pin versions for reproducibility. Bump deliberately — unpinned installs mean
# every rebuild silently picks up the latest and can break in lockstep across
# all users. The global Node CLIs (claude-code, agent-browser, vercel) are
# pinned in cli-tools.json so a skill can add one with a json-merge; Bun (the
# runtime) is pinned here because it installs from a different source.
ARG BUN_VERSION=1.3.12
# ---- System dependencies -----------------------------------------------------
@@ -99,16 +98,13 @@ ENV PATH="$PNPM_HOME:$PATH"
ARG PNPM_VERSION=10.33.0
RUN corepack enable && corepack prepare pnpm@${PNPM_VERSION} --activate
# Global Node CLIs the agent invokes at runtime live in cli-tools.json so a
# skill can add one with a json-merge instead of editing this Dockerfile.
# install-cli-tools.sh installs each via pnpm (pinned), writing the per-tool
# only-built-dependencies opt-ins it reads from the manifest.
COPY cli-tools.json install-cli-tools.sh /tmp/
RUN --mount=type=cache,target=/root/.cache/pnpm \
echo "only-built-dependencies[]=agent-browser" > /root/.npmrc && \
echo "only-built-dependencies[]=@anthropic-ai/claude-code" >> /root/.npmrc && \
pnpm install -g "vercel@${VERCEL_VERSION}"
RUN --mount=type=cache,target=/root/.cache/pnpm \
pnpm install -g "agent-browser@${AGENT_BROWSER_VERSION}"
RUN --mount=type=cache,target=/root/.cache/pnpm \
pnpm install -g "@anthropic-ai/claude-code@${CLAUDE_CODE_VERSION}"
sh /tmp/install-cli-tools.sh /tmp/cli-tools.json
# ---- ncl CLI wrapper ----------------------------------------------------------
# Actual script lives in the mounted source at /app/src/cli/ncl.ts.
+10 -10
View File
@@ -5,7 +5,7 @@
"": {
"name": "nanoclaw-agent-runner",
"dependencies": {
"@anthropic-ai/claude-agent-sdk": "^0.3.154",
"@anthropic-ai/claude-agent-sdk": "^0.3.170",
"@anthropic-ai/sdk": "^0.100.0",
"@modelcontextprotocol/sdk": "^1.29.0",
"cron-parser": "^5.0.0",
@@ -19,23 +19,23 @@
},
},
"packages": {
"@anthropic-ai/claude-agent-sdk": ["@anthropic-ai/claude-agent-sdk@0.3.154", "", { "optionalDependencies": { "@anthropic-ai/claude-agent-sdk-darwin-arm64": "0.3.154", "@anthropic-ai/claude-agent-sdk-darwin-x64": "0.3.154", "@anthropic-ai/claude-agent-sdk-linux-arm64": "0.3.154", "@anthropic-ai/claude-agent-sdk-linux-arm64-musl": "0.3.154", "@anthropic-ai/claude-agent-sdk-linux-x64": "0.3.154", "@anthropic-ai/claude-agent-sdk-linux-x64-musl": "0.3.154", "@anthropic-ai/claude-agent-sdk-win32-arm64": "0.3.154", "@anthropic-ai/claude-agent-sdk-win32-x64": "0.3.154" }, "peerDependencies": { "@anthropic-ai/sdk": ">=0.93.0", "@modelcontextprotocol/sdk": "^1.29.0", "zod": "^4.0.0" } }, "sha512-iEn25urI2QrMPFIhId3h7v/7EG5gsmF7ooe+6EvsAosePeLmpVVerp5nXtHnlmBkMinLecurcPA+OddKw76jYw=="],
"@anthropic-ai/claude-agent-sdk": ["@anthropic-ai/claude-agent-sdk@0.3.170", "", { "optionalDependencies": { "@anthropic-ai/claude-agent-sdk-darwin-arm64": "0.3.170", "@anthropic-ai/claude-agent-sdk-darwin-x64": "0.3.170", "@anthropic-ai/claude-agent-sdk-linux-arm64": "0.3.170", "@anthropic-ai/claude-agent-sdk-linux-arm64-musl": "0.3.170", "@anthropic-ai/claude-agent-sdk-linux-x64": "0.3.170", "@anthropic-ai/claude-agent-sdk-linux-x64-musl": "0.3.170", "@anthropic-ai/claude-agent-sdk-win32-arm64": "0.3.170", "@anthropic-ai/claude-agent-sdk-win32-x64": "0.3.170" }, "peerDependencies": { "@anthropic-ai/sdk": ">=0.93.0", "@modelcontextprotocol/sdk": "^1.29.0", "zod": "^4.0.0" } }, "sha512-pAvhfk+iTodXZ6RF18Kz7BEUWFjL7EcR3tKuhUNdPpE1NAYCR3mSHGbafi72JsrNwKEDIs7FU31z3fqhwy8QzA=="],
"@anthropic-ai/claude-agent-sdk-darwin-arm64": ["@anthropic-ai/claude-agent-sdk-darwin-arm64@0.3.154", "", { "os": "darwin", "cpu": "arm64" }, "sha512-oFW3LD5lYrKAU+AKu27Z8hrzqkrh362qQrwi/i3DxGcud9BXUycsXYjShpDj3D3JZu169UzZuSPhx1Wajmbiwg=="],
"@anthropic-ai/claude-agent-sdk-darwin-arm64": ["@anthropic-ai/claude-agent-sdk-darwin-arm64@0.3.170", "", { "os": "darwin", "cpu": "arm64" }, "sha512-rwfgArIa5WI0QPNqFsRBgvtSI0mrtpynUm0oK6+l6/KX4hcgnYGEzciZR1bOeD9/7sSZlTdIgt+T9alKeZmXcg=="],
"@anthropic-ai/claude-agent-sdk-darwin-x64": ["@anthropic-ai/claude-agent-sdk-darwin-x64@0.3.154", "", { "os": "darwin", "cpu": "x64" }, "sha512-5BgWEueP+cqoctWjZYhCbyltuaV/N2DmKDXD3/69cKaVmJp8XL9OCzlq/HEirA/+Ssjskx6hDUBaOcpuZ3iwQA=="],
"@anthropic-ai/claude-agent-sdk-darwin-x64": ["@anthropic-ai/claude-agent-sdk-darwin-x64@0.3.170", "", { "os": "darwin", "cpu": "x64" }, "sha512-0e58h8UQMtsQxLGIv9r4foxfBFWKZ7NeDtoplLhuD7EwQonehomw1sBXCch77t/IfUS+q5vQ5zv+fOGmap5nLQ=="],
"@anthropic-ai/claude-agent-sdk-linux-arm64": ["@anthropic-ai/claude-agent-sdk-linux-arm64@0.3.154", "", { "os": "linux", "cpu": "arm64" }, "sha512-rRkW4SBL3W7zQvKscCIfIGlmoeuTbMV6dXFbPdmpRGvmYZIs79RpzO6xrGBnnhmm+B7znQ9oHAnffi/2FBgJbA=="],
"@anthropic-ai/claude-agent-sdk-linux-arm64": ["@anthropic-ai/claude-agent-sdk-linux-arm64@0.3.170", "", { "os": "linux", "cpu": "arm64" }, "sha512-gLbaFqcGppFJQd4DLNV4IXoeahejT/p2/M8bSSvRDbla9GOsBr1AxV5XLRyBn1e7xFGozZIAIQr3+1chp7NJgQ=="],
"@anthropic-ai/claude-agent-sdk-linux-arm64-musl": ["@anthropic-ai/claude-agent-sdk-linux-arm64-musl@0.3.154", "", { "os": "linux", "cpu": "arm64" }, "sha512-o2bCQN4Xn3UqCLErC5m4T7u0yYArJYmgFCUFnA6K96DdW2RERvx+gTKXxWuHEBkDO+eMoHLHLxk0u2jGES00Ng=="],
"@anthropic-ai/claude-agent-sdk-linux-arm64-musl": ["@anthropic-ai/claude-agent-sdk-linux-arm64-musl@0.3.170", "", { "os": "linux", "cpu": "arm64" }, "sha512-SRYfQcsXlOq+CD/FqkQBTSHbaD++w73GnnO+NUV9adLYrca3kfetRwWT1iguY1cNS0l34dCR3rlzCPq78vg1Jg=="],
"@anthropic-ai/claude-agent-sdk-linux-x64": ["@anthropic-ai/claude-agent-sdk-linux-x64@0.3.154", "", { "os": "linux", "cpu": "x64" }, "sha512-GpiFF8Ez6PbM3m0gqtCo/FKM346qyRdP7VhbmJzdnbNKTiiUZ66vDQyEUPZPCG24ZkrG4m96KpRIUwY08rHiNg=="],
"@anthropic-ai/claude-agent-sdk-linux-x64": ["@anthropic-ai/claude-agent-sdk-linux-x64@0.3.170", "", { "os": "linux", "cpu": "x64" }, "sha512-Xl/m7TaSC3T5IDBdHrZQ9fCQYyDmPELN34CL+MoyPIf7uSmuZnjE9fUOqDh2Rv26JxWssi1M6X+BBvVuKd6Cpg=="],
"@anthropic-ai/claude-agent-sdk-linux-x64-musl": ["@anthropic-ai/claude-agent-sdk-linux-x64-musl@0.3.154", "", { "os": "linux", "cpu": "x64" }, "sha512-zA7S8Lm6O4QBsUpbhiOht8BgiXHOBBFUIo8ZLK6r5wAatK3Q44syWVxICeyCnR6wqfnkf3cugCw27ycS6vVgaA=="],
"@anthropic-ai/claude-agent-sdk-linux-x64-musl": ["@anthropic-ai/claude-agent-sdk-linux-x64-musl@0.3.170", "", { "os": "linux", "cpu": "x64" }, "sha512-m4+I0qBEk7cxRKS+pL+eoWXbXTFOAo83fQ0tQvap4z/mDMm06IWJtEPoYTaMBwsp32GJWLkHWKbZSBCHZnp2DQ=="],
"@anthropic-ai/claude-agent-sdk-win32-arm64": ["@anthropic-ai/claude-agent-sdk-win32-arm64@0.3.154", "", { "os": "win32", "cpu": "arm64" }, "sha512-cDW1YFbU/PJFlrGXhlAGcbkXt80sEO6WtnH8nN8YHXLn5NWduy2q7o/qC6i8XozgvRGf6t/eMoH7IasGIEDhDw=="],
"@anthropic-ai/claude-agent-sdk-win32-arm64": ["@anthropic-ai/claude-agent-sdk-win32-arm64@0.3.170", "", { "os": "win32", "cpu": "arm64" }, "sha512-IG+8isJNNJKbnnhO7m+PGhfVCg+XoQ/MDxGde5eigFI0WsEfitjuWSWwx82bT9ghxI1aa6qNvI+UPgPcZuo5Fg=="],
"@anthropic-ai/claude-agent-sdk-win32-x64": ["@anthropic-ai/claude-agent-sdk-win32-x64@0.3.154", "", { "os": "win32", "cpu": "x64" }, "sha512-tSKaIIpL72OPg3WfzZTCIl8OJgcbq4qieu8/fDWjsdeQuari9gQMIuEflFphk9HqNsxpSmDqKi8Sm5mW2V566Q=="],
"@anthropic-ai/claude-agent-sdk-win32-x64": ["@anthropic-ai/claude-agent-sdk-win32-x64@0.3.170", "", { "os": "win32", "cpu": "x64" }, "sha512-7cuqSKbHVItPGVwRbd3A0BEJwcNtc7Fhoh6qHN4C6yrmjSrvdYYx3MLvq/VI768/RoG7mAMDxb+j7WfEfoP9BA=="],
"@anthropic-ai/sdk": ["@anthropic-ai/sdk@0.100.0", "", { "dependencies": { "json-schema-to-ts": "^3.1.1", "standardwebhooks": "^1.0.0" }, "peerDependencies": { "zod": "^3.25.0 || ^4.0.0" }, "optionalPeers": ["zod"], "bin": { "anthropic-ai-sdk": "bin/cli" } }, "sha512-cAm3aXm6qAiHIvHxyIIGd6tVmsD2gDqlc2h0R20ijNUzGgVnIN822bit4mKbF6CkuV7qIrLQIPoAepHEpanrQQ=="],
+1 -1
View File
@@ -9,7 +9,7 @@
"test": "bun test"
},
"dependencies": {
"@anthropic-ai/claude-agent-sdk": "^0.3.154",
"@anthropic-ai/claude-agent-sdk": "^0.3.170",
"@anthropic-ai/sdk": "^0.100.0",
"@modelcontextprotocol/sdk": "^1.29.0",
"cron-parser": "^5.0.0",
+7
View File
@@ -27,6 +27,7 @@ import { fileURLToPath } from 'url';
import { loadConfig } from './config.js';
import { buildSystemPromptAddendum } from './destinations.js';
import { ensureMemoryScaffold } from './memory-scaffold.js';
// Providers barrel — each enabled provider self-registers on import.
// Provider skills append imports to providers/index.ts.
import './providers/index.js';
@@ -95,6 +96,12 @@ async function main(): Promise<void> {
effort: config.effort,
});
// Providers that lack native memory opt in via `usesMemoryScaffold`; for them
// the runner creates a persistent memory/ tree in its host-backed workspace at
// boot (idempotent). Default off — the trunk default (Claude) omits the flag
// and keeps its native memory untouched.
if (provider.usesMemoryScaffold) ensureMemoryScaffold();
await runPollLoop({
provider,
providerName,
@@ -5,6 +5,7 @@ import { getUndeliveredMessages } from './db/messages-out.js';
import { getPendingMessages } from './db/messages-in.js';
import { getContinuation, setContinuation } from './db/session-state.js';
import { MockProvider } from './providers/mock.js';
import type { ProviderExchange } from './providers/types.js';
import { runPollLoop } from './poll-loop.js';
beforeEach(() => {
@@ -304,6 +305,7 @@ async function runPollLoopWithTimeout(provider: MockProvider, signal: AbortSigna
provider,
providerName: 'mock',
cwd: '/tmp',
signal,
}),
new Promise<void>((_, reject) => {
signal.addEventListener('abort', () => reject(new Error('aborted')));
@@ -324,6 +326,86 @@ function sleep(ms: number): Promise<void> {
return new Promise((resolve) => setTimeout(resolve, ms));
}
describe('poll loop — exchange hook (onExchangeComplete)', () => {
// A provider that declares the per-exchange hook. The hook call is the
// wiring under test — these tests go red if the poll-loop seam is severed.
// What the provider DOES with an exchange (e.g. write markdown into
// conversations/) ships with the provider, not the runner.
class HookedMockProvider extends MockProvider {
readonly exchanges: ProviderExchange[] = [];
onExchangeComplete(exchange: ProviderExchange): void {
this.exchanges.push(exchange);
}
}
it('reports each exchange to a provider that declares the hook', async () => {
insertMessage('m1', { sender: 'Alice', text: 'please archive this' }, { platformId: 'chan-1', channelType: 'discord' });
const provider = new HookedMockProvider({}, () => '<message to="discord-test">archived answer</message>');
const controller = new AbortController();
const loopPromise = runPollLoopWithTimeout(provider, controller.signal, 2000);
await waitFor(() => provider.exchanges.length > 0, 2000);
controller.abort();
expect(provider.exchanges.length).toBe(1);
const exchange = provider.exchanges[0];
expect(exchange.prompt).toContain('please archive this');
expect(exchange.result).toContain('archived answer');
expect(exchange.continuation).toStartWith('mock-session-');
expect(exchange.status).toBe('completed');
await loopPromise.catch(() => {});
});
it('does not report the internal wrapping-retry nudge as a user prompt', async () => {
insertMessage('m1', { sender: 'Alice', text: 'wrap this later' }, { platformId: 'chan-1', channelType: 'discord' });
let calls = 0;
const provider = new HookedMockProvider({}, () => {
calls += 1;
// First result is unwrapped (triggers the retry nudge), second is wrapped.
return calls === 1 ? 'unwrapped text' : '<message to="discord-test">wrapped now</message>';
});
const controller = new AbortController();
const loopPromise = runPollLoopWithTimeout(provider, controller.signal, 3000);
await waitFor(() => provider.exchanges.length >= 2, 3000);
controller.abort();
// Both exchanges attribute themselves to the real user prompt, never the nudge.
for (const exchange of provider.exchanges) {
expect(exchange.prompt).not.toContain('Your response was not delivered');
expect(exchange.prompt).toContain('wrap this later');
}
expect(provider.exchanges.map((e) => e.status)).toEqual(['undelivered', 'completed']);
await loopPromise.catch(() => {});
});
it('a throwing hook never breaks delivery', async () => {
insertMessage('m1', { sender: 'Alice', text: 'still deliver this' }, { platformId: 'chan-1', channelType: 'discord' });
class ThrowingHookProvider extends MockProvider {
onExchangeComplete(): void {
throw new Error('hook exploded');
}
}
const provider = new ThrowingHookProvider({}, () => '<message to="discord-test">delivered anyway</message>');
const controller = new AbortController();
const loopPromise = runPollLoopWithTimeout(provider, controller.signal, 2000);
await waitFor(() => getUndeliveredMessages().length > 0, 2000);
controller.abort();
const out = getUndeliveredMessages();
expect(out.length).toBe(1);
expect(out[0].content).toContain('delivered anyway');
await loopPromise.catch(() => {});
});
});
describe('poll loop — provider error recovery', () => {
it('writes error to outbound and continues loop on provider throw', async () => {
insertMessage('m1', { sender: 'Alice', text: 'trigger error' }, { platformId: 'chan-1', channelType: 'discord' });
@@ -462,3 +544,76 @@ class InvalidSessionProvider {
};
}
}
describe('poll loop — slash command during active query', () => {
it('aborts the active query when /clear arrives as a follow-up', async () => {
insertMessage('m-active', { sender: 'Alice', text: 'long running request' }, { platformId: 'chan-1', channelType: 'discord' });
const provider = new BlockingProvider();
const controller = new AbortController();
const loopPromise = runPollLoopWithTimeout(provider as unknown as MockProvider, controller.signal, 3000);
await waitFor(() => provider.queries === 1, 2000);
insertMessage('m-clear-active', { sender: 'Alice', text: '/clear' }, { platformId: 'chan-1', channelType: 'discord' });
await waitFor(() => provider.aborts === 1, 2000);
await waitFor(
() => getUndeliveredMessages().some((msg) => JSON.parse(msg.content).text === 'Session cleared.'),
2000,
);
controller.abort();
expect(provider.ends).toBe(0);
expect(getContinuation('mock')).toBeUndefined();
expect(getPendingMessages()).toHaveLength(0);
await loopPromise.catch(() => {});
});
});
/**
* Provider whose query never completes until ended/aborted for testing how
* the loop interrupts an active stream.
*/
class BlockingProvider {
readonly supportsNativeSlashCommands = false;
queries = 0;
aborts = 0;
ends = 0;
isSessionInvalid(): boolean {
return false;
}
query() {
const owner = this;
this.queries += 1;
let wake: (() => void) | null = null;
let ended = false;
let aborted = false;
return {
push() {},
end: () => {
owner.ends += 1;
ended = true;
wake?.();
},
abort: () => {
owner.aborts += 1;
aborted = true;
wake?.();
},
events: (async function* () {
yield { type: 'activity' as const };
yield { type: 'init' as const, continuation: 'blocking-session' };
while (!ended && !aborted) {
await new Promise<void>((resolve) => {
wake = resolve;
});
wake = null;
}
})(),
};
}
}
@@ -5,8 +5,11 @@
* send_message(to="agent-name") since agents and channels share the
* unified destinations namespace.
*
* create_agent is admin-only. Non-admin containers never see this tool
* (see mcp-tools/index.ts). The host re-checks permission on receive.
* create_agent writes central-DB state. The host authorizes it by CLI scope:
* trusted owner agent groups (scope 'global') create directly; confined groups
* require admin approval (see src/modules/agent-to-agent/create-agent.ts). This
* tool just writes the outbound request; authorization is enforced host-side,
* not here the container is untrusted and cannot be relied on to gate itself.
*/
import { writeMessageOut } from '../db/messages-out.js';
import { registerTools } from './server.js';
@@ -32,7 +35,7 @@ export const createAgent: McpToolDefinition = {
tool: {
name: 'create_agent',
description:
'Create a long-lived companion sub-agent (research assistant, task manager, specialist) — the name becomes your destination for it. Admin-only. Fire-and-forget.',
'Create a long-lived companion sub-agent (research assistant, task manager, specialist) — the name becomes your destination for it. May require admin approval before the agent is created. Fire-and-forget.',
inputSchema: {
type: 'object' as const,
properties: {
+11 -15
View File
@@ -13,6 +13,7 @@ import { getCurrentInReplyTo } from '../current-batch.js';
import { findByName, getAllDestinations } from '../destinations.js';
import { getMessageIdBySeq, getRoutingBySeq, writeMessageOut } from '../db/messages-out.js';
import { getSessionRouting } from '../db/session-routing.js';
import { enqueueFileOut } from '../outbox.js';
import { registerTools } from './server.js';
import type { McpToolDefinition } from './types.js';
@@ -156,21 +157,16 @@ export const sendFile: McpToolDefinition = {
const resolvedPath = path.isAbsolute(filePath) ? filePath : path.resolve('/workspace/agent', filePath);
if (!fs.existsSync(resolvedPath)) return err(`File not found: ${filePath}`);
const id = generateId();
const filename = (args.filename as string) || path.basename(resolvedPath);
const outboxDir = path.join('/workspace/outbox', id);
fs.mkdirSync(outboxDir, { recursive: true });
fs.copyFileSync(resolvedPath, path.join(outboxDir, filename));
writeMessageOut({
id,
in_reply_to: getCurrentInReplyTo(),
kind: 'chat',
platform_id: routing.platform_id,
channel_type: routing.channel_type,
thread_id: routing.thread_id,
content: JSON.stringify({ text: (args.text as string) || '', files: [filename] }),
const { id, filename } = enqueueFileOut({
srcPath: resolvedPath,
routing: {
platform_id: routing.platform_id,
channel_type: routing.channel_type,
thread_id: routing.thread_id,
in_reply_to: getCurrentInReplyTo(),
},
text: (args.text as string) || '',
filename: (args.filename as string) || undefined,
});
log(`send_file: ${id}${routing.resolvedName} (${filename})`);
@@ -0,0 +1,53 @@
import { describe, expect, it } from 'bun:test';
import fs from 'fs';
import os from 'os';
import path from 'path';
import { ensureMemoryScaffold } from './memory-scaffold.js';
describe('ensureMemoryScaffold', () => {
it('deterministically creates the memory tree', () => {
const base = fs.mkdtempSync(path.join(os.tmpdir(), 'nanoclaw-mem-'));
try {
ensureMemoryScaffold(base);
expect(fs.existsSync(path.join(base, 'memory', 'index.md'))).toBe(true);
expect(fs.existsSync(path.join(base, 'memory', 'system', 'definition.md'))).toBe(true);
expect(fs.existsSync(path.join(base, 'memory', 'memories'))).toBe(true);
expect(fs.existsSync(path.join(base, 'memory', 'data'))).toBe(true);
} finally {
fs.rmSync(base, { recursive: true, force: true });
}
});
it('never touches workspace memory it did not create — CLAUDE.local.md stays untouched', () => {
const base = fs.mkdtempSync(path.join(os.tmpdir(), 'nanoclaw-mem-'));
try {
fs.writeFileSync(path.join(base, 'CLAUDE.local.md'), '# group memory\nuser prefers terse replies\n');
ensureMemoryScaffold(base);
// Migration between memory stores is the operator's move (/migrate-memory),
// never a boot side effect.
expect(fs.existsSync(path.join(base, 'memory', 'memories', 'imported-agent-memory.md'))).toBe(false);
expect(fs.readFileSync(path.join(base, 'CLAUDE.local.md'), 'utf-8')).toContain('terse replies');
} finally {
fs.rmSync(base, { recursive: true, force: true });
}
});
it('is idempotent and never clobbers the agent edits', () => {
const base = fs.mkdtempSync(path.join(os.tmpdir(), 'nanoclaw-mem-'));
try {
ensureMemoryScaffold(base);
const indexFile = path.join(base, 'memory', 'index.md');
fs.writeFileSync(indexFile, '# my own index\n');
ensureMemoryScaffold(base);
expect(fs.readFileSync(indexFile, 'utf-8')).toBe('# my own index\n');
} finally {
fs.rmSync(base, { recursive: true, force: true });
}
});
});
@@ -0,0 +1,39 @@
import fs from 'fs';
import path from 'path';
import { fileURLToPath } from 'url';
/**
* Create the agent's persistent memory scaffold, container-side, at boot.
*
* The runner owns its own workspace: it writes the memory tree straight into
* `/workspace/agent` (the host-backed, RW group dir, so it persists across the
* ephemeral container). No host-side step, nothing mounted in.
*
* The default `definition.md` / `index.md` live as real markdown templates next
* to this module (under `memory-templates/`) not as strings in code so the
* doctrine is editable as markdown and the agent receives an unescaped copy.
* They ship in the mounted `/app/src` tree, so no image change is needed.
*
* Idempotent only writes what's missing, so the agent's own edits and
* accumulated memory are never clobbered on a later wake. Provider-agnostic:
* the runner makes no assumption about which harness is running a provider
* opts in via `usesMemoryScaffold`.
*/
const TEMPLATES_DIR = path.join(path.dirname(fileURLToPath(import.meta.url)), 'memory-templates');
export function ensureMemoryScaffold(baseDir = '/workspace/agent'): void {
const memoryDir = path.join(baseDir, 'memory');
const systemDir = path.join(memoryDir, 'system');
for (const dir of [systemDir, path.join(memoryDir, 'memories'), path.join(memoryDir, 'data')]) {
fs.mkdirSync(dir, { recursive: true });
}
copyTemplateIfMissing('definition.md', path.join(systemDir, 'definition.md'));
copyTemplateIfMissing('index.md', path.join(memoryDir, 'index.md'));
}
function copyTemplateIfMissing(template: string, dest: string): void {
if (fs.existsSync(dest)) return;
fs.copyFileSync(path.join(TEMPLATES_DIR, template), dest);
}
@@ -0,0 +1,22 @@
import { describe, expect, it } from 'bun:test';
import fs from 'fs';
import path from 'path';
// Wiring guard for the memory-scaffold seam: the boot gate in index.ts
// (`if (provider.usesMemoryScaffold) ensureMemoryScaffold()`) is the seam's
// single functional reach-in. The unit tests in memory-scaffold.test.ts drive
// ensureMemoryScaffold directly and stay green if the gate is deleted — this
// test goes red. main() can't be driven in-process (it reads
// /workspace/agent/container.json and enters the poll loop), so the guard is
// structural: gate + import must both be present in the real entry point.
describe('memory scaffold boot wiring', () => {
const indexSrc = fs.readFileSync(path.join(import.meta.dir, 'index.ts'), 'utf-8');
it('gates the scaffold on the provider capability in main()', () => {
expect(indexSrc).toContain('if (provider.usesMemoryScaffold) ensureMemoryScaffold()');
});
it('imports ensureMemoryScaffold from the seam module', () => {
expect(indexSrc).toContain("import { ensureMemoryScaffold } from './memory-scaffold.js'");
});
});
@@ -0,0 +1,23 @@
# Agent Memory System
This editable file defines how your persistent memory works. It is a starting
point, not a contract — reorganize it as the work demands. If the user or another
memory system replaces this definition, follow the replacement.
Start every memory task at `memory/index.md`, then follow the narrowest relevant index.
Treat indexes as core data: keep them accurate and concise.
Every folder of durable memory has its own `index.md` describing its contents.
When an index grows past roughly 20 entries, group related items into subfolders,
and give each new subfolder its own `index.md` linked from the parent.
Use `memory/memories/` for durable facts, project context, people, decisions, and entity notes.
Use `memory/data/` for structured reference data, datasets, tables, and reusable records.
Use entity folders for things that matter: projects, people, places, organizations, decisions.
When the user shares something that should survive future turns, store it in the
smallest useful file; prefer updating an existing file over creating duplicates.
Write concise, source-aware notes; include dates when timing matters.
If a fact is corrected, update the memory and keep only useful history.
When you add, move, or remove memory, update the nearest index.
Before answering from memory, read the relevant index or file instead of guessing;
if memory is missing or uncertain, say so and verify when it matters.
@@ -0,0 +1,5 @@
# Memory Index
- [Memory system definition](system/definition.md)
- [Memories](memories/) - durable facts, people, projects, decisions
- [Data](data/) - structured reference data
+87
View File
@@ -0,0 +1,87 @@
import { describe, it, expect, beforeEach, afterEach } from 'bun:test';
import fs from 'fs';
import os from 'os';
import path from 'path';
import { initTestSessionDb, closeSessionDb } from './db/connection.js';
import { getUndeliveredMessages } from './db/messages-out.js';
import { enqueueFileOut } from './outbox.js';
let outboxDir: string;
let srcDir: string;
beforeEach(() => {
initTestSessionDb();
outboxDir = fs.mkdtempSync(path.join(os.tmpdir(), 'nanoclaw-outbox-'));
srcDir = fs.mkdtempSync(path.join(os.tmpdir(), 'nanoclaw-src-'));
process.env.NANOCLAW_OUTBOX_DIR = outboxDir;
});
afterEach(() => {
closeSessionDb();
delete process.env.NANOCLAW_OUTBOX_DIR;
fs.rmSync(outboxDir, { recursive: true, force: true });
fs.rmSync(srcDir, { recursive: true, force: true });
});
function writeSrc(name: string, bytes: string): string {
const p = path.join(srcDir, name);
fs.writeFileSync(p, bytes);
return p;
}
describe('enqueueFileOut', () => {
it('stages the file under the outbox and enqueues a messages_out row with files[]', () => {
const src = writeSrc('ig_abc.png', 'PNGDATA');
const { id, filename } = enqueueFileOut({
srcPath: src,
routing: { platform_id: 'chan-1', channel_type: 'discord', thread_id: 'thr-9', in_reply_to: 'm1' },
text: 'here you go',
});
// Bytes staged at <outbox>/<id>/<filename> for the host to read.
const staged = path.join(outboxDir, id, filename);
expect(fs.existsSync(staged)).toBe(true);
expect(fs.readFileSync(staged, 'utf8')).toBe('PNGDATA');
// Exactly one outbound row, carrying the file reference + routing.
const out = getUndeliveredMessages();
expect(out).toHaveLength(1);
const row = out[0];
expect(row.platform_id).toBe('chan-1');
expect(row.channel_type).toBe('discord');
expect(row.thread_id).toBe('thr-9');
expect(row.in_reply_to).toBe('m1');
const content = JSON.parse(row.content);
expect(content.files).toEqual(['ig_abc.png']);
expect(content.text).toBe('here you go');
});
it('defaults filename to the basename and text to empty', () => {
const src = writeSrc('chart.png', 'X');
const { filename } = enqueueFileOut({
srcPath: src,
routing: { platform_id: 'C-1', channel_type: 'slack', thread_id: null },
});
expect(filename).toBe('chart.png');
const row = getUndeliveredMessages()[0];
expect(row.in_reply_to).toBeNull();
const content = JSON.parse(row.content);
expect(content.text).toBe('');
expect(content.files).toEqual(['chart.png']);
});
it('throws when the source file is missing — callers decide how to surface it', () => {
expect(() =>
enqueueFileOut({
srcPath: path.join(srcDir, 'does-not-exist.png'),
routing: { platform_id: 'C-1', channel_type: 'slack', thread_id: null },
}),
).toThrow();
// Nothing enqueued on failure.
expect(getUndeliveredMessages()).toHaveLength(0);
});
});
+68
View File
@@ -0,0 +1,68 @@
/**
* File delivery via the outbox.
*
* A file is delivered in two parts that must stay in lockstep: the bytes are
* staged under `/workspace/outbox/<id>/<filename>` (the host reads them from
* there after polling), and a `messages_out` row carries `{ files: [name] }`
* so the host knows to attach them. This helper owns that contract so the two
* callers the `send_file` MCP tool (model-driven) and the poll-loop's `file`
* event consumer (harness-generated images) can't drift apart.
*/
import fs from 'fs';
import path from 'path';
import { writeMessageOut } from './db/messages-out.js';
/** Where staged files live. Overridable for tests; production is always the mount. */
function outboxBase(): string {
return process.env.NANOCLAW_OUTBOX_DIR ?? '/workspace/outbox';
}
function generateId(): string {
return `msg-${Date.now()}-${Math.random().toString(36).slice(2, 8)}`;
}
export interface FileOutRouting {
platform_id: string;
channel_type: string;
thread_id: string | null;
in_reply_to?: string | null;
}
export interface EnqueueFileOut {
/** Absolute or already-resolved path to the file to deliver. Must exist. */
srcPath: string;
routing: FileOutRouting;
/** Optional accompanying message text. */
text?: string;
/** Display name; defaults to the basename of `srcPath`. */
filename?: string;
}
/**
* Stage a file into the outbox and enqueue its `messages_out` row.
*
* Throws if `srcPath` cannot be read/copied callers decide whether that
* should surface to the user (the MCP tool validates existence first; the
* poll-loop consumer logs and moves on so one bad image can't fail the turn).
*/
export function enqueueFileOut(opts: EnqueueFileOut): { id: string; filename: string; seq: number } {
const id = generateId();
const filename = opts.filename ?? path.basename(opts.srcPath);
const outboxDir = path.join(outboxBase(), id);
fs.mkdirSync(outboxDir, { recursive: true });
fs.copyFileSync(opts.srcPath, path.join(outboxDir, filename));
const seq = writeMessageOut({
id,
in_reply_to: opts.routing.in_reply_to ?? null,
kind: 'chat',
platform_id: opts.routing.platform_id,
channel_type: opts.routing.channel_type,
thread_id: opts.routing.thread_id,
content: JSON.stringify({ text: opts.text ?? '', files: [filename] }),
});
return { id, filename, seq };
}
+99 -8
View File
@@ -14,7 +14,8 @@ import {
type RoutingContext,
} from './formatter.js';
import { isUploadTraceCommand, uploadTrace } from './upload-trace.js';
import type { AgentProvider, AgentQuery, ProviderEvent } from './providers/types.js';
import { enqueueFileOut } from './outbox.js';
import type { AgentProvider, AgentQuery, ProviderEvent, ProviderExchange } from './providers/types.js';
const POLL_INTERVAL_MS = 1000;
const ACTIVE_POLL_INTERVAL_MS = 500;
@@ -63,6 +64,12 @@ export interface PollLoopConfig {
systemContext?: {
instructions?: string;
};
/**
* Optional stop signal. In production the loop runs until the container
* dies; tests pass a signal so an abandoned loop actually exits instead of
* polling forever and stealing messages from the next test's DB.
*/
signal?: AbortSignal;
}
/**
@@ -107,6 +114,7 @@ export async function runPollLoop(config: PollLoopConfig): Promise<void> {
let pollCount = 0;
let isFirstPoll = true;
while (true) {
if (config.signal?.aborted) return;
// Skip system messages — they're responses for MCP tools (e.g., ask_user_question)
const messages = getPendingMessages(isFirstPoll).filter((m) => m.kind !== 'system');
isFirstPoll = false;
@@ -232,7 +240,15 @@ export async function runPollLoop(config: PollLoopConfig): Promise<void> {
// can stamp it on outbound rows — needed for a2a return-path routing.
setCurrentInReplyTo(routing.inReplyTo);
try {
const result = await processQuery(query, routing, processingIds, config.providerName);
const result = await processQuery(
query,
routing,
processingIds,
config.providerName,
config.provider.onExchangeComplete?.bind(config.provider),
prompt,
continuation,
);
if (result.continuation && result.continuation !== continuation) {
continuation = result.continuation;
setContinuation(config.providerName, continuation);
@@ -313,10 +329,18 @@ async function processQuery(
routing: RoutingContext,
initialBatchIds: string[],
providerName: string,
onExchangeComplete: ((exchange: ProviderExchange) => void) | undefined,
initialPrompt: string,
initialContinuation: string | undefined,
): Promise<QueryResult> {
let queryContinuation: string | undefined;
let done = false;
let unwrappedNudged = false;
// Prompt queue for the exchange hook — each result event consumes the
// oldest unanswered prompt, except a wrapping-retry result, which answers
// the same prompt again. Unused (and unmaintained) when the provider
// doesn't implement `onExchangeComplete`.
const archivePrompts: string[] = [initialPrompt];
// Concurrent polling: push follow-ups into the active query as they arrive.
// We do NOT force-end the stream on silence — keeping the query open avoids
@@ -342,13 +366,16 @@ async function processQuery(
// resume id (fixed at sdkQuery() time); admin/passthrough commands
// (/compact, /cost, …) only dispatch when they're the first input
// of a query — pushed mid-stream they arrive as plain text and
// the SDK never runs them. End the stream and leave the rows
// pending; the outer loop handles them on next iteration via the
// canonical command path + formatMessagesWithCommands.
// the SDK never runs them. Abort the active stream and leave the
// rows pending; the outer loop handles them on next iteration via
// the canonical command path + formatMessagesWithCommands. Abort,
// not end: end() lets an in-flight turn run to completion, which
// can block the command (e.g. /clear during a long task) for as
// long as the turn takes.
if (pending.some((m) => isRunnerCommand(m))) {
log('Pending slash command — ending stream so outer loop can process');
log('Pending slash command — aborting active stream so outer loop can process');
endedForCommand = true;
query.end();
query.abort();
return;
}
@@ -393,6 +420,7 @@ async function processQuery(
log(`Pushing ${keep.length} follow-up message(s) into active query`);
unwrappedNudged = false;
query.push(prompt);
archivePrompts.push(prompt);
markCompleted(keptIds);
} catch (err) {
// Without this catch the rejection escapes the void IIFE and Node
@@ -456,7 +484,14 @@ async function processQuery(
markCompleted(initialBatchIds);
if (event.text) {
const { hasUnwrapped } = dispatchResultText(event.text, routing);
if (hasUnwrapped && !unwrappedNudged) {
const willRetryWrapping = hasUnwrapped && !unwrappedNudged;
notifyExchangeComplete(onExchangeComplete, {
prompt: archivePrompts[0] ?? initialPrompt,
result: event.text,
continuation: queryContinuation ?? initialContinuation,
status: hasUnwrapped ? 'undelivered' : 'completed',
});
if (willRetryWrapping) {
unwrappedNudged = true;
const destinations = getAllDestinations();
const names = destinations.map((d) => d.name).join(', ');
@@ -467,9 +502,25 @@ async function processQuery(
`Please re-send your response with the correct wrapping.</system>`,
);
}
// The wrapping-retry result answers the SAME user prompt — keep it
// queued so the retry archives against it, not the nudge text.
if (!willRetryWrapping) archivePrompts.shift();
} else {
archivePrompts.shift();
}
} else if (event.type === 'file') {
deliverHarnessFile(event.path, routing);
}
}
} catch (err) {
const errMsg = err instanceof Error ? err.message : String(err);
notifyExchangeComplete(onExchangeComplete, {
prompt: archivePrompts[0] ?? initialPrompt,
result: `Error: ${errMsg}`,
continuation: queryContinuation ?? initialContinuation,
status: 'error',
});
throw err;
} finally {
done = true;
clearInterval(pollHandle);
@@ -478,6 +529,18 @@ async function processQuery(
return { continuation: queryContinuation };
}
function notifyExchangeComplete(
hook: ((exchange: ProviderExchange) => void) | undefined,
exchange: ProviderExchange,
): void {
if (!hook) return;
try {
hook(exchange);
} catch (err) {
log(`onExchangeComplete failed: ${err instanceof Error ? err.message : String(err)}`);
}
}
function handleEvent(event: ProviderEvent, _routing: RoutingContext): void {
switch (event.type) {
case 'init':
@@ -497,6 +560,34 @@ function handleEvent(event: ProviderEvent, _routing: RoutingContext): void {
}
}
/**
* Deliver a harness-generated file (e.g. a Codex-rendered image) to the
* batch's reply destination. The model never sends these itself its native
* client already rendered them so the loop delivers them via the same outbox
* path send_file uses. Best-effort: a missing reply destination or an
* unreadable file logs and is skipped rather than failing the whole turn.
*/
function deliverHarnessFile(filePath: string, routing: RoutingContext): void {
if (!routing.platformId || !routing.channelType) {
log(`Dropping harness file ${filePath}: batch has no reply destination`);
return;
}
try {
const { filename, seq } = enqueueFileOut({
srcPath: filePath,
routing: {
platform_id: routing.platformId,
channel_type: routing.channelType,
thread_id: routing.threadId,
in_reply_to: routing.inReplyTo,
},
});
log(`Delivered harness file #${seq}${routing.channelType}:${routing.platformId} (${filename})`);
} catch (err) {
log(`Failed to deliver harness file ${filePath}: ${err instanceof Error ? err.message : String(err)}`);
}
}
/**
* Parse the agent's final text for <message to="name">...</message> blocks
* and dispatch each one to its resolved destination. Text outside of blocks
@@ -6,6 +6,25 @@ export interface AgentProvider {
*/
readonly supportsNativeSlashCommands: boolean;
/**
* Optional. When true, the runner scaffolds a persistent `memory/` tree in the
* agent's workspace at boot. Providers with their own native memory (e.g.
* Claude's `CLAUDE.local.md`) omit this and get nothing memory is opt-in per
* provider, never gated on a provider name.
*/
readonly usesMemoryScaffold?: boolean;
/**
* Optional. Called by the poll-loop after each completed exchange (a
* result, a wrapping retry, or an error). Providers whose harness keeps no
* on-disk transcript implement this to persist exchanges themselves (e.g.
* markdown into the agent's `conversations/` dir); providers that persist
* and archive their own transcript (e.g. the Claude Agent SDK's `.jsonl`)
* omit it. Best-effort: the loop catches and logs anything it throws. The
* implementation lives with the provider, never in the runner.
*/
onExchangeComplete?(exchange: ProviderExchange): void;
/** Start a new query. Returns a handle for streaming input and output. */
query(input: QueryInput): AgentQuery;
@@ -31,6 +50,16 @@ export interface AgentProvider {
maybeRotateContinuation?(continuation: string, cwd: string): string | null;
}
/** One prompt/result round-trip, as reported to `onExchangeComplete`. */
export interface ProviderExchange {
/** The user prompt this exchange answers (never an internal retry nudge). */
prompt: string;
result: string | null;
/** Continuation/thread id in effect for the exchange, if any. */
continuation?: string;
status: 'completed' | 'undelivered' | 'error';
}
/**
* Options passed to provider constructors. Fields are common to most
* providers; individual providers may ignore any they don't need.
@@ -99,6 +128,13 @@ export type ProviderEvent =
| { type: 'result'; text: string | null }
| { type: 'error'; message: string; retryable: boolean; classification?: string }
| { type: 'progress'; message: string }
/**
* A file the harness produced that the model won't deliver itself (e.g.
* Codex's built-in image generation renders to its native client, so the
* model believes delivery already happened). The poll-loop delivers it to
* the batch's reply destination. `path` is absolute inside the container.
*/
| { type: 'file'; path: string }
/**
* Liveness signal. Providers MUST yield this on every underlying SDK
* event (tool call, thinking, partial message, anything) so the
+5
View File
@@ -0,0 +1,5 @@
[
{ "name": "vercel", "version": "52.2.1" },
{ "name": "agent-browser", "version": "0.27.1", "onlyBuilt": true },
{ "name": "@anthropic-ai/claude-code", "version": "2.1.170", "onlyBuilt": true }
]
+61
View File
@@ -0,0 +1,61 @@
import { describe, it, expect } from 'vitest';
import { readFileSync } from 'node:fs';
import { fileURLToPath } from 'node:url';
import { dirname, join } from 'node:path';
// Guards the cli-tools.json seam: the global CLIs the agent invokes at runtime
// are installed from the manifest (a skill adds one with a json-merge), not
// hand-edited into the Dockerfile. These go red on a bad merge that drops a
// baseline tool, or on dewiring the Dockerfile / switching the installer off
// the pnpm supply-chain path.
const here = dirname(fileURLToPath(import.meta.url));
const manifest = JSON.parse(readFileSync(join(here, 'cli-tools.json'), 'utf8')) as Array<{
name: string;
version: string;
onlyBuilt?: boolean;
}>;
const dockerfile = readFileSync(join(here, 'Dockerfile'), 'utf8');
const installer = readFileSync(join(here, 'install-cli-tools.sh'), 'utf8');
describe('cli-tools manifest', () => {
it('is a non-empty array of { name, version }', () => {
expect(Array.isArray(manifest)).toBe(true);
expect(manifest.length).toBeGreaterThan(0);
for (const tool of manifest) {
expect(typeof tool.name).toBe('string');
expect(tool.name.length).toBeGreaterThan(0);
expect(typeof tool.version).toBe('string');
expect(tool.version.length).toBeGreaterThan(0);
}
});
it('has unique tool names (json-merge is keyed on name)', () => {
const names = manifest.map((t) => t.name);
expect(new Set(names).size).toBe(names.length);
});
it('pins every version to an exact semver (no latest, no ranges — supply-chain policy)', () => {
for (const tool of manifest) {
expect(tool.version, `${tool.name} must be an exact semver, not "${tool.version}"`).toMatch(
/^\d+\.\d+\.\d+(?:[-+][0-9A-Za-z.-]+)?$/,
);
}
});
it('keeps the baseline CLIs the agent depends on', () => {
const names = manifest.map((t) => t.name);
for (const required of ['vercel', 'agent-browser', '@anthropic-ai/claude-code']) {
expect(names).toContain(required);
}
});
it('is wired into the Dockerfile build (COPY manifest + run installer)', () => {
expect(dockerfile).toMatch(/COPY cli-tools\.json install-cli-tools\.sh/);
expect(dockerfile).toMatch(/install-cli-tools\.sh \/tmp\/cli-tools\.json/);
});
it('installs via pnpm and writes only-built opt-ins (preserves the supply-chain path)', () => {
expect(installer).toMatch(/pnpm install -g/);
expect(installer).toMatch(/only-built-dependencies\[\]=/);
});
});
+29
View File
@@ -0,0 +1,29 @@
#!/bin/sh
# Install the global Node CLIs the agent invokes at runtime, from cli-tools.json.
#
# A skill adds a tool by appending a { "name", "version" } entry to that
# manifest (a json-merge) instead of editing the Dockerfile — the reach-in
# becomes the safest change shape, deterministic and removable.
#
# Every tool is installed via `pnpm install -g`, pinned to an exact version, so
# the pnpm supply-chain policy still applies. Tools with a native postinstall
# set "onlyBuilt": true to opt in to running build scripts (pnpm skips them by
# default). Run as root before `USER node`, so /root/.npmrc is the right home.
set -eu
MANIFEST="${1:-/tmp/cli-tools.json}"
# Write the per-tool only-built-dependencies opt-ins pnpm reads at install time.
node -e '
const tools = require(process.argv[1]);
const optIns = tools.filter((t) => t.onlyBuilt).map((t) => "only-built-dependencies[]=" + t.name);
require("fs").writeFileSync("/root/.npmrc", optIns.join("\n") + (optIns.length ? "\n" : ""));
' "$MANIFEST"
# Install every tool, pinned. name@version specs never contain spaces, so the
# unquoted expansion word-splits cleanly into positional args.
# shellcheck disable=SC2046
set -- $(node -e 'require(process.argv[1]).forEach((t) => console.log(t.name + "@" + t.version))' "$MANIFEST")
if [ "$#" -gt 0 ]; then
pnpm install -g "$@"
fi
-1
View File
@@ -9,6 +9,5 @@ The files in this directory are original design documents and developer referenc
| [SPEC.md](SPEC.md) | [Architecture](https://docs.nanoclaw.dev/concepts/architecture) |
| [SECURITY.md](SECURITY.md) | [Security model](https://docs.nanoclaw.dev/concepts/security) |
| [REQUIREMENTS.md](REQUIREMENTS.md) | [Introduction](https://docs.nanoclaw.dev/introduction) |
| [skills-as-branches.md](skills-as-branches.md) | [Skills system](https://docs.nanoclaw.dev/integrations/skills-system) |
| [docker-sandboxes.md](docker-sandboxes.md) | [Docker Sandboxes](https://docs.nanoclaw.dev/advanced/docker-sandboxes) |
| [APPLE-CONTAINER-NETWORKING.md](APPLE-CONTAINER-NETWORKING.md) | [Container runtime](https://docs.nanoclaw.dev/advanced/container-runtime) |
+42
View File
@@ -83,6 +83,48 @@ Each NanoClaw group gets its own OneCLI agent identity. This allows different cr
- Any credentials matching blocked patterns
- `.env` is shadowed with `/dev/null` in the project root mount
### 6. Egress Lockdown (Forced Proxy)
The `HTTPS_PROXY` env var only redirects *proxy-aware* clients — a tool that
ignores it (or a raw socket) could reach the internet directly and bypass
credential injection, approvals, and audit. Egress lockdown closes that hole at
the network layer.
**How it works:** agents are placed on a Docker `--internal` network
(`nanoclaw-egress`) that has **no route to the internet**. The OneCLI gateway
container is attached to that network, aliased as `host.docker.internal`, so the
injected proxy URL (`…@host.docker.internal:10255`) resolves to the gateway
*container-to-container*. The gateway is therefore the **only reachable hop**
anything else has nowhere to go. The agent is non-root with no `NET_ADMIN`, so
it cannot undo this. Identical mechanism on macOS and Linux (no host firewall,
no `host-gateway` route).
- **Self-healing:** the gateway is re-attached to the network at every spawn and
on each host-sweep tick, so an out-of-band detach (e.g. `docker compose up` on
the OneCLI stack — its compose lives in `~/.onecli`, not this repo) recovers
automatically.
- **Fail-fast:** if lockdown is on but the network can't be created or the
gateway can't be attached (e.g. a non-standard gateway container name, or the
gateway isn't running), nanoclaw **refuses to spawn the agent** and surfaces a
clear error — it never silently falls back to open egress. Fix the cause (or
set `NANOCLAW_EGRESS_LOCKDOWN=false`) and retry. The host-sweep re-heal is the
exception: a heal failure there is logged but not fatal, since already-running
agents stay on the internal net (no leak) until the gateway returns.
**Configuration:**
| Env | Default | Meaning |
| --- | --- | --- |
| `NANOCLAW_EGRESS_LOCKDOWN` | `false` | Set `true` to opt in (otherwise the host-gateway path is used). Enabled automatically by `/add-golden-registry`. |
| `NANOCLAW_EGRESS_NETWORK` | `nanoclaw-egress` | Network name. |
| `ONECLI_GATEWAY_CONTAINER` | `onecli` | Gateway container to attach. |
**⚠ Behavior when enabled:** with lockdown on, agents have **no direct
internet** — all traffic must go through OneCLI. Proxy-aware clients (npm, pnpm,
pip, curl, node/bun with the proxy env) are unaffected. Any workflow that relies
on a **non-proxy-aware** tool reaching the internet directly will fail by design.
Lockdown is **off by default**; opt in with `NANOCLAW_EGRESS_LOCKDOWN=true`.
## Privilege Comparison
| Capability | Main Group | Non-Main Group |
+5 -1
View File
@@ -668,15 +668,19 @@ CREATE TABLE agent_groups (
);
-- Platform groups/channels (WhatsApp group, Slack channel, Discord channel, email thread, etc.)
-- One row per chat PER ADAPTER INSTANCE. instance defaults to channel_type
-- (the "default instance"), so single-instance installs never see it.
CREATE TABLE messaging_groups (
id TEXT PRIMARY KEY,
channel_type TEXT NOT NULL, -- 'whatsapp', 'slack', 'discord', 'telegram', 'email'
platform_id TEXT NOT NULL, -- platform-specific ID (JID, channel ID, etc.)
instance TEXT NOT NULL, -- adapter-instance name; default = channel_type
name TEXT,
is_group INTEGER DEFAULT 0,
unknown_sender_policy TEXT NOT NULL DEFAULT 'strict', -- 'strict' | 'request_approval' | 'public'
created_at TEXT NOT NULL,
UNIQUE(channel_type, platform_id)
denied_at TEXT,
UNIQUE(channel_type, platform_id, instance)
);
-- Users (messaging platform identities, namespaced "<channel_type>:<handle>")
+36
View File
@@ -0,0 +1,36 @@
# Customizing NanoClaw
NanoClaw is made to be forked and changed. The catch with most projects is that once you edit the code, every upstream update turns into a merge fight, and the more you customized, the worse it gets.
NanoClaw avoids that with one simple idea: **every change you make is a skill.**
## The idea in a minute
- A **skill** is a small, self-contained add-on. It brings its own code and knows how to install itself.
- Your **fork is just a list of skills**, plus one "recipe" that says which skills you have and how they fit together.
- Because your changes live beside the core instead of tangled into it, **pulling in updates stays easy**.
## What makes it work
A good skill mostly **adds** things: new files, a line appended to an existing file, a dependency. It avoids rewriting existing code in place.
And it ships a test for each spot where it touches the rest of the system. When an update moves something your skill depends on, that test fails and points at the fix, instead of you finding out when things break in production.
## How you actually work
You don't have to think in skills while you're building. **Edit the code directly, get it working, then turn your changes into skills afterward.** A coding agent does the conversion for you, following [skill-guidelines.md](skill-guidelines.md).
The only rule worth remembering: **a change isn't really part of your fork until it's a skill**, because that's the form that survives an upgrade.
## Upgrading
Always upgrade by running `/update-nanoclaw`. **Don't just `git pull`.** The command sets a rollback point, pulls the upstream changes, runs your tests, and walks you through anything that needs fixing, usually a small, local fix in one skill.
## The deal
We keep the core small and stable, and every breaking change ships with its migration. You keep your changes as skills, with tests. Do that, and upgrades won't break you. Changes edited directly into the core are the one thing the model can't protect.
## Go deeper
- **[The skills model in full](skills-model.md)**: how skills, recipes, tests, and upgrades work under the hood.
- **[Skill guidelines](skill-guidelines.md)**: the authoritative checklist for writing one.
+6 -3
View File
@@ -27,21 +27,24 @@ CREATE TABLE agent_groups (
### 1.2 `messaging_groups`
One row per platform chat (one WhatsApp group, one Slack channel, one 1:1 DM, etc.).
One row per platform chat (one WhatsApp group, one Slack channel, one 1:1 DM, etc.) per adapter instance.
```sql
CREATE TABLE messaging_groups (
id TEXT PRIMARY KEY,
channel_type TEXT NOT NULL,
platform_id TEXT NOT NULL,
instance TEXT NOT NULL,
name TEXT,
is_group INTEGER DEFAULT 0,
unknown_sender_policy TEXT NOT NULL DEFAULT 'strict',
created_at TEXT NOT NULL,
UNIQUE(channel_type, platform_id)
denied_at TEXT,
UNIQUE(channel_type, platform_id, instance)
);
```
- `instance`: adapter-instance name — N adapters of one platform (e.g. three Slack apps in one workspace) each own their rows. The default instance IS the channel type: migration 016 backfills `instance = channel_type` and `createMessagingGroup` stamps the same default, so single-instance installs never see the dimension. Inbound lookups are exact-on-instance (an unknown named instance auto-creates its own row); outbound lookups resolve default-instance-first.
- `unknown_sender_policy`: `strict` (drop), `request_approval` (ask admin), `public` (allow).
- **Readers:** `src/router.ts`, `src/delivery.ts`, `src/session-manager.ts`
- **Writers:** `src/db/messaging-groups.ts`, channel setup flows
@@ -134,7 +137,7 @@ CREATE TABLE user_dms (
);
```
Populated lazily by `ensureUserDm()` in `src/user-dm.ts`.
Populated lazily by `ensureUserDm()` in `src/user-dm.ts`. Cold DMs resolve via the channel's default adapter instance — `PRIMARY KEY (user_id, channel_type)` is per-platform, not per-instance.
### 1.8 `sessions`
+74
View File
@@ -53,6 +53,80 @@ Model selection considerations for Apple Silicon:
The agent uses tool calls extensively (read/write files, shell commands). Models that support tool use reliably work best. Gemma 4 and Qwen 3 Coder both handle structured tool calls well.
## Allowing Prompt Caching (filter the cache-busting hash)
Out of the box this path is slow — every reply re-reads the whole multi-thousand-token system prompt from scratch, even for a one-word answer. Ollama has a prompt cache that should skip that repeated work, but on this path it never kicks in.
**Cause.** The Claude Agent SDK adds a per-request hash to the front of every prompt — `x-anthropic-billing-header: ...; cch=<hash>;`. It changes on every request, and Ollama's cache only reuses a prompt whose start is unchanged. So that one shifting value at the front makes Ollama treat every prompt as new and re-read all of it. (Ollama ignores the hash itself, so filtering it has no effect on output.)
**Fix.** Run a tiny proxy between the container and Ollama that filters the hash out (pins `cch=<hash>` to a constant). The start of the prompt is now stable, so the cache kicks in and only the new message gets processed. In our setup — a 31B model on Apple Silicon — follow-up replies dropped from ~80s to ~4s; your numbers will vary with model size and hardware. Output is unchanged, since Ollama ignores the value anyway.
Point the agent group's `ANTHROPIC_BASE_URL` at the proxy instead of Ollama directly (everything else from the sections above is unchanged):
```
ANTHROPIC_BASE_URL=http://host.docker.internal:11999 # the proxy
# proxy forwards to http://127.0.0.1:11434 (Ollama)
```
The proxy is ~40 lines of dependency-free Node:
```js
// ollama-cch-proxy.mjs — normalize the SDK's per-request cch nonce so Ollama's
// prefix cache survives across turns. Listens on :11999, forwards to Ollama.
import http from 'node:http';
const TARGET_HOST = process.env.OLLAMA_HOST || '127.0.0.1';
const TARGET_PORT = Number(process.env.OLLAMA_PORT || 11434);
const LISTEN_PORT = Number(process.env.PROXY_PORT || 11999);
const server = http.createServer((req, res) => {
const chunks = [];
req.on('data', (c) => chunks.push(c));
req.on('end', () => {
let body = Buffer.concat(chunks);
if (req.method === 'POST' && body.length) {
body = Buffer.from(body.toString('utf8').replace(/cch=[0-9a-f]+;/g, 'cch=00000;'), 'utf8');
}
const headers = { ...req.headers, host: `${TARGET_HOST}:${TARGET_PORT}`, 'content-length': String(body.length) };
const proxyReq = http.request(
{ host: TARGET_HOST, port: TARGET_PORT, method: req.method, path: req.url, headers },
(proxyRes) => {
res.writeHead(proxyRes.statusCode || 502, proxyRes.headers);
proxyRes.pipe(res);
},
);
proxyReq.on('error', (e) => { res.writeHead(502); res.end(String(e)); });
proxyReq.end(body);
});
});
server.listen(LISTEN_PORT, '0.0.0.0', () => console.log(`cch-proxy :${LISTEN_PORT} -> ${TARGET_HOST}:${TARGET_PORT}`));
```
Run it durably so it survives reboots. On Linux, a systemd user service:
```ini
# ~/.config/systemd/user/ollama-cch-proxy.service
[Unit]
Description=Ollama cch-normalizing proxy for NanoClaw
After=network-online.target
[Service]
ExecStart=/usr/bin/node %h/.config/nanoclaw/ollama-cch-proxy.mjs
Restart=always
[Install]
WantedBy=default.target
```
```bash
systemctl --user enable --now ollama-cch-proxy
loginctl enable-linger "$USER" # so it runs without an active login session
```
On macOS use a `launchd` user agent (`~/Library/LaunchAgents/`) running the same script.
**Scope.** This only affects the Claude-Code-CLI → Ollama path described here. Codex and OpenCode don't use the Claude Agent SDK, so they never emit the `cch` hash and get prompt caching for free.
## What Changes at the Code Level
Three files need to support this feature. See `/add-ollama-provider` for the exact changes.
+83
View File
@@ -0,0 +1,83 @@
# Upgrading the OneCLI gateway
NanoClaw talks to the OneCLI gateway (credential vault + egress proxy) through `@onecli-sh/sdk`. The gateway is an external component with its own release line, so NanoClaw pins the **sanctioned gateway version** in [`versions.json`](../versions.json) under `onecli-gateway`. When an update moves that pin, the gateway must be upgraded — this doc is the migration path. It is written to be handed to a coding agent verbatim: detect → upgrade → verify → rollback.
There is deliberately **no runtime version check, and setup does not migrate the gateway for you**: the gateway is a separate out-of-band component, and the migrator is your coding agent running `/update-nanoclaw` — it diffs `versions.json` across the update and routes you here when the `onecli-gateway` pin moved. (Setup detects a pre-`/v1` gateway and points at this doc, but never upgrades it.) Run the steps below verbatim.
## 1. Detect
Find out what is running and what is required:
```bash
cat versions.json # the sanctioned pin
curl -s http://127.0.0.1:10254/api/health # reports the running gateway version
curl -s -o /dev/null -w '%{http_code}' http://127.0.0.1:10254/v1/health
```
If the last command prints `404`, the server predates the `/v1` API that `@onecli-sh/sdk` 2.x requires — every SDK call will fail with 404s that look transient but are permanent. If your gateway is remote, substitute its host for `127.0.0.1` (it's in `.env` as `ONECLI_URL` / `NANOCLAW_ONECLI_API_HOST`).
Why gateways fall behind: the OneCLI installer's docker-compose tracks the `latest` image tag, but Docker never re-pulls a tag — the server freezes at whatever `latest` meant on install day.
## 2. Upgrade
The gateway runs as a Docker service in `~/.onecli`. Upgrade just that container to the pinned `onecli-gateway` version — vault data lives in named Docker volumes and survives. This upgrades only the gateway; the CLI binary is pinned separately (see below).
**Local gateway (the common case):**
```bash
cd ~/.onecli && ONECLI_VERSION=<onecli-gateway pin from versions.json> docker compose pull onecli && docker compose up -d
```
**Remote gateway** — run the same command on the gateway's host (NanoClaw can't reach it over SSH).
## 3. Verify
Host-side health is necessary but **not sufficient**:
```bash
curl -s http://127.0.0.1:10254/v1/health # must return {"status":"ok",...}
```
**Verify the bind interface (container reachability).** Agent containers reach the gateway over the docker bridge (`host.docker.internal` → e.g. `172.17.0.1`), so a server bound only to `127.0.0.1` boots clean host-side while every credentialed call from containers dies at the proxy:
```bash
docker run --rm --add-host=host.docker.internal:host-gateway \
curlimages/curl -s -o /dev/null -w '%{http_code}' http://host.docker.internal:10254/v1/health
```
This must print `200`. If it can't connect while the host-side check passed, set the bind address in `~/.onecli/.env` to the docker-bridge IP (or `0.0.0.0` on a host with a closed firewall) and `cd ~/.onecli && docker compose up -d`. Symptom if skipped: host log clean, agents fail all API calls.
Finally, restart the NanoClaw service (per-install names — derive with `setup/lib/install-slug.sh`):
```bash
# macOS
source setup/lib/install-slug.sh && launchctl kickstart -k gui/$(id -u)/$(launchd_label)
# Linux
source setup/lib/install-slug.sh && systemctl --user restart $(systemd_unit)
```
## 4. Rollback
```bash
cd ~/.onecli && ONECLI_VERSION=<old-version> docker compose up -d
```
If the NanoClaw update itself is being rolled back, also pin `@onecli-sh/sdk` back to its previous version in `package.json` and run `pnpm install`. Vault data is unaffected in both directions.
## The CLI binary (`onecli-cli` pin)
The `onecli` host CLI is pinned the same way, under `onecli-cli` in `versions.json`. Setup installs exactly that version by direct release download — it never resolves "latest". When an update moves this pin, replace the binary with the pinned release:
```bash
onecli --version # detect: what is installed
V=<onecli-cli pin from versions.json>
OS=$(uname -s | tr '[:upper:]' '[:lower:]') # darwin | linux
ARCH=$(uname -m | sed 's/x86_64/amd64/;s/aarch64/arm64/') # amd64 | arm64
curl -fsSL -o /tmp/onecli.tgz \
"https://github.com/onecli/onecli-cli/releases/download/v${V}/onecli_${V}_${OS}_${ARCH}.tar.gz"
tar -xzf /tmp/onecli.tgz -C /tmp
install -m 0755 /tmp/onecli "$(command -v onecli || echo ~/.local/bin/onecli)"
onecli --version # verify: must match versions.json
```
To roll back, run the same block after reverting `versions.json` (or checking out the previous NanoClaw version). The CLI is stateless — vault data lives in the gateway, so swapping the binary in either direction loses nothing.
+44
View File
@@ -0,0 +1,44 @@
# Switching an agent group between providers
How an **operator** moves a live agent group from one agent provider to another (e.g. Claude → Codex) and back. Switching is an operator action: it runs from the host via `ncl groups config update --provider` + restart.
NanoClaw's runtime does not migrate anything when you switch. Provider-neutral state simply stays where it is; provider-specific state (memory, in-flight context) stays with its provider, and carrying memory across is a separate, explicit operator step (`/migrate-memory`, executed by your coding agent).
## Preconditions
1. **The target provider is installed** — run its `/add-<provider>` skill and rebuild the container image (`./container/build.sh`). If the provider isn't installed (or the name is a typo), the container fails at boot and the host surfaces its last words in the logs: look for `Container exited non-zero` with a `stderrTail` like `Unknown provider: codexx. Registered: claude, codex`.
2. **Auth is configured** — each provider documents its own auth in its install skill (for Codex: a ChatGPT-subscription or API-key secret in the OneCLI vault).
## Switching
```bash
ncl groups config update --id <group-id> --provider codex
ncl groups restart --id <group-id>
```
Sessions resolve their provider at container spawn (`sessions.agent_provider` is only set when you've explicitly pinned a session), so existing sessions pick up the new provider on their next wake.
## What carries over automatically
| State | How |
|-------|-----|
| Group identity, wiring, members, roles, destinations | Provider-neutral, in the central DB — untouched |
| Container config (model aside), skills, MCP servers, packages, mounts, cli_scope | Provider-neutral — untouched |
| Workspace files (`groups/<folder>/` — notes, data files the agent created) | Same workspace, mounted for every provider |
| Conversation archives (`conversations/`) | Provider-neutral markdown — readable by the new provider |
| Agent surfaces (system instructions / project docs) | Composed fresh at every spawn from the same sources — nothing to migrate |
## What does NOT carry over
- **Agent memory.** Each provider keeps its own store: Claude's per-group memory is `CLAUDE.local.md` in the workspace; scaffold providers (e.g. Codex) keep a `memory/` tree. Neither is touched by a switch — the old store sits intact, the new provider starts with its own. To carry memory across, run **`/migrate-memory`**: your coding agent reads the source store, distills it into the target store (copy, never move), and restarts the group. Both directions work.
- **In-flight conversation context.** Continuations are provider-specific (a Claude SDK session, a Codex thread) and stored in separate per-provider slots — the new provider starts a fresh thread. The old slot is kept, not deleted. Recent context is recoverable from `conversations/` archives.
- **Provider state dirs** (`.claude-shared/`, `.codex-shared/`). Each provider keeps its own; they sit idle while unused and are reused if you switch back.
## Rolling back
```bash
ncl groups config update --id <group-id> --provider claude
ncl groups restart --id <group-id>
```
Rollback is lossless by construction: the per-provider continuation slot means Claude resumes its previous session (subject to normal transcript-rotation age limits), and `CLAUDE.local.md` was never modified by the switch. Memory written **while on the other provider** lives in that provider's store — run `/migrate-memory` again if you want it carried back.
+1 -1
View File
@@ -187,7 +187,7 @@ leaking the token to disk outweighs the debugging value.
| File | Role |
|---|---|
| `nanoclaw.sh` | Top-level wrapper. Phase 1 (bootstrap) and phase 2 (setup:auto) orchestration. Writes bootstrap's raw log + progression entry. |
| `nanoclaw.sh` | Top-level wrapper. Phase 1 (bootstrap) and phase 2 (setup:auto) orchestration. Writes bootstrap's raw log + progression entry. `--uninstall` bypasses bootstrap entirely — it execs setup:auto directly (the flow lives in `setup/uninstall/`), or prints manual-cleanup guidance and exits 1 when the TS toolchain is missing. |
| `setup.sh` | Phase 1 bootstrap: Node, pnpm, native-module verify. Emits its own `BOOTSTRAP` status block (historically printed to stdout; now goes to the bootstrap raw log). |
| `setup/auto.ts` | Phase 2 driver. Orchestrates the clack UI, step execution, user prompts, and writes to all three log levels for every step it spawns. |
| `setup/logs.ts` | The logging primitives (`logStep`, `logUserInput`, `logComplete`, `stepRawLog`, `initSetupLog`). Single source of truth for level 2/3 formatting and file paths. |
+168
View File
@@ -0,0 +1,168 @@
# Skill guidelines
The authoritative checklist for writing a NanoClaw skill: the bar that conformance tooling and registry review will hold every skill to. [customizing.md](customizing.md) is the short introduction; [skills-model.md](skills-model.md) explains why the model works this way. This document evolves with the system; when a rule here proves wrong, fix the rule.
---
## Principles
Every customization is an additive **skill**: not an edit buried in core, but a skill that carries its own code and knows how to install and remove itself. Two principles make a skill *maintainable*; everything else in this document follows from them.
### 1. Minimal integration surface
A skill adds files and makes the **smallest possible reach-ins** into existing code. Adding a file or a dependency never breaks on upgrade; reaching into existing code is the only thing that does, so the integration surface *is* the upgrade risk. Keep reach-ins few, tiny, and ideally a single line that *calls* into the skill's own code.
Follows from this:
- **Mostly add.** See the change shapes below, in safety order.
- **Push logic into skill-owned files** so the core edit is one call, not an inlined block. This shrinks the surface *and* makes the point testable.
- **Colocated, self-contained** edits over edits in two places.
- **Use an existing registry or hook when there is one**: appending to a registry is a smaller surface than reaching into code. When none exists, a true code-level edit is fine and first-class. (Whether to *add* a hook because a spot has become a hotspot is the maintainer's call, not the skill's.)
### 2. A test for every functional integration point
Every reach-in with a **functional consequence** gets a test that goes **red if the wiring is deleted or drifts**. That's what protects the fork from upstream changes. The tests are also the verification: there is no separate "verify" step.
Follows from this:
- **Tests target integration with core, not internal correctness.** Unit tests of a skill's own logic, or its behavior against an external service, are the creator's call: fine, just not required.
- **A direct unit test doesn't count**: calling the skill's own function bypasses the wiring and stays green when the reach-in is deleted. Drive the real entry, or assert the wiring structurally.
- **Build / typecheck is an always-on leg**: drift (moved imports, renamed fields) is the main enemy and slips past runtime tests.
- **The test lives where the point runs**: host code uses vitest under `src/`; container code uses `bun:test` under `container/agent-runner/`.
- **"Functional" is the filter**: weigh a reach-in by what breaks if it's gone. A cosmetic one (raising a log line's level) gets no test.
The two interlock: a minimal surface keeps the integration points few and testable; a test per point keeps the surface safe. *Maintainable = small surface, every functional point guarded.*
---
## Skill anatomy
A skill carries everything it needs:
- **Code**: the files it adds. They live in the skill's own folder, or, for large registry-backed skills like channels and providers, on a registry branch the skill fetches from. Apply copies them in.
- **Apply**: the steps in `SKILL.md`, written as prose an agent can run. Apply must be safe to re-run: upgrades re-run it, and a skill that half-applies twice is a bug.
- **Remove**: a separate `REMOVE.md` that reverses *every* change apply made: barrel lines deleted (not commented out), every copied file removed including tests, dependencies uninstalled, Dockerfile edits reverted, env lines removed. **REMOVE.md is required exactly when apply leaves anything behind.** A pure instruction-only skill that copies nothing needs none, and an empty one is noise.
- **Tests**: files that ship with the skill and are copied into the project's test tree on apply, so they run against the *composed* system.
- **Recipe entry**: how it composes with the fork's other skills (ordering, dependencies).
---
## Change shapes
In rough order of safety:
- **Add a file**: safest. New code in the skill's own files, or fetched from a registry branch (`git show origin/<branch>:path > path`).
- **Append to a file**: an import in a barrel, a line in `.env`, an entry at the end of a list.
- **Edit a value in JSON**: e.g. a `package.json` field.
- **Add a dependency**, pinned to an exact version.
- **Insert into existing code (an "integration point")**: the one risky move. Keep it to a line or two that *calls* code living in the skill's own files, never an inlined block of logic. A skill full of these is a smell.
Fetching from a registry branch is **additive, never a merge**. `git fetch origin <branch>` then `git show origin/<branch>:path > path` per file. Never `git merge` a registry branch into an install.
---
## Integration points
The integration point is wherever the skill reaches into existing code. Make it **minimal, colocated, and self-contained**:
- All real logic lives in the skill's own file behind a single entry function; the edit to core is just the call.
- **Prefer one colocated block** over edits in two places. For an inserted call, a dynamic import at the call site keeps the import and call together and avoids touching the top-of-file import block (itself a merge hotspot):
```typescript
const { startDashboard } = await import('./dashboard-pusher.js');
await startDashboard();
```
A static import + call is acceptable too; this is a recommendation, not a mandate.
- Keep any gating (feature flags, env checks) *inside* the skill's function, so the core edit stays a single call.
- When the reach-in lands inside an entangled function, extract a tiny skill-owned helper so the core touch is one line, like `args.push(...mySkillEnvArgs())`, rather than exporting the whole function or inlining the logic.
---
## Testing
**What the standard requires: integration with the NanoClaw system.**
- **Required:** a test for every functional integration point, and, where an added file consumes core (core APIs, data shapes, registries), a test that exercises that consumption against the real core. That's the leg that catches core drift.
- **Optional, the creator's call:** unit tests of the skill's own internal logic, or its behavior against an external service. Often good practice; not what defines a maintainable skill, because they don't protect against upstream changes.
### Choosing the test type
For a code-edit integration point, how you test the wiring depends on whether you can invoke the function the edit lives in. **Prefer behavior; fall back to structure.**
- **If the edit lives in an invocable function, test that function's behavior.** Calling it exercises the edit; remove or break the edit and the test goes red. This is the strongest option, and usually available, because a minimal integration point pushes the logic into the skill's own exported function anyway.
- **If the edit lives in a non-invocable entry point** (e.g. `main()` or boot), **use a structural / AST test.** Use the TypeScript compiler API and assert not just that the symbol exists but its **placement**: awaited, a direct statement of the right function, importing the right module path, correctly ordered. A present-but-misplaced call must go red.
Two more legs apply when relevant:
- **Build / typecheck** always applies: it catches a renamed symbol, a moved module, a bad signature.
- **A behavior test of how added code consumes core**, required when the added file reaches into core APIs or data at runtime. When the consumption is a *typed* call into a core API (a Chat SDK adapter calling `createChatSdkBridge`), the build leg already guards it and no separate behavior test is required. The behavior-test requirement targets runtime consumption: core DB state, data shapes, registries.
Together these cover deletion, misplacement, drift, and core consumption. Only true runtime-reachability (a call stranded behind a dead branch) needs the heavy option of booting the real entry point, a rare "real run" reserved for critical wiring.
### Registration reach-ins: behavior, not structural
A registry queryable at runtime gets a **behavior** test: import the real barrel, assert the registry contains the entry. A structural parse only proves the *source line* exists. It stays green when the barrel can't evaluate or the package isn't installed, which is exactly when the thing is actually broken. The behavior test goes red on a deleted barrel line, a barrel that won't evaluate, *and* an uninstalled package (the unmocked import throws), so it covers the dependency integration point for free.
Two consequences. First, **don't mock the adapter's package in the shipped test**: that would defeat the dependency check, and the test runs in the composed install where the package is present. Second, the only reason to fall back to a structural parse is an adapter with real import-time side effects (spawns a process, opens a socket, needs creds at load), which is an adapter smell to fix, not a reason to weaken the test. Conformant adapters do all side-effectful work in the factory or `setup()`, never at import.
### Test archetypes
The test matches the kind of integration point:
- **In-process seam with core** (a channel into the router, a pusher into the central DB): drive the real added component against the **real core collaborators** (DB, registry, router), faking only the external edge. The highest-value archetype: it exercises the added file's consumption of core, which is what catches core drift.
- **Wiring / registration** (a barrel import, a `main()` call, an entry in an `mcpServers` map): behavior test via the registry where queryable (see above); structural / AST test where not.
- **Config / container probe** (mounts, Dockerfile, a tool installed in the image): run the change where you can. Spin up a container to confirm a mount or binary. Checking that a line exists in a file is the last resort.
- **Agentic run** (operational, instruction-only skills): run the workflow with a small model; did it complete?
- **Patch behavior** (a patch skill that changes core logic): a behavior test of the changed behavior.
- **Provider (multi-point)**: a non-default agent backend reaches into *two* barrels (host `src/providers/index.ts`; container `container/agent-runner/src/providers/index.ts`), plus Dockerfile edits and a CLI or SDK dependency. Each is a separate way to break, and each needs its own guard. Ship a **barrel-driven registration test per tree** that imports *only* the real barrel and asserts the registry contains the provider. **The trap:** a `*.factory.test.ts` that imports the provider module directly self-registers it and stays green when the barrel line is deleted; that's a unit test, not a registration guard. REMOVE.md must reverse both barrel lines, all copied files in both trees, the dependency, and the Dockerfile edits.
- **Content / instruction-only** (a reference wiki, a pure workflow): makes no functional reach-in, so it owes no integration test. Conformance is anatomy: idempotent apply, plus REMOVE.md iff apply leaves anything behind.
### Dependencies are integration points
A skill that installs a package has made a reach-in: the code now assumes it's there. Guard it so a missing package goes red, in order of preference:
1. **An unmocked import in a behavior test**: the test imports real code that imports the package, so a missing package throws. Covers presence *and* exercises the real dependency.
2. **The build leg**: a typed import of a missing module fails typecheck. The fallback when the package genuinely can't be imported in a test (e.g. it binds a port on import). Only works if the validate step runs the build before or alongside the tests, so verify the order.
3. **A Dockerfile-installed CLI binary** is the case most often left unguarded: it isn't importable, so neither guard above sees it. Use a **structural test** asserting the Dockerfile `ARG <X>_VERSION=` and install line are present, optionally backed by a `<bin> --version` container probe. Pin the version; reject `latest`.
You do *not* need to test the dependency's own API contract; that's optional external-service coverage.
### When there is genuinely nothing to test in-tree
Some skills' only functional integration is a runtime operator action with no source footprint: registering an MCP server through `ncl`, or a mount through the sanctioned query wrapper (until the `ncl` add-mount verb lands). There's no line in the tree whose deletion a test could catch, so a registration test is structurally inapplicable. **State this explicitly in SKILL.md** rather than inventing a hollow test; conformance is then anatomy plus the dependency guard. This is a conformant outcome, valid only when the reach-in has no in-tree representation. (A raw-SQL write into core's schema to achieve the same thing is a smell, not a workaround.)
### Test rules
- **Hermetic at the external edge.** Mock genuinely external services (a fake HTTP server, stubbed creds), never the package under guard (see "Registration reach-ins").
- **Exercise the real entry, or assert it structurally.** A test that imports the skill's function directly does not test the integration.
- **Tests travel with the skill** and are copied in on apply; an integration test only means anything against the composed project.
- **Robustness check.** Apply the skill with a small, cheap model. If a small model fumbles the instructions, they're too vague. Fix the instructions, don't blame the model. (Small models also keep applying skills cheap.)
---
## Anti-patterns
Each with its fix. These are patterns to remove, not to test around: a drift-prone, untestable reach-in is usually a symptom of a bad pattern, not a missing test. Reviewers reject them; the conformance linter will flag them automatically.
1. **A separate VERIFY.md.** Delete it; tests are the verification. Fold any genuinely useful manual smoke check into SKILL.md's next steps.
2. **REMOVE.md soft-disable** (comments out an import; leaves copied files behind). DELETE the import line and `rm` every file the skill copied.
3. **REMOVE.md incomplete** (misses env vars, the package uninstall, copied tests). Reverse *every* change; read the env vars from the skill's own credentials section, don't guess.
4. **Raw SQL against a core DB** (read or write). Use a core helper or an `ncl` verb; the in-tree query wrapper is the sanctioned last resort. Never the `sqlite3` binary.
5. **Credential threading** (`-e KEY=…` or a stdin secrets payload into the container). OneCLI gateway only; it injects credentials per request.
6. **Branch-merge install** (`git merge` of a registry branch or any code branch). Install by additive fetch: `git fetch origin <branch>`, then `git show origin/<branch>:path > path` per file. For an update/reapply workflow, re-run each installed skill's additive apply, never merge.
7. **Diff-against-past framing** ("earlier versions…", "this is now redundant") and **documenting non-steps** ("no X needed"). Write present-tense DO steps only. A skill reads as a standalone artifact with no memory of its own edits.
8. **Stale reach-in targets** (an edit aimed at code that no longer exists; a reach-in already shipped in trunk). Verify the target exists *before* instructing the edit; reconcile already-in-trunk ones to a no-op. Before appending to an allowlist or list, check how it's consumed; the entry may already be derived from a registry, making the edit dead.
9. **Hand-maintained duplicate copies** (a mirror directory kept in sync by hand or sed). Generate the mirror from a single canonical source.
---
## Worked examples
In-tree exemplars for the code archetypes. (Two carry known smells, kept deliberately pending architectural fixes; they demonstrate the test shapes, not perfection.)
- `add-dashboard`: in-process seam with core (the pusher against the central DB), plus an AST wiring test for its `main()` call.
- `add-slack`: Chat SDK channel registration; the template for the whole channel family.
- `add-deltachat`: native channel registration.
- `add-atomic-chat-tool`: MCP-tool wiring across both runtimes (container registration and host env-helper call).
- `add-opencode` / `add-codex`: the provider multi-point archetype, with two barrels, Dockerfile pins, and per-tree registration tests.
-677
View File
@@ -1,677 +0,0 @@
# Skills as Branches
## Overview
This document covers **feature skills** — skills that add capabilities via git branch merges. This is the most complex skill type and the primary way NanoClaw is extended.
NanoClaw has four types of skills overall. See [CONTRIBUTING.md](../CONTRIBUTING.md) for the full taxonomy:
| Type | Location | How it works |
|------|----------|-------------|
| **Feature** (this doc) | `.claude/skills/` + `skill/*` branch | SKILL.md has instructions; code lives on a branch, applied via `git merge` |
| **Utility** | `.claude/skills/<name>/` with code files | Self-contained tools; code in skill directory, copied into place on install |
| **Operational** | `.claude/skills/` on `main` | Instruction-only workflows (setup, debug, update) |
| **Container** | `container/skills/` | Loaded inside agent containers at runtime |
---
Feature skills are distributed as git branches on the upstream repository. Applying a skill is a `git merge`. Updating core is a `git merge`. Everything is standard git.
This replaces the previous `skills-engine/` system (three-way file merging, `.nanoclaw/` state, manifest files, replay, backup/restore) with plain git operations and Claude for conflict resolution.
## How It Works
### Repository structure
The upstream repo (`nanocoai/nanoclaw`) maintains:
- `main` — core NanoClaw (no skill code)
- `skill/discord` — main + Discord integration
- `skill/telegram` — main + Telegram integration
- `skill/slack` — main + Slack integration
- `skill/gmail` — main + Gmail integration
- etc.
Each skill branch contains all the code changes for that skill: new files, modified source files, updated `package.json` dependencies, `.env.example` additions — everything. No manifest, no structured operations, no separate `add/` and `modify/` directories.
### Skill discovery and installation
Skills are split into two categories:
**Operational skills** (on `main`, always available):
- `/setup`, `/debug`, `/update-nanoclaw`, `/customize`, `/update-skills`
- These are instruction-only SKILL.md files — no code changes, just workflows
- Live in `.claude/skills/` on `main`, immediately available to every user
**Feature skills** (in marketplace, installed on demand):
- `/add-discord`, `/add-telegram`, `/add-slack`, `/add-gmail`, etc.
- Each has a SKILL.md with setup instructions and a corresponding `skill/*` branch with code
- Live in the marketplace repo (`nanocoai/nanoclaw-skills`)
Users never interact with the marketplace directly. The operational skills `/setup` and `/customize` handle plugin installation transparently:
```bash
# Claude runs this behind the scenes — users don't see it
claude plugin install nanoclaw-skills@nanoclaw-skills --scope project
```
Skills are hot-loaded after `claude plugin install` — no restart needed. This means `/setup` can install the marketplace plugin, then immediately run any feature skill, all in one session.
### Selective skill installation
`/setup` asks users what channels they want, then only offers relevant skills:
1. "Which messaging channels do you want to use?" → Discord, Telegram, Slack, WhatsApp
2. User picks Telegram → Claude installs the plugin and runs `/add-telegram`
3. After Telegram is set up: "Want to add Agent Swarm support for Telegram?" → offers `/add-telegram-swarm`
4. "Want to enable community skills?" → installs community marketplace plugins
Dependent skills (e.g., `telegram-swarm` depends on `telegram`) are only offered after their parent is installed. `/customize` follows the same pattern for post-setup additions.
### Marketplace configuration
NanoClaw's `.claude/settings.json` registers the official marketplace:
```json
{
"extraKnownMarketplaces": {
"nanoclaw-skills": {
"source": {
"source": "github",
"repo": "nanocoai/nanoclaw-skills"
}
}
}
}
```
The marketplace repo uses Claude Code's plugin structure:
```
nanocoai/nanoclaw-skills/
.claude-plugin/
marketplace.json # Plugin catalog
plugins/
nanoclaw-skills/ # Single plugin bundling all official skills
.claude-plugin/
plugin.json # Plugin manifest
skills/
add-discord/
SKILL.md # Setup instructions; step 1 is "merge the branch"
add-telegram/
SKILL.md
add-slack/
SKILL.md
...
```
Multiple skills are bundled in one plugin — installing `nanoclaw-skills` makes all feature skills available at once. Individual skills don't need separate installation.
Each SKILL.md tells Claude to merge the corresponding skill branch as step 1, then walks through interactive setup (env vars, bot creation, etc.).
### Applying a skill
User runs `/add-discord` (discovered via marketplace). Claude follows the SKILL.md:
1. `git fetch upstream skill/discord`
2. `git merge upstream/skill/discord`
3. Interactive setup (create bot, get token, configure env vars, etc.)
Or manually:
```bash
git fetch upstream skill/discord
git merge upstream/skill/discord
```
### Applying multiple skills
```bash
git merge upstream/skill/discord
git merge upstream/skill/telegram
```
Git handles the composition. If both skills modify the same lines, it's a real conflict and Claude resolves it.
### Updating core
```bash
git fetch upstream main
git merge upstream/main
```
Since skill branches are kept merged-forward with main (see CI section), the user's merged-in skill changes and upstream changes have proper common ancestors.
### Checking for skill updates
Users who previously merged a skill branch can check for updates. For each `upstream/skill/*` branch, check whether the branch has commits that aren't in the user's HEAD:
```bash
git fetch upstream
for branch in $(git branch -r | grep 'upstream/skill/'); do
# Check if user has merged this skill at some point
merge_base=$(git merge-base HEAD "$branch" 2>/dev/null) || continue
# Check if the skill branch has new commits beyond what the user has
if ! git merge-base --is-ancestor "$branch" HEAD 2>/dev/null; then
echo "$branch has updates available"
fi
done
```
This requires no state — it uses git history to determine which skills were previously merged and whether they have new commits.
This logic is available in two ways:
- Built into `/update-nanoclaw` — after merging main, optionally check for skill updates
- Standalone `/update-skills` — check and merge skill updates independently
### Conflict resolution
At any merge step, conflicts may arise. Claude resolves them — reading the conflicted files, understanding the intent of both sides, and producing the correct result. This is what makes the branch approach viable at scale: conflict resolution that previously required human judgment is now automated.
### Skill dependencies
Some skills depend on other skills. E.g., `skill/telegram-swarm` requires `skill/telegram`. Dependent skill branches are branched from their parent skill branch, not from `main`.
This means `skill/telegram-swarm` includes all of telegram's changes plus its own additions. When a user merges `skill/telegram-swarm`, they get both — no need to merge telegram separately.
Dependencies are implicit in git history — `git merge-base --is-ancestor` determines whether one skill branch is an ancestor of another. No separate dependency file is needed.
### Uninstalling a skill
```bash
# Find the merge commit
git log --merges --oneline | grep discord
# Revert it
git revert -m 1 <merge-commit>
```
This creates a new commit that undoes the skill's changes. Claude can handle the whole flow.
If the user has modified the skill's code since merging (custom changes on top), the revert might conflict — Claude resolves it.
If the user later wants to re-apply the skill, they need to revert the revert first (git treats reverted changes as "already applied and undone"). Claude handles this too.
## CI: Keeping Skill Branches Current
A GitHub Action runs on every push to `main`:
1. List all `skill/*` branches
2. For each skill branch, merge `main` into it (merge-forward, not rebase)
3. Run build and tests on the merged result
4. If tests pass, push the updated skill branch
5. If a skill fails (conflict, build error, test failure), open a GitHub issue for manual resolution
**Why merge-forward instead of rebase:**
- No force-push — preserves history for users who already merged the skill
- Users can re-merge a skill branch to pick up skill updates (bug fixes, improvements)
- Git has proper common ancestors throughout the merge graph
**Why this scales:** With a few hundred skills and a few commits to main per day, the CI cost is trivial. Haiku is fast and cheap. The approach that wouldn't have been feasible a year or two ago is now practical because Claude can resolve conflicts at scale.
## Installation Flow
### New users (recommended)
1. Fork `nanocoai/nanoclaw` on GitHub (click the Fork button)
2. Clone your fork:
```bash
git clone https://github.com/<you>/nanoclaw.git
cd nanoclaw
```
3. Run Claude Code:
```bash
claude
```
4. Run `/setup` — Claude handles dependencies, authentication, container setup, service configuration, and adds `upstream` remote if not present
Forking is recommended because it gives users a remote to push their customizations to. Clone-only works for trying things out but provides no remote backup.
### Existing users migrating from clone
Users who previously ran `git clone https://github.com/nanocoai/nanoclaw.git` and have local customizations:
1. Fork `nanocoai/nanoclaw` on GitHub
2. Reroute remotes:
```bash
git remote rename origin upstream
git remote add origin https://github.com/<you>/nanoclaw.git
git push --force origin main
```
The `--force` is needed because the fresh fork's main is at upstream's latest, but the user wants their (possibly behind) version. The fork was just created so there's nothing to lose.
3. From this point, `origin` = their fork, `upstream` = nanocoai/nanoclaw
### Existing users migrating from the old skills engine
Users who previously applied skills via the `skills-engine/` system have skill code in their tree but no merge commits linking to skill branches. Git doesn't know these changes came from a skill, so merging a skill branch on top would conflict or duplicate.
**For new skills going forward:** just merge skill branches as normal. No issue.
**For existing old-engine skills**, two migration paths:
**Option A: Per-skill reapply (keep your fork)**
1. For each old-engine skill: identify and revert the old changes, then merge the skill branch fresh
2. Claude assists with identifying what to revert and resolving any conflicts
3. Custom modifications (non-skill changes) are preserved
**Option B: Fresh start (cleanest)**
1. Create a new fork from upstream
2. Merge the skill branches you want
3. Manually re-apply your custom (non-skill) changes
4. Claude assists by diffing your old fork against the new one to identify custom changes
In both cases:
- Delete the `.nanoclaw/` directory (no longer needed)
- The `skills-engine/` code will be removed from upstream once all skills are migrated
- `/update-skills` only tracks skills applied via branch merge — old-engine skills won't appear in update checks
## User Workflows
### Custom changes
Users make custom changes directly on their main branch. This is the standard fork workflow — their `main` IS their customized version.
```bash
# Make changes
vim src/config.ts
git commit -am "change trigger word to @Bob"
git push origin main
```
Custom changes, skills, and core updates all coexist on their main branch. Git handles the three-way merging at each merge step because it can trace common ancestors through the merge history.
### Applying a skill
Run `/add-discord` in Claude Code (discovered via the marketplace plugin), or manually:
```bash
git fetch upstream skill/discord
git merge upstream/skill/discord
# Follow setup instructions for configuration
git push origin main
```
If the user is behind upstream's main when they merge a skill branch, the merge might bring in some core changes too (since skill branches are merged-forward with main). This is generally fine — they get a compatible version of everything.
### Updating core
```bash
git fetch upstream main
git merge upstream/main
git push origin main
```
This is the same as the existing `/update-nanoclaw` skill's merge path.
### Updating skills
Run `/update-skills` or let `/update-nanoclaw` check after a core update. For each previously-merged skill branch that has new commits, Claude offers to merge the updates.
### Contributing back to upstream
Users who want to submit a PR to upstream:
```bash
git fetch upstream main
git checkout -b my-fix upstream/main
# Make changes
git push origin my-fix
# Create PR from my-fix to nanocoai/nanoclaw:main
```
Standard fork contribution workflow. Their custom changes stay on their main and don't leak into the PR.
## Contributing a Skill
The flow below is for **feature skills** (branch-based). For utility skills (self-contained tools) and container skills, the contributor opens a PR that adds files directly to `.claude/skills/<name>/` or `container/skills/<name>/` — no branch extraction needed. See [CONTRIBUTING.md](../CONTRIBUTING.md) for all skill types.
### Contributor flow (feature skills)
1. Fork `nanocoai/nanoclaw`
2. Branch from `main`
3. Make the code changes (new channel file, modified integration points, updated package.json, .env.example additions, etc.)
4. Open a PR to `main`
The contributor opens a normal PR — they don't need to know about skill branches or marketplace repos. They just make code changes and submit.
### Maintainer flow
When a skill PR is reviewed and approved:
1. Create a `skill/<name>` branch from the PR's commits:
```bash
git fetch origin pull/<PR_NUMBER>/head:skill/<name>
git push origin skill/<name>
```
2. Force-push to the contributor's PR branch, replacing it with a single commit that adds the contributor to `CONTRIBUTORS.md` (removing all code changes)
3. Merge the slimmed PR into `main` (just the contributor addition)
4. Add the skill's SKILL.md to the marketplace repo (`nanocoai/nanoclaw-skills`)
This way:
- The contributor gets merge credit (their PR is merged)
- They're added to CONTRIBUTORS.md automatically by the maintainer
- The skill branch is created from their work
- `main` stays clean (no skill code)
- The contributor only had to do one thing: open a PR with code changes
**Note:** GitHub PRs from forks have "Allow edits from maintainers" checked by default, so the maintainer can push to the contributor's PR branch.
### Skill SKILL.md
The contributor can optionally provide a SKILL.md (either in the PR or separately). This goes into the marketplace repo and contains:
1. Frontmatter (name, description, triggers)
2. Step 1: Merge the skill branch
3. Steps 2-N: Interactive setup (create bot, get token, configure env vars, verify)
If the contributor doesn't provide a SKILL.md, the maintainer writes one based on the PR.
## Community Marketplaces
Anyone can maintain their own fork with skill branches and their own marketplace repo. This enables a community-driven skill ecosystem without requiring write access to the upstream repo.
### How it works
A community contributor:
1. Maintains a fork of NanoClaw (e.g., `alice/nanoclaw`)
2. Creates `skill/*` branches on their fork with their custom skills
3. Creates a marketplace repo (e.g., `alice/nanoclaw-skills`) with a `.claude-plugin/marketplace.json` and plugin structure
### Adding a community marketplace
If the community contributor is trusted, they can open a PR to add their marketplace to NanoClaw's `.claude/settings.json`:
```json
{
"extraKnownMarketplaces": {
"nanoclaw-skills": {
"source": {
"source": "github",
"repo": "nanocoai/nanoclaw-skills"
}
},
"alice-nanoclaw-skills": {
"source": {
"source": "github",
"repo": "alice/nanoclaw-skills"
}
}
}
}
```
Once merged, all NanoClaw users automatically discover the community marketplace alongside the official one.
### Installing community skills
`/setup` and `/customize` ask users whether they want to enable community skills. If yes, Claude installs community marketplace plugins via `claude plugin install`:
```bash
claude plugin install alice-skills@alice-nanoclaw-skills --scope project
```
Community skills are hot-loaded and immediately available — no restart needed. Dependent skills are only offered after their prerequisites are met (e.g., community Telegram add-ons only after Telegram is installed).
Users can also browse and install community plugins manually via `/plugin`.
### Properties of this system
- **No gatekeeping required.** Anyone can create skills on their fork without permission. They only need approval to be listed in the auto-discovered marketplaces.
- **Multiple marketplaces coexist.** Users see skills from all trusted marketplaces in `/plugin`.
- **Community skills use the same merge pattern.** The SKILL.md just points to a different remote:
```bash
git remote add alice https://github.com/alice/nanoclaw.git
git fetch alice skill/my-cool-feature
git merge alice/skill/my-cool-feature
```
- **Users can also add marketplaces manually.** Even without being listed in settings.json, users can run `/plugin marketplace add alice/nanoclaw-skills` to discover skills from any source.
- **CI is per-fork.** Each community maintainer runs their own CI to keep their skill branches merged-forward. They can use the same GitHub Action as the upstream repo.
## Flavors
A flavor is a curated fork of NanoClaw — a combination of skills, custom changes, and configuration tailored for a specific use case (e.g., "NanoClaw for Sales," "NanoClaw Minimal," "NanoClaw for Developers").
### Creating a flavor
1. Fork `nanocoai/nanoclaw`
2. Merge in the skills you want
3. Make custom changes (trigger word, prompts, integrations, etc.)
4. Your fork's `main` IS the flavor
### Installing a flavor
During `/setup`, users are offered a choice of flavors before any configuration happens. The setup skill reads `flavors.yaml` from the repo (shipped with upstream, always up to date) and presents options:
AskUserQuestion: "Start with a flavor or default NanoClaw?"
- Default NanoClaw
- NanoClaw for Sales — Gmail + Slack + CRM (maintained by alice)
- NanoClaw Minimal — Telegram-only, lightweight (maintained by bob)
If a flavor is chosen:
```bash
git remote add <flavor-name> https://github.com/alice/nanoclaw.git
git fetch <flavor-name> main
git merge <flavor-name>/main
```
Then setup continues normally (dependencies, auth, container, service).
**This choice is only offered on a fresh fork** — when the user's main matches or is close to upstream's main with no local commits. If `/setup` detects significant local changes (re-running setup on an existing install), it skips the flavor selection and goes straight to configuration.
After installation, the user's fork has three remotes:
- `origin` — their fork (push customizations here)
- `upstream``nanocoai/nanoclaw` (core updates)
- `<flavor-name>` — the flavor fork (flavor updates)
### Updating a flavor
```bash
git fetch <flavor-name> main
git merge <flavor-name>/main
```
The flavor maintainer keeps their fork updated (merging upstream, updating skills). Users pull flavor updates the same way they pull core updates.
### Flavors registry
`flavors.yaml` lives in the upstream repo:
```yaml
flavors:
- name: NanoClaw for Sales
repo: alice/nanoclaw
description: Gmail + Slack + CRM integration, daily pipeline summaries
maintainer: alice
- name: NanoClaw Minimal
repo: bob/nanoclaw
description: Telegram-only, no container overhead
maintainer: bob
```
Anyone can PR to add their flavor. The file is available locally when `/setup` runs since it's part of the cloned repo.
### Discoverability
- **During setup** — flavor selection is offered as part of the initial setup flow
- **`/browse-flavors` skill** — reads `flavors.yaml` and presents options at any time
- **GitHub topics** — flavor forks can tag themselves with `nanoclaw-flavor` for searchability
- **Discord / website** — community-curated lists
## Migration
Migration from the old skills engine to branches is complete. All feature skills now live on `skill/*` branches, and the skills engine has been removed.
### Skill branches
| Branch | Base | Description |
|--------|------|-------------|
| `skill/whatsapp` | `main` | WhatsApp channel |
| `skill/telegram` | `main` | Telegram channel |
| `skill/slack` | `main` | Slack channel |
| `skill/discord` | `main` | Discord channel |
| `skill/gmail` | `main` | Gmail channel |
| `skill/voice-transcription` | `skill/whatsapp` | OpenAI Whisper voice transcription |
| `skill/image-vision` | `skill/whatsapp` | Image attachment processing |
| `skill/pdf-reader` | `skill/whatsapp` | PDF attachment reading |
| `skill/local-whisper` | `skill/voice-transcription` | Local whisper.cpp transcription |
| `skill/ollama-tool` | `main` | Ollama MCP server for local models |
| `skill/apple-container` | `main` | Apple Container runtime |
| `skill/reactions` | `main` | WhatsApp emoji reactions |
### What was removed
- `skills-engine/` directory (entire engine)
- `scripts/apply-skill.ts`, `scripts/uninstall-skill.ts`, `scripts/rebase.ts`
- `scripts/fix-skill-drift.ts`, `scripts/validate-all-skills.ts`
- `.github/workflows/skill-drift.yml`, `.github/workflows/skill-pr.yml`
- All `add/`, `modify/`, `tests/`, and `manifest.yaml` from skill directories
- `.nanoclaw/` state directory
Operational skills (`setup`, `debug`, `update-nanoclaw`, `customize`, `update-skills`) remain on main in `.claude/skills/`.
## What Changes
### README Quick Start
Before:
```bash
git clone https://github.com/nanocoai/NanoClaw.git
cd NanoClaw
claude
```
After:
```
1. Fork nanocoai/nanoclaw on GitHub
2. git clone https://github.com/<you>/nanoclaw.git
3. cd nanoclaw
4. claude
5. /setup
```
### Setup skill (`/setup`)
Updates to the setup flow:
- Check if `upstream` remote exists; if not, add it: `git remote add upstream https://github.com/nanocoai/nanoclaw.git`
- Check if `origin` points to the user's fork (not nanocoai). If it points to nanocoai, guide them through the fork migration.
- **Install marketplace plugin:** `claude plugin install nanoclaw-skills@nanoclaw-skills --scope project` — makes all feature skills available (hot-loaded, no restart)
- **Ask which channels to add:** present channel options (Discord, Telegram, Slack, WhatsApp, Gmail), run corresponding `/add-*` skills for selected channels
- **Offer dependent skills:** after a channel is set up, offer relevant add-ons (e.g., Agent Swarm after Telegram, voice transcription after WhatsApp)
- **Optionally enable community marketplaces:** ask if the user wants community skills, install those marketplace plugins too
### `.claude/settings.json`
Marketplace configuration so the official marketplace is auto-registered:
```json
{
"extraKnownMarketplaces": {
"nanoclaw-skills": {
"source": {
"source": "github",
"repo": "nanocoai/nanoclaw-skills"
}
}
}
}
```
### Skills directory on main
The `.claude/skills/` directory on `main` retains only operational skills (setup, debug, update-nanoclaw, customize, update-skills). Feature skills (add-discord, add-telegram, etc.) live in the marketplace repo, installed via `claude plugin install` during `/setup` or `/customize`.
### Skills engine removal
The following can be removed:
- `skills-engine/` — entire directory (apply, merge, replay, state, backup, etc.)
- `scripts/apply-skill.ts`
- `scripts/uninstall-skill.ts`
- `scripts/fix-skill-drift.ts`
- `scripts/validate-all-skills.ts`
- `.nanoclaw/` — state directory
- `add/` and `modify/` subdirectories from all skill directories
- Feature skill SKILL.md files from `.claude/skills/` on main (they now live in the marketplace)
Operational skills (`setup`, `debug`, `update-nanoclaw`, `customize`, `update-skills`) remain on main in `.claude/skills/`.
### New infrastructure
- **Marketplace repo** (`nanocoai/nanoclaw-skills`) — single Claude Code plugin bundling SKILL.md files for all feature skills
- **CI GitHub Action** — merge-forward `main` into all `skill/*` branches on every push to `main`, using Claude (Haiku) for conflict resolution
- **`/update-skills` skill** — checks for and applies skill branch updates using git history
- **`CONTRIBUTORS.md`** — tracks skill contributors
### Update skill (`/update-nanoclaw`)
The update skill gets simpler with the branch-based approach. The old skills engine required replaying all applied skills after merging core updates — that entire step disappears. Skill changes are already in the user's git history, so `git merge upstream/main` just works.
**What stays the same:**
- Preflight (clean working tree, upstream remote)
- Backup branch + tag
- Preview (git log, git diff, file buckets)
- Merge/cherry-pick/rebase options
- Conflict preview (dry-run merge)
- Conflict resolution
- Build + test validation
- Rollback instructions
**What's removed:**
- Skill replay step (was needed by the old skills engine to re-apply skills after core update)
- Re-running structured operations (npm deps, env vars — these are part of git history now)
**What's added:**
- Optional step at the end: "Check for skill updates?" which runs the `/update-skills` logic
- This checks whether any previously-merged skill branches have new commits (bug fixes, improvements to the skill itself — not just merge-forwards from main)
**Why users don't need to re-merge skills after a core update:**
When the user merged a skill branch, those changes became part of their git history. When they later merge `upstream/main`, git performs a normal three-way merge — the skill changes in their tree are untouched, and only core changes are brought in. The merge-forward CI ensures skill branches stay compatible with latest main, but that's for new users applying the skill fresh. Existing users who already merged the skill don't need to do anything.
Users only need to re-merge a skill branch if the skill itself was updated (not just merged-forward with main). The `/update-skills` check detects this.
## Discord Announcement
### For existing users
> **Skills are now git branches**
>
> We've simplified how skills work in NanoClaw. Instead of a custom skills engine, skills are now git branches that you merge in.
>
> **What this means for you:**
> - Applying a skill: `git fetch upstream skill/discord && git merge upstream/skill/discord`
> - Updating core: `git fetch upstream main && git merge upstream/main`
> - Checking for skill updates: `/update-skills`
> - No more `.nanoclaw/` state directory or skills engine
>
> **We now recommend forking instead of cloning.** This gives you a remote to push your customizations to.
>
> **If you currently have a clone with local changes**, migrate to a fork:
> 1. Fork `nanocoai/nanoclaw` on GitHub
> 2. Run:
> ```
> git remote rename origin upstream
> git remote add origin https://github.com/<you>/nanoclaw.git
> git push --force origin main
> ```
> This works even if you're way behind — just push your current state.
>
> **If you previously applied skills via the old system**, your code changes are already in your working tree — nothing to redo. You can delete the `.nanoclaw/` directory. Future skills and updates use the branch-based approach.
>
> **Discovering skills:** Skills are now available through Claude Code's plugin marketplace. Run `/plugin` in Claude Code to browse and install available skills.
### For skill contributors
> **Contributing skills**
>
> To contribute a skill:
> 1. Fork `nanocoai/nanoclaw`
> 2. Branch from `main` and make your code changes
> 3. Open a regular PR
>
> That's it. We'll create a `skill/<name>` branch from your PR, add you to CONTRIBUTORS.md, and add the SKILL.md to the marketplace. CI automatically keeps skill branches merged-forward with `main` using Claude to resolve any conflicts.
>
> **Want to run your own skill marketplace?** Maintain skill branches on your fork and create a marketplace repo. Open a PR to add it to NanoClaw's auto-discovered marketplaces — or users can add it manually via `/plugin marketplace add`.
+150
View File
@@ -0,0 +1,150 @@
# The skills model
How NanoClaw stays customizable without breaking its forks. This is the full version; [customizing.md](customizing.md) is the short one, and [skill-guidelines.md](skill-guidelines.md) is the authoritative checklist for writing a skill.
## The problem
People fork NanoClaw and change the code. When we ship updates, their changes collide with ours and `git merge` turns into a fight. The more someone customized, the worse it gets. We can't grow the core without breaking everyone downstream.
## The bet
Every customization is a skill: not an edit buried in the core, but a skill that adds the change on top.
The core stays small and stable. Everything else composes on top as skills. Adding your 1st skill and your 500th skill is the same amount of work.
This works for any fork: a personal install with three tweaks, a company build with fifty.
## A fork is a recipe of skills
You don't track your changes as a pile of edits. You track them as skills.
- Each customization = one small skill.
- One "recipe" skill lists all your skills and how they fit together: the order, and any dependencies between them.
So a fork is defined by its recipe. Most upgrades don't need to run it (see "Upgrading"), but it's what lets you rebuild the fork from scratch on clean upstream, and it's how you hand your whole fork to someone else. It replaces every "what did I change" artifact you'd otherwise keep (a migration guide, a manifest, a pile of notes) with one runnable thing.
The recipe is the one fork-specific thing. It lives in your fork, never upstream. (A recipe is itself a skill: a SKILL.md listing the fork's skills in apply order.)
## What's in a skill
A skill carries everything it needs:
- **Its code**: the files it adds (see "Where a skill's files live").
- **Apply and remove.** Apply installs it; remove uninstalls it. Uninstall isn't a separate problem; it ships with the skill. (Remove is required exactly when apply leaves anything behind. A pure instruction-only skill that changes nothing needs none.)
- **Its tests**: see "A test for every integration point." The tests *are* the verification. If they pass against the composed project, the skill applied correctly and works; there is no separate "verify" step.
- **Its recipe entry**: how it composes with the others.
Apply must be safe to re-run. Upgrades re-run skills, so a skill that half-applies twice is a bug.
## Two kinds of skills
- **Capability skills** add something new: a channel, a provider, a tool, a dashboard.
- **Patch skills** make small tweaks or bug fixes to existing behavior, instead of adding a capability.
Patch skills follow the same rules: a test for every edit, and code pushed into independent files wherever possible instead of inline. To keep the overhead down, bundle several small patches into a single patch skill rather than making one skill per one-line fix.
One honest exception: a bug fix that genuinely changes an existing line can't always be moved into a new file. That single line is the one place an upgrade can still hard-conflict. If upstream touched the same line, the fix has to be re-derived against the new code. That's fine when it's small and tested; just don't pretend it's free.
(Packaging is a separate axis: some skills fetch code from a registry branch, some ship files in their own folder, some are pure instructions.)
## What makes a good skill
A good skill mostly just *adds* things:
- Adds new files.
- Adds a line to an existing file (an import, an entry, a line in `.env`).
- Adds a dependency.
- Changes a value in a JSON file like `package.json`.
These never really break.
The one risky move is when a skill has to *reach into* existing code and wire something in at a specific spot. That's the only part that breaks when we change the code later. Keep these rare, and keep them to a line or two that just *calls* code living in the skill's own files, not big chunks of logic inline.
Rule of thumb: aim for skills that are almost all "adds." Not 100%; some reach-ins are fine. But a skill full of reach-ins is a smell, and a sign that spot in the core should become a proper hook.
## Where a skill's files live
The files a skill adds live in the skill's own folder, and the skill copies them into the project when it runs. The skill is self-contained.
The exception is skills that plug into a registry: channels and providers. Their code is larger, multi-file, and has to stay in sync with the core as it changes over time. That code lives on a long-lived **registry branch** (`channels`, `providers`) that we forward-merge against main, and the skill fetches it from there (`git show origin/channels:path > path`). A frozen copy in a skill folder would go stale.
This fetch is **additive, never a merge**. The skill copies in the files it needs; it does *not* `git merge` the branch. Merging a registry branch into a customized install is exactly the conflict fight this model exists to avoid. A skill's **tests live on the branch alongside its code** and are fetched the same way; a channel's adapter travels with its registration test. A provider is the multi-point case: its code spans the host *and* container trees plus a Dockerfile edit, so it fetches files into both trees and ships a registration test per tree. See the provider archetype in [skill-guidelines.md](skill-guidelines.md).
Either way the skill brings its own code, from its folder or from its branch.
## A test for every integration point
The tests a skill *must* ship are the ones that prove it integrates with the core and keeps working as the core changes. That's the whole point. Tests of a skill's own internal logic, or of its behavior against an external service, are fine but optional: the creator's call, because they don't guard against upstream changes. A pure-add skill that touches nothing existing needs no required integration test at all.
The places that break on upgrade are the **integration points**: wherever a skill reaches into the existing system. That's not just the obvious code edit. An appended import, a config entry, a Dockerfile change, a mount, an installed dependency, and a direct read of the core's data all count. Each gets a guard that goes **red if it breaks or goes missing**:
- **A behavior or structural test of the wiring.** Prefer behavior when the seam is queryable at runtime: a channel's registration test imports the real barrel and asserts the registry contains it. Fall back to a structural test only for wiring with no invocable seam.
- **The build / typecheck.** Always on. It catches the drift a runtime test can't: a renamed symbol, a moved module, a changed signature.
- **Coverage of how an added file consumes the core.** When a skill's own file reaches into core APIs or data, a test must exercise that consumption against the *real* core. That's the leg that catches core drift.
Why points and not whole skills: a skill can have several, and each is a separate way to break. The count is honest signal: a skill's integration points are exactly its upgrade risk. Pure-add skills have zero and stay cheap.
This is what makes upgrades cheap to fix: when we move something in the core, the integration-point tests are exactly what fail, and that failing list *is* the set of skills to update.
**Tests travel with the skill.** They're files kept with the skill, in its folder or on its branch, and applying the skill copies them into the project's test tree. An integration-point test has to run against the *composed* system, so it only means anything once the skill is applied.
**The recipe tests the stack.** A single skill's tests prove that skill works alone. The recipe carries tests that run the skills *together*, in order. That's where you catch two skills that collide.
The full testing doctrine (how to pick the test type per point, the archetypes, the dependency cases) is in [skill-guidelines.md](skill-guidelines.md).
## How you actually work
You don't have to write a skill before you touch anything. Edit the code directly, get it working, then turn those edits into skills afterward; a coding agent does that conversion. Good authoring guidelines and a good recipe make skillifying-after-the-fact close to trivial.
The point isn't to slow you down at edit time. It's that nothing counts as part of your fork until it's a skill, because that's the only form that survives an upgrade.
## Upgrading
**Every update goes through `/update-nanoclaw`, never a raw `git pull`.** You don't know what an update contains until it lands; it might carry a breaking change with a migration. So the command inspects what's coming and runs the proper process: back up, pull the changes in, apply migrations, run tests, fix what broke, and flag when a fresh rebuild is needed instead.
Two different moves, two different rules. Your **fork pulls trunk**: that's a normal pull, run by the update command, and it's safe precisely because your changes live beside the core as skills rather than inside it. A **skill never merges**: it installs by fetching files and copying them in. If a skill's instructions say `git merge`, it isn't built to this model.
The update takes one of two paths:
**Normal upgrade: pull and fix what breaks.** Most of the time it pulls the latest upstream, resolves the occasional small conflict, runs the tests, and fixes whatever they flag. This stays cheap *because* the changes are small self-contained skills with tests: conflicts are rare, and when something does break, the failing test points at the exact skill and the fix is local.
**Rebuild from the recipe: the rare path.** Take fresh upstream and apply every skill from scratch. The command flags this when you've fallen far behind across many breaking changes (a clean rebuild beats catching up step by step). It's also how you hand your entire fork to someone else.
Around both:
- **The update skill updates itself first.** The first thing it does is fetch the latest version of the upgrade process. Otherwise you're upgrading with stale instructions.
- **Snapshot first, restore on failure.** The upgrade sets a rollback point before it starts: today a git backup branch and tag; the model calls for a full project snapshot (code, database, data, files) so anything that fails rolls back and retries. Until that snapshot lands, a migration that touches data makes its own data backup. Nothing in the upgrade needs its own undo logic.
- **Broken skills don't block you.** If a core change broke a skill, its test tells you, but the skill is usually still usable, and an agent fixes it at apply time. Skills are fixed lazily, when applied, not ahead of time for every core version.
## Migrations
Migrations are core, not an afterthought. Every breaking change ships with its migration, packaged together. A "migration" is broad: upgrading dependencies, a database change, a data backfill, moving files to new locations, whatever the change requires.
Migrations are **forward-only**. They don't need reverse scripts; the rollback point in front of the upgrade is the undo. If one fails, restore and retry.
A **startup tripwire** keeps installs on the supported path. Every sanctioned update path (install, update, migrate) stamps a marker with the version it reached; at startup the host checks that marker against the running code. If it's missing or doesn't match, because someone pulled by hand, the host stops, loudly, with the exact command to fix it instead of silently breaking.
The tripwire doesn't reason about *which* changes are breaking; it just enforces that the path was used. (DB schema migrations already run automatically at startup, so they aren't its concern; it guards everything else a raw `git pull` leaves undone.) To override, you stamp the marker yourself: an explicit "I know what I'm doing," not a deletion. If you have your **own** upgrade flow (a deploy script, a CI job), make stamping the last step after it succeeds: `pnpm exec tsx scripts/upgrade-state.ts set`. See [upgrade-recovery.md](upgrade-recovery.md).
## The maintainer's side of the deal
This is a two-sided contract. Users keep their changes as skills. In return, the maintainer keeps the core stable and owns the breakage.
As maintainer:
1. **Keep the core small and stable.** Resist hardwiring features into the core. Push them to skills too.
2. **Before shipping a core change, run the skills against it.** That tells you what you broke before users find out.
3. **When you break a skill, you fix it, not the users.** If a refactor moves something, update the affected skills or ship a migration. Don't make every user rediscover the same fix.
4. **Ship the migration with the breaking change.** Packaged together: code, DB, files. Not a separate "good luck" note.
5. **Watch for hotspots.** When lots of skills reach into the same spot in the core, that's the signal to add a proper hook there, so those reach-ins become clean adds.
6. **Test against real forks.** Every core change and migration runs against a fleet of real, skill-built forks before shipping. Real proof on real installs.
## The public registry
Skills will be shared and composed; that's the whole point. A skill runs real code when it applies (copies files, installs dependencies, edits the Dockerfile). So a public registry of skills is a trust surface.
The rule: **every skill is reviewed and approved before it goes into the public registry, and every new version is re-reviewed.** Approving once and trusting forever is how supply chains get poisoned. Automated checks (linting against the guidelines, plus a harness that applies the skill on fresh upstream, runs its tests, removes it, and applies it twice) will clear the mechanical part so human review can focus on intent and safety. First-party skills are trusted by where they come from; the gate is for the public registry.
## The promise
Build your changes as skills following this, and we won't break you. It's a promise we can only make for skills: changes edited directly into the core are beyond what we can protect.
+51
View File
@@ -0,0 +1,51 @@
# Recovering from the upgrade tripwire
If NanoClaw refuses to start with a message like *"update did not go through the supported path"*, this page explains what happened and how to clear it.
## What happened
NanoClaw records the version it reached each time you upgrade through a supported path — `/setup`, `/update-nanoclaw`, or `/migrate-nanoclaw`. That record lives in `data/upgrade-state.json`.
At startup the host checks that record against the running code. If it's **missing** or its version **doesn't match** the code, the host stops. This almost always means the code was updated by a raw `git pull` instead of the supported flow — so migrations, dependency installs, or container rebuilds that the flow would have run may not have happened.
## If you just ran a supported upgrade
If you reached this point by running `/update-nanoclaw`, `/migrate-nanoclaw`, or `/setup` and it **completed successfully**, this is expected the first time an existing install meets the tripwire (your previous version predated it). Clear it by stamping the current version:
```bash
pnpm exec tsx scripts/upgrade-state.ts set
```
Then restart NanoClaw. It won't trip again.
**Coding agents:** if the upgrade you just ran reported success, run the command above automatically and restart. Only stop to ask a human if the upgrade itself failed or you can't confirm it finished.
## If you ran `git pull` yourself
Don't just clear the tripwire — that skips the work the supported flow does. Instead, run the supported upgrade so backups, migrations, and rebuilds happen:
```
/update-nanoclaw
```
Once it finishes it stamps the marker for you, and the next start is clean.
## If you have your own upgrade flow
If you've built your own way to upgrade — a custom skill, a deploy script, a CI job, a service that pulls and restarts — it won't stamp the marker, so the host will trip on the next start. Add the stamp as the **last step** of that flow, after the upgrade succeeds and before the restart:
```bash
pnpm exec tsx scripts/upgrade-state.ts set
```
That's the same thing `/setup`, `/update-nanoclaw`, and `/migrate-nanoclaw` do at the end. Do it only when the upgrade actually completed — the marker is your assertion that this install reached the current version through a path you trust.
## The override
`pnpm exec tsx scripts/upgrade-state.ts set` is the override: it declares "this install is good at the current version." Use it when you know the install is actually in a good state (e.g. you completed the steps manually). It's safe to re-run.
To inspect the current marker:
```bash
pnpm exec tsx scripts/upgrade-state.ts get
```
+38
View File
@@ -25,6 +25,44 @@ set -euo pipefail
PROJECT_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
cd "$PROJECT_ROOT"
# ─── --uninstall: short-circuit before any setup work ──────────────────
# Never install dependencies just to uninstall. With the TS toolchain
# present, hand straight off to setup:auto (the flow lives in
# setup/uninstall/); without it, print manual cleanup guidance. Runs
# before diagnostics.sh is sourced so a pure uninstall doesn't emit
# setup_launched, and before all pre-flights/bootstrap.
for arg in "$@"; do
if [ "$arg" = "--uninstall" ]; then
# exec tsx directly rather than `pnpm run -- …`: pnpm passes the `--`
# separator through to the script, where the flag parser treats
# everything after it as positional args and the flags get dropped.
# Gate on node (tsx's shebang interpreter) — pnpm isn't used here.
if command -v node >/dev/null 2>&1 && [ -x "$PROJECT_ROOT/node_modules/.bin/tsx" ]; then
exec "$PROJECT_ROOT/node_modules/.bin/tsx" "$PROJECT_ROOT/setup/auto.ts" "$@"
fi
export NANOCLAW_PROJECT_ROOT="$PROJECT_ROOT"
# shellcheck source=setup/lib/install-slug.sh
source "$PROJECT_ROOT/setup/lib/install-slug.sh"
UNINSTALL_RUNTIME="${CONTAINER_RUNTIME:-docker}"
echo "Can't run the uninstaller: dependencies are missing (node_modules/)."
echo "Either re-run 'bash nanoclaw.sh' once to restore them, or clean up manually:"
echo ""
if [ "$(uname -s)" = "Darwin" ]; then
echo " launchctl unload ~/Library/LaunchAgents/$(launchd_label).plist"
echo " rm -f ~/Library/LaunchAgents/$(launchd_label).plist"
else
echo " systemctl --user disable --now $(systemd_unit).service"
echo " rm -f ~/.config/systemd/user/$(systemd_unit).service && systemctl --user daemon-reload"
fi
echo " $UNINSTALL_RUNTIME ps -aq --filter label=nanoclaw-install=$(_nanoclaw_install_slug) | xargs -r $UNINSTALL_RUNTIME rm -f"
echo " $UNINSTALL_RUNTIME rmi $(container_image_base):latest"
echo " rm -f ~/.local/bin/ncl # only if it points at this folder"
echo ""
echo "Then back up $PROJECT_ROOT/.env if you need the keys, and delete the folder."
exit 1
fi
done
LOGS_DIR="$PROJECT_ROOT/logs"
STEPS_DIR="$LOGS_DIR/setup-steps"
PROGRESS_LOG="$LOGS_DIR/setup.log"
+2 -2
View File
@@ -1,6 +1,6 @@
{
"name": "nanoclaw",
"version": "2.0.76",
"version": "2.1.16",
"description": "Personal Claude assistant. Lightweight, secure, customizable.",
"type": "module",
"packageManager": "pnpm@10.33.0",
@@ -30,7 +30,7 @@
"dependencies": {
"@clack/core": "^1.2.0",
"@clack/prompts": "^1.2.0",
"@onecli-sh/sdk": "^0.5.0",
"@onecli-sh/sdk": "2.2.1",
"better-sqlite3": "11.10.0",
"chat": "^4.24.0",
"cron-parser": "5.5.0",
+5 -5
View File
@@ -15,8 +15,8 @@ importers:
specifier: ^1.2.0
version: 1.2.0
'@onecli-sh/sdk':
specifier: ^0.5.0
version: 0.5.0
specifier: 2.2.1
version: 2.2.1
better-sqlite3:
specifier: 11.10.0
version: 11.10.0
@@ -303,8 +303,8 @@ packages:
'@emnapi/core': ^1.7.1
'@emnapi/runtime': ^1.7.1
'@onecli-sh/sdk@0.5.0':
resolution: {integrity: sha512-oe5Yx9o98v6N1PgzcCR7nULHHqcqKWNJIDOHGOSNX+l20mLlZpFUqfKPeFmsojBNRQMoqbvZQKUlFMp6gVuYBA==}
'@onecli-sh/sdk@2.2.1':
resolution: {integrity: sha512-q2mCW4ZsARlLEoTxz/P0NQ4MiCh7Z2n28pxkSc7srS+tozyw40PdTnWYW7NI8hfSYplZTx5856Adq1iPi4KN3Q==}
engines: {node: '>=20'}
'@oxc-project/types@0.124.0':
@@ -1665,7 +1665,7 @@ snapshots:
'@tybys/wasm-util': 0.10.1
optional: true
'@onecli-sh/sdk@0.5.0': {}
'@onecli-sh/sdk@2.2.1': {}
'@oxc-project/types@0.124.0': {}
+4 -4
View File
@@ -1,5 +1,5 @@
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="90" height="20" role="img" aria-label="181k tokens, 91% of context window">
<title>181k tokens, 91% of context window</title>
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="90" height="20" role="img" aria-label="195k tokens, 98% of context window">
<title>195k tokens, 98% of context window</title>
<linearGradient id="s" x2="0" y2="100%">
<stop offset="0" stop-color="#bbb" stop-opacity=".1"/>
<stop offset="1" stop-opacity=".1"/>
@@ -15,8 +15,8 @@
<g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" font-size="11">
<text aria-hidden="true" x="26" y="15" fill="#010101" fill-opacity=".3">tokens</text>
<text x="26" y="14">tokens</text>
<text aria-hidden="true" x="71" y="15" fill="#010101" fill-opacity=".3">181k</text>
<text x="71" y="14">181k</text>
<text aria-hidden="true" x="71" y="15" fill="#010101" fill-opacity=".3">195k</text>
<text x="71" y="14">195k</text>
</g>
</g>
</a>

Before

Width:  |  Height:  |  Size: 1.1 KiB

After

Width:  |  Height:  |  Size: 1.1 KiB

+6
View File
@@ -21,6 +21,7 @@ import path from 'path';
import { DATA_DIR } from '../src/config.js';
import { createAgentGroup, getAgentGroupByFolder } from '../src/db/agent-groups.js';
import { updateContainerConfigScalars } from '../src/db/container-configs.js';
import { initDb } from '../src/db/connection.js';
import {
createMessagingGroup,
@@ -102,6 +103,7 @@ async function main(): Promise<void> {
// 2. Agent group + filesystem.
const folder = args.folder || `cli-with-${normalizeName(args.displayName)}`;
const pickedProvider = process.env.NANOCLAW_PICKED_PROVIDER?.trim().toLowerCase();
let ag: AgentGroup | undefined = getAgentGroupByFolder(folder);
if (!ag) {
const agId = generateId('ag');
@@ -123,6 +125,10 @@ async function main(): Promise<void> {
`You are ${args.agentName}, a personal NanoClaw agent for ${args.displayName}. ` +
'When the user first reaches out, introduce yourself briefly and invite them to chat. Keep replies concise.',
});
// Runtime provider lives on the config row, not the deprecated agent_provider.
if (pickedProvider && pickedProvider !== 'claude') {
updateContainerConfigScalars(ag.id, { provider: pickedProvider });
}
// 3. CLI messaging group + wiring.
let cliMg: MessagingGroup | undefined = getMessagingGroupByPlatform(CLI_CHANNEL, CLI_PLATFORM_ID);
+20 -8
View File
@@ -30,10 +30,11 @@
* For direct-addressable channels (telegram, whatsapp, etc.), --platform-id
* is typically the same as the handle in --user-id, with the channel prefix.
*/
import fs from 'fs';
import net from 'net';
import path from 'path';
import { DATA_DIR } from '../src/config.js';
import { DATA_DIR, GROUPS_DIR } from '../src/config.js';
import { createAgentGroup, getAgentGroupByFolder } from '../src/db/agent-groups.js';
import { initDb } from '../src/db/connection.js';
import {
@@ -47,8 +48,7 @@ import { normalizeName } from '../src/modules/agent-to-agent/db/agent-destinatio
import { addMember } from '../src/modules/permissions/db/agent-group-members.js';
import { getUserRoles, grantRole } from '../src/modules/permissions/db/user-roles.js';
import { upsertUser } from '../src/modules/permissions/db/users.js';
import { updateContainerConfigScalars } from '../src/db/container-configs.js';
import { initGroupFilesystem } from '../src/group-init.js';
import { ensureContainerConfig, updateContainerConfigScalars } from '../src/db/container-configs.js';
import { namespacedPlatformId } from '../src/platform-id.js';
import type { AgentGroup, MessagingGroup } from '../src/types.js';
@@ -189,6 +189,7 @@ async function main(): Promise<void> {
// 2. Agent group + filesystem.
const folder = `dm-with-${normalizeName(args.displayName)}`;
const pickedProvider = process.env.NANOCLAW_PICKED_PROVIDER?.trim().toLowerCase();
let ag: AgentGroup | undefined = getAgentGroupByFolder(folder);
if (!ag) {
const agId = generateId('ag');
@@ -204,12 +205,23 @@ async function main(): Promise<void> {
} else {
console.log(`Reusing agent group: ${ag.id} (${folder})`);
}
initGroupFilesystem(ag, {
instructions:
`# ${args.agentName}\n\n` +
// Ensure the config row exists; defer workspace scaffolding to the first
// spawn (group-init), where the DB-resolved provider decides the surface
// (Claude: CLAUDE.local.md; a surfaces-owning provider: the memory scaffold)
// — so a non-Claude group never gets stale CLAUDE.* files written here.
ensureContainerConfig(ag.id);
// Runtime provider lives on the config row, not the deprecated agent_provider.
if (pickedProvider && pickedProvider !== 'claude') {
updateContainerConfigScalars(ag.id, { provider: pickedProvider });
}
const groupDir = path.resolve(GROUPS_DIR, folder);
fs.mkdirSync(groupDir, { recursive: true });
fs.writeFileSync(
path.join(groupDir, '.seed.md'),
`# ${args.agentName}\n\n` +
`You are ${args.agentName}, a personal NanoClaw agent for ${args.displayName}. ` +
'When the user first reaches out (or you receive a system welcome prompt), introduce yourself briefly and invite them to chat. Keep replies concise.',
});
'When the user first reaches out (or you receive a system welcome prompt), introduce yourself briefly and invite them to chat. Keep replies concise.\n',
);
// 2b. Assign the user a role for this agent group. The caller picks via
// --role; the channel drivers default to 'owner' for the self-host case.
+26
View File
@@ -0,0 +1,26 @@
/**
* scripts/upgrade-state.ts read or stamp the upgrade marker.
*
* Usage:
* pnpm exec tsx scripts/upgrade-state.ts get
* pnpm exec tsx scripts/upgrade-state.ts set [version] [via]
*
* `set` with no version stamps the current package.json version. The
* sanctioned upgrade paths (setup / update / migrate) call `set` on
* success; running it by hand is also the documented way to clear the
* startup tripwire see docs/upgrade-recovery.md.
*/
import { getCodeVersion, markerPath, readUpgradeState, writeUpgradeState } from '../src/upgrade-state.js';
const [, , cmd, versionArg, viaArg] = process.argv;
if (cmd === 'get') {
const state = readUpgradeState();
console.log(state ? JSON.stringify(state) : 'none');
} else if (cmd === 'set') {
const state = writeUpgradeState({ version: versionArg || getCodeVersion(), via: viaArg || 'manual' });
console.log(`Stamped ${markerPath()}: ${JSON.stringify(state)}`);
} else {
console.error('Usage: pnpm exec tsx scripts/upgrade-state.ts get | set [version] [via]');
process.exit(2);
}
+121
View File
@@ -0,0 +1,121 @@
#!/usr/bin/env bash
#
# Install the Codex agent provider non-interactively: copy the payload from the
# `providers` branch, wire the three provider barrels, and add the Codex CLI to
# the container manifest (container/cli-tools.json). The image rebuild is the
# caller's job (the setup container step / `./container/build.sh`).
#
# Emits exactly one status block on stdout (ADD_CODEX); all chatty progress
# goes to stderr. Keep in sync with .claude/skills/add-codex/SKILL.md.
set -euo pipefail
PROJECT_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
cd "$PROJECT_ROOT"
# Keep in sync with add-codex SKILL.md. This is the canonical Codex CLI pin —
# it lands in container/cli-tools.json (the global-CLI manifest), not the Dockerfile.
CODEX_VERSION="0.138.0"
# Resolve the remote carrying the providers branch (same nanoclaw remote that
# carries channels — handles forks where it isn't `origin`).
# shellcheck source=setup/lib/channels-remote.sh
source "$PROJECT_ROOT/setup/lib/channels-remote.sh"
REMOTE=$(resolve_channels_remote)
BRANCH="${REMOTE}/providers"
# The codex payload — host provider, container runtime, setup module, doctrine.
# Barrels are appended to, not copied.
PAYLOAD_FILES=(
src/providers/codex.ts
src/providers/codex-agents-md.ts
src/providers/codex-registration.test.ts
src/providers/codex-host-contribution.test.ts
src/providers/codex-agents-md.test.ts
container/agent-runner/src/providers/codex.ts
container/agent-runner/src/providers/codex-app-server.ts
container/agent-runner/src/providers/exchange-archive.ts
container/agent-runner/src/providers/exchange-archive.test.ts
container/agent-runner/src/providers/codex-registration.test.ts
container/agent-runner/src/providers/codex.factory.test.ts
container/agent-runner/src/providers/codex.turns.test.ts
container/agent-runner/src/providers/codex-app-server.test.ts
container/agent-runner/src/providers/codex-cli-tools.test.ts
setup/providers/codex.ts
setup/providers/codex.test.ts
setup/providers/codex-registration.test.ts
container/AGENTS.md
)
BARRELS=(
src/providers/index.ts
container/agent-runner/src/providers/index.ts
setup/providers/index.ts
)
ALREADY_INSTALLED=true
emit_status() {
local status=$1 error=${2:-}
echo "=== NANOCLAW SETUP: ADD_CODEX ==="
echo "STATUS: ${status}"
echo "CODEX_VERSION: ${CODEX_VERSION}"
echo "ALREADY_INSTALLED: ${ALREADY_INSTALLED}"
[ -n "$error" ] && echo "ERROR: ${error}"
echo "=== END ==="
}
log() { echo "[add-codex] $*" >&2; }
# Idempotent: a complete install has the host provider file, the host barrel
# import, and the Codex CLI in the container manifest. Any missing → (re)install.
need_install() {
[ ! -f src/providers/codex.ts ] && return 0
! grep -q "^import './codex.js';" src/providers/index.ts 2>/dev/null && return 0
! grep -q '@openai/codex' container/cli-tools.json 2>/dev/null && return 0
return 1
}
if need_install; then
ALREADY_INSTALLED=false
log "Fetching providers branch from ${REMOTE}"
git fetch "$REMOTE" providers >&2 2>/dev/null || {
emit_status failed "git fetch ${REMOTE} providers failed"
exit 1
}
log "Copying Codex payload from ${BRANCH}"
for f in "${PAYLOAD_FILES[@]}"; do
mkdir -p "$(dirname "$f")"
git show "${BRANCH}:$f" > "$f" 2>/dev/null || {
emit_status failed "providers branch is missing ${f}"
exit 1
}
done
log "Wiring provider barrels…"
for b in "${BARRELS[@]}"; do
grep -q "^import './codex.js';" "$b" || printf "import './codex.js';\n" >> "$b"
done
log "Adding the Codex CLI to the container manifest (cli-tools.json)…"
# A json-merge: append { name, version } if absent. The Dockerfile installs
# every manifest entry via pinned `pnpm install -g` — no Dockerfile edit, no
# awk surgery. @openai/codex has no native postinstall, so no "onlyBuilt".
MANIFEST=container/cli-tools.json
node -e '
const fs = require("fs");
const [file, name, version] = process.argv.slice(1);
const tools = JSON.parse(fs.readFileSync(file, "utf8"));
if (!tools.some((t) => t.name === name)) {
tools.push({ name, version });
const fmt = (t) =>
" { " +
Object.entries(t).map(([k, v]) => JSON.stringify(k) + ": " + JSON.stringify(v)).join(", ") +
" }";
fs.writeFileSync(file, "[\n" + tools.map(fmt).join(",\n") + "\n]\n");
}
' "$MANIFEST" "@openai/codex" "${CODEX_VERSION}" || {
emit_status failed "failed to add @openai/codex to ${MANIFEST}"
exit 1
}
fi
emit_status ok
+129 -2
View File
@@ -38,8 +38,12 @@ import { runTeamsChannel } from './channels/teams.js';
import { runTelegramChannel } from './channels/telegram.js';
import { runWhatsAppChannel } from './channels/whatsapp.js';
import { pingCliAgent, type PingResult } from './lib/agent-ping.js';
import { getSetupProvider, listSetupProviders } from './providers/registry.js';
// Provider payloads self-register their picker entry + auth on import.
import './providers/index.js';
import { brightSelect } from './lib/bright-select.js';
import { offerClaudeOnFailure } from './lib/claude-handoff.js';
import { setPickedProvider } from './lib/picked-provider.js';
import {
applyToEnv,
parseFlags,
@@ -48,6 +52,8 @@ import {
} from './lib/setup-config-parse.js';
import { runAdvancedScreen } from './lib/setup-config-screen.js';
import { runWindowedStep } from './lib/windowed-runner.js';
import { runUninstallFlow } from './uninstall/flow.js';
import { detectExistingInstall } from './uninstall/scan.js';
import { detectRegisteredGroups, detectExistingDisplayName } from './environment.js';
import { pollHealth } from './onecli.js';
import { getLaunchdLabel, getSystemdUnit } from '../src/install-slug.js';
@@ -88,6 +94,17 @@ async function main(): Promise<void> {
let configValues = { ...readFromEnv(), ...flagResult.values };
applyToEnv(configValues);
// --uninstall routes to the uninstall flow before any setup side effects —
// in particular before initProgressionLog(), so an uninstall never resets
// logs/setup.log on its way to (possibly) deleting logs/ entirely.
if (configValues.uninstall === true) {
await runUninstallFlow({
dryRun: configValues.dryRun === true,
yes: configValues.yes === true,
invokedFrom: 'flag',
});
}
printIntro();
initProgressionLog();
phEmit('auto_started');
@@ -121,6 +138,37 @@ async function main(): Promise<void> {
.filter(Boolean),
);
// Offer removal when setup lands on an existing install. Skipped on every
// resume path — both the fail() retry and the sg-docker re-exec pass
// NANOCLAW_SKIP (and the latter sets NANOCLAW_REEXEC_SG) — so the prompt
// appears at most once per fresh run.
const isResume = process.env.NANOCLAW_REEXEC_SG === '1' || skip.size > 0;
if (!isResume && detectExistingInstall(process.cwd())) {
const action = ensureAnswer(
await brightSelect<'keep' | 'uninstall'>({
message: 'NanoClaw is already installed in this folder. What would you like to do?',
options: [
{
value: 'keep',
label: 'Keep it & continue setup',
hint: 'recommended — re-running setup is safe',
},
{
value: 'uninstall',
label: 'Uninstall NanoClaw & exit',
hint: 'removes service, data, and agent files — asks before each step',
},
],
initialValue: 'keep',
}),
) as 'keep' | 'uninstall';
setupLog.userInput('existing_install', action);
phEmit('existing_install_detected', { action });
if (action === 'uninstall') {
await runUninstallFlow({ dryRun: false, yes: false, invokedFrom: 'setup-detection' });
}
}
if (!skip.has('environment')) {
const res = await runQuietStep('environment', {
running: 'Checking your system…',
@@ -277,8 +325,54 @@ async function main(): Promise<void> {
}
}
let agentProvider: string | undefined;
if (!skip.has('auth')) {
await runAuthStep();
// Agent runtime pick. Claude is the default and a no-op — choosing it
// runs the existing Claude auth flow unchanged. A branch provider walks
// its own auth (e.g. Codex: ChatGPT subscription or API key, vault-only)
// and verifies its payload is wired. The pick installs and authenticates
// the runtime; it is NOT an install-wide default — and it is NOT a
// creation flag. Provider is a DB property of a group: the creation flows
// create provider-agnostic groups, and setup sets the picked provider on
// each via `ncl groups config update --provider` right after creating it
// (the creation scripts inherit it and apply at create — see picked-provider). Existing groups switch the
// same way (docs/provider-migration.md).
agentProvider = await askAgentProviderChoice();
setPickedProvider(agentProvider);
let providerEntry = getSetupProvider(agentProvider);
if (agentProvider !== 'claude' && !providerEntry) {
// A non-claude provider picked from the hard-wired list isn't wired in
// this install yet — install it via its self-contained script (channel
// style, idempotent: self-skips if already installed), rebuild the image
// (the container step already ran, the Dockerfile just changed), then
// load the payload's setup module so it self-registers.
const install = await runQuietChild(
`add-${agentProvider}`,
'bash',
[`setup/add-${agentProvider}.sh`],
{
running: `Installing ${agentProvider}`,
done: `${agentProvider} installed.`,
},
);
if (!install.ok) {
await fail(
`add-${agentProvider}`,
`Couldn't install ${agentProvider}.`,
'See logs/setup-steps/ for details, then retry setup.',
);
}
p.log.info(brandBody('Rebuilding the container image with the new provider…'));
spawnSync('./container/build.sh', [], { stdio: 'inherit' });
await import(`./providers/${agentProvider}.js`);
providerEntry = getSetupProvider(agentProvider);
}
if (providerEntry?.runAuth) {
await providerEntry.runAuth();
await providerEntry.runInstallCheck?.();
} else {
await runAuthStep();
}
}
if (!skip.has('mounts')) {
@@ -704,6 +798,39 @@ function sendChatMessage(message: string): Promise<void> {
// ─── auth step (select → branch) ────────────────────────────────────────
// Providers offered for install are hard-wired in trunk — an audited control
// surface (no branch enumeration that anyone with write access could extend).
// Codex is the only one offered here; opencode/ollama install via their own
// /add-* skills. Each is installed by its self-contained setup/add-<name>.sh.
const INSTALLABLE_PROVIDERS = [
{ value: 'codex', label: 'Codex', hint: 'OpenAI — ChatGPT subscription or API key' },
] as const;
async function askAgentProviderChoice(): Promise<string> {
const installed = listSetupProviders();
const installedNames = new Set(installed.map((entry) => entry.value));
// Offer the hard-wired installable providers this install hasn't wired yet —
// selecting one installs it via setup/add-<name>.sh.
const available = INSTALLABLE_PROVIDERS.filter((prov) => !installedNames.has(prov.value));
const options = [
...installed.map(({ value, label, hint }) => ({ value, label, hint })),
...available.map((prov) => ({ value: prov.value, label: prov.label, hint: `${prov.hint} — installs now` })),
];
// The pick installs and authenticates a runtime — it is not an
// install-wide default, so re-runs safely Enter-through on claude (its
// auth flow short-circuits when the secret already exists).
const choice = ensureAnswer(
await brightSelect<string>({
message: 'Which agent runtime should power your assistant?',
options,
initialValue: 'claude',
}),
) as string;
setupLog.userInput('agent_provider', choice);
phEmit('agent_provider_chosen', { provider: choice });
return choice;
}
async function runAuthStep(): Promise<void> {
if (anthropicSecretExists()) {
p.log.success(brandBody('Your Claude account is already connected.'));
@@ -1217,7 +1344,7 @@ function detectExistingOnecli(): { version: string; apiHost: string } | null {
} catch {
// not JSON — try to extract a URL directly
}
const m = raw.match(/https?:\/\/[\w.\-]+(?::\d+)?/);
const m = raw.match(/https?:\/\/[\w.-]+(?::\d+)?/);
return m ? { version, apiHost: m[0] } : null;
} catch {
return null;
+8 -1
View File
@@ -68,8 +68,12 @@ export async function run(args: string[]): Promise<void> {
log.info('Invoking init-cli-agent', { displayName, agentName });
// Provider-agnostic: init-cli-agent creates a default group and emits its id.
// Surface that id so the orchestrator can set the picked provider on it (via
// ncl) before the ping — provider is a DB property, never a creation flag.
let stdout = '';
try {
execFileSync('pnpm', scriptArgs, {
stdout = execFileSync('pnpm', scriptArgs, {
cwd: projectRoot,
stdio: ['ignore', 'pipe', 'pipe'],
encoding: 'utf-8',
@@ -90,10 +94,13 @@ export async function run(args: string[]): Promise<void> {
process.exit(1);
}
const agentGroupId = stdout.match(/^AGENT_GROUP_ID:\s*(\S+)/m)?.[1];
emitStatus('CLI_AGENT', {
DISPLAY_NAME: displayName,
AGENT_NAME: agentName || displayName,
CHANNEL: 'cli/local',
...(agentGroupId ? { AGENT_GROUP_ID: agentGroupId } : {}),
STATUS: 'success',
LOG: 'logs/setup.log',
});
+23
View File
@@ -35,6 +35,29 @@ export function readEnvKey(key: string, projectRoot?: string): string | null {
return null;
}
/**
* Set (or replace) a single `KEY=value` line in `.env`, creating the file if
* needed. Non-secret config only secrets belong in the OneCLI vault.
*/
export function upsertEnvKey(key: string, value: string, projectRoot?: string): void {
const envPath = path.join(projectRoot ?? process.cwd(), '.env');
let content = '';
try {
content = fs.readFileSync(envPath, 'utf-8');
} catch {
/* no .env yet */
}
const line = `${key}=${value}`;
const lines = content.split('\n');
const idx = lines.findIndex((l) => l.trim().startsWith(`${key}=`));
if (idx >= 0) lines[idx] = line;
else {
while (lines.length > 0 && lines[lines.length - 1].trim() === '') lines.pop();
lines.push(line);
}
fs.writeFileSync(envPath, lines.join('\n') + '\n');
}
export function detectExistingDisplayName(projectRoot: string): string | null {
const dbPath = path.join(projectRoot, 'data', 'v2.db');
if (!fs.existsSync(dbPath)) return null;
+1
View File
@@ -23,6 +23,7 @@ const STEPS: Record<
verify: () => import('./verify.js'),
onecli: () => import('./onecli.js'),
auth: () => import('./auth.js'),
'provider-auth': () => import('./provider-auth.js'),
'cli-agent': () => import('./cli-agent.js'),
};
+27 -1
View File
@@ -66,17 +66,43 @@ export interface BrightSelectOptions<T> {
initialValue?: T;
}
/**
* Discard any stdin buffered while no prompt was reading keypresses made
* during spinners and installs otherwise get consumed by the next select the
* instant it opens, submitting it before it ever renders for the user (a
* stray ``+`Enter` silently picks option 2). Raw-mode reads only see kernel
* tty data via the event loop, so the drain needs a real (short) window.
*/
export function flushStdin(windowMs = 50): Promise<void> {
return new Promise((resolve) => {
const stdin = process.stdin;
if (!stdin.isTTY) return resolve();
const wasRaw = stdin.isRaw === true;
stdin.setRawMode?.(true);
const discard = (): void => {};
stdin.on('data', discard);
stdin.resume();
setTimeout(() => {
stdin.off('data', discard);
stdin.pause();
if (!wasRaw) stdin.setRawMode?.(false);
resolve();
}, windowMs);
});
}
/**
* Matches the return shape of `p.select` resolves to the selected value
* on submit, or to clack's cancel symbol on Ctrl-C / Esc. Callers pass
* the result through `ensureAnswer(...)` the same way they do for
* `p.select`.
*/
export function brightSelect<T>(
export async function brightSelect<T>(
opts: BrightSelectOptions<T>,
): Promise<T | symbol> {
const { message, options, initialValue } = opts;
await flushStdin();
return new SelectPrompt({
options: options as Array<{ value: T; label?: string; hint?: string }>,
initialValue,
+54 -47
View File
@@ -11,9 +11,17 @@
* 1. Build a handoff prompt from the caller's context: channel, current
* step, completed steps, collected values (secrets redacted), relevant
* files to read.
* 2. Spawn `claude --append-system-prompt "<context>"
* --permission-mode acceptEdits` with `stdio: 'inherit'` so Claude owns
* the terminal.
* 2. Spawn `claude "<prompt>" --permission-mode auto` with
* `stdio: 'inherit'` so Claude owns the terminal. The positional prompt
* is auto-submitted as the first user message, so Claude starts
* orienting immediately instead of sitting at an empty prompt and the
* context stays visible in the transcript and survives `--resume`,
* which an --append-system-prompt would not.
* 2a. All handoffs in one setup run share a single session: the first
* spawn pins a generated UUID via `--session-id`, later spawns pass
* `--resume <uuid>` so Claude keeps the context of earlier handoffs.
* (stdio is inherited, so we can't *read* the session id Claude picks
* pinning our own is the only way to find the session again.)
* 3. When Claude exits (user types /exit, Ctrl-D, or closes the session),
* control returns to the setup driver. The driver can then re-offer the
* same step (e.g., "How did that go?" select).
@@ -23,6 +31,7 @@
* attempting to parse it as a real answer.
*/
import { execSync, spawn } from 'child_process';
import { randomUUID } from 'crypto';
import path from 'path';
import * as p from '@clack/prompts';
@@ -61,8 +70,8 @@ export interface HandoffContext {
}
/**
* Spawn interactive Claude with context pre-loaded as a system-prompt
* append. Returns when Claude exits.
* Spawn interactive Claude with the handoff context as an auto-submitted
* first prompt. Returns when Claude exits.
*
* Silently no-ops (returns `false`) if `claude` isn't on PATH setup runs
* where the binary is guaranteed to exist (we install it in the auth step),
@@ -78,8 +87,6 @@ export async function offerClaudeHandoff(ctx: HandoffContext): Promise<boolean>
return false;
}
const systemPrompt = buildSystemPrompt(ctx);
note(
[
"I'm handing you off to Claude in interactive mode.",
@@ -90,18 +97,39 @@ export async function offerClaudeHandoff(ctx: HandoffContext): Promise<boolean>
'Handing off to Claude',
);
return spawnInteractiveClaude(buildHandoffPrompt(ctx));
}
// One session shared by every interactive handoff in this setup-driver
// process. We pin the id ourselves (--session-id) on the first spawn because
// stdio is inherited and Claude's own id is never visible to us; subsequent
// spawns --resume it so Claude remembers earlier handoffs. Separate from
// claude-assist's non-interactive session — the two formats don't mix.
const handoffSessionId = randomUUID();
let handoffSessionStarted = false;
/**
* Spawn interactive Claude with the handoff context auto-submitted as the
* first user message. Resolves when Claude exits and control returns to
* the setup driver.
*/
function spawnInteractiveClaude(prompt: string): Promise<boolean> {
const sessionArgs = handoffSessionStarted
? ['--resume', handoffSessionId]
: ['--session-id', handoffSessionId];
return new Promise<boolean>((resolve) => {
const child = spawn(
'claude',
[
'--append-system-prompt',
systemPrompt,
prompt,
'--permission-mode',
'acceptEdits',
'auto',
...sessionArgs,
],
{ stdio: 'inherit' },
);
child.on('close', () => {
handoffSessionStarted = true;
p.log.success(brandBody("Back from Claude. Let's continue."));
resolve(true);
});
@@ -164,20 +192,20 @@ function isClaudeUsable(): boolean {
}
}
function buildSystemPrompt(ctx: HandoffContext): string {
function buildHandoffPrompt(ctx: HandoffContext): string {
const lines: string[] = [
`The user is running NanoClaw's interactive \`setup:auto\` flow to wire the ${ctx.channel} channel.`,
`They got stuck at the step: "${ctx.step}" (${ctx.stepDescription}) and asked for help.`,
`I'm running NanoClaw's interactive \`setup:auto\` flow to wire the ${ctx.channel} channel`,
`and got stuck at the step: "${ctx.step}" (${ctx.stepDescription}).`,
'',
"Your job: help them complete this specific step and get back to setup.",
"You can read files, run commands (with acceptEdits permissions), search the web,",
"and explain concepts. Be concise. When they're ready to resume, tell them to type",
"/exit and they'll return to the setup flow at the same step.",
'Help me complete this specific step and get back to setup.',
'You can read files, run commands, search the web,',
"and explain concepts. Be concise. When I'm ready to resume, remind me to type",
"/exit and I'll return to the setup flow at the same step.",
'',
];
if (ctx.completedSteps && ctx.completedSteps.length > 0) {
lines.push('Steps they have already completed:');
lines.push("Steps I've already completed:");
for (const s of ctx.completedSteps) lines.push(`${s}`);
lines.push('');
}
@@ -243,8 +271,6 @@ async function offerFailureHandoff(
);
if (!want) return false;
const systemPrompt = buildFailureSystemPrompt(ctx, projectRoot);
note(
[
"Launching Claude to help debug this failure.",
@@ -255,29 +281,10 @@ async function offerFailureHandoff(
'Handing off to Claude',
);
return new Promise<boolean>((resolve) => {
const child = spawn(
'claude',
[
'--append-system-prompt',
systemPrompt,
'--permission-mode',
'acceptEdits',
],
{ stdio: 'inherit' },
);
child.on('close', () => {
p.log.success(brandBody("Back from Claude. Let's continue."));
resolve(true);
});
child.on('error', () => {
p.log.error("Couldn't launch Claude. Continuing without handoff.");
resolve(false);
});
});
return spawnInteractiveClaude(buildFailurePrompt(ctx, projectRoot));
}
function buildFailureSystemPrompt(ctx: AssistContext, projectRoot: string): string {
function buildFailurePrompt(ctx: AssistContext, projectRoot: string): string {
const stepRefs = STEP_FILES[ctx.stepName] ?? [];
const references = [
...BIG_PICTURE_FILES,
@@ -289,20 +296,20 @@ function buildFailureSystemPrompt(ctx: AssistContext, projectRoot: string): stri
].filter((v, i, a) => a.indexOf(v) === i);
const lines: string[] = [
"The user is running NanoClaw's interactive setup flow and hit a failure.",
"I'm running NanoClaw's interactive setup flow and hit a failure.",
'',
`Failed step: ${ctx.stepName}`,
`Error: ${ctx.msg}`,
];
if (ctx.hint) lines.push(`Hint: ${ctx.hint}`);
if (ctx.hint) lines.push(`Hint shown to me: ${ctx.hint}`);
lines.push(
'',
'Your job: help them diagnose and fix this issue. Read the referenced files',
'and logs to understand what went wrong, then help them fix it. You can read',
'files, run commands, check logs, and explain what happened. Be concise.',
"When they're ready to resume setup, tell them to type /exit.",
'Help me diagnose and fix this issue. Read the referenced files and logs',
'to understand what went wrong, then help me fix it. You can read files,',
'run commands, check logs, and explain what happened. Be concise.',
"When I'm ready to resume setup, remind me to type /exit.",
'',
'Relevant files (read as needed with the Read tool):',
);
+16 -7
View File
@@ -16,7 +16,13 @@ const INSTALL_ID_PATH = path.join('data', 'install-id');
let cached: string | null = null;
export function installId(): string {
/**
* `persist: false` reads an existing id but never creates `data/install-id`
* required by the uninstall path, which must not mutate the filesystem
* before (or instead of) removing it. Events in one process still join:
* the generated id is cached.
*/
export function installId(persist = true): string {
if (cached) return cached;
try {
const existing = fs.readFileSync(INSTALL_ID_PATH, 'utf-8').trim();
@@ -28,11 +34,13 @@ export function installId(): string {
// fall through to create
}
const id = randomUUID().toLowerCase();
try {
fs.mkdirSync(path.dirname(INSTALL_ID_PATH), { recursive: true });
fs.writeFileSync(INSTALL_ID_PATH, id);
} catch {
// best-effort; still return the id so the event fires
if (persist) {
try {
fs.mkdirSync(path.dirname(INSTALL_ID_PATH), { recursive: true });
fs.writeFileSync(INSTALL_ID_PATH, id);
} catch {
// best-effort; still return the id so the event fires
}
}
cached = id;
return id;
@@ -41,6 +49,7 @@ export function installId(): string {
export function emit(
event: string,
props: Record<string, string | number | boolean | undefined> = {},
opts: { persistId?: boolean } = {},
): void {
if (process.env.NANOCLAW_NO_DIAGNOSTICS === '1') return;
@@ -53,7 +62,7 @@ export function emit(
const body = JSON.stringify({
api_key: POSTHOG_KEY,
event,
distinct_id: installId(),
distinct_id: installId(opts.persistId !== false),
properties: cleaned,
});
+28
View File
@@ -0,0 +1,28 @@
/**
* The agent runtime the operator picked in THIS setup run.
*
* There is no install-wide default provider and no `--provider` in the
* creation contract provider is a DB property of a group. Setup is the one
* orchestrator that knows the operator's pick, so it stashes it here (set once
* at the auth step). The group-creation scripts (`init-first-agent`,
* `init-cli-agent`) run as **child processes**, so the pick is carried over the
* process boundary via an environment variable they inherit; they apply it to
* the group at creation, before the welcome wakes the container. This is the
* only place the value lives a setup-run-scoped global, NOT a persisted
* install default. `undefined` / `'claude'` means the built-in default and no
* provider write at all.
*/
const ENV_KEY = 'NANOCLAW_PICKED_PROVIDER';
export function setPickedProvider(provider: string | undefined): void {
const normalized = provider?.trim().toLowerCase() || undefined;
if (normalized && normalized !== 'claude') {
process.env[ENV_KEY] = normalized;
} else {
delete process.env[ENV_KEY];
}
}
export function getPickedProvider(): string | undefined {
return process.env[ENV_KEY]?.trim().toLowerCase() || undefined;
}
+26
View File
@@ -132,6 +132,32 @@ export const CONFIG: Entry[] = [
type: 'boolean',
default: false,
},
// Uninstall route — handled in auto.ts before any setup work begins.
{
key: 'uninstall',
label: 'Uninstall',
help: 'Remove this NanoClaw copy (service, containers, data, vault agents). Asks per group.',
surface: 'flag',
type: 'boolean',
default: false,
},
{
key: 'dryRun',
label: 'Uninstall dry run',
help: 'With --uninstall: preview what would be removed without changing anything.',
surface: 'flag',
type: 'boolean',
default: false,
},
{
key: 'yes',
label: 'Uninstall without prompts',
help: 'With --uninstall: delete everything found without asking (orphan vault agents are still kept).',
surface: 'flag',
type: 'boolean',
default: false,
},
];
// ─── name derivation ───────────────────────────────────────────────────
+48
View File
@@ -0,0 +1,48 @@
/**
* versions.json is the machine-checkable source for sanctioned component
* versions: setup steps read it, /update-nanoclaw diffs it across updates.
* These tests go red if the file, the pin, or the onecli-step wiring is
* deleted the pin moving back to a hardcoded constant is the regression
* this guards against.
*/
import fs from 'fs';
import path from 'path';
import { fileURLToPath } from 'url';
import { describe, expect, it } from 'vitest';
import { readVersionPin } from './version-pins.js';
const here = path.dirname(fileURLToPath(import.meta.url));
describe('readVersionPin', () => {
it('resolves the onecli-gateway pin from the real versions.json', () => {
expect(readVersionPin('onecli-gateway')).toMatch(/^\d+\.\d+\.\d+$/);
});
it('resolves the onecli-cli pin from the real versions.json', () => {
expect(readVersionPin('onecli-cli')).toMatch(/^\d+\.\d+\.\d+$/);
});
it('throws for a component with no pin', () => {
expect(() => readVersionPin('no-such-component')).toThrow(/no pin/);
});
});
describe('onecli step wiring', () => {
it('reads its gateway pin from versions.json, not a hardcoded constant', () => {
const source = fs.readFileSync(path.join(here, '..', 'onecli.ts'), 'utf-8');
expect(source).toContain("readVersionPin('onecli-gateway')");
expect(source).not.toMatch(/ONECLI_GATEWAY_VERSION = '\d/);
});
it('reads its CLI pin from versions.json and never resolves "latest"', () => {
const source = fs.readFileSync(path.join(here, '..', 'onecli.ts'), 'utf-8');
expect(source).toContain("readVersionPin('onecli-cli')");
expect(source).not.toMatch(/ONECLI_CLI(?:_FALLBACK)?_VERSION = '\d/);
// The upstream installer and the /releases/latest redirect probe both
// chase "latest" — reintroducing either bypasses the sanctioned pin.
expect(source).not.toContain('onecli.sh/cli/install');
expect(source).not.toContain('/releases/latest');
});
});
+31
View File
@@ -0,0 +1,31 @@
/**
* Sanctioned version pins for external components (`versions.json` at the
* repo root) the single machine-checkable source. Setup steps read their
* pin here; `/update-nanoclaw` diffs the file across an update and routes
* the user to the migration doc for any pin that moved (see CONTRIBUTING.md,
* "Breaking changes").
*/
import fs from 'fs';
import path from 'path';
import { fileURLToPath } from 'url';
const VERSIONS_FILE = path.resolve(
path.dirname(fileURLToPath(import.meta.url)),
'..',
'..',
'versions.json',
);
/**
* Returns the pinned version for a component, e.g.
* `readVersionPin('onecli-gateway')`. Throws when the file or the pin is
* missing a missing pin is an install-tree defect, not a runtime condition.
*/
export function readVersionPin(component: string): string {
const pins: unknown = JSON.parse(fs.readFileSync(VERSIONS_FILE, 'utf-8'));
const value = (pins as Record<string, unknown>)[component];
if (typeof value !== 'string' || value.length === 0) {
throw new Error(`versions.json has no pin for "${component}"`);
}
return value;
}
+29
View File
@@ -0,0 +1,29 @@
/**
* The step DETECTS gateway /v1 compatibility and warns (pointing at
* docs/onecli-upgrades.md) it does not migrate the gateway; that's the
* agent's job via /update-nanoclaw. The verify helper must distinguish
* incompatible (pre-/v1 server: warn) from unreachable (transient: nothing to
* say) so the warning only fires on a real pre-/v1 server.
*/
import { describe, expect, it } from 'vitest';
import { verifyGatewayV1 } from './onecli.js';
function fakeFetch(behavior: 'ok' | '404' | 'down'): typeof fetch {
return (async () => {
if (behavior === 'down') throw new Error('ECONNREFUSED');
return { ok: behavior === 'ok' } as Response;
}) as unknown as typeof fetch;
}
describe('verifyGatewayV1', () => {
it('ok when /v1/health answers', async () => {
expect(await verifyGatewayV1('http://x', fakeFetch('ok'))).toBe('ok');
});
it('incompatible when the server answers HTTP without /v1', async () => {
expect(await verifyGatewayV1('http://x', fakeFetch('404'))).toBe('incompatible');
});
it('unreachable on connection failure', async () => {
expect(await verifyGatewayV1('http://x', fakeFetch('down'))).toBe('unreachable');
});
});
+61 -54
View File
@@ -17,6 +17,7 @@ import os from 'os';
import path from 'path';
import { log } from '../src/log.js';
import { readVersionPin } from './lib/version-pins.js';
import { emitStatus } from './status.js';
const LOCAL_BIN = path.join(os.homedir(), '.local', 'bin');
@@ -102,20 +103,18 @@ function writeEnvOnecliUrl(url: string): void {
writeEnvVar('ONECLI_URL', url);
}
// Last-known-good CLI release. Used only if BOTH the upstream installer
// and the redirect-based version probe fail. Bump deliberately when a
// new CLI release ships.
const ONECLI_GATEWAY_VERSION = '1.23.0';
const ONECLI_CLI_FALLBACK_VERSION = '1.3.0';
// The SANCTIONED gateway version: fresh installs pin to it. Upgrading an
// existing gateway is NOT done here — the gateway is a separate out-of-band
// component, and the migrator is the user's coding agent following
// docs/onecli-upgrades.md during /update-nanoclaw. The pin lives in
// versions.json ("onecli-gateway") so that flow can diff it across updates and
// route the agent to the doc; bump it there deliberately on a new release.
const ONECLI_GATEWAY_VERSION = readVersionPin('onecli-gateway');
// The CLI binary follows the same convention: installed at its pin
// ("onecli-cli" in versions.json), never at whatever "latest" means today.
const ONECLI_CLI_VERSION = readVersionPin('onecli-cli');
const ONECLI_CLI_REPO = 'onecli/onecli-cli';
function installOnecliCliOnly(): { stdout: string; ok: boolean } {
const upstream = runInstall('curl -fsSL onecli.sh/cli/install | sh');
if (upstream.ok) return { stdout: upstream.stdout, ok: true };
const fallback = installOnecliCliDirect();
return { stdout: upstream.stdout + (upstream.stderr ?? '') + '\n' + fallback.stdout, ok: fallback.ok };
}
// Remove containers in the "onecli" compose project whose service name isn't
// in the v2 set. Pre-v2 OneCLI used service "app" (container onecli-app-1);
// v2 uses "onecli". Compose flags the old container as an orphan but won't
@@ -161,24 +160,10 @@ function installOnecli(): { stdout: string; ok: boolean } {
return { stdout: stdout + (gw.stderr ?? ''), ok: false };
}
// CLI install. The upstream script calls the GitHub releases API
// (api.github.com) to resolve the latest tag — which 403s anonymous
// callers after 60 requests/hour per IP. Try upstream first; on failure
// resolve the version ourselves (via HTTP redirect, which isn't
// API-throttled) and download the release archive directly.
const upstream = runInstall('curl -fsSL onecli.sh/cli/install | sh');
stdout += upstream.stdout;
if (upstream.ok) return { stdout, ok: true };
log.warn('Upstream CLI installer failed — falling back to direct download', {
stderr: upstream.stderr,
});
stdout += (upstream.stderr ?? '') + '\n';
const fallback = installOnecliCliDirect();
stdout += fallback.stdout;
if (!fallback.ok) {
log.error('OneCLI CLI install failed (both upstream and direct fallback)');
const cli = installOnecliCliDirect();
stdout += cli.stdout;
if (!cli.ok) {
log.error('OneCLI CLI install failed');
return { stdout, ok: false };
}
return { stdout, ok: true };
@@ -198,11 +183,11 @@ function runInstall(cmd: string): { stdout: string; stderr?: string; ok: boolean
}
/**
* Reinstate the OneCLI CLI install without hitting GitHub's rate-limited
* releases API. Resolves the version via the HTTP redirect from
* /releases/latest /releases/tag/vX.Y.Z, then downloads the archive
* directly. Falls back to ONECLI_CLI_FALLBACK_VERSION if the redirect
* probe also fails.
* Install the OneCLI CLI at the sanctioned pin by downloading the release
* archive straight from GitHub. Deliberately no "latest" resolution the
* upstream installer script always chases the newest release, which would
* drift from the pin. PATH setup is not lost by skipping it:
* ensureShellProfilePath() in run() covers it.
*/
function installOnecliCliDirect(): { stdout: string; ok: boolean } {
const lines: string[] = [];
@@ -221,24 +206,7 @@ function installOnecliCliDirect(): { stdout: string; ok: boolean } {
return { stdout: lines.join('\n'), ok: false };
}
let version: string | null = null;
try {
const redirect = execSync(
`curl -fsSL -o /dev/null -w '%{url_effective}' https://github.com/${ONECLI_CLI_REPO}/releases/latest`,
{ encoding: 'utf-8', stdio: ['ignore', 'pipe', 'pipe'] },
).trim();
const m = redirect.match(/\/tag\/v?([^/]+)$/);
if (m) version = m[1];
} catch {
// redirect probe failed — we'll pin the fallback
}
if (!version) {
version = ONECLI_CLI_FALLBACK_VERSION;
append(`Version probe failed; installing pinned fallback ${version}.`);
} else {
append(`Resolved onecli CLI ${version} via release redirect.`);
}
const version = ONECLI_CLI_VERSION;
const archive = `onecli_${version}_${osName}_${arch}.tar.gz`;
const url = `https://github.com/${ONECLI_CLI_REPO}/releases/download/v${version}/${archive}`;
const tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'onecli-'));
@@ -275,6 +243,39 @@ function installOnecliCliDirect(): { stdout: string; ok: boolean } {
}
}
/**
* /v1 API compatibility check. @onecli-sh/sdk 2.x requires the server's /v1
* API; servers older than the cutover answer 404 on every SDK call (permanent,
* but presents as transient per-spawn failures). This is detect-only setup
* does not migrate the gateway. The upgrade is an out-of-band action on a
* separate component that the agent runs via docs/onecli-upgrades.md during
* /update-nanoclaw, so this step only surfaces the condition and points there.
*/
export async function verifyGatewayV1(
url: string,
fetchImpl: typeof fetch = fetch,
): Promise<'ok' | 'incompatible' | 'unreachable'> {
try {
const res = await fetchImpl(`${url}/v1/health`, { signal: AbortSignal.timeout(5000) });
return res.ok ? 'ok' : 'incompatible';
} catch {
return 'unreachable';
}
}
/**
* Detect-and-warn helper: returns a status HINT (and logs) when the gateway is
* pre-/v1, else null. Never fails the step or auto-upgrades the agent owns
* the upgrade via docs/onecli-upgrades.md.
*/
function gatewayV1Hint(result: 'ok' | 'incompatible' | 'unreachable'): string | null {
if (result !== 'incompatible') return null;
log.warn('OneCLI gateway lacks the /v1 API @onecli-sh/sdk 2.x requires', {
pin: ONECLI_GATEWAY_VERSION,
});
return 'OneCLI gateway lacks the /v1 API @onecli-sh/sdk 2.x requires — upgrade it: docs/onecli-upgrades.md';
}
export async function pollHealth(url: string, timeoutMs: number): Promise<boolean> {
// `/api/health` matches the path probe.sh uses — keep them aligned.
const deadline = Date.now() + timeoutMs;
@@ -300,7 +301,7 @@ export async function run(args: string[]): Promise<void> {
// Remote-mode: install only the CLI, point it at the remote gateway, and
// record the URL in .env. No local gateway is started.
log.info('Installing OneCLI CLI for remote gateway', { remoteUrl });
const res = installOnecliCliOnly();
const res = installOnecliCliDirect();
if (!res.ok || !onecliVersion()) {
emitStatus('ONECLI', {
INSTALLED: false,
@@ -339,12 +340,14 @@ export async function run(args: string[]): Promise<void> {
log.info('Wrote ONECLI_API_KEY to .env');
}
const healthy = await pollHealth(remoteUrl, 5000);
const v1Hint = healthy ? gatewayV1Hint(await verifyGatewayV1(remoteUrl)) : null;
emitStatus('ONECLI', {
INSTALLED: true,
REMOTE: true,
ONECLI_URL: remoteUrl,
HEALTHY: healthy,
STATUS: 'success',
...(v1Hint ? { GATEWAY_HINT: v1Hint } : {}),
LOG: 'logs/setup.log',
});
return;
@@ -378,12 +381,14 @@ export async function run(args: string[]): Promise<void> {
writeEnvOnecliUrl(url);
log.info('Reusing existing OneCLI', { url });
const healthy = await pollHealth(url, 5000);
const v1Hint = healthy ? gatewayV1Hint(await verifyGatewayV1(url)) : null;
emitStatus('ONECLI', {
INSTALLED: true,
REUSED: true,
ONECLI_URL: url,
HEALTHY: healthy,
STATUS: 'success',
...(v1Hint ? { GATEWAY_HINT: v1Hint } : {}),
LOG: 'logs/setup.log',
});
return;
@@ -436,6 +441,7 @@ export async function run(args: string[]): Promise<void> {
log.info('Wrote ONECLI_URL to .env', { url });
const healthy = await pollHealth(url, 15000);
const v1Hint = healthy ? gatewayV1Hint(await verifyGatewayV1(url)) : null;
emitStatus('ONECLI', {
INSTALLED: true,
@@ -446,6 +452,7 @@ export async function run(args: string[]): Promise<void> {
// The next step (auth) will surface a genuinely broken gateway via
// `onecli secrets list`, so don't trigger rescue attempts from here.
STATUS: 'success',
...(v1Hint ? { GATEWAY_HINT: v1Hint } : {}),
...(healthy
? {}
: {
+80
View File
@@ -0,0 +1,80 @@
/**
* Standalone provider auth the late-adopter entry point.
*
* Fresh installs reach a provider's auth walk-through via the setup picker;
* an existing install adding a provider later runs THIS instead:
*
* pnpm exec tsx setup/index.ts --step provider-auth codex
*
* Same walk-through, same vault-only invariant, idempotent (each provider's
* runAuth short-circuits when its secret already exists) and unlike
* re-running full setup, it touches nothing else: no install-wide default
* provider rewrite, no service changes. Provider install skills call this as
* their auth step so there is exactly one auth implementation per provider.
*/
import { execSync } from 'child_process';
import fs from 'fs';
import path from 'path';
import { getSetupProvider, listSetupProviders } from './providers/registry.js';
// Provider payloads self-register on import.
import './providers/index.js';
// Hard-wired install scripts — the audited control surface (no branch
// enumeration). Each setup/add-<name>.sh is idempotent and self-skips when the
// payload is already wired. Codex is the only manifest-style provider today.
const INSTALL_SCRIPTS: Record<string, string> = {
codex: 'setup/add-codex.sh',
};
export async function run(args: string[]): Promise<void> {
const name = args[0]?.trim().toLowerCase();
const withAuth = listSetupProviders().filter((entry) => entry.runAuth);
if (!name) {
console.error(
`Usage: pnpm exec tsx setup/index.ts --step provider-auth <provider>\n` +
`Providers with an auth step: ${withAuth.map((entry) => entry.value).join(', ') || '(none installed)'}`,
);
process.exit(1);
}
let entry = getSetupProvider(name);
const script = INSTALL_SCRIPTS[name];
if (script) {
// Install OR refresh: the script is idempotent and is also the upgrade
// path — payload files resync and a bumped Dockerfile pin replaces the
// local one. Rebuild the image only when the Dockerfile actually changed
// (payload code is mounted, not baked).
const dfPath = path.join(process.cwd(), 'container', 'Dockerfile');
const dfBefore = fs.readFileSync(dfPath, 'utf-8');
console.log(`${entry ? 'Refreshing' : 'Installing'} ${name}`);
execSync(`bash ${script}`, { stdio: 'inherit' });
if (fs.readFileSync(dfPath, 'utf-8') !== dfBefore) {
console.log('Dockerfile pin changed — rebuilding the container image…');
execSync('./container/build.sh', { stdio: 'inherit' });
}
if (!entry) {
await import(`./providers/${name}.js`);
entry = getSetupProvider(name);
}
if (!entry) {
console.error(`Install completed but ${name} did not register — check setup/providers/${name}.ts`);
process.exit(1);
}
} else if (!entry) {
console.error(
`Unknown provider: ${name}. Installed: ${listSetupProviders()
.map((e) => e.value)
.join(', ')}.`,
);
process.exit(1);
}
if (!entry.runAuth) {
console.error(`Provider "${name}" uses the standard auth flow — run the full setup, or /add-${name}'s steps.`);
process.exit(1);
}
await entry.runAuth();
await entry.runInstallCheck?.();
}
+83
View File
@@ -0,0 +1,83 @@
import { describe, it, expect } from 'vitest';
import fs from 'fs';
import path from 'path';
import { fileURLToPath } from 'url';
/**
* Provider is a DB property of a group, set only via
* `ncl groups config update --provider`. The group-creation contract that a
* fork's coding agent and its skills depend on must carry zero provider
* vocabulary no `--provider` flag passed to, parsed by, or threaded through
* any creation path. These guards go red if that flag creeps back in.
*
* (Prose references to the ncl surface in comments are fine we assert the
* absence of the `'--provider'` arg *literal*, not the substring.)
*/
const repoRoot = path.resolve(path.dirname(fileURLToPath(import.meta.url)), '..');
function read(rel: string): string {
return fs.readFileSync(path.join(repoRoot, rel), 'utf-8');
}
const CREATION_FILES = [
'scripts/init-first-agent.ts',
'scripts/init-cli-agent.ts',
'setup/register.ts',
'setup/cli-agent.ts',
'setup/channels/telegram.ts',
'setup/channels/discord.ts',
'setup/channels/slack.ts',
'setup/channels/whatsapp.ts',
'setup/channels/signal.ts',
'setup/channels/imessage.ts',
'setup/channels/teams.ts',
];
describe('creation is provider-agnostic', () => {
for (const file of CREATION_FILES) {
it(`${file} passes/parses no --provider flag`, () => {
const src = read(file);
expect(src).not.toContain("'--provider'");
expect(src).not.toMatch(/case '--provider'/);
});
}
});
describe('setup carries the picked provider to creation via a setup-run env var', () => {
it('picked-provider stashes/reads the pick in the NANOCLAW_PICKED_PROVIDER env var', () => {
const src = read('setup/lib/picked-provider.ts');
expect(src).toContain('NANOCLAW_PICKED_PROVIDER');
// The pick is set into process.env so child creation scripts inherit it —
// an in-process module global can't cross the process boundary.
expect(src).toMatch(/process\.env\[/);
});
// The creation scripts run as child processes, inherit the env var, and apply
// it to the group's runtime config — container_configs.provider, the source of
// truth materialized into container.json (agent_provider is deprecated) — before
// the welcome wakes the container. No `--provider` flag in the contract (above).
for (const file of ['scripts/init-first-agent.ts', 'scripts/init-cli-agent.ts']) {
it(`${file} applies the env-carried provider to container_configs.provider`, () => {
const src = read(file);
expect(src).toContain('NANOCLAW_PICKED_PROVIDER');
expect(src).toMatch(/updateContainerConfigScalars\([^)]*provider:\s*pickedProvider/);
});
}
});
describe('codex installs from a hard-wired self-contained script', () => {
// The provider picker no longer enumerates a remote manifest branch (an
// unaudited control surface). Codex is offered in trunk and installed by its
// own setup/add-<name>.sh, exactly like a channel adapter.
it('setup/add-codex.sh exists', () => {
expect(fs.existsSync(path.join(repoRoot, 'setup/add-codex.sh'))).toBe(true);
});
it('setup/auto.ts installs the picked provider by running setup/add-<name>.sh', () => {
const src = read('setup/auto.ts');
expect(src).toContain('setup/add-${agentProvider}.sh');
// The removed branch-enumeration machinery must not creep back in.
expect(src).not.toContain('listBranchProviderManifests');
expect(src).not.toContain('installProviderFromBranch');
});
});
+3
View File
@@ -0,0 +1,3 @@
// Setup-side provider barrel. Provider payloads with their own setup surface
// (picker entry, auth walk-through, install check) self-register on import.
// Skills add a provider by appending one import line below.
+43
View File
@@ -0,0 +1,43 @@
/**
* Setup-side provider registration guards.
*
* Behavior (barrel-driven): imports the real setup/providers barrel and
* asserts the built-in default red if the barrel fails to evaluate.
* Per-provider registration guards ship WITH each provider payload (the
* skill copies them in), same archetype as the host/container registration
* tests.
*
* Structural: the picker and the standalone provider-auth step are wiring
* inside non-invocable entry flows (setup main, STEPS map) assert their
* consumption of the registry in source, so deleting either reach-in goes red.
*/
import fs from 'fs';
import path from 'path';
import { describe, expect, it } from 'vitest';
import { getSetupProvider, listSetupProviders } from './registry.js';
import './index.js'; // the real setup provider barrel — triggers self-registration
describe('setup provider registry', () => {
it('always carries claude as the built-in default with the standard auth flow', () => {
const claude = getSetupProvider('claude');
expect(claude).toBeDefined();
expect(claude!.runAuth).toBeUndefined();
expect(listSetupProviders()[0]!.value).toBe('claude');
});
});
describe('setup flow consumes the registry (structural)', () => {
it('the picker renders options from listSetupProviders', () => {
const src = fs.readFileSync(path.join(process.cwd(), 'setup', 'auto.ts'), 'utf-8');
expect(src).toContain('listSetupProviders()');
expect(src).toContain("import './providers/index.js'");
// The capability-keyed branch — a provider's own auth runs iff it declares one.
expect(src).toMatch(/providerEntry\?\.runAuth/);
});
it('the standalone provider-auth step is reachable from the STEPS map', () => {
const src = fs.readFileSync(path.join(process.cwd(), 'setup', 'index.ts'), 'utf-8');
expect(src).toContain("'provider-auth'");
});
});
+59
View File
@@ -0,0 +1,59 @@
/**
* Setup-side provider registry the picker and the standalone `provider-auth`
* step render from this map instead of hardcoding provider names in the setup
* flow (same capability-not-name rule as the host provider-container registry).
*
* `claude` is the built-in default: it has no `runAuth` of its own, which the
* setup flow reads as "run the standard auth step". A provider payload adds
* itself by shipping a `setup/providers/<name>.ts` with a top-level
* `registerSetupProvider(...)` call and appending one import line to the
* `setup/providers/index.ts` barrel the same shape as the host and container
* provider registries, guarded the same way (a barrel-driven registration test).
*/
import type { AssistContext } from '../lib/claude-assist.js'; // type-only — registry stays runtime-dependency-free
/**
* Outcome of a provider-owned failure-assist hook:
* - 'launched' the provider's debugger ran (user may have fixed things).
* - 'declined' the user said no; do NOT offer another debugger.
* - 'unavailable' the provider's CLI can't be used here; the dispatcher
* falls back to the guarded Claude offer (never install/sign-in).
*/
export type FailureAssistResult = 'launched' | 'declined' | 'unavailable';
export interface SetupProviderEntry {
value: string;
label: string;
hint: string;
/** Provider-owned auth walk-through (vault-only). Absent → standard auth step. */
runAuth?: () => Promise<void>;
/** Verifies the provider's payload is wired (files, barrels, Dockerfile pin). */
runInstallCheck?: () => Promise<void>;
/** Provider-owned interactive failure debugger. 'unavailable' dispatcher
* falls back to the guarded Claude offer (never install/sign-in). */
offerFailureAssist?: (ctx: AssistContext, projectRoot: string) => Promise<FailureAssistResult>;
}
const registry = new Map<string, SetupProviderEntry>();
registry.set('claude', {
value: 'claude',
label: 'Claude',
hint: 'default — Anthropic subscription or API key',
});
export function registerSetupProvider(entry: SetupProviderEntry): void {
if (registry.has(entry.value)) {
throw new Error(`Setup provider already registered: ${entry.value}`);
}
registry.set(entry.value, entry);
}
export function getSetupProvider(name: string): SetupProviderEntry | undefined {
return registry.get(name.toLowerCase());
}
/** Claude (the default) first, then the rest in registration order. */
export function listSetupProviders(): SetupProviderEntry[] {
return [...registry.values()];
}
+8 -4
View File
@@ -11,6 +11,7 @@ import { DATA_DIR } from '../src/config.js';
import { initDb } from '../src/db/connection.js';
import { runMigrations } from '../src/db/migrations/index.js';
import { createAgentGroup, getAgentGroupByFolder } from '../src/db/agent-groups.js';
import { ensureContainerConfig } from '../src/db/container-configs.js';
import {
createMessagingGroup,
createMessagingGroupAgent,
@@ -18,7 +19,6 @@ import {
getMessagingGroupAgentByPair,
} from '../src/db/messaging-groups.js';
import { isValidGroupFolder } from '../src/group-folder.js';
import { initGroupFilesystem } from '../src/group-init.js';
import { log } from '../src/log.js';
import { namespacedPlatformId } from '../src/platform-id.js';
import { resolveSession, writeSessionMessage } from '../src/session-manager.js';
@@ -118,7 +118,7 @@ export async function run(args: string[]): Promise<void> {
// Chat SDK adapters prefix, native adapters (WhatsApp/iMessage/Signal) don't.
parsed.platformId = namespacedPlatformId(parsed.channel, parsed.platformId);
log.info('Registering channel', parsed);
log.info('Registering channel', { ...parsed });
// Init v2 central DB
fs.mkdirSync(path.join(projectRoot, 'data'), { recursive: true });
@@ -126,7 +126,11 @@ export async function run(args: string[]): Promise<void> {
const db = initDb(dbPath);
runMigrations(db);
// 1. Create or find agent group
// 1. Create or find agent group. Provider-agnostic: provider is a DB
// property set via `ncl groups config update --provider`, not a creation
// flag. The workspace is scaffolded at the first spawn (group-init), where
// the DB-resolved provider is known; here we only ensure the config row
// exists so that update has a row to write.
let agentGroup = getAgentGroupByFolder(parsed.folder);
if (!agentGroup) {
const agId = generateId('ag');
@@ -140,7 +144,7 @@ export async function run(args: string[]): Promise<void> {
agentGroup = getAgentGroupByFolder(parsed.folder)!;
log.info('Created agent group', { id: agId, folder: parsed.folder });
}
initGroupFilesystem(agentGroup);
ensureContainerConfig(agentGroup.id);
// 2. Create or find messaging group
let messagingGroup = getMessagingGroupByPlatform(parsed.channel, parsed.platformId);
+6
View File
@@ -11,6 +11,7 @@ import path from 'path';
import { log } from '../src/log.js';
import { getLaunchdLabel, getSystemdUnit } from '../src/install-slug.js';
import { writeUpgradeState } from '../src/upgrade-state.js';
import { cleanupUnhealthyPeers } from './peer-cleanup.js';
import {
commandExists,
@@ -54,6 +55,11 @@ export async function run(_args: string[]): Promise<void> {
fs.mkdirSync(path.join(projectRoot, 'logs'), { recursive: true });
// Stamp the upgrade marker before the host first starts, so the startup
// tripwire (enforceUpgradeTripwire) sees this as a sanctioned install.
const stamped = writeUpgradeState({ via: 'setup' });
log.info('Stamped upgrade marker', { version: stamped.version });
// Peer preflight — a crash-looping peer install (most often the legacy v1
// `com.nanoclaw` plist) will keep trashing this install's containers on
// every respawn via its own cleanupOrphans. Detect and unload any peer
+365
View File
@@ -0,0 +1,365 @@
/**
* Uninstall flow clack UI orchestration over scan/plan/remove.
*
* Self-deletion constraint: this flow runs on tsx out of the node_modules
* it deletes. All imports are static (loaded before any deletion), dist/
* and node_modules/ are removed last (the runtime tail), and once execution
* starts nothing here writes to logs/ (which would recreate it) or does a
* dynamic import. After the runtime tail, the only output is console.log.
*
* Removes ONLY what belongs to this checkout (per-checkout install slug).
* Each non-empty group shows a WHAT/WHERE table and asks a default-No
* confirm. Nothing is deleted until every decision has been made, so
* Ctrl-C anywhere in the confirm phase leaves the install untouched.
*/
import { spawnSync } from 'child_process';
import fs from 'fs';
import os from 'os';
import path from 'path';
import * as p from '@clack/prompts';
import k from 'kleur';
import { emit as phEmit } from '../lib/diagnostics.js';
import { note } from '../lib/theme.js';
import * as setupLog from '../logs.js';
import {
resolveOnecliDeletions,
type RunCommand,
type VaultAgent,
} from './onecli-agents.js';
import { buildRemovalPlan, type Decisions } from './plan.js';
import { executePlan, type ExecDeps } from './remove.js';
import { scanInstall, tilde, type Inventory } from './scan.js';
const GROUPS = {
service: {
title: '1) App & background service',
desc: 'Runs NanoClaw in the background. Removing this stops the assistant. None of your data lives here.',
prompt: 'Delete the app & background service shown above?',
},
data: {
title: '2) App data, logs & secrets',
desc: 'Message database, conversation history, logs, build files, and your .env (API keys / tokens). Removing this erases stored conversations and saved credentials.',
prompt: 'Delete app data, logs & secrets shown above? (erases conversations + API keys)',
},
user: {
title: "3) Your agents' memory & files",
desc: 'Notes and memory your agents created (groups/) and any migrated data (store/). Content you made — it cannot be recovered after deletion.',
prompt: "Delete your agents' memory & files shown above? (cannot be undone)",
},
onecli: {
title: '4) OneCLI credential agents',
desc: 'Per-agent entries this copy registered in the OneCLI vault. The OneCLI app, your credentials, and the gateway are NOT touched.',
},
} as const;
const runCommand: RunCommand = (cmd, args) => {
const res = spawnSync(cmd, args, { encoding: 'utf-8' });
return { status: res.status, stdout: res.stdout ?? '' };
};
export async function runUninstallFlow(opts: {
dryRun: boolean;
yes: boolean;
invokedFrom: 'flag' | 'setup-detection';
}): Promise<never> {
const { dryRun, yes } = opts;
if (!process.stdin.isTTY && !yes && !dryRun) {
console.error(
'Uninstall needs an interactive terminal. Re-run with --yes to delete everything found without prompts, or --dry-run to preview.',
);
process.exit(1);
}
const projectRoot = process.cwd();
const home = os.homedir();
p.intro(k.bold(`Uninstall NanoClaw`));
// persistId: false — the emit must not create data/install-id, which would
// both break --dry-run's "changes nothing" promise and resurrect a data/
// row in the very inventory we are about to scan.
phEmit('uninstall_started', { invokedFrom: opts.invokedFrom, dryRun, yes }, { persistId: false });
const spinner = p.spinner();
spinner.start('Checking what exists for this copy…');
const inv = scanInstall({
projectRoot,
home,
platform: process.platform,
runCommand,
});
spinner.stop(`Scanned copy ${inv.slug} at ${tilde(projectRoot, home)}.`);
const svcRows = serviceRows(inv, home);
const dataRows = [...inv.data, ...inv.runtime].map(({ what, where }) => ({ what, where }));
const userRows = inv.user.map(({ what, where }) => ({ what, where }));
const totalFound =
svcRows.length +
dataRows.length +
userRows.length +
inv.onecli.mine.length +
inv.onecli.orphans.length;
if (totalFound === 0) {
p.outro(
`✓ Nothing to uninstall — this copy (${inv.slug}) is already clean.\n` +
k.dim(' (No service, containers, image, data, or OneCLI agents found for this folder.)'),
);
process.exit(0);
}
if (dryRun) {
p.log.message(
k.cyan('PREVIEW ONLY — this shows what would be deleted and changes nothing.'),
);
if (svcRows.length > 0) note(groupBody(GROUPS.service.desc, svcRows), GROUPS.service.title);
if (dataRows.length > 0) note(groupBody(GROUPS.data.desc, dataRows), GROUPS.data.title);
if (userRows.length > 0) note(groupBody(GROUPS.user.desc, userRows), GROUPS.user.title);
if (inv.onecli.mine.length > 0 || inv.onecli.orphans.length > 0) {
const lines = [GROUPS.onecli.desc, ''];
lines.push('Would be deleted (after confirmation):');
for (const a of inv.onecli.mine) lines.push(`${a.name}${a.identifier}`);
if (inv.onecli.mine.length === 0) lines.push(' (none)');
lines.push('Left in place — may belong to another copy:');
for (const a of inv.onecli.orphans) lines.push(`${a.name}${a.identifier}`);
if (inv.onecli.orphans.length === 0) lines.push(' (none)');
note(lines.join('\n'), GROUPS.onecli.title);
}
const empty = emptyGroupTitles(svcRows.length, dataRows.length, userRows.length, inv);
if (empty.length > 0) p.log.message(k.dim(`Nothing found for: ${empty.join(', ')}`));
for (const n of inv.notes) p.log.message(k.dim(`${n}`));
p.outro('Preview complete. Nothing was changed.');
process.exit(0);
}
if (yes) {
p.log.warn('--yes given: deleting everything found below without asking.');
} else {
p.log.message(
k.dim(
'You will be asked about each group that has something. Default is to keep\n(just press Enter). Type "y" to delete a group.',
),
);
}
// ── confirm phase — nothing is deleted until every decision is made ──
let serviceYes = false;
if (svcRows.length > 0) {
note(groupBody(GROUPS.service.desc, svcRows), GROUPS.service.title);
serviceYes = await confirmGroup(GROUPS.service.prompt, yes);
}
let dataYes = false;
if (dataRows.length > 0) {
note(groupBody(GROUPS.data.desc, dataRows), GROUPS.data.title);
dataYes = await confirmGroup(GROUPS.data.prompt, yes);
}
let userYes = false;
if (userRows.length > 0) {
note(groupBody(GROUPS.user.desc, userRows), GROUPS.user.title);
userYes = await confirmGroup(GROUPS.user.prompt, yes);
}
const keptNotes: string[] = [];
if (!serviceYes && svcRows.length > 0) keptNotes.push(`${GROUPS.service.title}: kept by your choice.`);
if (!dataYes && dataRows.length > 0) keptNotes.push(`${GROUPS.data.title}: kept by your choice.`);
if (!userYes && userRows.length > 0) keptNotes.push(`${GROUPS.user.title}: kept by your choice.`);
const onecliDelete = await decideOnecli(inv, yes, keptNotes);
// Record the decisions before execution can delete logs/ — but only into
// an existing logs/ (userInput would otherwise mkdir it back into
// existence, leaving a fresh logs/setup.log behind after the uninstall).
if (fs.existsSync(path.join(projectRoot, 'logs'))) {
setupLog.userInput(
'uninstall_decisions',
JSON.stringify({
service: serviceYes,
data: dataYes,
user: userYes,
onecliAgentsDeleted: onecliDelete.length,
}),
);
}
const decisions: Decisions = {
service: serviceYes,
data: dataYes,
user: userYes,
onecliDelete,
};
const actions = buildRemovalPlan(inv, decisions);
if (actions.length === 0) {
printLeftAlone([...inv.notes, ...keptNotes]);
p.outro('Nothing selected — nothing was changed.');
process.exit(0);
}
phEmit(
'uninstall_executed',
{
invokedFrom: opts.invokedFrom,
service: serviceYes,
data: dataYes,
user: userYes,
onecliAgentsDeleted: onecliDelete.length,
},
{ persistId: false },
);
// The runtime tail (dist/, node_modules/) runs after every other action
// AND after the summary — nothing but console.log may happen once the
// modules we're running from are gone.
const head = actions.filter((a) => a.kind !== 'delete-runtime-path');
const tail = actions.filter((a) => a.kind === 'delete-runtime-path');
const deps: ExecDeps = {
runCommand,
log: (line) => p.log.message(line),
isRoot: process.getuid?.() === 0,
};
const { notes: execNotes } = executePlan(head, deps);
printLeftAlone([...inv.notes, ...keptNotes, ...execNotes]);
const { notes: tailNotes } = executePlan(tail, {
...deps,
log: (line) => console.log(` ${line}`),
});
for (const n of tailNotes) console.log(`${n}`);
console.log(`\n✓ Done. NanoClaw copy ${inv.slug} has been uninstalled.`);
process.exit(0);
}
/** Unwrap a confirm result; Ctrl-C / Esc cancels the whole uninstall — nothing deleted. */
function answered<T>(value: T | symbol): T {
if (p.isCancel(value)) {
p.cancel('Uninstall cancelled. Nothing was deleted.');
process.exit(0);
}
return value as T;
}
async function confirmGroup(prompt: string, yes: boolean): Promise<boolean> {
if (yes) return true;
return answered(await p.confirm({ message: prompt, initialValue: false }));
}
/**
* Group 4 has two sub-decisions the single-prompt loop can't express:
* MINE is one yes/no; ORPHANS get a separate default-No prompt with an
* explicit cross-copy warning. --yes deletes MINE but never ORPHANS
* (enforced in resolveOnecliDeletions); anything kept is reported with
* the exact manual delete command (by vault uuid).
*/
async function decideOnecli(
inv: Inventory,
yes: boolean,
keptNotes: string[],
): Promise<VaultAgent[]> {
const { mine, orphans } = inv.onecli;
if (mine.length === 0 && orphans.length === 0) return [];
const rows = [
...mine.map((a) => ({ what: 'OneCLI agent', where: `${a.name}${a.identifier}` })),
...orphans.map((a) => ({ what: 'OneCLI agent (orphan)', where: `${a.name}${a.identifier}` })),
];
note(groupBody(GROUPS.onecli.desc, rows), GROUPS.onecli.title);
let deleteMine = false;
if (mine.length > 0 && !yes) {
deleteMine = answered(
await p.confirm({
message: `Delete this copy's ${mine.length} OneCLI agent(s)?`,
initialValue: false,
}),
);
if (!deleteMine) keptNotes.push('OneCLI agents (this copy): kept by your choice.');
}
let deleteOrphans = false;
if (orphans.length > 0) {
if (yes) {
p.log.warn(
`${orphans.length} other NanoClaw-style agent(s) in the vault are not linked to this copy;\n--yes does NOT delete them (they may belong to another copy).`,
);
} else {
p.log.warn(
`Found ${orphans.length} other NanoClaw-style agent(s) in the vault not linked to this copy —\nthey may belong to ANOTHER NanoClaw copy on this machine.`,
);
deleteOrphans = answered(
await p.confirm({ message: 'Delete them too?', initialValue: false }),
);
}
if (yes || !deleteOrphans) {
keptNotes.push(
`OneCLI orphan agents (${orphans.length}): left in place — remove manually if they're yours:`,
);
for (const a of orphans) {
keptNotes.push(` onecli agents delete --id ${a.uuid} # ${a.name}${a.identifier}`);
}
}
}
return resolveOnecliDeletions({
mine,
orphans,
assumeYes: yes,
deleteMine,
deleteOrphans,
});
}
function serviceRows(inv: Inventory, home: string): { what: string; where: string }[] {
const s = inv.service;
const rows: { what: string; where: string }[] = [];
if (s.launchdPlist) rows.push({ what: 'Background service', where: tilde(s.launchdPlist, home) });
if (s.systemdUserUnit) rows.push({ what: 'Background service', where: tilde(s.systemdUserUnit, home) });
if (s.systemdSystemUnit) rows.push({ what: 'Background service (system)', where: s.systemdSystemUnit });
if (s.pidFile) rows.push({ what: 'Running process', where: 'nanoclaw.pid' });
if (s.containerIds.length > 0) {
rows.push({ what: 'Running containers', where: `${s.containerIds.length} container(s)` });
}
if (s.image) rows.push({ what: 'Container image', where: s.image });
if (s.nclSymlink) rows.push({ what: 'Command-line tool (ncl)', where: tilde(s.nclSymlink, home) });
return rows;
}
function groupBody(desc: string, rows: { what: string; where: string }[]): string {
const width = Math.max(...rows.map((r) => r.what.length), 'WHAT'.length);
const lines = [desc, '', `${'WHAT'.padEnd(width + 2)}WHERE`];
for (const r of rows) lines.push(`${r.what.padEnd(width + 2)}${r.where}`);
return lines.join('\n');
}
function emptyGroupTitles(
svcCount: number,
dataCount: number,
userCount: number,
inv: Inventory,
): string[] {
const empty: string[] = [];
if (svcCount === 0) empty.push(GROUPS.service.title);
if (dataCount === 0) empty.push(GROUPS.data.title);
if (userCount === 0) empty.push(GROUPS.user.title);
if (inv.onecli.mine.length === 0 && inv.onecli.orphans.length === 0) {
empty.push(GROUPS.onecli.title);
}
return empty;
}
function printLeftAlone(notes: string[]): void {
const lines = [
'• OneCLI app, vault & credentials: ~/.local/share/onecli, ~/.local/bin/onecli',
'• Host-wide config: ~/.config/nanoclaw/ (mount/sender allowlists)',
'• PATH line in ~/.bashrc and ~/.zshrc',
'• Other NanoClaw copies on this machine',
...notes.map((n) => `${n}`),
];
note(lines.join('\n'), 'Left alone (shared / not ours)');
}
+150
View File
@@ -0,0 +1,150 @@
import { describe, it, expect, beforeEach, afterEach } from 'vitest';
import fs from 'fs';
import os from 'os';
import path from 'path';
import Database from 'better-sqlite3';
import {
listVaultAgents,
readAgentGroupIds,
resolveOnecliDeletions,
splitVaultAgents,
type VaultAgent,
} from './onecli-agents.js';
const agent = (uuid: string, identifier: string, name = identifier): VaultAgent => ({
uuid,
identifier,
name,
});
describe('listVaultAgents', () => {
it('parses non-default agents from onecli JSON output', () => {
const payload = JSON.stringify({
data: [
{ id: 'u-1', identifier: 'ag-main', name: 'Main', isDefault: false },
{ id: 'u-2', identifier: 'default', name: 'Default', isDefault: false },
{ id: 'u-3', identifier: 'ag-dev', name: 'Dev', isDefault: true },
],
});
const result = listVaultAgents(() => ({ status: 0, stdout: payload }));
expect(result.available).toBe(true);
expect(result.agents).toEqual([agent('u-1', 'ag-main', 'Main')]);
});
it('reports unavailable when the command fails', () => {
expect(listVaultAgents(() => ({ status: 1, stdout: '' })).available).toBe(false);
});
it('reports unavailable when the command cannot be spawned', () => {
const result = listVaultAgents(() => {
throw new Error('ENOENT');
});
expect(result.available).toBe(false);
expect(result.agents).toEqual([]);
});
it('reports unavailable on unparseable output', () => {
expect(listVaultAgents(() => ({ status: 0, stdout: 'not json' })).available).toBe(false);
expect(listVaultAgents(() => ({ status: 0, stdout: '{"nope":1}' })).available).toBe(false);
});
});
describe('readAgentGroupIds', () => {
let tempDir: string;
beforeEach(() => {
tempDir = fs.mkdtempSync(path.join(os.tmpdir(), 'nanoclaw-uninstall-test-'));
});
afterEach(() => {
fs.rmSync(tempDir, { recursive: true, force: true });
});
it('reads ids from a real DB', () => {
const dbPath = path.join(tempDir, 'v2.db');
const db = new Database(dbPath);
db.exec('CREATE TABLE agent_groups (id TEXT PRIMARY KEY)');
db.prepare('INSERT INTO agent_groups (id) VALUES (?)').run('ag-one');
db.prepare('INSERT INTO agent_groups (id) VALUES (?)').run('ag-two');
db.close();
const result = readAgentGroupIds(dbPath);
expect(result.known).toBe(true);
expect(result.ids).toEqual(new Set(['ag-one', 'ag-two']));
});
it('returns known:false for a missing file', () => {
const result = readAgentGroupIds(path.join(tempDir, 'missing.db'));
expect(result.known).toBe(false);
expect(result.ids.size).toBe(0);
});
it('returns known:false for a corrupt file', () => {
const dbPath = path.join(tempDir, 'corrupt.db');
fs.writeFileSync(dbPath, 'this is not a sqlite database at all');
const result = readAgentGroupIds(dbPath);
expect(result.known).toBe(false);
expect(result.ids.size).toBe(0);
});
});
describe('splitVaultAgents', () => {
it('splits mine vs ag-* orphans and ignores foreign identifiers', () => {
const agents = [
agent('u-1', 'ag-mine'),
agent('u-2', 'ag-other'),
agent('u-3', 'some-tool'),
];
const { mine, orphans } = splitVaultAgents(agents, new Set(['ag-mine']), true);
expect(mine).toEqual([agent('u-1', 'ag-mine')]);
expect(orphans).toEqual([agent('u-2', 'ag-other')]);
});
it('forces all ag-* agents into orphans when ids are unknown', () => {
const agents = [agent('u-1', 'ag-mine'), agent('u-2', 'ag-other')];
// ids set even contains ag-mine — known:false must override.
const { mine, orphans } = splitVaultAgents(agents, new Set(['ag-mine']), false);
expect(mine).toEqual([]);
expect(orphans).toEqual(agents);
});
});
describe('resolveOnecliDeletions', () => {
const mine = [agent('u-1', 'ag-mine')];
const orphans = [agent('u-2', 'ag-other')];
it('never deletes orphans under --yes, even if asked to', () => {
const deletions = resolveOnecliDeletions({
mine,
orphans,
assumeYes: true,
deleteMine: false,
deleteOrphans: true,
});
expect(deletions).toEqual(mine);
});
it('deletes orphans only on explicit interactive consent', () => {
expect(
resolveOnecliDeletions({
mine,
orphans,
assumeYes: false,
deleteMine: true,
deleteOrphans: true,
}),
).toEqual([...mine, ...orphans]);
expect(
resolveOnecliDeletions({
mine,
orphans,
assumeYes: false,
deleteMine: false,
deleteOrphans: false,
}),
).toEqual([]);
});
});
+141
View File
@@ -0,0 +1,141 @@
/**
* OneCLI vault-agent inventory for the uninstaller.
*
* Vault agents split into two sets: MINE (identifier matches an agent-group
* id in this copy's data/v2.db) and ORPHANS (NanoClaw-style `ag-*`
* identifiers not in our DB possibly another copy's). Deletion is always
* by the vault's internal uuid: the agent-group id is NOT a valid
* `onecli agents delete --id` value (see src/container-runner.ts).
*/
import fs from 'fs';
import Database from 'better-sqlite3';
export interface VaultAgent {
/** Internal vault uuid — the only valid `onecli agents delete --id` value. */
uuid: string;
/** What the agent was registered under, e.g. a NanoClaw agent-group id (`ag-*`). */
identifier: string;
name: string;
}
export type RunCommand = (
cmd: string,
args: string[],
) => { status: number | null; stdout: string };
/**
* List non-default vault agents via `onecli agents list`. `available: false`
* means the vault couldn't be read at all (binary missing, command failed,
* or unparseable output) distinct from an empty vault.
*/
export function listVaultAgents(run: RunCommand): {
available: boolean;
agents: VaultAgent[];
} {
let result: { status: number | null; stdout: string };
try {
result = run('onecli', ['agents', 'list']);
} catch {
return { available: false, agents: [] };
}
if (result.status !== 0) return { available: false, agents: [] };
let parsed: unknown;
try {
parsed = JSON.parse(result.stdout);
} catch {
return { available: false, agents: [] };
}
const data =
parsed !== null && typeof parsed === 'object' && 'data' in parsed
? (parsed as { data: unknown }).data
: null;
if (!Array.isArray(data)) return { available: false, agents: [] };
const agents: VaultAgent[] = [];
for (const entry of data) {
if (entry === null || typeof entry !== 'object') continue;
const a = entry as Record<string, unknown>;
if (a.isDefault === true) continue;
const identifier = typeof a.identifier === 'string' ? a.identifier : '';
const uuid = typeof a.id === 'string' ? a.id : '';
if (!identifier || identifier === 'default' || !uuid) continue;
agents.push({
uuid,
identifier,
name: typeof a.name === 'string' ? a.name : '',
});
}
return { available: true, agents };
}
/**
* Read this copy's agent-group ids from data/v2.db (readonly).
*
* `known: false` distinguishes "we couldn't read the DB at all" from "this
* copy has zero agent groups" without it every ag-* vault agent would be
* mislabeled an orphan and --yes would silently leave this copy's agents
* behind.
*/
export function readAgentGroupIds(dbPath: string): {
ids: Set<string>;
known: boolean;
} {
if (!fs.existsSync(dbPath)) return { ids: new Set(), known: false };
let db: Database.Database | null = null;
try {
db = new Database(dbPath, { readonly: true });
const rows = db.prepare('SELECT id FROM agent_groups').all() as {
id: string;
}[];
return { ids: new Set(rows.map((r) => r.id)), known: true };
} catch {
return { ids: new Set(), known: false };
} finally {
db?.close();
}
}
/**
* Split vault agents into MINE (identifier ids) and ORPHANS (ag-* not in
* ids). Non-NanoClaw identifiers are ignored entirely. With `known: false`
* nothing can be MINE, so every ag-* agent lands in ORPHANS the caller is
* responsible for warning that the labels are unreliable.
*/
export function splitVaultAgents(
agents: VaultAgent[],
ids: Set<string>,
known: boolean,
): { mine: VaultAgent[]; orphans: VaultAgent[] } {
const mine: VaultAgent[] = [];
const orphans: VaultAgent[] = [];
for (const agent of agents) {
if (known && ids.has(agent.identifier)) {
mine.push(agent);
} else if (agent.identifier.startsWith('ag-')) {
orphans.push(agent);
}
}
return { mine, orphans };
}
/**
* Resolve the vault-agent delete set from the user's answers. Under --yes
* (`assumeYes`) MINE is always deleted but ORPHANS never are deleting
* what may be another copy's agents requires explicit human intent.
*/
export function resolveOnecliDeletions(input: {
mine: VaultAgent[];
orphans: VaultAgent[];
assumeYes: boolean;
deleteMine: boolean;
deleteOrphans: boolean;
}): VaultAgent[] {
const out: VaultAgent[] = [];
if (input.assumeYes || input.deleteMine) out.push(...input.mine);
if (!input.assumeYes && input.deleteOrphans) out.push(...input.orphans);
return out;
}
+156
View File
@@ -0,0 +1,156 @@
import { describe, it, expect } from 'vitest';
import type { VaultAgent } from './onecli-agents.js';
import { buildRemovalPlan, type Decisions, type RemovalAction } from './plan.js';
import type { Inventory, PathItem } from './scan.js';
const item = (p: string, what: string): PathItem => ({ what, where: p, path: p });
const agent = (uuid: string, identifier: string): VaultAgent => ({
uuid,
identifier,
name: identifier,
});
function inventory(overrides: Partial<Inventory> = {}): Inventory {
return {
slug: 'abcd1234',
projectRoot: '/proj',
containerRuntime: 'docker',
service: {
launchdPlist: '/home/u/Library/LaunchAgents/com.nanoclaw-v2-abcd1234.plist',
containerIds: ['c1', 'c2'],
image: 'nanoclaw-agent-v2-abcd1234:latest',
nclSymlink: '/home/u/.local/bin/ncl',
},
data: [
item('/proj/data', 'Database & conversations'),
item('/proj/logs', 'Logs'),
item('/proj/.env', 'Secrets / API keys (.env)'),
item('/proj/start-nanoclaw.sh', 'Start script'),
],
runtime: [
// node_modules deliberately FIRST — the planner must still order it last.
item('/proj/node_modules', 'Installed dependencies'),
item('/proj/dist', 'Build output'),
],
user: [item('/proj/groups', 'Agent memory & files'), item('/proj/store', 'Migrated data store')],
onecli: { mine: [], orphans: [], idsKnown: true },
notes: [],
...overrides,
};
}
const allYes = (onecliDelete: VaultAgent[] = []): Decisions => ({
service: true,
data: true,
user: true,
onecliDelete,
});
const kinds = (actions: RemovalAction[]) => actions.map((a) => a.kind);
describe('buildRemovalPlan ordering invariants', () => {
it('removes .env only via the atomic backup action, never a bare delete', () => {
const actions = buildRemovalPlan(inventory(), allYes());
expect(actions.filter((a) => a.kind === 'backup-env')).toHaveLength(1);
expect(
actions.some((a) => a.kind === 'delete-path' && a.item.path === '/proj/.env'),
).toBe(false);
});
it('puts the runtime tail strictly last, with node_modules final', () => {
const actions = buildRemovalPlan(inventory(), allYes([agent('u-1', 'ag-mine')]));
const tail = actions.slice(-2);
expect(tail.map((a) => a.kind)).toEqual(['delete-runtime-path', 'delete-runtime-path']);
expect(tail.map((a) => (a.kind === 'delete-runtime-path' ? a.item.path : ''))).toEqual([
'/proj/dist',
'/proj/node_modules',
]);
// No non-tail action after the first runtime delete.
const firstTailIdx = actions.findIndex((a) => a.kind === 'delete-runtime-path');
expect(
actions.slice(firstTailIdx).every((a) => a.kind === 'delete-runtime-path'),
).toBe(true);
});
it('deletes OneCLI agents before the data group (which removes data/v2.db)', () => {
const actions = buildRemovalPlan(inventory(), allYes([agent('u-1', 'ag-mine')]));
const onecliIdx = actions.findIndex((a) => a.kind === 'delete-onecli-agent');
const dataIdx = actions.findIndex(
(a) => a.kind === 'delete-path' && a.item.path === '/proj/data',
);
expect(onecliIdx).toBeGreaterThanOrEqual(0);
expect(dataIdx).toBeGreaterThan(onecliIdx);
});
it('runs service teardown before container removal so the host cannot respawn them', () => {
const actions = buildRemovalPlan(inventory(), allYes());
const unloadIdx = actions.findIndex((a) => a.kind === 'unload-service');
const pkillIdx = actions.findIndex((a) => a.kind === 'pkill-host');
const rmContainersIdx = actions.findIndex((a) => a.kind === 'rm-containers');
expect(unloadIdx).toBeLessThan(rmContainersIdx);
expect(pkillIdx).toBeLessThan(rmContainersIdx);
});
});
describe('buildRemovalPlan declined groups', () => {
it('declined data yields no data deletes and no runtime tail', () => {
const actions = buildRemovalPlan(inventory(), {
service: true,
data: false,
user: true,
onecliDelete: [],
});
expect(kinds(actions)).not.toContain('backup-env');
expect(kinds(actions)).not.toContain('delete-runtime-path');
expect(
actions.some((a) => a.kind === 'delete-path' && a.item.path.startsWith('/proj/data')),
).toBe(false);
});
it('all declined yields an empty plan', () => {
const actions = buildRemovalPlan(inventory(), {
service: false,
data: false,
user: false,
onecliDelete: [],
});
expect(actions).toEqual([]);
});
it('declined service yields no service actions', () => {
const actions = buildRemovalPlan(inventory(), {
service: false,
data: true,
user: false,
onecliDelete: [],
});
for (const kind of ['unload-service', 'pkill-host', 'rm-containers', 'rmi', 'rm-ncl-symlink']) {
expect(kinds(actions)).not.toContain(kind);
}
});
});
describe('buildRemovalPlan conditional actions', () => {
it('skips backup-env when there is no .env', () => {
const inv = inventory({ data: [item('/proj/data', 'Database & conversations')] });
expect(kinds(buildRemovalPlan(inv, allYes()))).not.toContain('backup-env');
});
it('always re-sweeps containers and processes with a confirmed service group', () => {
const inv = inventory({ service: { containerIds: [] } });
const actions = buildRemovalPlan(inv, allYes());
const actionKinds = kinds(actions);
expect(actionKinds).not.toContain('rmi');
expect(actionKinds).not.toContain('unload-service');
// pkill and rm-containers run unconditionally — a manually started host
// has no plist/unit, and the live host may have spawned containers the
// scan never saw. Removal re-lists by install label, not scan-time ids.
expect(actionKinds).toContain('pkill-host');
const rm = actions.find((a) => a.kind === 'rm-containers');
expect(rm && rm.kind === 'rm-containers' ? rm.labelFilter : '').toBe(
'nanoclaw-install=abcd1234',
);
});
});
+130
View File
@@ -0,0 +1,130 @@
/**
* Pure removal planner: inventory + per-group decisions ordered actions.
*
* The order is load-bearing:
* 1. Service / processes / containers / image / symlink stop the host
* first so it can't respawn containers mid-removal.
* 2. OneCLI agent deletions before the data group, which removes the
* data/v2.db the mine/orphan split was computed from.
* 3. Data group, with the .env backup strictly before its deletion.
* 4. User group (groups/, store/).
* 5. Runtime tail: dist/ then node_modules/ ALWAYS last. The uninstaller
* runs on tsx out of node_modules; nothing may load after this.
*/
import path from 'path';
import type { VaultAgent } from './onecli-agents.js';
import type { Inventory, PathItem } from './scan.js';
export interface Decisions {
service: boolean;
data: boolean;
user: boolean;
onecliDelete: VaultAgent[];
}
export type RemovalAction =
| {
kind: 'unload-service';
flavor: 'launchd' | 'systemd-user' | 'systemd-system';
unitPath: string;
/** systemd unit name without .service (unused for launchd). */
unitName: string;
}
| { kind: 'kill-pid'; pidFile: string }
| { kind: 'pkill-host'; pattern: string }
/**
* Containers are re-listed by label at removal time, not removed from
* scan-time ids the host stays alive through the whole confirm phase
* and can spawn new containers after the scan.
*/
| { kind: 'rm-containers'; runtime: string; labelFilter: string }
| { kind: 'rmi'; runtime: string; image: string }
| { kind: 'rm-ncl-symlink'; linkPath: string }
| { kind: 'delete-onecli-agent'; agent: VaultAgent }
/**
* Backs up AND removes .env as one atomic action: a failed backup must
* never be followed by the deletion (the backup is the user's only copy
* of their API keys). .env is deliberately excluded from `delete-path`.
*/
| { kind: 'backup-env'; envPath: string }
| { kind: 'delete-path'; item: PathItem }
| { kind: 'delete-runtime-path'; item: PathItem };
export function buildRemovalPlan(inv: Inventory, d: Decisions): RemovalAction[] {
const actions: RemovalAction[] = [];
if (d.service) {
const s = inv.service;
if (s.launchdPlist) {
actions.push({
kind: 'unload-service',
flavor: 'launchd',
unitPath: s.launchdPlist,
unitName: path.basename(s.launchdPlist, '.plist'),
});
}
if (s.systemdUserUnit) {
actions.push({
kind: 'unload-service',
flavor: 'systemd-user',
unitPath: s.systemdUserUnit,
unitName: path.basename(s.systemdUserUnit, '.service'),
});
}
if (s.systemdSystemUnit) {
actions.push({
kind: 'unload-service',
flavor: 'systemd-system',
unitPath: s.systemdSystemUnit,
unitName: path.basename(s.systemdSystemUnit, '.service'),
});
}
if (s.pidFile) actions.push({ kind: 'kill-pid', pidFile: s.pidFile });
actions.push({
kind: 'pkill-host',
pattern: `${inv.projectRoot}/dist/index.js`,
});
// Unconditional (like pkill): the scan may have found zero containers
// while the still-running host spawned one since.
actions.push({
kind: 'rm-containers',
runtime: inv.containerRuntime,
labelFilter: `nanoclaw-install=${inv.slug}`,
});
if (s.image) {
actions.push({ kind: 'rmi', runtime: inv.containerRuntime, image: s.image });
}
if (s.nclSymlink) {
actions.push({ kind: 'rm-ncl-symlink', linkPath: s.nclSymlink });
}
}
for (const agent of d.onecliDelete) {
actions.push({ kind: 'delete-onecli-agent', agent });
}
if (d.data) {
const env = inv.data.find((i) => path.basename(i.path) === '.env');
if (env) actions.push({ kind: 'backup-env', envPath: env.path });
for (const item of inv.data) {
if (item === env) continue; // removed by backup-env, never a bare delete
actions.push({ kind: 'delete-path', item });
}
}
if (d.user) {
for (const item of inv.user) actions.push({ kind: 'delete-path', item });
}
if (d.data) {
const tail = [...inv.runtime].sort(
(a, b) =>
Number(path.basename(a.path) === 'node_modules') -
Number(path.basename(b.path) === 'node_modules'),
);
for (const item of tail) actions.push({ kind: 'delete-runtime-path', item });
}
return actions;
}
+212
View File
@@ -0,0 +1,212 @@
import { describe, it, expect, beforeEach, afterEach } from 'vitest';
import fs from 'fs';
import os from 'os';
import path from 'path';
import type { RunCommand } from './onecli-agents.js';
import type { RemovalAction } from './plan.js';
import { backupEnv, executePlan, type ExecDeps } from './remove.js';
let tempDir: string;
beforeEach(() => {
tempDir = fs.mkdtempSync(path.join(os.tmpdir(), 'nanoclaw-remove-test-'));
});
afterEach(() => {
fs.rmSync(tempDir, { recursive: true, force: true });
});
function deps(overrides: Partial<ExecDeps> = {}): ExecDeps {
return {
runCommand: () => ({ status: 0, stdout: '' }),
log: () => {},
isRoot: false,
...overrides,
};
}
describe('backupEnv', () => {
it('backs up to .env.bak', () => {
const envPath = path.join(tempDir, '.env');
fs.writeFileSync(envPath, 'KEY=secret');
const backup = backupEnv(envPath);
expect(backup).toBe(path.join(tempDir, '.env.bak'));
expect(fs.readFileSync(backup, 'utf-8')).toBe('KEY=secret');
});
it('falls back to a timestamped name when .env.bak exists', () => {
const envPath = path.join(tempDir, '.env');
fs.writeFileSync(envPath, 'KEY=new');
fs.writeFileSync(path.join(tempDir, '.env.bak'), 'KEY=old');
const backup = backupEnv(envPath);
expect(path.basename(backup)).toMatch(/^\.env\.bak\.\d{8}-\d{6}$/);
expect(fs.readFileSync(backup, 'utf-8')).toBe('KEY=new');
// The earlier backup is never clobbered.
expect(fs.readFileSync(path.join(tempDir, '.env.bak'), 'utf-8')).toBe('KEY=old');
});
});
describe('executePlan', () => {
it('deletes paths recursively', () => {
const dir = path.join(tempDir, 'data');
fs.mkdirSync(path.join(dir, 'nested'), { recursive: true });
fs.writeFileSync(path.join(dir, 'nested', 'f.txt'), 'x');
const { notes } = executePlan(
[{ kind: 'delete-path', item: { what: 'Data', where: dir, path: dir } }],
deps(),
);
expect(fs.existsSync(dir)).toBe(false);
expect(notes).toEqual([]);
});
it('continues past a failing action and records a note', () => {
const dir = path.join(tempDir, 'logs');
fs.mkdirSync(dir);
const actions: RemovalAction[] = [
{
kind: 'unload-service',
flavor: 'launchd',
unitPath: path.join(tempDir, 'svc.plist'),
unitName: 'com.nanoclaw-v2-test',
},
{ kind: 'delete-path', item: { what: 'Logs', where: dir, path: dir } },
];
const failing: RunCommand = () => {
throw new Error('launchctl exploded');
};
const { notes } = executePlan(actions, deps({ runCommand: failing }));
expect(notes).toHaveLength(1);
expect(notes[0]).toContain('unload-service');
expect(notes[0]).toContain('launchctl exploded');
// Later actions still ran.
expect(fs.existsSync(dir)).toBe(false);
});
it('leaves a system unit in place without root and notes the sudo command', () => {
const unitPath = path.join(tempDir, 'nanoclaw-v2-test.service');
fs.writeFileSync(unitPath, '[Unit]');
const calls: string[] = [];
const recorder: RunCommand = (cmd) => {
calls.push(cmd);
return { status: 0, stdout: '' };
};
const { notes } = executePlan(
[
{
kind: 'unload-service',
flavor: 'systemd-system',
unitPath,
unitName: 'nanoclaw-v2-test',
},
],
deps({ runCommand: recorder, isRoot: false }),
);
expect(fs.existsSync(unitPath)).toBe(true);
expect(calls).toEqual([]);
expect(notes.some((n) => n.includes('re-run with sudo'))).toBe(true);
});
it('notes a failed image removal with the retry command', () => {
const { notes } = executePlan(
[{ kind: 'rmi', runtime: 'docker', image: 'img:latest' }],
deps({ runCommand: () => ({ status: 1, stdout: '' }) }),
);
expect(notes.some((n) => n.includes('docker rmi img:latest'))).toBe(true);
});
it('removes .env only after a successful backup', () => {
const envPath = path.join(tempDir, '.env');
fs.writeFileSync(envPath, 'KEY=secret');
const { notes } = executePlan([{ kind: 'backup-env', envPath }], deps());
expect(fs.existsSync(envPath)).toBe(false);
expect(fs.readFileSync(path.join(tempDir, '.env.bak'), 'utf-8')).toBe('KEY=secret');
expect(notes).toEqual([]);
});
it('keeps .env when the backup fails', () => {
const envPath = path.join(tempDir, '.env');
fs.writeFileSync(envPath, 'KEY=secret');
fs.chmodSync(tempDir, 0o555); // backup destination unwritable
try {
const { notes } = executePlan([{ kind: 'backup-env', envPath }], deps());
expect(fs.existsSync(envPath)).toBe(true);
expect(notes.some((n) => n.includes('backup-env'))).toBe(true);
} finally {
fs.chmodSync(tempDir, 0o755);
}
});
it('re-lists containers by label at removal time instead of using scan-time ids', () => {
const calls: string[][] = [];
const docker: RunCommand = (cmd, args) => {
calls.push([cmd, ...args]);
if (args[0] === 'ps') return { status: 0, stdout: 'fresh1\nfresh2\n' };
return { status: 0, stdout: '' };
};
executePlan(
[{ kind: 'rm-containers', runtime: 'docker', labelFilter: 'nanoclaw-install=abcd1234' }],
deps({ runCommand: docker }),
);
expect(calls).toEqual([
['docker', 'ps', '-aq', '--filter', 'label=nanoclaw-install=abcd1234'],
['docker', 'rm', '-f', 'fresh1', 'fresh2'],
]);
});
it('notes a manual command when the container runtime is unavailable', () => {
const { notes } = executePlan(
[{ kind: 'rm-containers', runtime: 'docker', labelFilter: 'nanoclaw-install=x' }],
deps({ runCommand: () => ({ status: null, stdout: '' }) }),
);
expect(notes.some((n) => n.includes('xargs -r docker rm -f'))).toBe(true);
});
it('notes a manual delete when onecli itself cannot be run', () => {
const { notes } = executePlan(
[
{
kind: 'delete-onecli-agent',
agent: { uuid: 'u-123', identifier: 'ag-mine', name: 'Mine' },
},
],
deps({ runCommand: () => ({ status: null, stdout: '' }) }),
);
expect(notes.some((n) => n.includes('onecli agents delete --id u-123'))).toBe(true);
});
it('deletes OneCLI agents by vault uuid, never by identifier', () => {
const calls: string[][] = [];
const recorder: RunCommand = (cmd, args) => {
calls.push([cmd, ...args]);
return { status: 0, stdout: '' };
};
executePlan(
[
{
kind: 'delete-onecli-agent',
agent: { uuid: 'u-123', identifier: 'ag-mine', name: 'Mine' },
},
],
deps({ runCommand: recorder }),
);
expect(calls).toEqual([['onecli', 'agents', 'delete', '--id', 'u-123']]);
});
});
+193
View File
@@ -0,0 +1,193 @@
/**
* Removal-plan executor. Each action runs in its own try/catch: a failure
* becomes a summary note and execution continues (re-running the
* uninstaller is idempotent the next scan only finds what's left).
*
* Must stay safe to run after logs/ and node_modules/ are gone: only static
* imports, no dynamic import(), no setup-log writes. Output goes through
* the injected `log` callback.
*/
import fs from 'fs';
import path from 'path';
import type { RunCommand } from './onecli-agents.js';
import type { RemovalAction } from './plan.js';
export interface ExecDeps {
runCommand: RunCommand;
log: (line: string) => void;
/** True when running as root — required to remove a system-level unit. */
isRoot: boolean;
}
export function executePlan(
actions: RemovalAction[],
deps: ExecDeps,
): { notes: string[] } {
const notes: string[] = [];
for (const action of actions) {
try {
runAction(action, deps, notes);
} catch (err) {
const msg = err instanceof Error ? err.message : String(err);
notes.push(
`${action.kind}: failed (${msg}) — re-run the uninstaller to retry.`,
);
}
}
return { notes };
}
/**
* Copy .env aside before deletion. Never clobbers an existing backup
* falls back to a timestamped name on collision. Returns the backup path.
*/
export function backupEnv(envPath: string): string {
const dir = path.dirname(envPath);
let backup = path.join(dir, '.env.bak');
if (fs.existsSync(backup)) {
const stamp = new Date()
.toISOString()
.replace(/[-:]/g, '')
.replace('T', '-')
.slice(0, 15);
backup = path.join(dir, `.env.bak.${stamp}`);
}
fs.copyFileSync(envPath, backup);
return backup;
}
function runAction(action: RemovalAction, deps: ExecDeps, notes: string[]): void {
const { runCommand, log } = deps;
switch (action.kind) {
case 'unload-service':
switch (action.flavor) {
case 'launchd':
runCommand('launchctl', ['unload', action.unitPath]);
fs.rmSync(action.unitPath, { force: true });
log('✓ background service removed');
break;
case 'systemd-user':
runCommand('systemctl', [
'--user',
'disable',
'--now',
`${action.unitName}.service`,
]);
fs.rmSync(action.unitPath, { force: true });
runCommand('systemctl', ['--user', 'daemon-reload']);
log('✓ background service removed');
break;
case 'systemd-system':
if (!deps.isRoot) {
log('! system service needs root — left in place');
notes.push(
`System service ${action.unitPath} — re-run with sudo to remove.`,
);
break;
}
runCommand('systemctl', ['disable', '--now', `${action.unitName}.service`]);
fs.rmSync(action.unitPath, { force: true });
runCommand('systemctl', ['daemon-reload']);
log('✓ system service removed');
break;
}
break;
case 'kill-pid': {
let pid = NaN;
try {
pid = Number(fs.readFileSync(action.pidFile, 'utf-8').trim());
} catch {
// pidfile already gone
}
if (Number.isInteger(pid) && pid > 0) {
try {
process.kill(pid);
log('✓ stopped host process');
} catch {
// not running
}
}
break;
}
case 'pkill-host':
// Exit 1 = no matching process — not a failure.
runCommand('pkill', ['-f', action.pattern]);
break;
case 'rm-containers': {
// Re-list at removal time: the host was alive during the confirm
// phase and may have spawned containers the scan never saw.
const ps = runCommand(action.runtime, [
'ps',
'-aq',
'--filter',
`label=${action.labelFilter}`,
]);
if (ps.status !== 0) {
notes.push(
`Containers: '${action.runtime}' unavailable — remove later with: ` +
`${action.runtime} ps -aq --filter label=${action.labelFilter} | xargs -r ${action.runtime} rm -f`,
);
break;
}
const ids = ps.stdout
.split('\n')
.map((s) => s.trim())
.filter(Boolean);
if (ids.length === 0) break;
runCommand(action.runtime, ['rm', '-f', ...ids]);
log(`✓ removed ${ids.length} container(s)`);
break;
}
case 'rmi': {
const res = runCommand(action.runtime, ['rmi', action.image]);
if (res.status === 0) {
log('✓ removed container image');
} else {
log('! could not remove image (in use?)');
notes.push(
`Image ${action.image}: not removed — retry with: ${action.runtime} rmi ${action.image}`,
);
}
break;
}
case 'rm-ncl-symlink':
fs.rmSync(action.linkPath, { force: true });
log('✓ removed ncl command');
break;
case 'delete-onecli-agent': {
const res = runCommand('onecli', [
'agents',
'delete',
'--id',
action.agent.uuid,
]);
if (res.status === 0) {
log(`✓ deleted OneCLI agent ${action.agent.name} (${action.agent.identifier})`);
} else if (res.status === null) {
// spawn failure (binary gone since the scan), not a missing agent
log(`! couldn't run onecli for ${action.agent.identifier}`);
notes.push(
`OneCLI agent ${action.agent.name} (${action.agent.identifier}): couldn't run onecli — ` +
`delete manually with: onecli agents delete --id ${action.agent.uuid}`,
);
} else {
log(`! OneCLI agent ${action.agent.identifier} already gone`);
}
break;
}
case 'backup-env': {
// Backup and removal are one action so a failed backup (which throws
// into executePlan's catch) can never be followed by the deletion.
const backup = backupEnv(action.envPath);
fs.rmSync(action.envPath, { force: true });
log(`✓ removed .env (backup at ${backup})`);
break;
}
case 'delete-path':
case 'delete-runtime-path':
fs.rmSync(action.item.path, { recursive: true, force: true });
log(`✓ removed ${action.item.what}`);
break;
}
}
+196
View File
@@ -0,0 +1,196 @@
import { describe, it, expect, beforeEach, afterEach } from 'vitest';
import fs from 'fs';
import os from 'os';
import path from 'path';
import Database from 'better-sqlite3';
import { getLaunchdLabel, getSystemdUnit } from '../../src/install-slug.js';
import type { RunCommand } from './onecli-agents.js';
import { detectExistingInstall, scanInstall, type ScanDeps } from './scan.js';
let root: string;
let home: string;
beforeEach(() => {
root = fs.mkdtempSync(path.join(os.tmpdir(), 'nanoclaw-scan-root-'));
home = fs.mkdtempSync(path.join(os.tmpdir(), 'nanoclaw-scan-home-'));
});
afterEach(() => {
fs.rmSync(root, { recursive: true, force: true });
fs.rmSync(home, { recursive: true, force: true });
});
/** Fake runCommand: unhandled commands fail (binary missing / daemon down). */
function fakeRun(
handlers: Record<string, (args: string[]) => { status: number | null; stdout: string }>,
): RunCommand {
return (cmd, args) => (handlers[cmd] ?? (() => ({ status: 1, stdout: '' })))(args);
}
function deps(overrides: Partial<ScanDeps> = {}): ScanDeps {
return {
projectRoot: root,
home,
platform: 'darwin',
runCommand: fakeRun({}),
...overrides,
};
}
const dockerUp = (containerIds: string[], hasImage: boolean) =>
fakeRun({
docker: (args) => {
if (args[0] === 'ps') return { status: 0, stdout: containerIds.join('\n') + '\n' };
if (args[0] === 'image') return { status: hasImage ? 0 : 1, stdout: '' };
return { status: 1, stdout: '' };
},
});
describe('scanInstall path groups', () => {
it('puts dist and node_modules in runtime, not data', () => {
for (const dir of ['data', 'logs', 'dist', 'node_modules', 'groups', 'store']) {
fs.mkdirSync(path.join(root, dir));
}
fs.writeFileSync(path.join(root, '.env'), 'KEY=v');
fs.writeFileSync(path.join(root, 'start-nanoclaw.sh'), '#!/bin/bash');
const inv = scanInstall(deps());
expect(inv.data.map((i) => path.basename(i.path))).toEqual([
'data',
'logs',
'.env',
'start-nanoclaw.sh',
]);
expect(inv.runtime.map((i) => path.basename(i.path))).toEqual([
'dist',
'node_modules',
]);
expect(inv.user.map((i) => path.basename(i.path))).toEqual(['groups', 'store']);
});
it('finds nothing in an empty checkout', () => {
const inv = scanInstall(deps());
expect(inv.data).toEqual([]);
expect(inv.runtime).toEqual([]);
expect(inv.user).toEqual([]);
expect(inv.service.containerIds).toEqual([]);
expect(inv.service.image).toBeUndefined();
});
});
describe('scanInstall service artifacts', () => {
it('detects the launchd plist on macOS', () => {
const plist = path.join(
home,
'Library',
'LaunchAgents',
`${getLaunchdLabel(root)}.plist`,
);
fs.mkdirSync(path.dirname(plist), { recursive: true });
fs.writeFileSync(plist, '<plist/>');
const inv = scanInstall(deps());
expect(inv.service.launchdPlist).toBe(plist);
expect(inv.service.systemdUserUnit).toBeUndefined();
});
it('detects systemd user unit and pidfile on Linux', () => {
const unit = path.join(
home,
'.config',
'systemd',
'user',
`${getSystemdUnit(root)}.service`,
);
fs.mkdirSync(path.dirname(unit), { recursive: true });
fs.writeFileSync(unit, '[Unit]');
fs.writeFileSync(path.join(root, 'nanoclaw.pid'), '12345');
const inv = scanInstall(deps({ platform: 'linux' }));
expect(inv.service.systemdUserUnit).toBe(unit);
expect(inv.service.pidFile).toBe(path.join(root, 'nanoclaw.pid'));
expect(inv.service.launchdPlist).toBeUndefined();
});
it('captures container ids and image when docker is up', () => {
const inv = scanInstall(deps({ runCommand: dockerUp(['abc123', 'def456'], true) }));
expect(inv.service.containerIds).toEqual(['abc123', 'def456']);
expect(inv.service.image).toMatch(/^nanoclaw-agent-v2-[0-9a-f]{8}:latest$/);
expect(inv.notes).toEqual([]);
});
it('degrades with a manual-cleanup note when docker is unavailable', () => {
const inv = scanInstall(deps());
expect(inv.service.containerIds).toEqual([]);
expect(inv.service.image).toBeUndefined();
expect(inv.notes.some((n) => n.includes("'docker' unavailable"))).toBe(true);
});
});
describe('scanInstall ncl symlink', () => {
const link = () => path.join(home, '.local', 'bin', 'ncl');
it('includes the symlink only when it targets this checkout', () => {
fs.mkdirSync(path.dirname(link()), { recursive: true });
fs.symlinkSync(path.join(root, 'bin', 'ncl'), link());
const inv = scanInstall(deps());
expect(inv.service.nclSymlink).toBe(link());
});
it('leaves a symlink pointing at another copy, with a note', () => {
fs.mkdirSync(path.dirname(link()), { recursive: true });
fs.symlinkSync('/some/other/copy/bin/ncl', link());
const inv = scanInstall(deps());
expect(inv.service.nclSymlink).toBeUndefined();
expect(inv.notes.some((n) => n.includes('points to another NanoClaw copy'))).toBe(true);
});
});
describe('scanInstall OneCLI agents', () => {
const vault = JSON.stringify({
data: [
{ id: 'u-1', identifier: 'ag-mine', name: 'Mine', isDefault: false },
{ id: 'u-2', identifier: 'ag-other', name: 'Other', isDefault: false },
],
});
const onecliUp = fakeRun({ onecli: () => ({ status: 0, stdout: vault }) });
it('splits mine vs orphans against the central DB', () => {
fs.mkdirSync(path.join(root, 'data'));
const db = new Database(path.join(root, 'data', 'v2.db'));
db.exec('CREATE TABLE agent_groups (id TEXT PRIMARY KEY)');
db.prepare('INSERT INTO agent_groups (id) VALUES (?)').run('ag-mine');
db.close();
const inv = scanInstall(deps({ runCommand: onecliUp }));
expect(inv.onecli.idsKnown).toBe(true);
expect(inv.onecli.mine.map((a) => a.identifier)).toEqual(['ag-mine']);
expect(inv.onecli.orphans.map((a) => a.identifier)).toEqual(['ag-other']);
});
it('flags orphan labels as unreliable when the DB is unreadable', () => {
const inv = scanInstall(deps({ runCommand: onecliUp }));
expect(inv.onecli.idsKnown).toBe(false);
expect(inv.onecli.mine).toEqual([]);
expect(inv.onecli.orphans.map((a) => a.identifier)).toEqual(['ag-mine', 'ag-other']);
expect(inv.notes.some((n) => n.includes("Couldn't read agent_groups"))).toBe(true);
});
});
describe('detectExistingInstall', () => {
it('is false for an empty checkout', () => {
expect(detectExistingInstall(root)).toBe(false);
});
it('is true when the central DB exists', () => {
fs.mkdirSync(path.join(root, 'data'));
const db = new Database(path.join(root, 'data', 'v2.db'));
db.close();
expect(detectExistingInstall(root)).toBe(true);
});
});
+278
View File
@@ -0,0 +1,278 @@
/**
* Uninstall inventory scan find every artifact this checkout created.
*
* Everything NanoClaw creates is tagged with the per-checkout install slug
* (sha1(projectRoot)[:8]), so several copies can coexist on one machine.
* The scan reports ONLY things belonging to the given project root; shared
* tools (the OneCLI app/vault, shell PATH lines, host-wide config) are
* never inventoried.
*
* External commands (docker, onecli) go through the injected `runCommand`
* so tests can fake them; filesystem checks are real tests use temp dirs.
* A missing/down docker daemon degrades to an empty result plus a note with
* manual cleanup commands; it never throws.
*
* Deliberately does NOT import src/config.ts (import-time side effects).
*/
import fs from 'fs';
import os from 'os';
import path from 'path';
import {
getContainerImageBase,
getInstallSlug,
getLaunchdLabel,
getSystemdUnit,
} from '../../src/install-slug.js';
import {
listVaultAgents,
readAgentGroupIds,
splitVaultAgents,
type RunCommand,
type VaultAgent,
} from './onecli-agents.js';
export interface PathItem {
/** Human label, e.g. "Database & conversations". */
what: string;
/** Display location (tilde-abbreviated). */
where: string;
/** Absolute path to remove. */
path: string;
}
export interface ServiceInventory {
launchdPlist?: string;
systemdUserUnit?: string;
systemdSystemUnit?: string;
pidFile?: string;
containerIds: string[];
image?: string;
nclSymlink?: string;
}
export interface OnecliInventory {
mine: VaultAgent[];
orphans: VaultAgent[];
/** False when agent_groups couldn't be read — orphan labels are then unreliable. */
idsKnown: boolean;
}
export interface Inventory {
slug: string;
projectRoot: string;
containerRuntime: string;
service: ServiceInventory;
/** Group 2: app data, logs & secrets. */
data: PathItem[];
/**
* dist/ + node_modules/ displayed with the data group but removed dead
* last: the uninstaller itself runs on tsx out of node_modules.
*/
runtime: PathItem[];
/** Group 3: groups/ and store/ — user content, unrecoverable. */
user: PathItem[];
onecli: OnecliInventory;
notes: string[];
}
export interface ScanDeps {
projectRoot: string;
home: string;
platform: NodeJS.Platform;
runCommand: RunCommand;
}
export function tilde(p: string, home: string): string {
return p.startsWith(home) ? `~${p.slice(home.length)}` : p;
}
export function scanInstall(deps: ScanDeps): Inventory {
const { projectRoot, home, runCommand } = deps;
const slug = getInstallSlug(projectRoot);
const containerRuntime = process.env.CONTAINER_RUNTIME ?? 'docker';
const notes: string[] = [];
const service = scanService(deps, slug, containerRuntime, notes);
const data = existingItems(projectRoot, home, [
{ rel: 'data', what: 'Database & conversations' },
{ rel: 'logs', what: 'Logs' },
{ rel: '.env', what: 'Secrets / API keys (.env)', where: 'backed up before removal' },
{ rel: 'start-nanoclaw.sh', what: 'Start script', where: 'start-nanoclaw.sh' },
{ rel: 'nanoclaw.pid', what: 'PID file', where: 'nanoclaw.pid' },
]);
const runtime = existingItems(projectRoot, home, [
{ rel: 'dist', what: 'Build output' },
{ rel: 'node_modules', what: 'Installed dependencies' },
]);
const user = existingItems(projectRoot, home, [
{ rel: 'groups', what: 'Agent memory & files' },
{ rel: 'store', what: 'Migrated data store' },
]);
const onecli = scanOnecli(projectRoot, runCommand, notes);
return {
slug,
projectRoot,
containerRuntime,
service,
data,
runtime,
user,
onecli,
notes,
};
}
/**
* Cheap existing-install probe for mid-setup detection: service registration
* (per-platform) or a central DB. No docker or onecli calls.
*/
export function detectExistingInstall(projectRoot: string): boolean {
if (fs.existsSync(path.join(projectRoot, 'data', 'v2.db'))) return true;
const home = os.homedir();
if (process.platform === 'darwin') {
return fs.existsSync(
path.join(home, 'Library', 'LaunchAgents', `${getLaunchdLabel(projectRoot)}.plist`),
);
}
if (process.platform === 'linux') {
const unit = getSystemdUnit(projectRoot);
return (
fs.existsSync(path.join(home, '.config', 'systemd', 'user', `${unit}.service`)) ||
fs.existsSync(`/etc/systemd/system/${unit}.service`)
);
}
return false;
}
function scanService(
deps: ScanDeps,
slug: string,
containerRuntime: string,
notes: string[],
): ServiceInventory {
const { projectRoot, home, platform, runCommand } = deps;
const service: ServiceInventory = { containerIds: [] };
if (platform === 'darwin') {
const plist = path.join(
home,
'Library',
'LaunchAgents',
`${getLaunchdLabel(projectRoot)}.plist`,
);
if (fs.existsSync(plist)) service.launchdPlist = plist;
} else if (platform === 'linux') {
const unit = getSystemdUnit(projectRoot);
const userUnit = path.join(home, '.config', 'systemd', 'user', `${unit}.service`);
const systemUnit = `/etc/systemd/system/${unit}.service`;
if (fs.existsSync(userUnit)) service.systemdUserUnit = userUnit;
if (fs.existsSync(systemUnit)) service.systemdSystemUnit = systemUnit;
const pidFile = path.join(projectRoot, 'nanoclaw.pid');
if (fs.existsSync(pidFile)) service.pidFile = pidFile;
}
// Container label matches what container-runner.ts stamps at spawn time.
const installLabel = `nanoclaw-install=${slug}`;
const image = `${getContainerImageBase(projectRoot)}:latest`;
let runtimeOk = true;
try {
const ps = runCommand(containerRuntime, [
'ps',
'-aq',
'--filter',
`label=${installLabel}`,
]);
if (ps.status === 0) {
service.containerIds = ps.stdout
.split('\n')
.map((s) => s.trim())
.filter(Boolean);
} else {
runtimeOk = false;
}
} catch {
runtimeOk = false;
}
if (runtimeOk) {
try {
const inspect = runCommand(containerRuntime, ['image', 'inspect', image]);
if (inspect.status === 0) service.image = image;
} catch {
runtimeOk = false;
}
}
if (!runtimeOk) {
notes.push(
`Containers/image: '${containerRuntime}' unavailable; remove later with: ` +
`${containerRuntime} ps -aq --filter label=${installLabel} | xargs -r ${containerRuntime} rm -f; ` +
`${containerRuntime} rmi ${image}`,
);
}
const link = path.join(home, '.local', 'bin', 'ncl');
let linkStat: fs.Stats | null = null;
try {
linkStat = fs.lstatSync(link);
} catch {
linkStat = null;
}
if (linkStat?.isSymbolicLink()) {
let target = fs.readlinkSync(link);
if (!path.isAbsolute(target)) {
target = path.resolve(path.dirname(link), target);
}
if (path.resolve(target) === path.join(projectRoot, 'bin', 'ncl')) {
service.nclSymlink = link;
} else {
notes.push(
`ncl command ${tilde(link, home)} points to another NanoClaw copy; left untouched.`,
);
}
}
return service;
}
function scanOnecli(
projectRoot: string,
runCommand: RunCommand,
notes: string[],
): OnecliInventory {
const vault = listVaultAgents(runCommand);
if (!vault.available || vault.agents.length === 0) {
return { mine: [], orphans: [], idsKnown: false };
}
const { ids, known } = readAgentGroupIds(path.join(projectRoot, 'data', 'v2.db'));
const { mine, orphans } = splitVaultAgents(vault.agents, ids, known);
if (!known && orphans.length > 0) {
notes.push(
"Couldn't read agent_groups from data/v2.db; OneCLI agents shown as 'orphan' may actually belong to this copy.",
);
}
return { mine, orphans, idsKnown: known };
}
function existingItems(
projectRoot: string,
home: string,
specs: { rel: string; what: string; where?: string }[],
): PathItem[] {
const items: PathItem[] = [];
for (const spec of specs) {
const p = path.join(projectRoot, spec.rel);
if (!fs.existsSync(p)) continue;
items.push({
what: spec.what,
where: spec.where ?? `${tilde(p, home)}/`,
path: p,
});
}
return items;
}
+12
View File
@@ -44,6 +44,9 @@ export interface DeliveryAddress {
*/
export interface InboundEvent {
channelType: string;
/** Receiving adapter instance; stamped host-side (src/index.ts onInbound).
* Absent (e.g. CLI onInboundEvent) means the default instance (= channelType). */
instance?: string;
platformId: string;
threadId: string | null;
message: {
@@ -112,6 +115,15 @@ export interface ChannelAdapter {
name: string;
channelType: string;
/**
* Adapter-instance name distinguishes N adapters of one platform
* (e.g. three Slack apps in one workspace). Defaults to channelType.
* channelType stays the SEMANTIC platform key (user ids '<channelType>:<handle>',
* formatting, container config); instance is a host-side routing key only.
* Must be unique across active adapters and URL-safe (no '/', '?', ':').
*/
instance?: string;
/**
* Whether this adapter models conversations as threads.
*
+118 -2
View File
@@ -30,19 +30,24 @@ function now() {
/** Create a mock ChannelAdapter for testing. */
function createMockAdapter(
channelType: string,
): ChannelAdapter & { delivered: OutboundMessage[]; inbound: InboundMessage[] } {
instance?: string,
): ChannelAdapter & { delivered: OutboundMessage[]; inbound: InboundMessage[]; setupTimes: number[] } {
const delivered: OutboundMessage[] = [];
const inbound: InboundMessage[] = [];
const setupTimes: number[] = [];
let setupConfig: ChannelSetup | null = null;
return {
name: channelType,
name: instance ?? channelType,
channelType,
instance,
supportsThreads: false,
delivered,
inbound,
setupTimes,
async setup(config: ChannelSetup) {
setupTimes.push(Date.now());
setupConfig = config;
},
@@ -117,6 +122,117 @@ describe('channel registry', () => {
});
});
describe('channel registry — instance keying', () => {
// Fresh module per test: the registry and activeAdapters maps are
// module-level, and these arms register conflicting same-channelType
// adapters that must not leak across tests.
beforeEach(() => {
vi.resetModules();
});
afterEach(async () => {
const { teardownChannelAdapters } = await import('./channel-registry.js');
await teardownChannelAdapters();
// Drop this test's registrations so later describe blocks (which import
// the registry without resetting) start from an empty registry instead
// of inheriting same-channelType pairs.
vi.resetModules();
});
const mockSetup = () => ({
onInbound: () => {},
onInboundEvent: () => {},
onMetadata: () => {},
onAction: () => {},
});
it('keys two same-channelType adapters by instance — both resolvable', async () => {
const reg = await import('./channel-registry.js');
const worker = createMockAdapter('slack', 'slack-worker');
const tester = createMockAdapter('slack', 'slack-tester');
reg.registerChannelAdapter('slack-worker', { factory: () => worker });
reg.registerChannelAdapter('slack-tester', { factory: () => tester });
await reg.initChannelAdapters(mockSetup);
expect(reg.getChannelAdapter('slack-worker')).toBe(worker);
expect(reg.getChannelAdapter('slack-tester')).toBe(tester);
expect(reg.getActiveAdapters()).toHaveLength(2);
});
it('resolves channelType to the default-instance adapter when one exists, else first-registered', async () => {
const reg = await import('./channel-registry.js');
const named = createMockAdapter('slack', 'slack-tester');
const unnamed = createMockAdapter('slack');
reg.registerChannelAdapter('slack-tester', { factory: () => named });
reg.registerChannelAdapter('slack', { factory: () => unnamed });
await reg.initChannelAdapters(mockSetup);
// Exact key (default instance keyed by channelType) beats the fallback
// scan, even though the named sibling registered first.
expect(reg.getChannelAdapter('slack')).toBe(unnamed);
// With ONLY named instances active, channelType still resolves —
// deterministic first-registered fallback.
await reg.teardownChannelAdapters();
vi.resetModules();
const reg2 = await import('./channel-registry.js');
const first = createMockAdapter('slack', 'slack-tester');
const second = createMockAdapter('slack', 'slack-worker');
reg2.registerChannelAdapter('slack-tester', { factory: () => first });
reg2.registerChannelAdapter('slack-worker', { factory: () => second });
await reg2.initChannelAdapters(mockSetup);
expect(reg2.getChannelAdapter('slack')).toBe(first);
});
it('does NOT reroute default-instance outbound through a named sibling when the default adapter is missing', async () => {
// The default Slack app is offline (token rotated, factory returned
// null, …) while a named sibling boots fine. Outbound for the default
// instance must get the offline-adapter handling (drop into the retry
// path) — NEVER a cross-identity send through the sibling bot.
const reg = await import('./channel-registry.js');
const tester = createMockAdapter('slack', 'slack-tester');
reg.registerChannelAdapter('slack-tester', { factory: () => tester });
reg.registerChannelAdapter('slack', { factory: () => null });
await reg.initChannelAdapters(mockSetup);
// Exact lookup (delivery/typing path): the default key resolves nothing.
expect(reg.getChannelAdapterExact('slack')).toBeUndefined();
// Fallback-capable lookup (channelType-only callers) still resolves.
expect(reg.getChannelAdapter('slack')).toBe(tester);
// The delivery bridge dispatches by exact key: a default-instance
// message (instance === channelType after backfill) is dropped, not
// delivered through the sibling's identity.
const bridge = reg.createChannelDeliveryAdapter();
const result = await bridge.deliver(
'slack',
'slack:C1',
null,
'chat',
JSON.stringify({ text: 'to the default bot' }),
undefined,
'slack',
);
expect(result).toBeUndefined();
expect(tester.delivered).toHaveLength(0);
// Sanity: the same bridge DOES deliver when the exact instance is live.
await bridge.deliver(
'slack',
'slack:C1',
null,
'chat',
JSON.stringify({ text: 'to the tester bot' }),
undefined,
'slack-tester',
);
expect(tester.delivered).toHaveLength(1);
});
});
describe('channel + router integration', () => {
beforeEach(async () => {
if (fs.existsSync(TEST_DIR)) fs.rmSync(TEST_DIR, { recursive: true });
+85 -6
View File
@@ -4,7 +4,8 @@
* Channels self-register on import. The host calls initChannelAdapters() at startup
* to instantiate and set up all registered adapters.
*/
import type { ChannelAdapter, ChannelRegistration, ChannelSetup } from './adapter.js';
import type { ChannelAdapter, ChannelRegistration, ChannelSetup, OutboundFile } from './adapter.js';
import type { ChannelDeliveryAdapter } from '../delivery.js';
import { log } from '../log.js';
const SETUP_RETRY_DELAYS_MS = [2000, 5000, 10000];
@@ -26,9 +27,79 @@ export function registerChannelAdapter(name: string, registration: ChannelRegist
registry.set(name, registration);
}
/** Get a live adapter by channel type. */
export function getChannelAdapter(channelType: string): ChannelAdapter | undefined {
return activeAdapters.get(channelType);
/** Get a live adapter by its EXACT registry key (instance name; default
* instances are keyed by channelType itself). No channelType fallback
* callers that address a specific instance (outbound delivery, typing)
* must never be rerouted through a sibling instance: that would send
* through the wrong bot identity with the wrong token. A missing key
* means the owning adapter is offline; callers apply their normal
* offline-adapter handling. */
export function getChannelAdapterExact(key: string): ChannelAdapter | undefined {
return activeAdapters.get(key);
}
/** Get a live adapter by instance name, falling back to any adapter of the
* given channel type. The fallback exists ONLY for channelType-only callers
* (user-id prefix resolution and cold DMs in user-dm.ts, approval delivery
* in channel-approval.ts, the router's thread-policy probe when an event
* carries no instance) they must still resolve when every instance of a
* platform is named. First registered wins (Map insertion order,
* deterministic). Default instances are keyed by channelType itself, so
* single-instance installs always hit the exact-key path. Instance-addressed
* dispatch (delivery, typing) must use getChannelAdapterExact instead. */
export function getChannelAdapter(key: string): ChannelAdapter | undefined {
const exact = activeAdapters.get(key);
if (exact) return exact;
for (const [registryKey, adapter] of activeAdapters) {
if (adapter.channelType === key) {
log.warn('Channel adapter fallback: requested key resolved through a differently-keyed instance', {
requested: key,
resolvedKey: registryKey,
});
return adapter;
}
}
return undefined;
}
/**
* Build the host's outbound delivery bridge: dispatches delivery-poll and
* typing traffic into the adapter registry. Resolution is EXACT-key only
* `instance ?? channelType`. For default-instance messaging_groups rows the
* stored instance IS the channelType, which matches default-registered
* adapters, so single-instance behavior is unchanged. A named instance whose
* adapter is offline gets the normal offline-adapter handling (warn + drop
* into the delivery retry path) never a cross-identity send through a
* sibling bot of the same platform.
*/
export function createChannelDeliveryAdapter(): ChannelDeliveryAdapter {
return {
async deliver(
channelType: string,
platformId: string,
threadId: string | null,
kind: string,
content: string,
files?: OutboundFile[],
instance?: string,
): Promise<string | undefined> {
const adapter = getChannelAdapterExact(instance ?? channelType);
if (!adapter) {
log.warn('No adapter for channel type', { channelType, instance });
return;
}
return adapter.deliver(platformId, threadId, { kind, content: JSON.parse(content), files });
},
async setTyping(
channelType: string,
platformId: string,
threadId: string | null,
instance?: string,
): Promise<void> {
const adapter = getChannelAdapterExact(instance ?? channelType);
await adapter?.setTyping?.(platformId, threadId);
},
};
}
/** Get all active adapters. */
@@ -85,8 +156,16 @@ export async function initChannelAdapters(setupFn: (adapter: ChannelAdapter) =>
throw err;
}
}
activeAdapters.set(adapter.channelType, adapter);
log.info('Channel adapter started', { channel: name, type: adapter.channelType });
// Adapters key by instance (default instance = channelType), so N
// instances of one platform coexist. Duplicate keys warn instead of
// throwing — boot stays resilient, matching the historical silent
// last-write-wins, but now visibly.
const key = adapter.instance ?? adapter.channelType;
if (activeAdapters.has(key)) {
log.warn('Duplicate adapter instance key — overwriting previous adapter', { key, channel: name });
}
activeAdapters.set(key, adapter);
log.info('Channel adapter started', { channel: name, type: adapter.channelType, instance: key });
} catch (err) {
log.error('Failed to start channel adapter', { channel: name, err });
}
+112
View File
@@ -0,0 +1,112 @@
/**
* Approval-card actor byline in the Chat SDK bridge.
*
* Drives the bridge's real onAction handler through the real Chat SDK
* dispatch (`chat.processAction`): `bridge.setup()` registers the handler on
* a real Chat instance, which the test captures from the webhook-server
* registration (mocked so no HTTP server binds a port). After a button click
* the bridge edits the card; the edit must append " — <actor>" so shared
* channels see who resolved an approval. Goes red if the byLine concatenation
* is removed from the edited markdown.
*/
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
import type { Adapter, Chat } from 'chat';
const captured = vi.hoisted(() => ({ chat: null as unknown }));
vi.mock('../webhook-server.js', () => ({
registerWebhookAdapter: vi.fn((chat: unknown) => {
captured.chat = chat;
}),
}));
import { closeDb, initTestDb, runMigrations } from '../db/index.js';
import type { ChannelSetup } from './adapter.js';
import { createChatSdkBridge } from './chat-sdk-bridge.js';
interface CapturedEdit {
threadId: string;
messageId: string;
markdown: string;
}
function makeAdapter(edits: CapturedEdit[]): Adapter {
return {
name: 'stub',
initialize: async () => {},
channelIdFromThreadId: (threadId: string) => `stub:${threadId}`,
editMessage: async (threadId: string, messageId: string, content: { markdown: string }) => {
edits.push({ threadId, messageId, markdown: content.markdown });
},
} as unknown as Adapter;
}
async function fireAction(user: Record<string, unknown>): Promise<{ edits: CapturedEdit[]; actions: string[] }> {
const edits: CapturedEdit[] = [];
const actions: string[] = [];
const adapter = makeAdapter(edits);
const bridge = createChatSdkBridge({ adapter, supportsThreads: false });
await bridge.setup({
onInbound: async () => {},
onInboundEvent: async () => {},
onMetadata: () => {},
onAction: (questionId: string, selectedOption: string, userId: string) => {
actions.push(`${questionId}:${selectedOption}:${userId}`);
},
} as ChannelSetup);
const chat = captured.chat as Chat;
expect(chat).toBeTruthy();
await chat.processAction(
{
actionId: 'ncq:q-1:approve',
adapter,
messageId: 'msg-1',
raw: {},
threadId: 'T-1',
user: user as never,
value: 'approve',
},
undefined,
);
return { edits, actions };
}
beforeEach(() => {
captured.chat = null;
const db = initTestDb();
runMigrations(db);
});
afterEach(() => {
closeDb();
});
describe('chat-sdk-bridge approval-card byline', () => {
it('appends the acting user to the edited card markdown', async () => {
const { edits, actions } = await fireAction({ userId: 'U1', userName: 'gavriel', fullName: 'Gavriel C' });
expect(edits).toHaveLength(1);
expect(edits[0].threadId).toBe('T-1');
expect(edits[0].messageId).toBe('msg-1');
expect(edits[0].markdown).toContain('approve — gavriel');
expect(actions).toEqual(['q-1:approve:U1']);
});
it('falls back to fullName when userName is missing', async () => {
const { edits } = await fireAction({ userId: 'U2', fullName: 'Gavriel C' });
expect(edits).toHaveLength(1);
expect(edits[0].markdown).toContain('— Gavriel C');
});
it('omits the byline when the actor has no name', async () => {
const { edits } = await fireAction({ userId: 'U3' });
expect(edits).toHaveLength(1);
expect(edits[0].markdown).not.toContain('—');
expect(edits[0].markdown).toContain('approve');
});
});
+146 -1
View File
@@ -1,9 +1,13 @@
import { describe, expect, it } from 'vitest';
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
import type { Adapter, AdapterPostableMessage, RawMessage } from 'chat';
import { createChatSdkBridge, splitForLimit } from './chat-sdk-bridge.js';
vi.mock('../webhook-server.js', () => ({
registerWebhookAdapter: vi.fn(),
}));
function stubAdapter(partial: Partial<Adapter>): Adapter {
return { name: 'stub', ...partial } as unknown as Adapter;
}
@@ -93,6 +97,147 @@ describe('createChatSdkBridge', () => {
});
});
describe('createChatSdkBridge — instance identity', () => {
it('default: name === channelType === adapter.name, instance undefined', () => {
const bridge = createChatSdkBridge({
adapter: stubAdapter({ name: 'slack' }),
supportsThreads: true,
});
expect(bridge.name).toBe('slack');
expect(bridge.channelType).toBe('slack');
expect(bridge.instance).toBeUndefined();
});
it('named instance: name follows the instance, channelType stays the platform', () => {
const bridge = createChatSdkBridge({
adapter: stubAdapter({ name: 'slack' }),
instance: 'slack-tester',
supportsThreads: true,
});
expect(bridge.name).toBe('slack-tester');
expect(bridge.channelType).toBe('slack');
expect(bridge.instance).toBe('slack-tester');
});
it('rejects instance names that would break the webhook route or state delimiter', () => {
for (const bad of ['a/b', 'a:b', 'a?b', 'a b']) {
expect(() =>
createChatSdkBridge({ adapter: stubAdapter({ name: 'slack' }), instance: bad, supportsThreads: true }),
).toThrow(/URL-safe/);
}
});
it('rejects empty and whitespace-only instance names (config bug — fail loud)', () => {
// '' is falsy: a truthiness guard would skip it, dead-ending the
// webhook route ('/webhook/' + '') and collapsing the state namespace
// into the default instance's unprefixed keyspace — the exact
// cross-bot dedupe/lock collisions the namespace exists to prevent.
for (const bad of ['', ' ', ' ', '\t']) {
expect(() =>
createChatSdkBridge({ adapter: stubAdapter({ name: 'slack' }), instance: bad, supportsThreads: true }),
).toThrow(/URL-safe/);
}
});
});
describe('createChatSdkBridge.setup — webhook route and state namespace', () => {
// Real setup() over a stub adapter: Chat.initialize() needs a working
// StateAdapter (chat_sdk_* tables) and an adapter.initialize — nothing
// platform-side. registerWebhookAdapter is mocked at module level so we
// can assert the (chat, adapterName, routingPath) triple.
function setupStubAdapter(): Adapter {
return stubAdapter({
name: 'slack',
initialize: async () => {},
} as unknown as Partial<Adapter>);
}
beforeEach(async () => {
const { initTestDb } = await import('../db/connection.js');
const { runMigrations } = await import('../db/migrations/index.js');
runMigrations(initTestDb());
const { registerWebhookAdapter } = await import('../webhook-server.js');
vi.mocked(registerWebhookAdapter).mockClear();
});
afterEach(async () => {
const { closeDb } = await import('../db/connection.js');
closeDb();
});
const hostConfig = {
onInbound: () => {},
onInboundEvent: () => {},
onMetadata: () => {},
onAction: () => {},
};
it('named instance registers the webhook with adapterName as handler key and instance as route', async () => {
const { registerWebhookAdapter } = await import('../webhook-server.js');
const bridge = createChatSdkBridge({
adapter: setupStubAdapter(),
instance: 'slack-tester',
supportsThreads: true,
});
await bridge.setup(hostConfig);
expect(registerWebhookAdapter).toHaveBeenCalledTimes(1);
const [, adapterName, routingPath] = vi.mocked(registerWebhookAdapter).mock.calls[0];
expect(adapterName).toBe('slack');
expect(routingPath).toBe('slack-tester');
await bridge.teardown();
});
it('default instance registers the historical route', async () => {
const { registerWebhookAdapter } = await import('../webhook-server.js');
const bridge = createChatSdkBridge({ adapter: setupStubAdapter(), supportsThreads: true });
await bridge.setup(hostConfig);
const [, adapterName, routingPath] = vi.mocked(registerWebhookAdapter).mock.calls[0];
expect(adapterName).toBe('slack');
expect(routingPath ?? adapterName).toBe('slack');
await bridge.teardown();
});
it('named instance namespaces Chat SDK state; default stays unprefixed (live-install constraint)', async () => {
const { getDb } = await import('../db/connection.js');
const named = createChatSdkBridge({
adapter: setupStubAdapter(),
instance: 'slack-tester',
supportsThreads: true,
});
await named.setup(hostConfig);
await named.subscribe!('slack:C1', 'slack:T1');
const def = createChatSdkBridge({ adapter: setupStubAdapter(), supportsThreads: true });
await def.setup(hostConfig);
await def.subscribe!('slack:C1', 'slack:T1');
const rows = getDb().prepare('SELECT thread_id FROM chat_sdk_subscriptions ORDER BY thread_id').all() as Array<{
thread_id: string;
}>;
expect(rows.map((r) => r.thread_id)).toEqual(['slack-tester:slack:T1', 'slack:T1']);
await named.teardown();
await def.teardown();
});
it('explicitly naming the primary instance after the platform stays on the unprefixed keyspace', async () => {
const { getDb } = await import('../db/connection.js');
const bridge = createChatSdkBridge({
adapter: setupStubAdapter(),
instance: 'slack', // explicit, but equal to adapter.name ⇒ default keyspace
supportsThreads: true,
});
await bridge.setup(hostConfig);
await bridge.subscribe!('slack:C1', 'slack:T9');
const rows = getDb().prepare('SELECT thread_id FROM chat_sdk_subscriptions').all() as Array<{
thread_id: string;
}>;
expect(rows.map((r) => r.thread_id)).toEqual(['slack:T9']);
await bridge.teardown();
});
});
describe('createChatSdkBridge.deliver — display cards (send_card)', () => {
// The send_card MCP tool writes outbound rows with `{ type: 'card', card, fallbackText }`.
// Before this branch existed the bridge silently dropped them: cards have no
+42 -7
View File
@@ -47,6 +47,15 @@ export type ReplyContextExtractor = (raw: Record<string, any>) => ReplyContext |
export interface ChatSdkBridgeConfig {
adapter: Adapter;
/**
* Adapter-instance name for running multiple bridges of one platform
* (e.g. several Slack apps in one workspace). Defaults to the platform
* name. Drives the registry key, the webhook route (/webhook/<instance>),
* and the Chat SDK state namespace. channelType is NOT affected user
* identity, formatting, and container config stay keyed on the platform.
* Must be URL-safe: non-empty, only letters, digits, '.', '_' or '-'.
*/
instance?: string;
concurrency?: ConcurrencyStrategy;
/** Bot token for authenticating forwarded Gateway events (required for interaction handling). */
botToken?: string;
@@ -121,6 +130,19 @@ export function splitForLimit(text: string, limit: number): string[] {
export function createChatSdkBridge(config: ChatSdkBridgeConfig): ChannelAdapter {
const { adapter } = config;
// The instance name becomes a webhook route segment (the route regex is
// [^/?]+) and ':' is the state-namespace delimiter — reject anything that
// would break either, at construction time rather than at first webhook.
// Positive allow-list (not a deny-list): also rejects '' and
// whitespace-only names, which are config bugs — '' is falsy, so it
// would skip a truthiness guard, dead-end the webhook route, and
// collapse the state namespace into the default instance's keyspace.
if (config.instance !== undefined && !/^[A-Za-z0-9._-]+$/.test(config.instance)) {
throw new Error(
`chat-sdk bridge instance ${JSON.stringify(config.instance)} must be URL-safe: ` +
`non-empty, only letters, digits, '.', '_' or '-'`,
);
}
const transformText = (t: string): string => (config.transformOutboundText ? config.transformOutboundText(t) : t);
let chat: Chat;
let state: SqliteStateAdapter;
@@ -193,14 +215,21 @@ export function createChatSdkBridge(config: ChatSdkBridgeConfig): ChannelAdapter
}
const bridge: ChannelAdapter = {
name: adapter.name,
channelType: adapter.name,
name: config.instance ?? adapter.name,
channelType: adapter.name, // unchanged — semantic platform key
instance: config.instance, // undefined ⇒ default instance
supportsThreads: config.supportsThreads,
async setup(hostConfig: ChannelSetup) {
setupConfig = hostConfig;
state = new SqliteStateAdapter();
// State namespace: ONLY for a named non-default instance. A skill
// that explicitly names the primary instance after the platform
// (instance === adapter.name) still lands on the legacy UNPREFIXED
// keyspace — prefixing the default would orphan every live install's
// chat_sdk_subscriptions/kv/locks/lists rows.
state = new SqliteStateAdapter(config.instance && config.instance !== adapter.name ? config.instance : undefined);
chat = new Chat({
adapters: { [adapter.name]: adapter },
@@ -284,11 +313,13 @@ export function createChatSdkBridge(config: ChatSdkBridgeConfig): ChannelAdapter
const matched = render?.options.find((o) => o.value === selectedOption);
const selectedLabel = matched?.selectedLabel ?? selectedOption ?? '(clicked)';
// Update the card to show the selected answer and remove buttons
// Update the card to show the selected answer, who acted, and remove buttons
const actorName = event.user?.userName || event.user?.fullName || '';
const byLine = actorName ? `${actorName}` : '';
try {
const tid = event.threadId;
await adapter.editMessage(tid, event.messageId, {
markdown: `${title}\n\n${selectedLabel}`,
markdown: `${title}\n\n${selectedLabel}${byLine}`,
});
} catch (err) {
log.warn('Failed to update card after action', { err });
@@ -358,8 +389,12 @@ export function createChatSdkBridge(config: ChatSdkBridgeConfig): ChannelAdapter
startGateway();
log.info('Gateway listener started', { adapter: adapter.name });
} else {
// Non-gateway adapters (Slack, Teams, GitHub, etc.) — register on the shared webhook server
registerWebhookAdapter(chat, adapter.name);
// Non-gateway adapters (Slack, Teams, GitHub, etc.) — register on the
// shared webhook server. The handler key stays adapter.name (the
// Chat instance's webhooks map is keyed by it); the route segment is
// the instance, so each same-platform bridge gets its own URL (and
// its own signing secret — platforms sign per-app).
registerWebhookAdapter(chat, adapter.name, config.instance ?? adapter.name);
}
log.info('Chat SDK bridge initialized', { adapter: adapter.name });
+2 -2
View File
@@ -90,8 +90,8 @@ describe('groups CLI delete cascades dependent rows (#2525)', () => {
now(),
);
db.prepare(
`INSERT INTO messaging_groups (id, channel_type, platform_id, name, is_group, unknown_sender_policy, created_at)
VALUES (?, 'telegram', 'tg-1', 'chat', 1, 'strict', ?)`,
`INSERT INTO messaging_groups (id, channel_type, platform_id, instance, name, is_group, unknown_sender_policy, created_at)
VALUES (?, 'telegram', 'tg-1', 'telegram', 'chat', 1, 'strict', ?)`,
).run(MGID, now());
db.prepare(
+23
View File
@@ -26,6 +26,12 @@ vi.mock('./db/sessions.js', () => ({
const mockWriteSessionMessage = vi.fn();
vi.mock('./session-manager.js', () => ({
writeSessionMessage: (...args: unknown[]) => mockWriteSessionMessage(...args),
openInboundDb: () => ({}),
}));
const mockCountDueMessages = vi.fn((..._args: unknown[]) => 0);
vi.mock('./db/session-db.js', () => ({
countDueMessages: (...args: unknown[]) => mockCountDueMessages(...args),
}));
import { restartAgentGroupContainers } from './container-restart.js';
@@ -148,4 +154,21 @@ describe('restartAgentGroupContainers', () => {
expect(mockWriteSessionMessage.mock.calls[0][1]).toBe('s1');
expect(mockWriteSessionMessage.mock.calls[1][1]).toBe('s2');
});
it('wakes even without a wake message when in-flight messages are pending', () => {
// A provider switch mid-conversation kills a container holding claimed
// messages — without an immediate respawn those messages stay dark until
// the next inbound or a slow sweep backoff.
mockGetSessionsByAgentGroup.mockReturnValue([makeSession('s1', 'ag1')]);
mockIsContainerRunning.mockReturnValue(true);
mockCountDueMessages.mockReturnValue(2);
restartAgentGroupContainers('ag1', 'provider switch');
const onExit = mockKillContainer.mock.calls[0][2] as () => void;
expect(typeof onExit).toBe('function');
mockGetSession.mockReturnValue(makeSession('s1', 'ag1'));
onExit();
expect(mockWakeContainer).toHaveBeenCalled();
});
});
+8 -2
View File
@@ -5,9 +5,10 @@
* wakes a fresh container via the onExit callback race-free.
*/
import { isContainerRunning, killContainer, wakeContainer } from './container-runner.js';
import { countDueMessages } from './db/session-db.js';
import { getSession, getSessionsByAgentGroup } from './db/sessions.js';
import { log } from './log.js';
import { writeSessionMessage } from './session-manager.js';
import { openInboundDb, writeSessionMessage } from './session-manager.js';
/**
* Kill all running containers for an agent group and respawn them.
@@ -40,10 +41,15 @@ export function restartAgentGroupContainers(agentGroupId: string, reason: string
onWake: 1,
});
}
// Always respawn after the kill when there is anything to process: an
// explicit wake message, or in-flight messages the dying container had
// claimed. Without this, a provider switch mid-conversation leaves the
// claimed messages dark until the next inbound or a slow sweep backoff.
const hasPending = countDueMessages(openInboundDb(session.agent_group_id, session.id)) > 0;
killContainer(
session.id,
reason,
wakeMessage
wakeMessage || hasPending
? () => {
const s = getSession(session.id);
if (s) wakeContainer(s);
+35
View File
@@ -1,3 +1,5 @@
import fs from 'fs';
import path from 'path';
import { describe, expect, it } from 'vitest';
import { resolveProviderName } from './container-runner.js';
@@ -25,3 +27,36 @@ describe('resolveProviderName', () => {
expect(resolveProviderName(null, '')).toBe('claude');
});
});
describe('buildContainerArgs ordering invariant (structural)', () => {
// The OneCLI gateway apply (SDK applyContainerConfig) appends credential-stub
// mounts — e.g. the codex auth.json sentinel nested INSIDE our RW
// /home/node/.codex mount. Docker applies binds in argument order, so the
// stub must land AFTER its parent mount or the parent shadows it and the
// agent silently degrades to loginless auth. Driving the real
// buildContainerArgs needs a live gateway + container runtime, so this
// guards the invariant structurally: the gateway apply must appear after
// the volume-mounts loop in the source.
it('applies the OneCLI gateway after the volume mounts', () => {
const src = fs.readFileSync(path.join(process.cwd(), 'src', 'container-runner.ts'), 'utf-8');
const mountsLoop = src.indexOf('for (const mount of mounts)');
const gatewayApply = src.indexOf('onecli.applyContainerConfig');
expect(mountsLoop).toBeGreaterThan(-1);
expect(gatewayApply).toBeGreaterThan(-1);
expect(gatewayApply).toBeGreaterThan(mountsLoop);
});
});
describe('container boot-failure tripwire (structural)', () => {
// A container that dies at boot (unknown provider, missing CLI binary, bad
// config) explains itself only on stderr — which logs at debug, below the
// default level. The spawn handler must keep a stderr tail and surface it
// at warn on a non-zero exit, or the operator sees only "exited code 1" on
// repeat. Driving a real failing spawn needs a container runtime, so this
// guards the wiring structurally, matching the invariant test above.
it('surfaces the stderr tail when the container exits non-zero', () => {
const src = fs.readFileSync(path.join(process.cwd(), 'src', 'container-runner.ts'), 'utf-8');
expect(src).toContain('stderrTail.push(line)');
expect(src).toMatch(/Container exited non-zero.*stderrTail/s);
});
});
+91 -54
View File
@@ -21,8 +21,9 @@ import {
} from './config.js';
import { materializeContainerJson } from './container-config.js';
import { getContainerConfig } from './db/container-configs.js';
import { updateContainerConfigScalars, updateContainerConfigJson } from './db/container-configs.js';
import { updateContainerConfigScalars } from './db/container-configs.js';
import { CONTAINER_RUNTIME_BIN, hostGatewayArgs, readonlyMountArgs, stopContainer } from './container-runtime.js';
import { EGRESS_NETWORK, egressNetworkArgs, ensureEgressNetwork } from './egress-lockdown.js';
import { composeGroupClaudeMd } from './claude-md-compose.js';
import { getAgentGroup } from './db/agent-groups.js';
import { getDb, hasTable } from './db/connection.js';
@@ -35,6 +36,7 @@ import { validateAdditionalMounts } from './modules/mount-security/index.js';
import './providers/index.js';
import {
getProviderContainerConfig,
providerProvidesAgentSurfaces,
type ProviderContainerContribution,
type VolumeMount,
} from './providers/provider-container-registry.js';
@@ -126,12 +128,19 @@ async function spawnContainer(session: Session): Promise<void> {
// and buildContainerArgs so we don't re-read.
const containerConfig = materializeContainerJson(agentGroup.id);
// Per-group filesystem state lives forever after first creation. Init is
// idempotent: it only writes paths that don't already exist, so this call
// is a no-op for groups that have spawned before. Runs before the provider
// contribution so a surfaces-providing provider finds the group dir ready.
const providerName = resolveProviderName(session.agent_provider, containerConfig.provider);
initGroupFilesystem(agentGroup, { provider: providerName });
// Resolve the effective provider + any host-side contribution it declares
// (extra mounts, env passthrough). Computed once and threaded through both
// buildMounts and buildContainerArgs so side effects (mkdir, etc.) fire once.
const { provider, contribution } = resolveProviderContribution(session, agentGroup, containerConfig);
const mounts = buildMounts(agentGroup, session, containerConfig, contribution);
const mounts = buildMounts(agentGroup, session, containerConfig, provider, contribution);
const containerName = `nanoclaw-v2-${agentGroup.folder}-${Date.now()}`;
// OneCLI agent identifier is always the agent group id — stable across
// sessions and reversible via getAgentGroup() for approval routing.
@@ -159,10 +168,16 @@ async function spawnContainer(session: Session): Promise<void> {
activeContainers.set(session.id, { process: container, containerName });
markContainerRunning(session.id);
// Log stderr
// Log stderr. A container that dies at boot (unknown provider, missing
// binary, bad config) explains itself only here — and debug is below the
// default log level — so keep a tail to surface on a non-zero exit.
const stderrTail: string[] = [];
container.stderr?.on('data', (data) => {
for (const line of data.toString().trim().split('\n')) {
if (line) log.debug(line, { container: agentGroup.folder });
if (!line) continue;
log.debug(line, { container: agentGroup.folder });
stderrTail.push(line);
if (stderrTail.length > 10) stderrTail.shift();
}
});
@@ -178,7 +193,12 @@ async function spawnContainer(session: Session): Promise<void> {
activeContainers.delete(session.id);
markContainerStopped(session.id);
stopTypingRefresh(session.id);
log.info('Container exited', { sessionId: session.id, code, containerName });
// code null = killed by signal (normal shutdown path), not a boot failure.
if (code !== 0 && code !== null && stderrTail.length > 0) {
log.warn('Container exited non-zero', { sessionId: session.id, code, containerName, stderrTail });
} else {
log.info('Container exited', { sessionId: session.id, code, containerName });
}
});
container.on('error', (err) => {
@@ -233,32 +253,37 @@ function resolveProviderContribution(
? fn({
sessionDir: sessionDir(agentGroup.id, session.id),
agentGroupId: agentGroup.id,
groupDir: path.resolve(GROUPS_DIR, agentGroup.folder),
selectedSkills: selectedSkillNames(containerConfig),
hostEnv: process.env,
})
: {};
return { provider, contribution };
}
function buildMounts(
export function buildMounts(
agentGroup: AgentGroup,
session: Session,
containerConfig: import('./container-config.js').ContainerConfig,
provider: string,
providerContribution: ProviderContainerContribution,
): VolumeMount[] {
const projectRoot = process.cwd();
// Per-group filesystem state lives forever after first creation. Init is
// idempotent: it only writes paths that don't already exist, so this call
// is a no-op for groups that have spawned before.
initGroupFilesystem(agentGroup);
// Default agent surfaces (composed project doc, skill links, provider state
// dir) apply unless the provider's registration declares it provides its
// own — a capability, never a provider name. See provider-container-registry.
const defaultSurfaces = !providerProvidesAgentSurfaces(provider);
// Sync skill symlinks based on container.json selection before mounting.
const claudeDir = path.join(DATA_DIR, 'v2-sessions', agentGroup.id, '.claude-shared');
syncSkillSymlinks(claudeDir, containerConfig);
if (defaultSurfaces) {
// Sync skill symlinks based on container.json selection before mounting.
syncSkillSymlinks(claudeDir, containerConfig);
// Compose CLAUDE.md fresh every spawn from the shared base, enabled skill
// fragments, and MCP server instructions. See `claude-md-compose.ts`.
composeGroupClaudeMd(agentGroup);
// Compose CLAUDE.md fresh every spawn from the shared base, enabled skill
// fragments, and MCP server instructions. See `claude-md-compose.ts`.
composeGroupClaudeMd(agentGroup);
}
const mounts: VolumeMount[] = [];
const sessDir = sessionDir(agentGroup.id, session.id);
@@ -285,11 +310,11 @@ function buildMounts(
// already RO-mounted, so writes through it fail regardless — no need for
// a nested mount there.
const composedClaudeMd = path.join(groupDir, 'CLAUDE.md');
if (fs.existsSync(composedClaudeMd)) {
if (defaultSurfaces && fs.existsSync(composedClaudeMd)) {
mounts.push({ hostPath: composedClaudeMd, containerPath: '/workspace/agent/CLAUDE.md', readonly: true });
}
const fragmentsDir = path.join(groupDir, '.claude-fragments');
if (fs.existsSync(fragmentsDir)) {
if (defaultSurfaces && fs.existsSync(fragmentsDir)) {
mounts.push({ hostPath: fragmentsDir, containerPath: '/workspace/agent/.claude-fragments', readonly: true });
}
@@ -302,13 +327,15 @@ function buildMounts(
// Shared CLAUDE.md — read-only, imported by the composed entry point via
// the `.claude-shared.md` symlink inside the group dir.
const sharedClaudeMd = path.join(process.cwd(), 'container', 'CLAUDE.md');
if (fs.existsSync(sharedClaudeMd)) {
if (defaultSurfaces && fs.existsSync(sharedClaudeMd)) {
mounts.push({ hostPath: sharedClaudeMd, containerPath: '/app/CLAUDE.md', readonly: true });
}
// Per-group .claude-shared at /home/node/.claude (Claude state, settings,
// skill symlinks)
mounts.push({ hostPath: claudeDir, containerPath: '/home/node/.claude', readonly: false });
if (defaultSurfaces) {
mounts.push({ hostPath: claudeDir, containerPath: '/home/node/.claude', readonly: false });
}
// Shared agent-runner source — read-only, same code for all groups.
const agentRunnerSrc = path.join(projectRoot, 'container', 'agent-runner', 'src');
@@ -345,25 +372,7 @@ function syncSkillSymlinks(claudeDir: string, containerConfig: import('./contain
fs.mkdirSync(skillsDir, { recursive: true });
}
// Determine desired skill set
const projectRoot = process.cwd();
const sharedSkillsDir = path.join(projectRoot, 'container', 'skills');
let desired: string[];
if (containerConfig.skills === 'all') {
// Recompute from shared dir — newly-added upstream skills appear automatically
desired = fs.existsSync(sharedSkillsDir)
? fs.readdirSync(sharedSkillsDir).filter((e) => {
try {
return fs.statSync(path.join(sharedSkillsDir, e)).isDirectory();
} catch {
return false;
}
})
: [];
} else {
desired = containerConfig.skills;
}
const desired = selectedSkillNames(containerConfig);
const desiredSet = new Set(desired);
// Remove symlinks not in the desired set
@@ -396,12 +405,30 @@ function syncSkillSymlinks(claudeDir: string, containerConfig: import('./contain
}
}
/**
* Resolve the group's skill selection to concrete names — `'all'` recomputes
* from `container/skills/` so newly-added upstream skills appear automatically.
*/
function selectedSkillNames(containerConfig: import('./container-config.js').ContainerConfig): string[] {
if (containerConfig.skills !== 'all') return containerConfig.skills;
const sharedSkillsDir = path.join(process.cwd(), 'container', 'skills');
return fs.existsSync(sharedSkillsDir)
? fs.readdirSync(sharedSkillsDir).filter((e) => {
try {
return fs.statSync(path.join(sharedSkillsDir, e)).isDirectory();
} catch {
return false;
}
})
: [];
}
async function buildContainerArgs(
mounts: VolumeMount[],
containerName: string,
agentGroup: AgentGroup,
containerConfig: import('./container-config.js').ContainerConfig,
provider: string,
_provider: string,
providerContribution: ProviderContainerContribution,
agentIdentifier?: string,
): Promise<string[]> {
@@ -418,22 +445,14 @@ async function buildContainerArgs(
}
}
// OneCLI gateway — injects HTTPS_PROXY + certs so container API calls
// are routed through the agent vault for credential injection. Treated as
// a transient hard failure: if we can't wire the gateway, we don't spawn.
// The caller (router or host-sweep) catches the throw, leaves the inbound
// message pending, and the next sweep tick retries.
if (agentIdentifier) {
await onecli.ensureAgent({ name: agentGroup.name, identifier: agentIdentifier });
// Egress lockdown when enabled — throws if it can't be established, aborting
// the spawn rather than running with open egress. Otherwise the host gateway.
if (ensureEgressNetwork()) {
args.push(...egressNetworkArgs());
log.info('Egress lockdown active', { containerName, network: EGRESS_NETWORK });
} else {
args.push(...hostGatewayArgs());
}
const onecliApplied = await onecli.applyContainerConfig(args, { addHostMapping: false, agent: agentIdentifier });
if (!onecliApplied) {
throw new Error('OneCLI gateway not applied — refusing to spawn container without credentials');
}
log.info('OneCLI gateway applied', { containerName });
// Host gateway
args.push(...hostGatewayArgs());
// User mapping
const hostUid = process.getuid?.();
@@ -452,6 +471,24 @@ async function buildContainerArgs(
}
}
// OneCLI gateway — injects HTTPS_PROXY + certs so container API calls
// are routed through the agent vault for credential injection, and mounts
// any credential stubs the gateway serves (e.g. a sentinel auth file).
// Runs AFTER the volume mounts so a stub nested inside one of our mounts
// (a parent dir mounted RW above it) lands later in the args and isn't
// shadowed by it. Treated as a transient hard failure: if we can't wire
// the gateway, we don't spawn. The caller (router or host-sweep) catches
// the throw, leaves the inbound message pending, and the next sweep tick
// retries.
if (agentIdentifier) {
await onecli.ensureAgent({ name: agentGroup.name, identifier: agentIdentifier });
}
const onecliApplied = await onecli.applyContainerConfig(args, { addHostMapping: false, agent: agentIdentifier });
if (!onecliApplied) {
throw new Error('OneCLI gateway not applied — refusing to spawn container without credentials');
}
log.info('OneCLI gateway applied', { containerName });
// Override entrypoint: run v2 entry point directly via Bun (no tsc, no stdin).
args.push('--entrypoint', 'bash');
+255
View File
@@ -0,0 +1,255 @@
/**
* Channel-instance dimension tests (migration 016 + messaging-groups queries).
*
* Covers the three load-bearing rules:
* 1. Backfill/default instance = channel_type everywhere it isn't set,
* so single-instance installs behave byte-identically.
* 2. UNIQUE(channel_type, platform_id, instance) siblings coexist,
* single-bot pair-uniqueness is preserved via the default value.
* 3. Lookup asymmetry inbound (getMessagingGroupWithAgentCount) is
* exact-on-instance with NO fallback (unknown named instance null
* router auto-creates instead of hijacking a sibling's row); outbound
* (getMessagingGroupByPlatform) is default-instance-first.
*
* The wired-DB arm reproduces the failure mode that bit migration 011: a
* table recreate on a live DB with FK children. It must pass with
* disableForeignKeys: true and fail without it.
*/
import { describe, it, expect, beforeEach, afterEach } from 'vitest';
import { initTestDb, closeDb, getDb } from './connection.js';
import { runMigrations, migrations, type Migration } from './migrations/index.js';
import {
createMessagingGroup,
getMessagingGroupByPlatform,
getMessagingGroupWithAgentCount,
} from './messaging-groups.js';
import type { MessagingGroup } from '../types.js';
function now(): string {
return new Date().toISOString();
}
function mg(overrides: Partial<MessagingGroup> & { id: string }): MessagingGroup {
return {
channel_type: 'slack',
platform_id: 'slack:C1',
name: null,
is_group: 1,
unknown_sender_policy: 'public',
created_at: now(),
...overrides,
};
}
afterEach(() => {
closeDb();
});
describe('migration 016 — fresh DB', () => {
beforeEach(() => {
const db = initTestDb();
runMigrations(db);
});
it('adds a NOT NULL instance column', () => {
const cols = getDb().prepare("PRAGMA table_info('messaging_groups')").all() as Array<{
name: string;
notnull: number;
}>;
const instance = cols.find((c) => c.name === 'instance');
expect(instance).toBeDefined();
expect(instance!.notnull).toBe(1);
});
it('createMessagingGroup without instance stamps instance = channel_type', () => {
createMessagingGroup(mg({ id: 'mg-default' }));
const row = getDb().prepare("SELECT instance FROM messaging_groups WHERE id = 'mg-default'").get() as {
instance: string;
};
expect(row.instance).toBe('slack');
});
it('allows sibling instances on the same (channel_type, platform_id)', () => {
createMessagingGroup(mg({ id: 'mg-default' }));
createMessagingGroup(mg({ id: 'mg-tester', instance: 'slack-tester' }));
const count = getDb().prepare('SELECT COUNT(*) AS c FROM messaging_groups').get() as { c: number };
expect(count.c).toBe(2);
});
it('rejects a duplicate (channel_type, platform_id, instance) triple', () => {
createMessagingGroup(mg({ id: 'mg-a', instance: 'slack-tester' }));
expect(() => createMessagingGroup(mg({ id: 'mg-b', instance: 'slack-tester' }))).toThrow();
});
it('rejects a duplicate default pair (single-bot uniqueness preserved)', () => {
createMessagingGroup(mg({ id: 'mg-a' }));
expect(() => createMessagingGroup(mg({ id: 'mg-b' }))).toThrow();
});
});
describe('migration 016 — wired legacy DB upgrade (the FK recreate arm)', () => {
it('recreates messaging_groups under FK children without violations and backfills instance', () => {
const db = initTestDb();
// Bring the DB to the pre-016 schema.
runMigrations(
db,
migrations.filter((m) => m.name !== 'messaging-group-instance'),
);
const preCols = db.prepare("PRAGMA table_info('messaging_groups')").all() as Array<{ name: string }>;
expect(preCols.some((c) => c.name === 'instance')).toBe(false);
// Seed a wired install: messaging_groups with live FK children
// (messaging_group_agents + sessions reference messaging_groups.id).
// Raw SQL — the new createMessagingGroup expects the instance column.
db.prepare("INSERT INTO agent_groups (id, name, folder, created_at) VALUES ('ag-1', 'A', 'a', ?)").run(now());
db.prepare(
`INSERT INTO messaging_groups (id, channel_type, platform_id, name, is_group, unknown_sender_policy, created_at)
VALUES ('mg-1', 'telegram', 'telegram:123', 'Chat', 0, 'public', ?)`,
).run(now());
db.prepare(
`INSERT INTO messaging_group_agents (id, messaging_group_id, agent_group_id, engage_mode, sender_scope, ignored_message_policy, created_at)
VALUES ('mga-1', 'mg-1', 'ag-1', 'pattern', 'all', 'drop', ?)`,
).run(now());
db.prepare(
`INSERT INTO sessions (id, agent_group_id, messaging_group_id, created_at)
VALUES ('sess-1', 'ag-1', 'mg-1', ?)`,
).run(now());
// Upgrade: only 016 is pending now. Without disableForeignKeys this
// throws 'FOREIGN KEY constraint failed' at DROP TABLE.
expect(() => runMigrations(db)).not.toThrow();
// Backfill: existing row got instance = channel_type.
const row = db.prepare("SELECT instance FROM messaging_groups WHERE id = 'mg-1'").get() as { instance: string };
expect(row.instance).toBe('telegram');
// Children intact and pointing at the recreated parent.
expect(
db.prepare("SELECT COUNT(*) AS c FROM messaging_group_agents WHERE messaging_group_id = 'mg-1'").get(),
).toEqual({ c: 1 });
expect(db.prepare("SELECT COUNT(*) AS c FROM sessions WHERE messaging_group_id = 'mg-1'").get()).toEqual({ c: 1 });
// Full-DB FK integrity (FK enforcement was restored by the runner).
expect(db.pragma('foreign_key_check')).toEqual([]);
expect(db.pragma('foreign_keys', { simple: true })).toBe(1);
});
it('tolerates pre-existing FK orphans: the migration still applies (no boot crash-loop)', () => {
const db = initTestDb();
runMigrations(
db,
migrations.filter((m) => m.name !== 'messaging-group-instance'),
);
// Seed the orphan class that demonstrably exists on live installs
// (ensureUserDm tolerates it at runtime): a user_dms row whose
// messaging_group was deleted through a FK-OFF connection — the
// sqlite3 CLI ships with foreign_keys OFF, and operators are told to
// poke v2.db when troubleshooting.
db.prepare("INSERT INTO users (id, kind, created_at) VALUES ('slack:U1', 'slack', ?)").run(now());
db.pragma('foreign_keys = OFF');
db.prepare(
`INSERT INTO user_dms (user_id, channel_type, messaging_group_id, resolved_at)
VALUES ('slack:U1', 'slack', 'mg-deleted-via-cli', ?)`,
).run(now());
db.pragma('foreign_keys = ON');
expect(db.pragma('foreign_key_check')).toHaveLength(1);
// 016 did not create this violation — it must still apply (the runner
// diffs post-up violations against a pre-up snapshot and only throws
// on NEW ones; pre-existing ones are warned about and carried through).
expect(() => runMigrations(db)).not.toThrow();
const cols = db.prepare("PRAGMA table_info('messaging_groups')").all() as Array<{ name: string }>;
expect(cols.some((c) => c.name === 'instance')).toBe(true);
// The orphan is untouched: still present, still the only violation.
expect(db.pragma('foreign_key_check')).toHaveLength(1);
});
it('still rejects a migration that ITSELF introduces FK violations', () => {
const db = initTestDb();
runMigrations(db);
const rogue: Migration = {
version: 999,
name: 'test-rogue-fk-violation',
disableForeignKeys: true,
up: (d) => {
d.prepare("INSERT INTO users (id, kind, created_at) VALUES ('slack:U-rogue', 'slack', datetime('now'))").run();
d.prepare(
`INSERT INTO user_dms (user_id, channel_type, messaging_group_id, resolved_at)
VALUES ('slack:U-rogue', 'slack', 'mg-never-existed', datetime('now'))`,
).run();
},
};
expect(() => runMigrations(db, [...migrations, rogue])).toThrow(/left FK violations/);
// Rolled back atomically: not recorded as applied, nothing committed.
expect(db.prepare("SELECT 1 FROM schema_version WHERE name = 'test-rogue-fk-violation'").get()).toBeUndefined();
expect(db.pragma('foreign_key_check')).toEqual([]);
});
it('is idempotent — re-running the full barrel is a no-op', () => {
const db = initTestDb();
runMigrations(db);
createMessagingGroup(mg({ id: 'mg-keep', instance: 'slack-tester' }));
expect(() => runMigrations(db)).not.toThrow();
const row = db.prepare("SELECT instance FROM messaging_groups WHERE id = 'mg-keep'").get() as {
instance: string;
};
expect(row.instance).toBe('slack-tester');
});
});
describe('lookup asymmetry — inbound exact-only vs outbound default-first', () => {
beforeEach(() => {
const db = initTestDb();
runMigrations(db);
// The named instance ('alpha-tester') sorts lexically BEFORE the
// channel type ('slack') and is inserted first — so both rowid order
// and the triple-autoindex order put it ahead of the default row.
// A query missing the `(instance = channel_type) DESC` ORDER BY would
// return it; only the deterministic default-first ordering picks
// mg-default.
createMessagingGroup(mg({ id: 'mg-tester', instance: 'alpha-tester' }));
createMessagingGroup(mg({ id: 'mg-default' }));
});
it('getMessagingGroupWithAgentCount without instance resolves the default-instance row', () => {
const found = getMessagingGroupWithAgentCount('slack', 'slack:C1');
expect(found).not.toBeNull();
expect(found!.mg.id).toBe('mg-default');
});
it('getMessagingGroupWithAgentCount with a named instance resolves exactly that row', () => {
const found = getMessagingGroupWithAgentCount('slack', 'slack:C1', 'alpha-tester');
expect(found).not.toBeNull();
expect(found!.mg.id).toBe('mg-tester');
});
it('getMessagingGroupWithAgentCount with an unknown instance returns null (no-hijack rule)', () => {
expect(getMessagingGroupWithAgentCount('slack', 'slack:C1', 'slack-unknown')).toBeNull();
});
it('getMessagingGroupByPlatform without instance prefers the default-instance row', () => {
const found = getMessagingGroupByPlatform('slack', 'slack:C1');
expect(found).toBeDefined();
expect(found!.id).toBe('mg-default');
});
it('getMessagingGroupByPlatform with explicit instance is exact', () => {
expect(getMessagingGroupByPlatform('slack', 'slack:C1', 'alpha-tester')!.id).toBe('mg-tester');
expect(getMessagingGroupByPlatform('slack', 'slack:C1', 'slack-unknown')).toBeUndefined();
});
it('getMessagingGroupByPlatform falls back deterministically when only named instances exist', () => {
const db = getDb();
db.prepare("DELETE FROM messaging_groups WHERE id = 'mg-default'").run();
createMessagingGroup(mg({ id: 'mg-zeta', instance: 'zeta' }));
const found = getMessagingGroupByPlatform('slack', 'slack:C1');
// Lexically-first named instance: 'alpha-tester' < 'zeta'.
expect(found!.id).toBe('mg-tester');
});
});
+47 -9
View File
@@ -21,19 +21,43 @@ import { getDb, hasTable } from './connection.js';
export function createMessagingGroup(group: MessagingGroup): void {
getDb()
.prepare(
`INSERT INTO messaging_groups (id, channel_type, platform_id, name, is_group, unknown_sender_policy, created_at)
VALUES (@id, @channel_type, @platform_id, @name, @is_group, @unknown_sender_policy, @created_at)`,
`INSERT INTO messaging_groups (id, channel_type, platform_id, instance, name, is_group, unknown_sender_policy, created_at)
VALUES (@id, @channel_type, @platform_id, @instance, @name, @is_group, @unknown_sender_policy, @created_at)`,
)
.run(group);
.run({ ...group, instance: group.instance ?? group.channel_type });
}
export function getMessagingGroup(id: string): MessagingGroup | undefined {
return getDb().prepare('SELECT * FROM messaging_groups WHERE id = ?').get(id) as MessagingGroup | undefined;
}
export function getMessagingGroupByPlatform(channelType: string, platformId: string): MessagingGroup | undefined {
/**
* Outbound / cold-DM / setup lookup by platform address.
*
* Instance semantics are deliberately ASYMMETRIC with the router's
* `getMessagingGroupWithAgentCount` (exact-only): outbound callers usually
* don't know (or care) which adapter instance owns a chat, so an unset
* `instance` resolves the default instance first (instance = channel_type),
* falling back deterministically to the lexically-first named instance.
* A set `instance` is exact-only unknown instance returns undefined.
*/
export function getMessagingGroupByPlatform(
channelType: string,
platformId: string,
instance?: string,
): MessagingGroup | undefined {
if (instance !== undefined) {
return getDb()
.prepare('SELECT * FROM messaging_groups WHERE channel_type = ? AND platform_id = ? AND instance = ?')
.get(channelType, platformId, instance) as MessagingGroup | undefined;
}
return getDb()
.prepare('SELECT * FROM messaging_groups WHERE channel_type = ? AND platform_id = ?')
.prepare(
`SELECT * FROM messaging_groups
WHERE channel_type = ? AND platform_id = ?
ORDER BY (instance = channel_type) DESC, instance ASC
LIMIT 1`,
)
.get(channelType, platformId) as MessagingGroup | undefined;
}
@@ -46,23 +70,31 @@ export function getMessagingGroupByPlatform(channelType: string, platformId: str
*
* Returns `null` when no messaging_groups row exists for this channel.
* Returns `{ mg, agentCount: 0 }` when the row exists but has no wired
* agents. Uses the `UNIQUE(channel_type, platform_id)` index plus the
* `UNIQUE(messaging_group_id, agent_group_id)` index for the JOIN both
* agents. Uses the `UNIQUE(channel_type, platform_id, instance)` index plus
* the `UNIQUE(messaging_group_id, agent_group_id)` index for the JOIN both
* covered by existing SQLite auto-indexes from the UNIQUE constraints.
*
* `instance` is EXACT-ONLY, with no fallback deliberately asymmetric with
* `getMessagingGroupByPlatform`'s default-instance-first resolution. An
* unknown named instance must return null so the router auto-creates a
* per-instance group instead of hijacking a sibling instance's row. The
* default param (= channelType) keeps instance-less callers resolving the
* default instance, identical to pre-instance behavior.
*/
export function getMessagingGroupWithAgentCount(
channelType: string,
platformId: string,
instance: string = channelType,
): { mg: MessagingGroup; agentCount: number } | null {
const row = getDb()
.prepare(
`SELECT mg.*, COUNT(mga.id) AS agent_count
FROM messaging_groups mg
LEFT JOIN messaging_group_agents mga ON mga.messaging_group_id = mg.id
WHERE mg.channel_type = ? AND mg.platform_id = ?
WHERE mg.channel_type = ? AND mg.platform_id = ? AND mg.instance = ?
GROUP BY mg.id`,
)
.get(channelType, platformId) as (MessagingGroup & { agent_count: number }) | undefined;
.get(channelType, platformId, instance) as (MessagingGroup & { agent_count: number }) | undefined;
if (!row) return null;
const { agent_count, ...mg } = row;
return { mg: mg as MessagingGroup, agentCount: agent_count };
@@ -72,6 +104,12 @@ export function getAllMessagingGroups(): MessagingGroup[] {
return getDb().prepare('SELECT * FROM messaging_groups ORDER BY name').all() as MessagingGroup[];
}
/**
* All messaging groups on a platform, across every adapter instance.
* Semantics intentionally unchanged by the instance dimension channel_type
* stays the semantic platform key. No live caller today; if a caller needs
* a single instance's rows, filter on `mg.instance`.
*/
export function getMessagingGroupsByChannel(channelType: string): MessagingGroup[] {
return getDb().prepare('SELECT * FROM messaging_groups WHERE channel_type = ?').all(channelType) as MessagingGroup[];
}
@@ -0,0 +1,61 @@
/**
* Channel-instance dimension on messaging_groups.
*
* `instance` names the adapter instance that owns a chat N adapters of one
* platform (e.g. three Slack apps in one workspace) each get their own
* messaging_groups rows. The default instance IS the channel type: every
* existing row is backfilled with `instance = channel_type`, so all existing
* lookups keep resolving the same rows with zero operator action. NOT NULL
* (instead of nullable + partial unique index) keeps every lookup two-state:
* "default instance" is just the literal value `channel_type`.
*
* Uniqueness relaxes from UNIQUE(channel_type, platform_id) to
* UNIQUE(channel_type, platform_id, instance). SQLite cannot relax a
* table-level UNIQUE in place this requires the documented 12-step
* recreate (new table copy DROP RENAME, sqlite.org/lang_altertable.html).
* DROP TABLE fails `FOREIGN KEY constraint failed` on live DBs because five
* child tables REFERENCE messaging_groups(id) (messaging_group_agents,
* user_dms, sessions, pending_sender_approvals, pending_channel_approvals)
* the exact failure that forced migration 011 to abandon its rebuild (see
* its header). Hence `disableForeignKeys: true`: the runner toggles
* foreign_keys=OFF around the transaction (the pragma is a no-op inside one)
* and runs PRAGMA foreign_key_check inside it so violations roll back.
*
* Column list mirrors the live tip schema exactly (001 columns + 012's
* denied_at) verified against PRAGMA table_info on a freshly-migrated DB.
* A recreate with a stale column list silently drops data.
*/
import type Database from 'better-sqlite3';
import type { Migration } from './index.js';
export const migration016: Migration = {
version: 16,
name: 'messaging-group-instance',
disableForeignKeys: true,
up: (db: Database.Database) => {
// Idempotency guard per the 012 pattern.
const cols = db.prepare("PRAGMA table_info('messaging_groups')").all() as Array<{ name: string }>;
if (cols.some((c) => c.name === 'instance')) return;
db.exec(`
CREATE TABLE messaging_groups_new (
id TEXT PRIMARY KEY,
channel_type TEXT NOT NULL,
platform_id TEXT NOT NULL,
instance TEXT NOT NULL,
name TEXT,
is_group INTEGER DEFAULT 0,
unknown_sender_policy TEXT NOT NULL DEFAULT 'strict',
created_at TEXT NOT NULL,
denied_at TEXT,
UNIQUE(channel_type, platform_id, instance)
);
INSERT INTO messaging_groups_new
(id, channel_type, platform_id, instance, name, is_group, unknown_sender_policy, created_at, denied_at)
SELECT id, channel_type, platform_id, channel_type, name, is_group, unknown_sender_policy, created_at, denied_at
FROM messaging_groups;
DROP TABLE messaging_groups;
ALTER TABLE messaging_groups_new RENAME TO messaging_groups;
`);
},
};

Some files were not shown because too many files have changed in this diff Show More