Apply suggestion from @gavrielc

feat(runner): onExchangeComplete provider hook + slash-command interruption
Inverts conversation archiving into an optional onExchangeComplete provider hook: the runner never archives on a provider's behalf, and the markdown writer ships with the provider that needs it. Dormant for the default provider. Slash commands now interrupt an in-flight turn — a runner-handled command (/clear, /compact, /cost, …) arriving mid-turn aborts the active stream and runs immediately instead of waiting out the turn. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 18:21:47 +08:00 · 2026-06-13 16:03:02 +03:00 · 2026-06-13 15:56:43 +03:00 · 2026-06-13 15:49:40 +03:00 · 2026-06-13 12:27:29 +00:00 · 2026-06-13 12:27:24 +00:00
23 changed files with 719 additions and 96 deletions
@@ -33,7 +33,7 @@ Run `/update-nanoclaw` in Claude Code.

 **Validation**: runs `pnpm run build` and `pnpm test`. If container files changed, also runs the container typecheck and `./container/build.sh`.

-**Breaking changes check**: after validation, reads CHANGELOG.md for any `[BREAKING]` entries introduced by the update. If found, shows each breaking change and offers to run the recommended skill to migrate.
+**Breaking changes check**: after validation, reads CHANGELOG.md for any `[BREAKING]` entries introduced by the update and diffs `versions.json` for moved component pins. Each entry carries its migration path — a skill to run or a `docs/` page to follow (per CONTRIBUTING.md, "Breaking Changes") — and the skill walks you through them.

 ## Rollback

@@ -221,24 +221,31 @@ After validation succeeds, check if the update introduced any breaking changes.
 Determine which CHANGELOG entries are new by diffing against the backup tag:
 - `git diff <backup-tag-from-step-1>..HEAD -- CHANGELOG.md`

-Parse the diff output for lines that contain `[BREAKING]` anywhere in the line. Each such line is one breaking change entry. The format is:
+Parse the diff output for lines that contain `[BREAKING]` anywhere in the line. Each such line is one breaking change entry, and per CONTRIBUTING.md ("Breaking Changes") it references its migration path in one of two forms:
 ```
 [BREAKING] <description>. Run `/<skill-name>` to <action>.
+[BREAKING] <description>. **Migration:** follow [docs/<page>.md](docs/<page>.md) ...
 ```

-If no `[BREAKING]` lines are found:
+Also diff the component version pins:
+- `git diff <backup-tag-from-step-1>..HEAD -- versions.json`
+
+Each changed pin is a breaking component update (e.g. `onecli-gateway` moving means the OneCLI gateway must be upgraded). Its migration path is the `[BREAKING]` CHANGELOG entry covering it; if no new entry mentions it, search `docs/` for the pin name (convention: `docs/<component>-upgrades.md`) and treat that doc as the migration path.
+
+If no `[BREAKING]` lines are found and `versions.json` did not change:
 - Skip this step silently. Proceed to Step 7 (skill updates check).

-If one or more `[BREAKING]` lines are found:
+Otherwise:
 - Display a warning header to the user: "This update includes breaking changes that may require action:"
- For each breaking change, display the full description.
- Collect all skill names referenced in the breaking change entries (the `/<skill-name>` part).
- Use AskUserQuestion to ask the user which migration skills they want to run now. Options:
+- For each breaking change, display the full description (for a moved pin without its own entry: the component name, old → new version, and the doc that covers it).
+- Use AskUserQuestion to ask the user which migrations to run now. Options:
  - One option per referenced skill (e.g., "Run /add-whatsapp to re-add WhatsApp channel")
+  - One option per referenced doc (e.g., "Upgrade the OneCLI gateway (docs/onecli-upgrades.md)")
  - "Skip — I'll handle these manually"
- Set `multiSelect: true` so the user can pick multiple skills if there are several breaking changes.
+- Set `multiSelect: true` so the user can pick multiple migrations if there are several breaking changes.
 - For each skill the user selects, invoke it using the Skill tool.
- After all selected skills complete (or if user chose Skip), proceed to Step 7 (skill updates check).
+- For each doc the user selects, read the doc and execute it top to bottom — these docs are written to be executed verbatim by a coding agent (detect → fix → verify → rollback). Stop and report if a verify step fails.
+- After all selected migrations complete (or if user chose Skip), proceed to Step 7 (skill updates check).

 # Step 7: Check for skill and channel/provider updates

@@ -2,6 +2,11 @@

 All notable changes to NanoClaw will be documented in this file.

+## [Unreleased]
+
+- [BREAKING] **`@onecli-sh/sdk` 0.5.0 -> 2.2.1 — requires a OneCLI server with the `/v1` API** (older servers 404 every SDK call). The sanctioned gateway and CLI versions are pinned in `versions.json`; the `onecli` setup step enforces them. **Migration:** [docs/onecli-upgrades.md](docs/onecli-upgrades.md).
+- **Slash commands now interrupt an in-flight turn.** A runner-handled command (`/clear`, `/compact`, `/cost`, …) arriving mid-turn aborts the active stream and runs immediately instead of waiting out the turn.
+
 ## [2.1.0] - 2026-06-07

 - [BREAKING] **Startup now requires an upgrade marker.** The host refuses to boot unless `data/upgrade-state.json` records that this install reached the current version through a sanctioned path (`/setup`, `/update-nanoclaw`, `/migrate-nanoclaw`). After this update completes — and before restarting the service — stamp the marker by running `pnpm exec tsx scripts/upgrade-state.ts set`. If the host has already tripped on restart with "update did not go through the supported path", that same command clears it. See [docs/upgrade-recovery.md](docs/upgrade-recovery.md).
@@ -19,6 +19,13 @@

 **Not accepted:** Features, capabilities, compatibility, enhancements. These should be skills.

+## Breaking Changes
+
+Breaking changes are allowed; **silent** ones are not. NanoClaw does not migrate user installs at runtime — the user's coding agent is the migrator, so every breaking change must ship a migration path that agent can execute without a human reverse-engineering the diff:
+
+1. **Every `[BREAKING]` CHANGELOG entry must reference its migration path** — either a skill to run (`Run /<skill-name> to <action>`) or a `docs/` page covering **detect / why / fix / verify / rollback** (see [docs/onecli-upgrades.md](docs/onecli-upgrades.md) for the shape). `/update-nanoclaw` surfaces these entries after every update and walks the user through them.
+2. **If the change moves an external component's sanctioned version** (gateway, pinned CLI binary, …), update its pin in [`versions.json`](versions.json). The changelog stays human-narrative; `versions.json` is the machine-checkable signal — `/update-nanoclaw` diffs it across the update and routes the user to the linked doc for any pin that moved.
+
 ## Skills

 NanoClaw uses [Claude Code skills](https://code.claude.com/docs/en/skills) — markdown files with optional supporting files that teach Claude how to do something. There are four types of skills in NanoClaw, each serving a different purpose.
@@ -27,6 +27,7 @@ import { fileURLToPath } from 'url';

 import { loadConfig } from './config.js';
 import { buildSystemPromptAddendum } from './destinations.js';
+import { ensureMemoryScaffold } from './memory-scaffold.js';
 // Providers barrel — each enabled provider self-registers on import.
 // Provider skills append imports to providers/index.ts.
 import './providers/index.js';
@@ -95,6 +96,12 @@ async function main(): Promise<void> {
    effort: config.effort,
  });

+  // Providers that lack native memory opt in via `usesMemoryScaffold`; for them
+  // the runner creates a persistent memory/ tree in its host-backed workspace at
+  // boot (idempotent). Default off — the trunk default (Claude) omits the flag
+  // and keeps its native memory untouched.
+  if (provider.usesMemoryScaffold) ensureMemoryScaffold();
+
  await runPollLoop({
    provider,
    providerName,
@@ -5,6 +5,7 @@ import { getUndeliveredMessages } from './db/messages-out.js';
 import { getPendingMessages } from './db/messages-in.js';
 import { getContinuation, setContinuation } from './db/session-state.js';
 import { MockProvider } from './providers/mock.js';
+import type { ProviderExchange } from './providers/types.js';
 import { runPollLoop } from './poll-loop.js';

 beforeEach(() => {
@@ -304,6 +305,7 @@ async function runPollLoopWithTimeout(provider: MockProvider, signal: AbortSigna
      provider,
      providerName: 'mock',
      cwd: '/tmp',
+      signal,
    }),
    new Promise<void>((_, reject) => {
      signal.addEventListener('abort', () => reject(new Error('aborted')));
@@ -324,6 +326,86 @@ function sleep(ms: number): Promise<void> {
  return new Promise((resolve) => setTimeout(resolve, ms));
 }

+describe('poll loop — exchange hook (onExchangeComplete)', () => {
+  // A provider that declares the per-exchange hook. The hook call is the
+  // wiring under test — these tests go red if the poll-loop seam is severed.
+  // What the provider DOES with an exchange (e.g. write markdown into
+  // conversations/) ships with the provider, not the runner.
+  class HookedMockProvider extends MockProvider {
+    readonly exchanges: ProviderExchange[] = [];
+    onExchangeComplete(exchange: ProviderExchange): void {
+      this.exchanges.push(exchange);
+    }
+  }
+
+  it('reports each exchange to a provider that declares the hook', async () => {
+    insertMessage('m1', { sender: 'Alice', text: 'please archive this' }, { platformId: 'chan-1', channelType: 'discord' });
+
+    const provider = new HookedMockProvider({}, () => '<message to="discord-test">archived answer</message>');
+    const controller = new AbortController();
+    const loopPromise = runPollLoopWithTimeout(provider, controller.signal, 2000);
+
+    await waitFor(() => provider.exchanges.length > 0, 2000);
+    controller.abort();
+
+    expect(provider.exchanges.length).toBe(1);
+    const exchange = provider.exchanges[0];
+    expect(exchange.prompt).toContain('please archive this');
+    expect(exchange.result).toContain('archived answer');
+    expect(exchange.continuation).toStartWith('mock-session-');
+    expect(exchange.status).toBe('completed');
+
+    await loopPromise.catch(() => {});
+  });
+
+  it('does not report the internal wrapping-retry nudge as a user prompt', async () => {
+    insertMessage('m1', { sender: 'Alice', text: 'wrap this later' }, { platformId: 'chan-1', channelType: 'discord' });
+
+    let calls = 0;
+    const provider = new HookedMockProvider({}, () => {
+      calls += 1;
+      // First result is unwrapped (triggers the retry nudge), second is wrapped.
+      return calls === 1 ? 'unwrapped text' : '<message to="discord-test">wrapped now</message>';
+    });
+    const controller = new AbortController();
+    const loopPromise = runPollLoopWithTimeout(provider, controller.signal, 3000);
+
+    await waitFor(() => provider.exchanges.length >= 2, 3000);
+    controller.abort();
+
+    // Both exchanges attribute themselves to the real user prompt, never the nudge.
+    for (const exchange of provider.exchanges) {
+      expect(exchange.prompt).not.toContain('Your response was not delivered');
+      expect(exchange.prompt).toContain('wrap this later');
+    }
+    expect(provider.exchanges.map((e) => e.status)).toEqual(['undelivered', 'completed']);
+
+    await loopPromise.catch(() => {});
+  });
+
+  it('a throwing hook never breaks delivery', async () => {
+    insertMessage('m1', { sender: 'Alice', text: 'still deliver this' }, { platformId: 'chan-1', channelType: 'discord' });
+
+    class ThrowingHookProvider extends MockProvider {
+      onExchangeComplete(): void {
+        throw new Error('hook exploded');
+      }
+    }
+    const provider = new ThrowingHookProvider({}, () => '<message to="discord-test">delivered anyway</message>');
+    const controller = new AbortController();
+    const loopPromise = runPollLoopWithTimeout(provider, controller.signal, 2000);
+
+    await waitFor(() => getUndeliveredMessages().length > 0, 2000);
+    controller.abort();
+
+    const out = getUndeliveredMessages();
+    expect(out.length).toBe(1);
+    expect(out[0].content).toContain('delivered anyway');
+
+    await loopPromise.catch(() => {});
+  });
+});
+
 describe('poll loop — provider error recovery', () => {
  it('writes error to outbound and continues loop on provider throw', async () => {
    insertMessage('m1', { sender: 'Alice', text: 'trigger error' }, { platformId: 'chan-1', channelType: 'discord' });
@@ -462,3 +544,76 @@ class InvalidSessionProvider {
    };
  }
 }
+
+describe('poll loop — slash command during active query', () => {
+  it('aborts the active query when /clear arrives as a follow-up', async () => {
+    insertMessage('m-active', { sender: 'Alice', text: 'long running request' }, { platformId: 'chan-1', channelType: 'discord' });
+
+    const provider = new BlockingProvider();
+    const controller = new AbortController();
+    const loopPromise = runPollLoopWithTimeout(provider as unknown as MockProvider, controller.signal, 3000);
+
+    await waitFor(() => provider.queries === 1, 2000);
+    insertMessage('m-clear-active', { sender: 'Alice', text: '/clear' }, { platformId: 'chan-1', channelType: 'discord' });
+
+    await waitFor(() => provider.aborts === 1, 2000);
+    await waitFor(
+      () => getUndeliveredMessages().some((msg) => JSON.parse(msg.content).text === 'Session cleared.'),
+      2000,
+    );
+    controller.abort();
+
+    expect(provider.ends).toBe(0);
+    expect(getContinuation('mock')).toBeUndefined();
+    expect(getPendingMessages()).toHaveLength(0);
+
+    await loopPromise.catch(() => {});
+  });
+});
+
+/**
+ * Provider whose query never completes until ended/aborted — for testing how
+ * the loop interrupts an active stream.
+ */
+class BlockingProvider {
+  readonly supportsNativeSlashCommands = false;
+  queries = 0;
+  aborts = 0;
+  ends = 0;
+
+  isSessionInvalid(): boolean {
+    return false;
+  }
+
+  query() {
+    const owner = this;
+    this.queries += 1;
+    let wake: (() => void) | null = null;
+    let ended = false;
+    let aborted = false;
+
+    return {
+      push() {},
+      end: () => {
+        owner.ends += 1;
+        ended = true;
+        wake?.();
+      },
+      abort: () => {
+        owner.aborts += 1;
+        aborted = true;
+        wake?.();
+      },
+      events: (async function* () {
+        yield { type: 'activity' as const };
+        yield { type: 'init' as const, continuation: 'blocking-session' };
+        while (!ended && !aborted) {
+          await new Promise<void>((resolve) => {
+            wake = resolve;
+          });
+          wake = null;
+        }
+      })(),
+    };
+  }
+}
@@ -0,0 +1,37 @@
+import { describe, expect, it } from 'bun:test';
+import fs from 'fs';
+import os from 'os';
+import path from 'path';
+
+import { ensureMemoryScaffold } from './memory-scaffold.js';
+
+describe('ensureMemoryScaffold', () => {
+  it('deterministically creates the memory tree', () => {
+    const base = fs.mkdtempSync(path.join(os.tmpdir(), 'nanoclaw-mem-'));
+    try {
+      ensureMemoryScaffold(base);
+
+      expect(fs.existsSync(path.join(base, 'memory', 'index.md'))).toBe(true);
+      expect(fs.existsSync(path.join(base, 'memory', 'system', 'definition.md'))).toBe(true);
+      expect(fs.existsSync(path.join(base, 'memory', 'memories'))).toBe(true);
+      expect(fs.existsSync(path.join(base, 'memory', 'data'))).toBe(true);
+    } finally {
+      fs.rmSync(base, { recursive: true, force: true });
+    }
+  });
+
+  it('is idempotent and never clobbers the agent edits', () => {
+    const base = fs.mkdtempSync(path.join(os.tmpdir(), 'nanoclaw-mem-'));
+    try {
+      ensureMemoryScaffold(base);
+      const indexFile = path.join(base, 'memory', 'index.md');
+      fs.writeFileSync(indexFile, '# my own index\n');
+
+      ensureMemoryScaffold(base);
+
+      expect(fs.readFileSync(indexFile, 'utf-8')).toBe('# my own index\n');
+    } finally {
+      fs.rmSync(base, { recursive: true, force: true });
+    }
+  });
+});
@@ -0,0 +1,39 @@
+import fs from 'fs';
+import path from 'path';
+import { fileURLToPath } from 'url';
+
+/**
+ * Create the agent's persistent memory scaffold, container-side, at boot.
+ *
+ * The runner owns its own workspace: it writes the memory tree straight into
+ * `/workspace/agent` (the host-backed, RW group dir, so it persists across the
+ * ephemeral container). No host-side step, nothing mounted in.
+ *
+ * The default `definition.md` / `index.md` live as real markdown templates next
+ * to this module (under `memory-templates/`) — not as strings in code — so the
+ * doctrine is editable as markdown and the agent receives an unescaped copy.
+ * They ship in the mounted `/app/src` tree, so no image change is needed.
+ *
+ * Idempotent — only writes what's missing, so the agent's own edits and
+ * accumulated memory are never clobbered on a later wake. Provider-agnostic:
+ * the runner makes no assumption about which harness is running — a provider
+ * opts in via `usesMemoryScaffold`.
+ */
+const TEMPLATES_DIR = path.join(path.dirname(fileURLToPath(import.meta.url)), 'memory-templates');
+
+export function ensureMemoryScaffold(baseDir = '/workspace/agent'): void {
+  const memoryDir = path.join(baseDir, 'memory');
+  const systemDir = path.join(memoryDir, 'system');
+
+  for (const dir of [systemDir, path.join(memoryDir, 'memories'), path.join(memoryDir, 'data')]) {
+    fs.mkdirSync(dir, { recursive: true });
+  }
+
+  copyTemplateIfMissing('definition.md', path.join(systemDir, 'definition.md'));
+  copyTemplateIfMissing('index.md', path.join(memoryDir, 'index.md'));
+}
+
+function copyTemplateIfMissing(template: string, dest: string): void {
+  if (fs.existsSync(dest)) return;
+  fs.copyFileSync(path.join(TEMPLATES_DIR, template), dest);
+}
@@ -0,0 +1,22 @@
+import { describe, expect, it } from 'bun:test';
+import fs from 'fs';
+import path from 'path';
+
+// Wiring guard for the memory-scaffold seam: the boot gate in index.ts
+// (`if (provider.usesMemoryScaffold) ensureMemoryScaffold()`) is the seam's
+// single functional reach-in. The unit tests in memory-scaffold.test.ts drive
+// ensureMemoryScaffold directly and stay green if the gate is deleted — this
+// test goes red. main() can't be driven in-process (it reads
+// /workspace/agent/container.json and enters the poll loop), so the guard is
+// structural: gate + import must both be present in the real entry point.
+describe('memory scaffold boot wiring', () => {
+  const indexSrc = fs.readFileSync(path.join(import.meta.dir, 'index.ts'), 'utf-8');
+
+  it('gates the scaffold on the provider capability in main()', () => {
+    expect(indexSrc).toContain('if (provider.usesMemoryScaffold) ensureMemoryScaffold()');
+  });
+
+  it('imports ensureMemoryScaffold from the seam module', () => {
+    expect(indexSrc).toContain("import { ensureMemoryScaffold } from './memory-scaffold.js'");
+  });
+});
@@ -0,0 +1,23 @@
+# Agent Memory System
+
+This editable file defines how your persistent memory works. It is a starting
+point, not a contract — reorganize it as the work demands. If the user or another
+memory system replaces this definition, follow the replacement.
+
+Start every memory task at `memory/index.md`, then follow the narrowest relevant index.
+Treat indexes as core data: keep them accurate and concise.
+Every folder of durable memory has its own `index.md` describing its contents.
+When an index grows past roughly 20 entries, group related items into subfolders,
+and give each new subfolder its own `index.md` linked from the parent.
+
+Use `memory/memories/` for durable facts, project context, people, decisions, and entity notes.
+Use `memory/data/` for structured reference data, datasets, tables, and reusable records.
+Use entity folders for things that matter: projects, people, places, organizations, decisions.
+
+When the user shares something that should survive future turns, store it in the
+smallest useful file; prefer updating an existing file over creating duplicates.
+Write concise, source-aware notes; include dates when timing matters.
+If a fact is corrected, update the memory and keep only useful history.
+When you add, move, or remove memory, update the nearest index.
+Before answering from memory, read the relevant index or file instead of guessing;
+if memory is missing or uncertain, say so and verify when it matters.
@@ -0,0 +1,5 @@
+# Memory Index
+
+- [Memory system definition](system/definition.md)
+- [Memories](memories/) - durable facts, people, projects, decisions
+- [Data](data/) - structured reference data
@@ -14,7 +14,7 @@ import {
  type RoutingContext,
 } from './formatter.js';
 import { isUploadTraceCommand, uploadTrace } from './upload-trace.js';
-import type { AgentProvider, AgentQuery, ProviderEvent } from './providers/types.js';
+import type { AgentProvider, AgentQuery, ProviderEvent, ProviderExchange } from './providers/types.js';

 const POLL_INTERVAL_MS = 1000;
 const ACTIVE_POLL_INTERVAL_MS = 500;
@@ -63,6 +63,12 @@ export interface PollLoopConfig {
  systemContext?: {
    instructions?: string;
  };
+  /**
+   * Optional stop signal. In production the loop runs until the container
+   * dies; tests pass a signal so an abandoned loop actually exits instead of
+   * polling forever and stealing messages from the next test's DB.
+   */
+  signal?: AbortSignal;
 }

 /**
@@ -107,6 +113,7 @@ export async function runPollLoop(config: PollLoopConfig): Promise<void> {
  let pollCount = 0;
  let isFirstPoll = true;
  while (true) {
+    if (config.signal?.aborted) return;
    // Skip system messages — they're responses for MCP tools (e.g., ask_user_question)
    const messages = getPendingMessages(isFirstPoll).filter((m) => m.kind !== 'system');
    isFirstPoll = false;
@@ -232,7 +239,15 @@ export async function runPollLoop(config: PollLoopConfig): Promise<void> {
    // can stamp it on outbound rows — needed for a2a return-path routing.
    setCurrentInReplyTo(routing.inReplyTo);
    try {
-      const result = await processQuery(query, routing, processingIds, config.providerName);
+      const result = await processQuery(
+        query,
+        routing,
+        processingIds,
+        config.providerName,
+        config.provider.onExchangeComplete?.bind(config.provider),
+        prompt,
+        continuation,
+      );
      if (result.continuation && result.continuation !== continuation) {
        continuation = result.continuation;
        setContinuation(config.providerName, continuation);
@@ -313,10 +328,18 @@ async function processQuery(
  routing: RoutingContext,
  initialBatchIds: string[],
  providerName: string,
+  onExchangeComplete: ((exchange: ProviderExchange) => void) | undefined,
+  initialPrompt: string,
+  initialContinuation: string | undefined,
 ): Promise<QueryResult> {
  let queryContinuation: string | undefined;
  let done = false;
  let unwrappedNudged = false;
+  // Prompt queue for the exchange hook — each result event consumes the
+  // oldest unanswered prompt, except a wrapping-retry result, which answers
+  // the same prompt again. Unused (and unmaintained) when the provider
+  // doesn't implement `onExchangeComplete`.
+  const archivePrompts: string[] = [initialPrompt];

  // Concurrent polling: push follow-ups into the active query as they arrive.
  // We do NOT force-end the stream on silence — keeping the query open avoids
@@ -342,13 +365,16 @@ async function processQuery(
        // resume id (fixed at sdkQuery() time); admin/passthrough commands
        // (/compact, /cost, …) only dispatch when they're the first input
        // of a query — pushed mid-stream they arrive as plain text and
-        // the SDK never runs them. End the stream and leave the rows
-        // pending; the outer loop handles them on next iteration via the
-        // canonical command path + formatMessagesWithCommands.
+        // the SDK never runs them. Abort the active stream and leave the
+        // rows pending; the outer loop handles them on next iteration via
+        // the canonical command path + formatMessagesWithCommands. Abort,
+        // not end: end() lets an in-flight turn run to completion, which
+        // can block the command (e.g. /clear during a long task) for as
+        // long as the turn takes.
        if (pending.some((m) => isRunnerCommand(m))) {
-          log('Pending slash command — ending stream so outer loop can process');
+          log('Pending slash command — aborting active stream so outer loop can process');
          endedForCommand = true;
-          query.end();
+          query.abort();
          return;
        }

@@ -393,6 +419,7 @@ async function processQuery(
        log(`Pushing ${keep.length} follow-up message(s) into active query`);
        unwrappedNudged = false;
        query.push(prompt);
+        archivePrompts.push(prompt);
        markCompleted(keptIds);
      } catch (err) {
        // Without this catch the rejection escapes the void IIFE and Node
@@ -456,7 +483,14 @@ async function processQuery(
        markCompleted(initialBatchIds);
        if (event.text) {
          const { hasUnwrapped } = dispatchResultText(event.text, routing);
-          if (hasUnwrapped && !unwrappedNudged) {
+          const willRetryWrapping = hasUnwrapped && !unwrappedNudged;
+          notifyExchangeComplete(onExchangeComplete, {
+            prompt: archivePrompts[0] ?? initialPrompt,
+            result: event.text,
+            continuation: queryContinuation ?? initialContinuation,
+            status: hasUnwrapped ? 'undelivered' : 'completed',
+          });
+          if (willRetryWrapping) {
            unwrappedNudged = true;
            const destinations = getAllDestinations();
            const names = destinations.map((d) => d.name).join(', ');
@@ -467,9 +501,23 @@ async function processQuery(
                `Please re-send your response with the correct wrapping.</system>`,
            );
          }
+          // The wrapping-retry result answers the SAME user prompt — keep it
+          // queued so the retry archives against it, not the nudge text.
+          if (!willRetryWrapping) archivePrompts.shift();
+        } else {
+          archivePrompts.shift();
        }
      }
    }
+  } catch (err) {
+    const errMsg = err instanceof Error ? err.message : String(err);
+    notifyExchangeComplete(onExchangeComplete, {
+      prompt: archivePrompts[0] ?? initialPrompt,
+      result: `Error: ${errMsg}`,
+      continuation: queryContinuation ?? initialContinuation,
+      status: 'error',
+    });
+    throw err;
  } finally {
    done = true;
    clearInterval(pollHandle);
@@ -478,6 +526,18 @@ async function processQuery(
  return { continuation: queryContinuation };
 }

+function notifyExchangeComplete(
+  hook: ((exchange: ProviderExchange) => void) | undefined,
+  exchange: ProviderExchange,
+): void {
+  if (!hook) return;
+  try {
+    hook(exchange);
+  } catch (err) {
+    log(`onExchangeComplete failed: ${err instanceof Error ? err.message : String(err)}`);
+  }
+}
+
 function handleEvent(event: ProviderEvent, _routing: RoutingContext): void {
  switch (event.type) {
    case 'init':
@@ -6,6 +6,25 @@ export interface AgentProvider {
   */
  readonly supportsNativeSlashCommands: boolean;

+  /**
+   * Optional. When true, the runner scaffolds a persistent `memory/` tree in the
+   * agent's workspace at boot. Providers with their own native memory (e.g.
+   * Claude's `CLAUDE.local.md`) omit this and get nothing — memory is opt-in per
+   * provider, never gated on a provider name.
+   */
+  readonly usesMemoryScaffold?: boolean;
+
+  /**
+   * Optional. Called by the poll-loop after each completed exchange (a
+   * result, a wrapping retry, or an error). Providers whose harness keeps no
+   * on-disk transcript implement this to persist exchanges themselves (e.g.
+   * markdown into the agent's `conversations/` dir); providers that persist
+   * and archive their own transcript (e.g. the Claude Agent SDK's `.jsonl`)
+   * omit it. Best-effort: the loop catches and logs anything it throws. The
+   * implementation lives with the provider, never in the runner.
+   */
+  onExchangeComplete?(exchange: ProviderExchange): void;
+
  /** Start a new query. Returns a handle for streaming input and output. */
  query(input: QueryInput): AgentQuery;

@@ -31,6 +50,16 @@ export interface AgentProvider {
  maybeRotateContinuation?(continuation: string, cwd: string): string | null;
 }

+/** One prompt/result round-trip, as reported to `onExchangeComplete`. */
+export interface ProviderExchange {
+  /** The user prompt this exchange answers (never an internal retry nudge). */
+  prompt: string;
+  result: string | null;
+  /** Continuation/thread id in effect for the exchange, if any. */
+  continuation?: string;
+  status: 'completed' | 'undelivered' | 'error';
+}
+
 /**
 * Options passed to provider constructors. Fields are common to most
 * providers; individual providers may ignore any they don't need.
@@ -0,0 +1,83 @@
+# Upgrading the OneCLI gateway
+
+NanoClaw talks to the OneCLI gateway (credential vault + egress proxy) through `@onecli-sh/sdk`. The gateway is an external component with its own release line, so NanoClaw pins the **sanctioned gateway version** in [`versions.json`](../versions.json) under `onecli-gateway`. When an update moves that pin, the gateway must be upgraded — this doc is the migration path. It is written to be handed to a coding agent verbatim: detect → upgrade → verify → rollback.
+
+There is deliberately **no runtime version check, and setup does not migrate the gateway for you**: the gateway is a separate out-of-band component, and the migrator is your coding agent running `/update-nanoclaw` — it diffs `versions.json` across the update and routes you here when the `onecli-gateway` pin moved. (Setup detects a pre-`/v1` gateway and points at this doc, but never upgrades it.) Run the steps below verbatim.
+
+## 1. Detect
+
+Find out what is running and what is required:
+
+```bash
+cat versions.json                                   # the sanctioned pin
+curl -s http://127.0.0.1:10254/api/health           # reports the running gateway version
+curl -s -o /dev/null -w '%{http_code}' http://127.0.0.1:10254/v1/health
+```
+
+If the last command prints `404`, the server predates the `/v1` API that `@onecli-sh/sdk` 2.x requires — every SDK call will fail with 404s that look transient but are permanent. If your gateway is remote, substitute its host for `127.0.0.1` (it's in `.env` as `ONECLI_URL` / `NANOCLAW_ONECLI_API_HOST`).
+
+Why gateways fall behind: the OneCLI installer's docker-compose tracks the `latest` image tag, but Docker never re-pulls a tag — the server freezes at whatever `latest` meant on install day.
+
+## 2. Upgrade
+
+The gateway runs as a Docker service in `~/.onecli`. Upgrade just that container to the pinned `onecli-gateway` version — vault data lives in named Docker volumes and survives. This upgrades only the gateway; the CLI binary is pinned separately (see below).
+
+**Local gateway (the common case):**
+
+```bash
+cd ~/.onecli && ONECLI_VERSION=<onecli-gateway pin from versions.json> docker compose pull onecli && docker compose up -d
+```
+
+**Remote gateway** — run the same command on the gateway's host (NanoClaw can't reach it over SSH).
+
+## 3. Verify
+
+Host-side health is necessary but **not sufficient**:
+
+```bash
+curl -s http://127.0.0.1:10254/v1/health     # must return {"status":"ok",...}
+```
+
+**Verify the bind interface (container reachability).** Agent containers reach the gateway over the docker bridge (`host.docker.internal` → e.g. `172.17.0.1`), so a server bound only to `127.0.0.1` boots clean host-side while every credentialed call from containers dies at the proxy:
+
+```bash
+docker run --rm --add-host=host.docker.internal:host-gateway \
+  curlimages/curl -s -o /dev/null -w '%{http_code}' http://host.docker.internal:10254/v1/health
+```
+
+This must print `200`. If it can't connect while the host-side check passed, set the bind address in `~/.onecli/.env` to the docker-bridge IP (or `0.0.0.0` on a host with a closed firewall) and `cd ~/.onecli && docker compose up -d`. Symptom if skipped: host log clean, agents fail all API calls.
+
+Finally, restart the NanoClaw service (per-install names — derive with `setup/lib/install-slug.sh`):
+
+```bash
+# macOS
+source setup/lib/install-slug.sh && launchctl kickstart -k gui/$(id -u)/$(launchd_label)
+# Linux
+source setup/lib/install-slug.sh && systemctl --user restart $(systemd_unit)
+```
+
+## 4. Rollback
+
+```bash
+cd ~/.onecli && ONECLI_VERSION=<old-version> docker compose up -d
+```
+
+If the NanoClaw update itself is being rolled back, also pin `@onecli-sh/sdk` back to its previous version in `package.json` and run `pnpm install`. Vault data is unaffected in both directions.
+
+## The CLI binary (`onecli-cli` pin)
+
+The `onecli` host CLI is pinned the same way, under `onecli-cli` in `versions.json`. Setup installs exactly that version by direct release download — it never resolves "latest". When an update moves this pin, replace the binary with the pinned release:
+
+```bash
+onecli --version                                            # detect: what is installed
+V=<onecli-cli pin from versions.json>
+OS=$(uname -s | tr '[:upper:]' '[:lower:]')                 # darwin | linux
+ARCH=$(uname -m | sed 's/x86_64/amd64/;s/aarch64/arm64/')   # amd64 | arm64
+curl -fsSL -o /tmp/onecli.tgz \
+  "https://github.com/onecli/onecli-cli/releases/download/v${V}/onecli_${V}_${OS}_${ARCH}.tar.gz"
+tar -xzf /tmp/onecli.tgz -C /tmp
+install -m 0755 /tmp/onecli "$(command -v onecli || echo ~/.local/bin/onecli)"
+onecli --version                                            # verify: must match versions.json
+```
+
+To roll back, run the same block after reverting `versions.json` (or checking out the previous NanoClaw version). The CLI is stateless — vault data lives in the gateway, so swapping the binary in either direction loses nothing.
@@ -1,6 +1,6 @@
 {
  "name": "nanoclaw",
-  "version": "2.1.11",
+  "version": "2.1.13",
  "description": "Personal Claude assistant. Lightweight, secure, customizable.",
  "type": "module",
  "packageManager": "pnpm@10.33.0",
@@ -30,7 +30,7 @@
  "dependencies": {
    "@clack/core": "^1.2.0",
    "@clack/prompts": "^1.2.0",
-    "@onecli-sh/sdk": "^0.5.0",
+    "@onecli-sh/sdk": "2.2.1",
    "better-sqlite3": "11.10.0",
    "chat": "^4.24.0",
    "cron-parser": "5.5.0",
@@ -15,8 +15,8 @@ importers:
        specifier: ^1.2.0
        version: 1.2.0
      '@onecli-sh/sdk':
-        specifier: ^0.5.0
-        version: 0.5.0
+        specifier: 2.2.1
+        version: 2.2.1
      better-sqlite3:
        specifier: 11.10.0
        version: 11.10.0
@@ -303,8 +303,8 @@ packages:
      '@emnapi/core': ^1.7.1
      '@emnapi/runtime': ^1.7.1

-  '@onecli-sh/sdk@0.5.0':
-    resolution: {integrity: sha512-oe5Yx9o98v6N1PgzcCR7nULHHqcqKWNJIDOHGOSNX+l20mLlZpFUqfKPeFmsojBNRQMoqbvZQKUlFMp6gVuYBA==}
+  '@onecli-sh/sdk@2.2.1':
+    resolution: {integrity: sha512-q2mCW4ZsARlLEoTxz/P0NQ4MiCh7Z2n28pxkSc7srS+tozyw40PdTnWYW7NI8hfSYplZTx5856Adq1iPi4KN3Q==}
    engines: {node: '>=20'}

  '@oxc-project/types@0.124.0':
@@ -1665,7 +1665,7 @@ snapshots:
      '@tybys/wasm-util': 0.10.1
    optional: true

-  '@onecli-sh/sdk@0.5.0': {}
+  '@onecli-sh/sdk@2.2.1': {}

  '@oxc-project/types@0.124.0': {}

@@ -1,5 +1,5 @@
-<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="90" height="20" role="img" aria-label="190k tokens, 95% of context window">
-  <title>190k tokens, 95% of context window</title>
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="90" height="20" role="img" aria-label="192k tokens, 96% of context window">
+  <title>192k tokens, 96% of context window</title>
  <linearGradient id="s" x2="0" y2="100%">
    <stop offset="0" stop-color="#bbb" stop-opacity=".1"/>
    <stop offset="1" stop-opacity=".1"/>
@@ -15,8 +15,8 @@
      <g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" font-size="11">
        <text aria-hidden="true" x="26" y="15" fill="#010101" fill-opacity=".3">tokens</text>
        <text x="26" y="14">tokens</text>
-        <text aria-hidden="true" x="71" y="15" fill="#010101" fill-opacity=".3">190k</text>
-        <text x="71" y="14">190k</text>
+        <text aria-hidden="true" x="71" y="15" fill="#010101" fill-opacity=".3">192k</text>
+        <text x="71" y="14">192k</text>
      </g>
    </g>
  </a>
@@ -0,0 +1,48 @@
+/**
+ * versions.json is the machine-checkable source for sanctioned component
+ * versions: setup steps read it, /update-nanoclaw diffs it across updates.
+ * These tests go red if the file, the pin, or the onecli-step wiring is
+ * deleted — the pin moving back to a hardcoded constant is the regression
+ * this guards against.
+ */
+import fs from 'fs';
+import path from 'path';
+import { fileURLToPath } from 'url';
+
+import { describe, expect, it } from 'vitest';
+
+import { readVersionPin } from './version-pins.js';
+
+const here = path.dirname(fileURLToPath(import.meta.url));
+
+describe('readVersionPin', () => {
+  it('resolves the onecli-gateway pin from the real versions.json', () => {
+    expect(readVersionPin('onecli-gateway')).toMatch(/^\d+\.\d+\.\d+$/);
+  });
+
+  it('resolves the onecli-cli pin from the real versions.json', () => {
+    expect(readVersionPin('onecli-cli')).toMatch(/^\d+\.\d+\.\d+$/);
+  });
+
+  it('throws for a component with no pin', () => {
+    expect(() => readVersionPin('no-such-component')).toThrow(/no pin/);
+  });
+});
+
+describe('onecli step wiring', () => {
+  it('reads its gateway pin from versions.json, not a hardcoded constant', () => {
+    const source = fs.readFileSync(path.join(here, '..', 'onecli.ts'), 'utf-8');
+    expect(source).toContain("readVersionPin('onecli-gateway')");
+    expect(source).not.toMatch(/ONECLI_GATEWAY_VERSION = '\d/);
+  });
+
+  it('reads its CLI pin from versions.json and never resolves "latest"', () => {
+    const source = fs.readFileSync(path.join(here, '..', 'onecli.ts'), 'utf-8');
+    expect(source).toContain("readVersionPin('onecli-cli')");
+    expect(source).not.toMatch(/ONECLI_CLI(?:_FALLBACK)?_VERSION = '\d/);
+    // The upstream installer and the /releases/latest redirect probe both
+    // chase "latest" — reintroducing either bypasses the sanctioned pin.
+    expect(source).not.toContain('onecli.sh/cli/install');
+    expect(source).not.toContain('/releases/latest');
+  });
+});
@@ -0,0 +1,31 @@
+/**
+ * Sanctioned version pins for external components (`versions.json` at the
+ * repo root) — the single machine-checkable source. Setup steps read their
+ * pin here; `/update-nanoclaw` diffs the file across an update and routes
+ * the user to the migration doc for any pin that moved (see CONTRIBUTING.md,
+ * "Breaking changes").
+ */
+import fs from 'fs';
+import path from 'path';
+import { fileURLToPath } from 'url';
+
+const VERSIONS_FILE = path.resolve(
+  path.dirname(fileURLToPath(import.meta.url)),
+  '..',
+  '..',
+  'versions.json',
+);
+
+/**
+ * Returns the pinned version for a component, e.g.
+ * `readVersionPin('onecli-gateway')`. Throws when the file or the pin is
+ * missing — a missing pin is an install-tree defect, not a runtime condition.
+ */
+export function readVersionPin(component: string): string {
+  const pins: unknown = JSON.parse(fs.readFileSync(VERSIONS_FILE, 'utf-8'));
+  const value = (pins as Record<string, unknown>)[component];
+  if (typeof value !== 'string' || value.length === 0) {
+    throw new Error(`versions.json has no pin for "${component}"`);
+  }
+  return value;
+}
@@ -0,0 +1,29 @@
+/**
+ * The step DETECTS gateway /v1 compatibility and warns (pointing at
+ * docs/onecli-upgrades.md) — it does not migrate the gateway; that's the
+ * agent's job via /update-nanoclaw. The verify helper must distinguish
+ * incompatible (pre-/v1 server: warn) from unreachable (transient: nothing to
+ * say) so the warning only fires on a real pre-/v1 server.
+ */
+import { describe, expect, it } from 'vitest';
+
+import { verifyGatewayV1 } from './onecli.js';
+
+function fakeFetch(behavior: 'ok' | '404' | 'down'): typeof fetch {
+  return (async () => {
+    if (behavior === 'down') throw new Error('ECONNREFUSED');
+    return { ok: behavior === 'ok' } as Response;
+  }) as unknown as typeof fetch;
+}
+
+describe('verifyGatewayV1', () => {
+  it('ok when /v1/health answers', async () => {
+    expect(await verifyGatewayV1('http://x', fakeFetch('ok'))).toBe('ok');
+  });
+  it('incompatible when the server answers HTTP without /v1', async () => {
+    expect(await verifyGatewayV1('http://x', fakeFetch('404'))).toBe('incompatible');
+  });
+  it('unreachable on connection failure', async () => {
+    expect(await verifyGatewayV1('http://x', fakeFetch('down'))).toBe('unreachable');
+  });
+});
@@ -17,6 +17,7 @@ import os from 'os';
 import path from 'path';

 import { log } from '../src/log.js';
+import { readVersionPin } from './lib/version-pins.js';
 import { emitStatus } from './status.js';

 const LOCAL_BIN = path.join(os.homedir(), '.local', 'bin');
@@ -102,20 +103,18 @@ function writeEnvOnecliUrl(url: string): void {
  writeEnvVar('ONECLI_URL', url);
 }

-// Last-known-good CLI release. Used only if BOTH the upstream installer
-// and the redirect-based version probe fail. Bump deliberately when a
-// new CLI release ships.
-const ONECLI_GATEWAY_VERSION = '1.23.0';
-const ONECLI_CLI_FALLBACK_VERSION = '1.3.0';
+// The SANCTIONED gateway version: fresh installs pin to it. Upgrading an
+// existing gateway is NOT done here — the gateway is a separate out-of-band
+// component, and the migrator is the user's coding agent following
+// docs/onecli-upgrades.md during /update-nanoclaw. The pin lives in
+// versions.json ("onecli-gateway") so that flow can diff it across updates and
+// route the agent to the doc; bump it there deliberately on a new release.
+const ONECLI_GATEWAY_VERSION = readVersionPin('onecli-gateway');
+// The CLI binary follows the same convention: installed at its pin
+// ("onecli-cli" in versions.json), never at whatever "latest" means today.
+const ONECLI_CLI_VERSION = readVersionPin('onecli-cli');
 const ONECLI_CLI_REPO = 'onecli/onecli-cli';

-function installOnecliCliOnly(): { stdout: string; ok: boolean } {
-  const upstream = runInstall('curl -fsSL onecli.sh/cli/install | sh');
-  if (upstream.ok) return { stdout: upstream.stdout, ok: true };
-  const fallback = installOnecliCliDirect();
-  return { stdout: upstream.stdout + (upstream.stderr ?? '') + '\n' + fallback.stdout, ok: fallback.ok };
-}
-
 // Remove containers in the "onecli" compose project whose service name isn't
 // in the v2 set. Pre-v2 OneCLI used service "app" (container onecli-app-1);
 // v2 uses "onecli". Compose flags the old container as an orphan but won't
@@ -161,24 +160,10 @@ function installOnecli(): { stdout: string; ok: boolean } {
    return { stdout: stdout + (gw.stderr ?? ''), ok: false };
  }

-  // CLI install. The upstream script calls the GitHub releases API
-  // (api.github.com) to resolve the latest tag — which 403s anonymous
-  // callers after 60 requests/hour per IP. Try upstream first; on failure
-  // resolve the version ourselves (via HTTP redirect, which isn't
-  // API-throttled) and download the release archive directly.
-  const upstream = runInstall('curl -fsSL onecli.sh/cli/install | sh');
-  stdout += upstream.stdout;
-  if (upstream.ok) return { stdout, ok: true };
-
-  log.warn('Upstream CLI installer failed — falling back to direct download', {
-    stderr: upstream.stderr,
-  });
-  stdout += (upstream.stderr ?? '') + '\n';
-
-  const fallback = installOnecliCliDirect();
-  stdout += fallback.stdout;
-  if (!fallback.ok) {
-    log.error('OneCLI CLI install failed (both upstream and direct fallback)');
+  const cli = installOnecliCliDirect();
+  stdout += cli.stdout;
+  if (!cli.ok) {
+    log.error('OneCLI CLI install failed');
    return { stdout, ok: false };
  }
  return { stdout, ok: true };
@@ -198,11 +183,11 @@ function runInstall(cmd: string): { stdout: string; stderr?: string; ok: boolean
 }

 /**
- * Reinstate the OneCLI CLI install without hitting GitHub's rate-limited
- * releases API. Resolves the version via the HTTP redirect from
- * /releases/latest → /releases/tag/vX.Y.Z, then downloads the archive
- * directly. Falls back to ONECLI_CLI_FALLBACK_VERSION if the redirect
- * probe also fails.
+ * Install the OneCLI CLI at the sanctioned pin by downloading the release
+ * archive straight from GitHub. Deliberately no "latest" resolution — the
+ * upstream installer script always chases the newest release, which would
+ * drift from the pin. PATH setup is not lost by skipping it:
+ * ensureShellProfilePath() in run() covers it.
 */
 function installOnecliCliDirect(): { stdout: string; ok: boolean } {
  const lines: string[] = [];
@@ -221,24 +206,7 @@ function installOnecliCliDirect(): { stdout: string; ok: boolean } {
    return { stdout: lines.join('\n'), ok: false };
  }

-  let version: string | null = null;
-  try {
-    const redirect = execSync(
-      `curl -fsSL -o /dev/null -w '%{url_effective}' https://github.com/${ONECLI_CLI_REPO}/releases/latest`,
-      { encoding: 'utf-8', stdio: ['ignore', 'pipe', 'pipe'] },
-    ).trim();
-    const m = redirect.match(/\/tag\/v?([^/]+)$/);
-    if (m) version = m[1];
-  } catch {
-    // redirect probe failed — we'll pin the fallback
-  }
-  if (!version) {
-    version = ONECLI_CLI_FALLBACK_VERSION;
-    append(`Version probe failed; installing pinned fallback ${version}.`);
-  } else {
-    append(`Resolved onecli CLI ${version} via release redirect.`);
-  }
-
+  const version = ONECLI_CLI_VERSION;
  const archive = `onecli_${version}_${osName}_${arch}.tar.gz`;
  const url = `https://github.com/${ONECLI_CLI_REPO}/releases/download/v${version}/${archive}`;
  const tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'onecli-'));
@@ -275,6 +243,39 @@ function installOnecliCliDirect(): { stdout: string; ok: boolean } {
  }
 }

+/**
+ * /v1 API compatibility check. @onecli-sh/sdk 2.x requires the server's /v1
+ * API; servers older than the cutover answer 404 on every SDK call (permanent,
+ * but presents as transient per-spawn failures). This is detect-only — setup
+ * does not migrate the gateway. The upgrade is an out-of-band action on a
+ * separate component that the agent runs via docs/onecli-upgrades.md during
+ * /update-nanoclaw, so this step only surfaces the condition and points there.
+ */
+export async function verifyGatewayV1(
+  url: string,
+  fetchImpl: typeof fetch = fetch,
+): Promise<'ok' | 'incompatible' | 'unreachable'> {
+  try {
+    const res = await fetchImpl(`${url}/v1/health`, { signal: AbortSignal.timeout(5000) });
+    return res.ok ? 'ok' : 'incompatible';
+  } catch {
+    return 'unreachable';
+  }
+}
+
+/**
+ * Detect-and-warn helper: returns a status HINT (and logs) when the gateway is
+ * pre-/v1, else null. Never fails the step or auto-upgrades — the agent owns
+ * the upgrade via docs/onecli-upgrades.md.
+ */
+function gatewayV1Hint(result: 'ok' | 'incompatible' | 'unreachable'): string | null {
+  if (result !== 'incompatible') return null;
+  log.warn('OneCLI gateway lacks the /v1 API @onecli-sh/sdk 2.x requires', {
+    pin: ONECLI_GATEWAY_VERSION,
+  });
+  return 'OneCLI gateway lacks the /v1 API @onecli-sh/sdk 2.x requires — upgrade it: docs/onecli-upgrades.md';
+}
+
 export async function pollHealth(url: string, timeoutMs: number): Promise<boolean> {
  // `/api/health` matches the path probe.sh uses — keep them aligned.
  const deadline = Date.now() + timeoutMs;
@@ -300,7 +301,7 @@ export async function run(args: string[]): Promise<void> {
    // Remote-mode: install only the CLI, point it at the remote gateway, and
    // record the URL in .env. No local gateway is started.
    log.info('Installing OneCLI CLI for remote gateway', { remoteUrl });
-    const res = installOnecliCliOnly();
+    const res = installOnecliCliDirect();
    if (!res.ok || !onecliVersion()) {
      emitStatus('ONECLI', {
        INSTALLED: false,
@@ -339,12 +340,14 @@ export async function run(args: string[]): Promise<void> {
      log.info('Wrote ONECLI_API_KEY to .env');
    }
    const healthy = await pollHealth(remoteUrl, 5000);
+    const v1Hint = healthy ? gatewayV1Hint(await verifyGatewayV1(remoteUrl)) : null;
    emitStatus('ONECLI', {
      INSTALLED: true,
      REMOTE: true,
      ONECLI_URL: remoteUrl,
      HEALTHY: healthy,
      STATUS: 'success',
+      ...(v1Hint ? { GATEWAY_HINT: v1Hint } : {}),
      LOG: 'logs/setup.log',
    });
    return;
@@ -378,12 +381,14 @@ export async function run(args: string[]): Promise<void> {
    writeEnvOnecliUrl(url);
    log.info('Reusing existing OneCLI', { url });
    const healthy = await pollHealth(url, 5000);
+    const v1Hint = healthy ? gatewayV1Hint(await verifyGatewayV1(url)) : null;
    emitStatus('ONECLI', {
      INSTALLED: true,
      REUSED: true,
      ONECLI_URL: url,
      HEALTHY: healthy,
      STATUS: 'success',
+      ...(v1Hint ? { GATEWAY_HINT: v1Hint } : {}),
      LOG: 'logs/setup.log',
    });
    return;
@@ -436,6 +441,7 @@ export async function run(args: string[]): Promise<void> {
  log.info('Wrote ONECLI_URL to .env', { url });

  const healthy = await pollHealth(url, 15000);
+  const v1Hint = healthy ? gatewayV1Hint(await verifyGatewayV1(url)) : null;

  emitStatus('ONECLI', {
    INSTALLED: true,
@@ -446,6 +452,7 @@ export async function run(args: string[]): Promise<void> {
    // The next step (auth) will surface a genuinely broken gateway via
    // `onecli secrets list`, so don't trigger rescue attempts from here.
    STATUS: 'success',
+    ...(v1Hint ? { GATEWAY_HINT: v1Hint } : {}),
    ...(healthy
      ? {}
      : {
@@ -1,3 +1,5 @@
+import fs from 'fs';
+import path from 'path';
 import { describe, expect, it } from 'vitest';

 import { resolveProviderName } from './container-runner.js';
@@ -25,3 +27,22 @@ describe('resolveProviderName', () => {
    expect(resolveProviderName(null, '')).toBe('claude');
  });
 });
+
+describe('buildContainerArgs ordering invariant (structural)', () => {
+  // The OneCLI gateway apply (SDK applyContainerConfig) appends credential-stub
+  // mounts — e.g. the codex auth.json sentinel nested INSIDE our RW
+  // /home/node/.codex mount. Docker applies binds in argument order, so the
+  // stub must land AFTER its parent mount or the parent shadows it and the
+  // agent silently degrades to loginless auth. Driving the real
+  // buildContainerArgs needs a live gateway + container runtime, so this
+  // guards the invariant structurally: the gateway apply must appear after
+  // the volume-mounts loop in the source.
+  it('applies the OneCLI gateway after the volume mounts', () => {
+    const src = fs.readFileSync(path.join(process.cwd(), 'src', 'container-runner.ts'), 'utf-8');
+    const mountsLoop = src.indexOf('for (const mount of mounts)');
+    const gatewayApply = src.indexOf('onecli.applyContainerConfig');
+    expect(mountsLoop).toBeGreaterThan(-1);
+    expect(gatewayApply).toBeGreaterThan(-1);
+    expect(gatewayApply).toBeGreaterThan(mountsLoop);
+  });
+});
@@ -434,20 +434,6 @@ async function buildContainerArgs(
    }
  }

-  // OneCLI gateway — injects HTTPS_PROXY + certs so container API calls
-  // are routed through the agent vault for credential injection. Treated as
-  // a transient hard failure: if we can't wire the gateway, we don't spawn.
-  // The caller (router or host-sweep) catches the throw, leaves the inbound
-  // message pending, and the next sweep tick retries.
-  if (agentIdentifier) {
-    await onecli.ensureAgent({ name: agentGroup.name, identifier: agentIdentifier });
-  }
-  const onecliApplied = await onecli.applyContainerConfig(args, { addHostMapping: false, agent: agentIdentifier });
-  if (!onecliApplied) {
-    throw new Error('OneCLI gateway not applied — refusing to spawn container without credentials');
-  }
-  log.info('OneCLI gateway applied', { containerName });
-
  // Egress lockdown when enabled — throws if it can't be established, aborting
  // the spawn rather than running with open egress. Otherwise the host gateway.
  if (ensureEgressNetwork()) {
@@ -474,6 +460,24 @@ async function buildContainerArgs(
    }
  }

+  // OneCLI gateway — injects HTTPS_PROXY + certs so container API calls
+  // are routed through the agent vault for credential injection, and mounts
+  // any credential stubs the gateway serves (e.g. a sentinel auth file).
+  // Runs AFTER the volume mounts so a stub nested inside one of our mounts
+  // (a parent dir mounted RW above it) lands later in the args and isn't
+  // shadowed by it. Treated as a transient hard failure: if we can't wire
+  // the gateway, we don't spawn. The caller (router or host-sweep) catches
+  // the throw, leaves the inbound message pending, and the next sweep tick
+  // retries.
+  if (agentIdentifier) {
+    await onecli.ensureAgent({ name: agentGroup.name, identifier: agentIdentifier });
+  }
+  const onecliApplied = await onecli.applyContainerConfig(args, { addHostMapping: false, agent: agentIdentifier });
+  if (!onecliApplied) {
+    throw new Error('OneCLI gateway not applied — refusing to spawn container without credentials');
+  }
+  log.info('OneCLI gateway applied', { containerName });
+
  // Override entrypoint: run v2 entry point directly via Bun (no tsc, no stdin).
  args.push('--entrypoint', 'bash');

@@ -0,0 +1,4 @@
+{
+  "onecli-gateway": "1.36.0",
+  "onecli-cli": "2.2.5"
+}
Author	SHA1	Message	Date
gavrielc	a619fc1aa2	Apply suggestion from @gavrielc	2026-06-13 16:03:02 +03:00
Omri Maya	3d2f3e58ca	feat(runner): onExchangeComplete provider hook + slash-command interruption Inverts conversation archiving into an optional onExchangeComplete provider hook: the runner never archives on a provider's behalf, and the markdown writer ships with the provider that needs it. Dormant for the default provider. Slash commands now interrupt an in-flight turn — a runner-handled command (/clear, /compact, /cost, …) arriving mid-turn aborts the active stream and runs immediately instead of waiting out the turn. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-13 15:56:43 +03:00
gavrielc	11afc64ba4	Merge pull request #2747 from nanocoai/oss/onecli-sdk-v2 feat(onecli): SDK 2.2.1 — credential-stub mounts + machine-checkable pins	2026-06-13 15:49:40 +03:00
github-actions[bot]	0ee75d393c	chore: bump version to 2.1.13	2026-06-13 12:27:29 +00:00
github-actions[bot]	72b9cc7ed0	docs: update token count to 192k tokens · 96% of context window	2026-06-13 12:27:24 +00:00
gavrielc	5fcf234165	Merge pull request #2746 from nanocoai/oss/agent-surfaces feat(providers): agent-surfaces capability seam	2026-06-13 15:27:12 +03:00
github-actions[bot]	9b1236505f	chore: bump version to 2.1.12	2026-06-13 12:25:58 +00:00
github-actions[bot]	878cd68c1b	docs: update token count to 191k tokens · 96% of context window	2026-06-13 12:25:52 +00:00
gavrielc	fab1ebf2d6	Merge pull request #2745 from nanocoai/oss/memory-scaffold feat(memory): opt-in persistent memory scaffold for providers	2026-06-13 15:25:39 +03:00
Omri Maya	3f9e89d345	feat(onecli): SDK 2.2.1 — credential-stub mounts + machine-checkable pins Injects credentials as request-time stubs so no credential is ever written into a container or to disk. Gateway and CLI versions move to versions.json (machine-checkable pins); breaking upgrades are documented in docs/onecli-upgrades.md as an agent-executable runbook (detect / why / fix / verify / rollback), and the update flow follows linked docs and diffs the pins. BREAKING: requires a gateway upgrade; the doc carries the steps. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-12 11:30:11 +03:00
Omri Maya	2cfa86e570	feat(memory): opt-in persistent memory scaffold for providers Adds a provider capability (usesMemoryScaffold) and a container-side boot scaffold that materializes a persistent memory/ tree for providers that opt in. Dormant for the default provider — the scaffold is only built when a provider declares the capability, so existing installs are byte-identical (asserted by a boot-gate wiring test). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-12 11:30:09 +03:00