chore: release 2.1.0; guard auto-bump against deliberate version changes

Set package.json to 2.1.0 to match the CHANGELOG entry for the upgrade tripwire (a [BREAKING] change warrants a minor bump). The startup tripwire reads package.json as the source of truth, so this is the version the gate will enforce. bump-version.yml previously ran `pnpm version patch` on every push to main, which would patch a deliberate 2.1.0 up to 2.1.1. It now skips the auto-bump when the pushed commits already changed package.json themselves. fetch-depth: 0 so the before/after diff has both tips. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
docs(changelog): release the upgrade-tripwire entry as 2.1.0
2026-06-18 18:29:35 +08:00 · 2026-06-07 17:03:02 +03:00 · 2026-06-07 16:59:30 +03:00 · 2026-06-07 16:57:13 +03:00 · 2026-06-06 13:02:12 +03:00 · 2026-06-05 08:04:24 +00:00
19 changed files with 636 additions and 31 deletions
@@ -98,13 +98,13 @@ for i in $(seq 1 15); do
 done
 ```

-If it never becomes healthy, check if the gateway process is running:
+If it never becomes healthy, check the gateway containers. The gateway is a Docker Compose stack (project `onecli`, compose file at `~/.onecli/docker-compose.yml`), **not** a host process — `ps aux | grep onecli` will not find it, and there is no `onecli start` command (removed in OneCLI 1.4.x).

 ```bash
-ps aux | grep -i onecli | grep -v grep
+docker ps -a --filter "label=com.docker.compose.project=onecli" --format '{{.Names}}\t{{.Status}}'
 ```

-If it's not running, try starting it manually: `onecli start`. If that fails, show the error and stop — the user needs to debug their OneCLI installation.
+Both services have `restart: unless-stopped`, so they come back automatically once the Docker daemon is up. If Docker isn't running, start it (`open -a Docker` on macOS) and they'll restart on their own. To bring the stack up manually: `docker compose -f ~/.onecli/docker-compose.yml up -d`. If that fails, show the error and stop — the user needs to debug their OneCLI installation.

 ## Phase 3: Migrate existing credentials

@@ -299,7 +299,7 @@ If an agent uses `git` or `gh`, add to `data/v2-sessions/<agent-group-id>/.claud

 ## Troubleshooting

-**"OneCLI gateway not reachable" in logs:** The gateway isn't running. Check with `curl -sf ${ONECLI_URL}/health`. Start it with `onecli start` if needed.
+**"OneCLI gateway not reachable" in logs:** The gateway isn't running. Check with `curl -sf ${ONECLI_URL}/health`. The most common cause is that Docker itself is down (the gateway is a Compose stack) — start Docker (`open -a Docker` on macOS) and the containers restart automatically. To bring them up manually: `docker compose -f ~/.onecli/docker-compose.yml up -d`.

 **Container gets no credentials:** Verify `ONECLI_URL` is set in `.env` and the gateway has an Anthropic secret (`onecli secrets list`).

@@ -28,6 +28,15 @@ Two phases: **Extract** (build the migration guide) and **Upgrade** (use it). If

 ---

+# Phase 0: Refresh this skill first
+
+The migration process itself evolves, so run its newest version before doing anything else:
+- Ensure the `upstream` remote exists (default `https://github.com/nanocoai/nanoclaw.git`) and fetch: `git fetch upstream --prune`. Detect the upstream branch (`main` or `master`).
+- Refresh this skill from upstream: `git checkout upstream/<branch> -- .claude/skills/migrate-nanoclaw/`
+- Re-read `.claude/skills/migrate-nanoclaw/SKILL.md`. If it changed, **follow the updated version from the top** instead of this one.
+
+This is the only working-tree change expected before the preflight check below; changes limited to `.claude/skills/migrate-nanoclaw/` are this self-refresh — ignore them in the 1.0 clean-tree check and proceed.
+
 # Phase 1: Extract

 ## 1.0 Preflight
@@ -463,6 +472,11 @@ Do NOT use `git checkout -B` to create an intermediate branch — this caused is

 Run `npm install && pnpm run build` in the main tree to confirm.

+Stamp the upgrade marker (required — without it the startup tripwire stops the host on next start). Only do this after the build above succeeds:
+```bash
+pnpm exec tsx scripts/upgrade-state.ts set "" migrate-nanoclaw
+```
+
 Restart the service:
 ```bash
 launchctl load ~/Library/LaunchAgents/com.nanoclaw.plist
@@ -60,11 +60,20 @@ Help a user with a customized NanoClaw install safely incorporate upstream chang
 - Default to MERGE (one-pass conflict resolution). Offer REBASE as an explicit option.
 - Keep token usage low: rely on `git status`, `git log`, `git diff`, and open only conflicted files.

+# Step 0a: Refresh this skill first
+The update process itself evolves, so run its newest version before doing anything else:
+- Ensure the `upstream` remote exists (default `https://github.com/nanocoai/nanoclaw.git`) and fetch: `git fetch upstream --prune`. Detect the upstream branch (`main` or `master`).
+- Refresh this skill from upstream: `git checkout upstream/<branch> -- .claude/skills/update-nanoclaw/`
+- Re-read `.claude/skills/update-nanoclaw/SKILL.md`. If it changed, **follow the updated version from the top** instead of this one.
+
+This is the only working-tree change expected before the preflight check; the full update commits it along with everything else.
+
 # Step 0: Preflight (stop early if unsafe)
 Run:
 - `git status --porcelain`
 If output is non-empty:
 - Tell the user to commit or stash first, then stop.
+- Exception: changes limited to `.claude/skills/update-nanoclaw/` are the Step 0a self-refresh — ignore those and proceed.

 Confirm remotes:
 - `git remote -v`
@@ -256,6 +265,16 @@ If any channels/providers are installed AND `upstream/channels` or `upstream/pro

 If no channels/providers are installed, skip silently.

+Proceed to Step 7.9.
+
+# Step 7.9: Stamp the upgrade marker (required)
+After validation has **succeeded**, record that this install reached the new version through the supported path. Without this, the startup tripwire stops the host on its next start.
+
+- `pnpm exec tsx scripts/upgrade-state.ts set "" update-nanoclaw`
+  - The empty version argument stamps the current `package.json` version.
+
+If validation did NOT succeed, do not stamp — leave the tripwire to catch the broken state.
+
 Proceed to Step 8.

 # Step 8: Summary + rollback instructions
@@ -18,12 +18,20 @@ jobs:

      - uses: actions/checkout@v4
        with:
+          fetch-depth: 0
          token: ${{ steps.app-token.outputs.token }}

      - uses: pnpm/action-setup@v4

      - name: Bump patch version
        run: |
+          # Skip the auto-bump when the pushed commits already changed the
+          # version themselves (e.g. a release PR that set a minor/major).
+          # Otherwise the bot would patch a deliberate 2.1.0 up to 2.1.1.
+          if git diff --name-only "${{ github.event.before }}" "${{ github.sha }}" | grep -qx 'package.json'; then
+            echo "package.json already changed in this push; skipping auto-bump."
+            exit 0
+          fi
          pnpm version patch --no-git-tag-version
          git add package.json
          git diff --cached --quiet && exit 0
@@ -2,6 +2,10 @@

 All notable changes to NanoClaw will be documented in this file.

+## [2.1.0] - 2026-06-07
+
+- [BREAKING] **Startup now requires an upgrade marker.** The host refuses to boot unless `data/upgrade-state.json` records that this install reached the current version through a sanctioned path (`/setup`, `/update-nanoclaw`, `/migrate-nanoclaw`). After this update completes — and before restarting the service — stamp the marker by running `pnpm exec tsx scripts/upgrade-state.ts set`. If the host has already tripped on restart with "update did not go through the supported path", that same command clears it. See [docs/upgrade-recovery.md](docs/upgrade-recovery.md).
+
 ## [2.0.64] - 2026-05-18

 - **`ncl destinations add` and `remove` through the approval flow now reach the receiver immediately.** Approved destinations weren't being projected into the receiving agent's local session state, so a freshly-added destination silently failed at `send_message` with `unknown destination`, and a removed destination stayed resolvable until the next container restart. Both now take effect the moment the approval executes. Direct (non-approval) calls were unaffected.
@@ -153,31 +153,17 @@ Key files: `src/container-restart.ts`, `src/container-runner.ts` (`killContainer

 API keys, OAuth tokens, and auth credentials are managed by the OneCLI gateway. Secrets are injected into per-agent containers at request time — none are passed in env vars or through chat context. The container agent sees this via the `onecli-gateway` container skill (`container/skills/onecli-gateway/SKILL.md`), which teaches it how the proxy works, how to handle auth errors, and to never ask for raw credentials. Host-side wiring: `src/onecli-approvals.ts`, `ensureAgent()` in `container-runner.ts`. Run `onecli --help`.

-### Gotcha: auto-created agents start in `selective` secret mode
+### Secret modes

-When the host first spawns a session for a new agent group, `container-runner.ts:385` calls `onecli.ensureAgent({ name, identifier })`. The OneCLI `POST /api/agents` endpoint creates the agent in **`selective`** secret mode — meaning **no secrets are assigned to it by default**, even if the secrets exist in the vault and have host patterns that would otherwise match.
-
-Symptom: container starts, the proxy + CA cert are wired correctly, but the agent gets `401 Unauthorized` (or similar) from APIs whose credentials *are* in the vault. The credential just isn't in this agent's allow-list.
-
-The SDK does not expose `setSecretMode` — the only fix is the CLI (or the web UI at `http://127.0.0.1:10254`).
+Auto-created agents default to `all` secret mode — every vault secret whose host pattern matches is injected automatically, so the common case needs no per-agent setup. If an agent is in `selective` mode it gets no secrets until you assign them, which shows up as a `401` from an API whose credential *is* in the vault. The SDK can't change this; use the CLI (or the web UI at `http://127.0.0.1:10254`):

 ```bash
-# Find the agent (identifier is the agent group id)
-onecli agents list
-
-# Flip to "all" so every vault secret with a matching host pattern gets injected
-onecli agents set-secret-mode --id <agent-id> --mode all
-
-# Or, stay selective and assign specific secrets
-onecli secrets list                                    # find secret ids
-onecli agents set-secrets --id <agent-id> --secret-ids <id1>,<id2>
-
-# Inspect what an agent currently has
-onecli agents secrets --id <agent-id>                  # secrets assigned to this agent
-onecli secrets list                                    # all vault secrets (with host patterns)
+onecli agents list                                          # check secretMode
+onecli agents set-secret-mode --id <agent-id> --mode all    # inject all matching secrets
+onecli agents set-secrets --id <agent-id> --secret-ids ...  # or stay selective, assign specific ones
 ```

-If you've just enabled `mode all`, no container restart is needed — the gateway looks up secrets per request, so the next API call from the running container will see the new credentials.
+No container restart needed — the gateway looks up secrets per request.

 ### Requiring approval for credential use

@@ -11,7 +11,7 @@ import { TIMEZONE, formatLocalTime } from './timezone.js';
 */
 export type CommandCategory = 'admin' | 'filtered' | 'passthrough' | 'none';

-const ADMIN_COMMANDS = new Set(['/remote-control', '/clear', '/compact', '/context', '/cost', '/files']);
+const ADMIN_COMMANDS = new Set(['/remote-control', '/clear', '/compact', '/context', '/cost', '/files', '/upload-trace']);
 const FILTERED_COMMANDS = new Set(['/help', '/login', '/logout', '/doctor', '/config', '/start']);

 export interface CommandInfo {
@@ -13,6 +13,7 @@ import {
  stripInternalTags,
  type RoutingContext,
 } from './formatter.js';
+import { isUploadTraceCommand, uploadTrace } from './upload-trace.js';
 import type { AgentProvider, AgentQuery, ProviderEvent } from './providers/types.js';

 const POLL_INTERVAL_MS = 1000;
@@ -161,6 +162,19 @@ export async function runPollLoop(config: PollLoopConfig): Promise<void> {
        commandIds.push(msg.id);
        continue;
      }
+      if ((msg.kind === 'chat' || msg.kind === 'chat-sdk') && isUploadTraceCommand(msg)) {
+        log('Uploading session trace to Hugging Face');
+        writeMessageOut({
+          id: generateId(),
+          kind: 'chat',
+          platform_id: routing.platformId,
+          channel_type: routing.channelType,
+          thread_id: routing.threadId,
+          content: JSON.stringify({ text: uploadTrace() }),
+        });
+        commandIds.push(msg.id);
+        continue;
+      }
      normalMessages.push(msg);
    }

@@ -0,0 +1,84 @@
+import { describe, it, expect, beforeEach, afterEach } from 'bun:test';
+
+import { initTestSessionDb, closeSessionDb, getInboundDb } from './db/connection.js';
+import { getUndeliveredMessages } from './db/messages-out.js';
+import { getPendingMessages } from './db/messages-in.js';
+import type { MessageInRow } from './db/messages-in.js';
+import { MockProvider } from './providers/mock.js';
+import { runPollLoop } from './poll-loop.js';
+import { isUploadTraceCommand } from './upload-trace.js';
+
+beforeEach(() => {
+  initTestSessionDb();
+});
+
+afterEach(() => {
+  closeSessionDb();
+});
+
+describe('isUploadTraceCommand', () => {
+  const make = (text: unknown) => ({ content: JSON.stringify({ text }) }) as MessageInRow;
+
+  it('matches /upload-trace (case-insensitive, with args)', () => {
+    expect(isUploadTraceCommand(make('/upload-trace'))).toBe(true);
+    expect(isUploadTraceCommand(make('/UPLOAD-TRACE'))).toBe(true);
+    expect(isUploadTraceCommand(make('  /upload-trace now '))).toBe(true);
+  });
+
+  it('does not match other text or commands', () => {
+    expect(isUploadTraceCommand(make('hello'))).toBe(false);
+    expect(isUploadTraceCommand(make('/upload'))).toBe(false);
+    expect(isUploadTraceCommand(make('/clear'))).toBe(false);
+    expect(isUploadTraceCommand({ content: 'not json' } as MessageInRow)).toBe(false);
+  });
+});
+
+describe('poll loop — /upload-trace command', () => {
+  it('handles the command in the runner, writes a status, skips query', async () => {
+    getInboundDb()
+      .prepare(
+        `INSERT INTO messages_in (id, kind, timestamp, status, platform_id, channel_type, content)
+         VALUES ('m-upload-trace', 'chat', datetime('now'), 'pending', 'chan-1', 'discord', ?)`,
+      )
+      .run(JSON.stringify({ text: '/upload-trace' }));
+
+    // If the provider were ever queried it would emit this — asserting its
+    // absence proves the runner intercepted /upload-trace instead of the LLM.
+    const provider = new MockProvider({}, () => '<message to="discord-test">should not run</message>');
+    const controller = new AbortController();
+    const loopPromise = runPollLoopWithTimeout(provider, controller.signal, 5000);
+
+    await waitFor(() => getUndeliveredMessages().length > 0, 5000);
+    controller.abort();
+
+    const out = getUndeliveredMessages();
+    expect(out).toHaveLength(1);
+    // A status line from uploadTrace() — never the provider's reply.
+    const text = JSON.parse(out[0].content).text as string;
+    expect(text.length).toBeGreaterThan(0);
+    expect(text).not.toBe('should not run');
+
+    // Command message was completed (not left pending).
+    expect(getPendingMessages()).toHaveLength(0);
+
+    await loopPromise.catch(() => {});
+  });
+});
+
+async function runPollLoopWithTimeout(provider: MockProvider, signal: AbortSignal, timeoutMs: number): Promise<void> {
+  return Promise.race([
+    runPollLoop({ provider, providerName: 'mock', cwd: '/tmp' }),
+    new Promise<void>((_, reject) => {
+      signal.addEventListener('abort', () => reject(new Error('aborted')));
+    }),
+    new Promise<void>((_, reject) => setTimeout(() => reject(new Error('timeout')), timeoutMs)),
+  ]);
+}
+
+async function waitFor(condition: () => boolean, timeoutMs: number): Promise<void> {
+  const start = Date.now();
+  while (!condition()) {
+    if (Date.now() - start > timeoutMs) throw new Error('waitFor timeout');
+    await new Promise((resolve) => setTimeout(resolve, 50));
+  }
+}
@@ -0,0 +1,172 @@
+import { spawnSync } from 'node:child_process';
+import fs from 'node:fs';
+import os from 'node:os';
+import path from 'node:path';
+
+import type { MessageInRow } from './db/messages-in.js';
+
+/**
+ * `/upload-trace` command: upload this session's Claude Code transcript to the user's
+ * own private `{hf_user}/nanoclaw-traces` dataset, browsable in the HF Agent
+ * Trace Viewer. The transcript the Claude provider keeps under
+ * `~/.claude/projects/<dir>/<sessionId>.jsonl` is already in the format the
+ * viewer auto-detects, so this just locates the newest one and pushes it.
+ *
+ * Auth is the OneCLI gateway's job: curl goes out through the injected
+ * HTTPS_PROXY, which adds the user's HF token. We never see the raw token, and
+ * a 401 from `whoami` is our "not signed in" signal.
+ */
+
+/**
+ * Narrow check for /upload-trace — the runner handles this command directly
+ * (no LLM turn). Admin-gated by the host router before it reaches the container.
+ */
+export function isUploadTraceCommand(msg: MessageInRow): boolean {
+  let text = '';
+  try {
+    text = (JSON.parse(msg.content)?.text ?? '').trim();
+  } catch {
+    return false; // non-JSON content is never a command
+  }
+  return text.toLowerCase().startsWith('/upload-trace');
+}
+
+/** Newest Claude Code transcript jsonl (the current session). */
+function newestTranscript(): string | null {
+  const projects = path.join(os.homedir(), '.claude', 'projects');
+  let best: { p: string; m: number } | null = null;
+  let dirs: string[];
+  try {
+    dirs = fs.readdirSync(projects);
+  } catch {
+    return null;
+  }
+  for (const dir of dirs) {
+    let files: string[];
+    try {
+      files = fs.readdirSync(path.join(projects, dir));
+    } catch {
+      continue;
+    }
+    for (const f of files) {
+      if (!f.endsWith('.jsonl')) continue;
+      const p = path.join(projects, dir, f);
+      const m = fs.statSync(p).mtimeMs;
+      if (!best || m > best.m) best = { p, m };
+    }
+  }
+  return best?.p ?? null;
+}
+
+function curl(args: string[], input?: string): { ok: boolean; out: string } {
+  const r = spawnSync('curl', args, { input, encoding: 'utf-8' });
+  return { ok: r.status === 0, out: (r.stdout ?? '') + (r.stderr ?? '') };
+}
+
+/**
+ * Setup instructions for when whoami fails. `body` is the gateway's error
+ * JSON (when the request was proxied through OneCLI). We surface the URL it
+ * hands back — `secret_url` for an unknown host (HF's case), `connect_url`
+ * for an OAuth app, `manage_url` when the secret exists but this agent lacks
+ * access — so the link always points at the right gateway (local or hosted).
+ */
+function notSignedInMessage(body: string): string {
+  let setupUrl: string | undefined;
+  try {
+    const e = JSON.parse(body) as { secret_url?: string; connect_url?: string; manage_url?: string };
+    if (e.secret_url) {
+      // The pre-filled `path` defaults to the failing request path
+      // (/api/whoami-v2), which scopes the secret to that one endpoint. Blank
+      // it so the secret matches all of huggingface.co — the upload endpoints
+      // included, not just whoami.
+      setupUrl = e.secret_url.replace(/([?&]path=)[^&]*/, '$1');
+    } else {
+      setupUrl = e.connect_url ?? e.manage_url;
+    }
+  } catch {
+    /* non-JSON body (e.g. HF's own error, or no gateway) — generic fallback */
+  }
+  const lines = [
+    "Can't upload — no Hugging Face token is available to this agent. To set it up:",
+    '',
+    '1. Create a token with WRITE access at https://huggingface.co/settings/tokens',
+    '   (New token → type "Write" → copy it).',
+    '',
+    setupUrl
+      ? `2. Add it to OneCLI here: ${setupUrl}`
+      : '2. Add it to the OneCLI vault as a secret with host pattern  huggingface.co',
+    '',
+    'Then run /upload-trace again.',
+  ];
+  return lines.join('\n');
+}
+
+/** Returns a user-facing status line. Never throws. */
+export function uploadTrace(): string {
+  const file = newestTranscript();
+  if (!file) return 'No transcript to upload for this session yet.';
+
+  // whoami, capturing the body + HTTP status (no -f, so the gateway's error
+  // JSON survives a 401). When no token is available the OneCLI gateway
+  // returns a setup URL pre-filled for *this* gateway — so we never hardcode
+  // local-vs-hosted dashboard links, and never have to know which it is.
+  const who = curl(['-s', '-w', '\n%{http_code}', 'https://huggingface.co/api/whoami-v2']);
+  const nl = who.out.lastIndexOf('\n');
+  const body = nl === -1 ? '' : who.out.slice(0, nl);
+  const status = nl === -1 ? who.out.trim() : who.out.slice(nl + 1).trim();
+
+  if (status !== '200') {
+    return notSignedInMessage(body);
+  }
+  let user: string | undefined;
+  try {
+    user = JSON.parse(body)?.name;
+  } catch {
+    /* fall through */
+  }
+  if (!user) return 'Could not resolve your Hugging Face username.';
+
+  const repo = `${user}/nanoclaw-traces`;
+  // Idempotent create — ignore failure (already exists / no-op). The
+  // Content-Type header is required: without it curl sends form-encoding and
+  // the Hub rejects the body with 400 (expected string at "name").
+  curl([
+    '-sf',
+    '-X',
+    'POST',
+    'https://huggingface.co/api/repos/create',
+    '-H',
+    'Content-Type: application/json',
+    '-d',
+    JSON.stringify({ type: 'dataset', name: 'nanoclaw-traces', private: true }),
+  ]);
+
+  const content = fs.readFileSync(file).toString('base64');
+  const repoPath = `sessions/${path.basename(file)}`;
+  const ndjson =
+    JSON.stringify({ key: 'header', value: { summary: 'add session trace' } }) +
+    '\n' +
+    JSON.stringify({
+      key: 'file',
+      value: { path: repoPath, encoding: 'base64', content },
+    }) +
+    '\n';
+
+  const commit = curl(
+    [
+      '-sf',
+      '-X',
+      'POST',
+      `https://huggingface.co/api/datasets/${repo}/commit/main`,
+      '-H',
+      'Content-Type: application/x-ndjson',
+      '--data-binary',
+      '@-',
+    ],
+    ndjson,
+  );
+  if (!commit.ok) {
+    return 'Upload to Hugging Face failed (the transcript may be too large for an inline commit).';
+  }
+  return `Uploaded → https://huggingface.co/datasets/${repo}/blob/main/${repoPath}`;
+}
@@ -0,0 +1,51 @@
+# Recovering from the upgrade tripwire
+
+If NanoClaw refuses to start with a message like *"update did not go through the supported path"*, this page explains what happened and how to clear it.
+
+## What happened
+
+NanoClaw records the version it reached each time you upgrade through a supported path — `/setup`, `/update-nanoclaw`, or `/migrate-nanoclaw`. That record lives in `data/upgrade-state.json`.
+
+At startup the host checks that record against the running code. If it's **missing** or its version **doesn't match** the code, the host stops. This almost always means the code was updated by a raw `git pull` instead of the supported flow — so migrations, dependency installs, or container rebuilds that the flow would have run may not have happened.
+
+## If you just ran a supported upgrade
+
+If you reached this point by running `/update-nanoclaw`, `/migrate-nanoclaw`, or `/setup` and it **completed successfully**, this is expected the first time an existing install meets the tripwire (your previous version predated it). Clear it by stamping the current version:
+
+```bash
+pnpm exec tsx scripts/upgrade-state.ts set
+```
+
+Then restart NanoClaw. It won't trip again.
+
+**Coding agents:** if the upgrade you just ran reported success, run the command above automatically and restart. Only stop to ask a human if the upgrade itself failed or you can't confirm it finished.
+
+## If you ran `git pull` yourself
+
+Don't just clear the tripwire — that skips the work the supported flow does. Instead, run the supported upgrade so backups, migrations, and rebuilds happen:
+
+```
+/update-nanoclaw
+```
+
+Once it finishes it stamps the marker for you, and the next start is clean.
+
+## If you have your own upgrade flow
+
+If you've built your own way to upgrade — a custom skill, a deploy script, a CI job, a service that pulls and restarts — it won't stamp the marker, so the host will trip on the next start. Add the stamp as the **last step** of that flow, after the upgrade succeeds and before the restart:
+
+```bash
+pnpm exec tsx scripts/upgrade-state.ts set
+```
+
+That's the same thing `/setup`, `/update-nanoclaw`, and `/migrate-nanoclaw` do at the end. Do it only when the upgrade actually completed — the marker is your assertion that this install reached the current version through a path you trust.
+
+## The override
+
+`pnpm exec tsx scripts/upgrade-state.ts set` is the override: it declares "this install is good at the current version." Use it when you know the install is actually in a good state (e.g. you completed the steps manually). It's safe to re-run.
+
+To inspect the current marker:
+
+```bash
+pnpm exec tsx scripts/upgrade-state.ts get
+```
@@ -1,6 +1,6 @@
 {
  "name": "nanoclaw",
-  "version": "2.0.70",
+  "version": "2.1.0",
  "description": "Personal Claude assistant. Lightweight, secure, customizable.",
  "type": "module",
  "packageManager": "pnpm@10.33.0",
@@ -1,5 +1,5 @@
-<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="90" height="20" role="img" aria-label="179k tokens, 89% of context window">
-  <title>179k tokens, 89% of context window</title>
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="90" height="20" role="img" aria-label="181k tokens, 91% of context window">
+  <title>181k tokens, 91% of context window</title>
  <linearGradient id="s" x2="0" y2="100%">
    <stop offset="0" stop-color="#bbb" stop-opacity=".1"/>
    <stop offset="1" stop-opacity=".1"/>
@@ -15,8 +15,8 @@
      <g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" font-size="11">
        <text aria-hidden="true" x="26" y="15" fill="#010101" fill-opacity=".3">tokens</text>
        <text x="26" y="14">tokens</text>
-        <text aria-hidden="true" x="71" y="15" fill="#010101" fill-opacity=".3">179k</text>
-        <text x="71" y="14">179k</text>
+        <text aria-hidden="true" x="71" y="15" fill="#010101" fill-opacity=".3">181k</text>
+        <text x="71" y="14">181k</text>
      </g>
    </g>
  </a>
@@ -0,0 +1,26 @@
+/**
+ * scripts/upgrade-state.ts — read or stamp the upgrade marker.
+ *
+ * Usage:
+ *   pnpm exec tsx scripts/upgrade-state.ts get
+ *   pnpm exec tsx scripts/upgrade-state.ts set [version] [via]
+ *
+ * `set` with no version stamps the current package.json version. The
+ * sanctioned upgrade paths (setup / update / migrate) call `set` on
+ * success; running it by hand is also the documented way to clear the
+ * startup tripwire — see docs/upgrade-recovery.md.
+ */
+import { getCodeVersion, markerPath, readUpgradeState, writeUpgradeState } from '../src/upgrade-state.js';
+
+const [, , cmd, versionArg, viaArg] = process.argv;
+
+if (cmd === 'get') {
+  const state = readUpgradeState();
+  console.log(state ? JSON.stringify(state) : 'none');
+} else if (cmd === 'set') {
+  const state = writeUpgradeState({ version: versionArg || getCodeVersion(), via: viaArg || 'manual' });
+  console.log(`Stamped ${markerPath()}: ${JSON.stringify(state)}`);
+} else {
+  console.error('Usage: pnpm exec tsx scripts/upgrade-state.ts get | set [version] [via]');
+  process.exit(2);
+}
@@ -11,6 +11,7 @@ import path from 'path';

 import { log } from '../src/log.js';
 import { getLaunchdLabel, getSystemdUnit } from '../src/install-slug.js';
+import { writeUpgradeState } from '../src/upgrade-state.js';
 import { cleanupUnhealthyPeers } from './peer-cleanup.js';
 import {
  commandExists,
@@ -54,6 +55,11 @@ export async function run(_args: string[]): Promise<void> {

  fs.mkdirSync(path.join(projectRoot, 'logs'), { recursive: true });

+  // Stamp the upgrade marker before the host first starts, so the startup
+  // tripwire (enforceUpgradeTripwire) sees this as a sanctioned install.
+  const stamped = writeUpgradeState({ via: 'setup' });
+  log.info('Stamped upgrade marker', { version: stamped.version });
+
  // Peer preflight — a crash-looping peer install (most often the legacy v1
  // `com.nanoclaw` plist) will keep trashing this install's containers on
  // every respawn via its own cleanupOrphans. Detect and unload any peer
@@ -12,7 +12,7 @@ import { getDb, hasTable } from './db/connection.js';
 export type GateResult = { action: 'pass' } | { action: 'filter' } | { action: 'deny'; command: string };

 const FILTERED_COMMANDS = new Set(['/help', '/login', '/logout', '/doctor', '/config', '/remote-control']);
-const ADMIN_COMMANDS = new Set(['/clear', '/compact', '/context', '/cost', '/files']);
+const ADMIN_COMMANDS = new Set(['/clear', '/compact', '/context', '/cost', '/files', '/upload-trace']);

 /**
 * Classify a message and decide whether it should reach the container.
@@ -17,6 +17,7 @@ import { startActiveDeliveryPoll, startSweepDeliveryPoll, setDeliveryAdapter, st
 import { startHostSweep, stopHostSweep } from './host-sweep.js';
 import { routeInbound } from './router.js';
 import { log } from './log.js';
+import { enforceUpgradeTripwire } from './upgrade-state.js';

 // Response + shutdown registries live in response-registry.ts to break the
 // circular import cycle: src/index.ts imports src/modules/index.js for side
@@ -69,6 +70,10 @@ async function main(): Promise<void> {
  // 0. Circuit breaker — backoff on rapid restarts
  await enforceStartupBackoff();

+  // 0.5 Upgrade tripwire — refuse to start if this install was updated
+  // outside the sanctioned path (raw `git pull` instead of /update-nanoclaw).
+  enforceUpgradeTripwire();
+
  // 1. Init central DB
  const dbPath = path.join(DATA_DIR, 'v2.db');
  const db = initDb(dbPath);
@@ -0,0 +1,90 @@
+import fs from 'fs';
+import path from 'path';
+
+import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest';
+
+vi.mock('./config.js', async () => {
+  const actual = await vi.importActual<typeof import('./config.js')>('./config.js');
+  return { ...actual, DATA_DIR: '/tmp/nanoclaw-test-upgrade-state' };
+});
+
+const TEST_DIR = '/tmp/nanoclaw-test-upgrade-state';
+
+import {
+  enforceUpgradeTripwire,
+  getCodeVersion,
+  isUpgradeCurrent,
+  markerPath,
+  readUpgradeState,
+  writeUpgradeState,
+} from './upgrade-state.js';
+
+beforeEach(() => {
+  fs.rmSync(TEST_DIR, { recursive: true, force: true });
+});
+afterEach(() => {
+  fs.rmSync(TEST_DIR, { recursive: true, force: true });
+});
+
+describe('upgrade-state', () => {
+  it('getCodeVersion reads the package.json version', () => {
+    const pkg = JSON.parse(fs.readFileSync(path.join(process.cwd(), 'package.json'), 'utf8'));
+    expect(getCodeVersion()).toBe(pkg.version);
+  });
+
+  it('readUpgradeState returns null when the marker is absent', () => {
+    expect(readUpgradeState()).toBeNull();
+  });
+
+  it('write then read round-trips, with version/via/updatedAt', () => {
+    const written = writeUpgradeState({ version: '9.9.9', via: 'test' });
+    expect(written).toMatchObject({ version: '9.9.9', via: 'test' });
+    expect(written.updatedAt).toBeTruthy();
+    expect(readUpgradeState()).toEqual(written);
+  });
+
+  it('write defaults the version to the code version', () => {
+    expect(writeUpgradeState({ via: 'test' }).version).toBe(getCodeVersion());
+  });
+
+  it('isUpgradeCurrent: false when absent, false on mismatch, true on match', () => {
+    expect(isUpgradeCurrent()).toBe(false);
+    writeUpgradeState({ version: '0.0.0-nope', via: 'test' });
+    expect(isUpgradeCurrent()).toBe(false);
+    writeUpgradeState({ version: getCodeVersion(), via: 'test' });
+    expect(isUpgradeCurrent()).toBe(true);
+  });
+
+  it('treats a corrupt marker as absent (fails closed, never throws)', () => {
+    fs.mkdirSync(TEST_DIR, { recursive: true });
+    fs.writeFileSync(path.join(TEST_DIR, 'upgrade-state.json'), '{ this is not json');
+    expect(() => readUpgradeState()).not.toThrow();
+    expect(readUpgradeState()).toBeNull();
+    expect(isUpgradeCurrent()).toBe(false);
+  });
+
+  it('markerPath is upgrade-state.json under the data dir', () => {
+    expect(markerPath()).toBe(path.join(TEST_DIR, 'upgrade-state.json'));
+  });
+
+  it('enforceUpgradeTripwire exits when not current and passes when current', () => {
+    const exitSpy = vi.spyOn(process, 'exit').mockImplementation(((code?: number) => {
+      throw new Error(`exit:${code}`);
+    }) as never);
+    const errSpy = vi.spyOn(console, 'error').mockImplementation(() => {});
+
+    // No marker → trips.
+    expect(() => enforceUpgradeTripwire()).toThrow('exit:1');
+
+    // Stale marker → trips.
+    writeUpgradeState({ version: '0.0.0-nope', via: 'test' });
+    expect(() => enforceUpgradeTripwire()).toThrow('exit:1');
+
+    // Matching marker → passes.
+    writeUpgradeState({ version: getCodeVersion(), via: 'test' });
+    expect(() => enforceUpgradeTripwire()).not.toThrow();
+
+    exitSpy.mockRestore();
+    errSpy.mockRestore();
+  });
+});
@@ -0,0 +1,126 @@
+/**
+ * Upgrade marker — the record that an install reached its current version
+ * through a sanctioned path (setup / `/update-nanoclaw` / `/migrate-nanoclaw`).
+ *
+ * The startup tripwire (enforceUpgradeTripwire) refuses to run if the marker
+ * is missing or its version doesn't match the running code — i.e. if the
+ * install was updated by a raw `git pull` instead of the supported flow.
+ *
+ * The marker lives in `data/` (gitignored), so a `git pull` can't touch it.
+ * Only the sanctioned paths call writeUpgradeState(); clearing the tripwire
+ * by hand is the same `set` — see docs/upgrade-recovery.md.
+ */
+import fs from 'fs';
+import path from 'path';
+
+import { DATA_DIR } from './config.js';
+import { log } from './log.js';
+
+export interface UpgradeState {
+  version: string;
+  updatedAt: string;
+  via: string;
+}
+
+const MARKER_PATH = path.join(DATA_DIR, 'upgrade-state.json');
+const FIX_COMMAND = 'pnpm exec tsx scripts/upgrade-state.ts set';
+
+/** Version the running code declares, read from package.json. */
+export function getCodeVersion(): string {
+  const pkgPath = path.join(process.cwd(), 'package.json');
+  const pkg = JSON.parse(fs.readFileSync(pkgPath, 'utf8')) as { version?: string };
+  if (!pkg.version) throw new Error(`No version field in ${pkgPath}`);
+  return pkg.version;
+}
+
+/**
+ * Read the upgrade marker, or null if it's absent, unreadable, or corrupt.
+ * Never throws — a boot gate must fail closed (treat anything it can't trust
+ * as "no valid marker" → trip), not crash with a stack trace.
+ */
+export function readUpgradeState(): UpgradeState | null {
+  let raw: string;
+  try {
+    raw = fs.readFileSync(MARKER_PATH, 'utf8');
+  } catch (e: unknown) {
+    if ((e as NodeJS.ErrnoException).code === 'ENOENT') return null;
+    log.warn('Could not read upgrade marker; treating as absent', { path: MARKER_PATH, err: String(e) });
+    return null;
+  }
+  try {
+    return JSON.parse(raw) as UpgradeState;
+  } catch {
+    log.warn('Upgrade marker is corrupt; treating as absent', { path: MARKER_PATH });
+    return null;
+  }
+}
+
+/**
+ * Stamp the marker. Only the sanctioned paths (setup / update / migrate)
+ * call this on success; `version` defaults to the current code version.
+ */
+export function writeUpgradeState(opts: { version?: string; via: string }): UpgradeState {
+  const state: UpgradeState = {
+    version: opts.version ?? getCodeVersion(),
+    updatedAt: new Date().toISOString(),
+    via: opts.via,
+  };
+  fs.mkdirSync(DATA_DIR, { recursive: true });
+  fs.writeFileSync(MARKER_PATH, JSON.stringify(state, null, 2) + '\n');
+  return state;
+}
+
+/** True when the marker exists and matches the running code version. */
+export function isUpgradeCurrent(): boolean {
+  const state = readUpgradeState();
+  return state !== null && state.version === getCodeVersion();
+}
+
+/** Absolute path to the marker file. */
+export function markerPath(): string {
+  return MARKER_PATH;
+}
+
+/**
+ * Startup gate. If the install didn't reach the current version through a
+ * sanctioned path, stop with a message written for the coding agent that
+ * just ran the upgrade to act on automatically.
+ */
+export function enforceUpgradeTripwire(): void {
+  if (isUpgradeCurrent()) return;
+
+  const code = getCodeVersion();
+  const recorded = readUpgradeState()?.version ?? 'none';
+
+  console.error(
+    [
+      '',
+      '='.repeat(64),
+      'NanoClaw stopped: update did not go through the supported path',
+      '='.repeat(64),
+      'This install was updated outside the sanctioned upgrade flow',
+      '(no upgrade marker, or the marker version does not match the code).',
+      '',
+      `  code version:     ${code}`,
+      `  recorded version: ${recorded}`,
+      '',
+      "IF YOU'RE A HUMAN: you most likely ran `git pull` directly. Don't",
+      'fix this by hand — open Claude Code in this folder and run',
+      '/update-nanoclaw, which repairs the install and clears this stop.',
+      '',
+      'CODING AGENT: if you just ran /update-nanoclaw, /migrate-nanoclaw, or',
+      'setup and it completed successfully, clear this automatically by running:',
+      '',
+      `  ${FIX_COMMAND}`,
+      '',
+      'then restart NanoClaw. If the upgrade did NOT complete, stop and',
+      'investigate instead of clearing.',
+      '',
+      'Details: docs/upgrade-recovery.md',
+      '='.repeat(64),
+      '',
+    ].join('\n'),
+  );
+  log.error('Upgrade tripwire: install not on the sanctioned path', { code, recorded });
+  process.exit(1);
+}
Author	SHA1	Message	Date
gavrielc	092487d7ad	chore: release 2.1.0; guard auto-bump against deliberate version changes Set package.json to 2.1.0 to match the CHANGELOG entry for the upgrade tripwire (a [BREAKING] change warrants a minor bump). The startup tripwire reads package.json as the source of truth, so this is the version the gate will enforce. bump-version.yml previously ran `pnpm version patch` on every push to main, which would patch a deliberate 2.1.0 up to 2.1.1. It now skips the auto-bump when the pushed commits already changed package.json themselves. fetch-depth: 0 so the before/after diff has both tips. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-07 17:03:02 +03:00
gavrielc	87850aa7f8	docs(changelog): release the upgrade-tripwire entry as 2.1.0 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-07 16:59:30 +03:00
gavrielc	526170fd47	feat(upgrade): add human-addressed guidance to tripwire banner The startup tripwire message was written for a coding agent and gave a human no direction — only the bare `set` override (which skips the migrations the gate guards). Add one human-addressed stanza pointing to /update-nanoclaw as the correct fix. The tested CODING AGENT block is left byte-for-byte unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-07 16:57:13 +03:00
gavrielc	e734e5cddd	feat(upgrade): startup tripwire + upgrade marker Refuse to start unless this install reached the current version through a sanctioned path (setup / update / migrate). A raw `git pull` that skips migrations now fails loudly with a self-healing message instead of silently breaking. - src/upgrade-state.ts: marker at data/upgrade-state.json, getCodeVersion, isUpgradeCurrent, enforceUpgradeTripwire (fails closed on missing / corrupt / mismatched marker) - src/index.ts: gate wired in at startup step 0.5, before DB init - scripts/upgrade-state.ts: get/set CLI (also the override / recovery cmd) - setup/service.ts, /update-nanoclaw, /migrate-nanoclaw: stamp on success; update/migrate also self-update their own skill first - CHANGELOG [BREAKING] entry bridges existing installs via the skills' breaking-change check - docs/upgrade-recovery.md: clearing the tripwire Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 13:02:12 +03:00
github-actions[bot]	d14472142d	chore: bump version to 2.0.76	2026-06-05 08:04:24 +00:00
gavrielc	0c1897ad12	fix: blank the secret_url path instead of /* A bare * in the pre-filled secret_url path doesn't survive (the gateway URL-encodes everything, so an unencoded * collapses to just /, which only exact-matches the path /). Leave the path blank instead so the created secret matches all of huggingface.co, not a single endpoint. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-05 11:04:06 +03:00
github-actions[bot]	d16b24d5b4	docs: update token count to 181k tokens · 91% of context window	2026-06-05 07:56:50 +00:00
github-actions[bot]	d0de64b999	chore: bump version to 2.0.75	2026-06-05 07:56:45 +00:00
gavrielc	f3fde69536	fix: trim the upload-trace not-signed-in message Drop "(host pattern pre-filled)" and "— no restart needed" from the HF setup instructions. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-05 10:56:30 +03:00
github-actions[bot]	20140c84be	chore: bump version to 2.0.74	2026-06-05 07:55:13 +00:00
gavrielc	33c36842fa	Merge pull request #2691 from nanocoai/upload-trace-gateway-setup-url feat: show OneCLI's own setup URL when HF token is missing	2026-06-05 10:54:58 +03:00
gavrielc	0435736314	feat: show OneCLI's own setup URL when HF token is missing The not-signed-in message hardcoded both a local and a hosted OneCLI dashboard URL because the container can't tell which gateway it's behind. But the gateway already tells us: a credential-less proxied request comes back with the right URL in its error body — - credential_not_found → secret_url (pre-filled "new secret" form) - access_restricted → manage_url (grant this agent access) - app_not_connected → connect_url Capture whoami's body + status (drop -f so the JSON survives the 401), extract that URL, and present it. It's always the correct gateway, local or hosted, with zero extra wiring. The secret_url's pre-filled `path` defaults to the failing request path (/api/whoami-v2), so broaden it to /* — otherwise the created secret wouldn't cover the upload endpoints. Falls back to generic text when there's no gateway JSON to read. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-05 10:48:02 +03:00
github-actions[bot]	384f9c29e3	chore: bump version to 2.0.73	2026-06-05 07:37:40 +00:00
github-actions[bot]	aa8597acf8	docs: update token count to 181k tokens · 90% of context window	2026-06-05 07:37:35 +00:00
gavrielc	3ae4ba18c3	Merge pull request #2690 from nanocoai/fix-upload-trace-secret-mode-docs fix: simplify HF token setup + correct secret-mode docs	2026-06-05 10:37:20 +03:00
gavrielc	de88be8a7a	fix: simplify HF token setup + correct secret-mode docs The default OneCLI secret mode for auto-created agents is `all`, not `selective` — a fresh agent created via ensureAgent({name, identifier}) comes back with secretMode "all", so matching vault secrets inject automatically. Drop the now-unnecessary per-agent assignment step. - upload-trace.ts: remove step 3 (set-secret-mode) from the not-authed message; creating the token and adding it to the vault is enough - CLAUDE.md: trim the secret-mode gotcha to reflect `all` as the default - init-onecli skill: replace stale `onecli start` (gone in 1.4.x) and the `ps aux \| grep onecli` check with the real Docker Compose start path Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-05 10:35:42 +03:00
github-actions[bot]	b9141218ad	docs: update token count to 181k tokens · 91% of context window	2026-05-31 20:17:59 +00:00
github-actions[bot]	341b5950e1	chore: bump version to 2.0.72	2026-05-31 20:17:55 +00:00
gavrielc	8cb4ed27ef	Merge pull request #2648 from nanocoai/share-session-command	2026-05-31 23:17:43 +03:00
gavrielc	729cd8d2a6	feat: add /upload-trace command to upload session trace to Hugging Face Adds a runner-handled /upload-trace slash command (admin-gated, like /clear) that uploads the current session's Claude Code transcript to the user's own private {hf_user}/nanoclaw-traces dataset, browsable in the HF Agent Trace Viewer. The transcript is already in the format the viewer auto-detects, so the command just locates the newest one and pushes it via the Hub commit API. Auth is handled by the OneCLI gateway: curl goes out through the injected HTTPS_PROXY, which adds the user's HF token — no credential ever touches agent code. A missing/unassigned token yields a clear setup message. - container/agent-runner/src/upload-trace.ts: isUploadTraceCommand() + uploadTrace() - poll-loop.ts: recognize and handle /upload-trace in the runner - command-gate.ts: admin-gate /upload-trace on the host - upload-trace.test.ts: unit + integration coverage for the command Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 10:42:36 +03:00
github-actions[bot]	3601a8a1fe	chore: bump version to 2.0.71	2026-05-28 19:41:34 +00:00
gavrielc	991969085e	Merge pull request #2637 from nanocoai/bump-claude-code-2.1.154 chore: bump claude-code to 2.1.154 and claude-agent-sdk to 0.3.154	2026-05-28 22:41:19 +03:00