mirror of
https://github.com/qwibitai/nanoclaw.git
synced 2026-06-18 18:29:35 +08:00
Compare commits
22 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 092487d7ad | |||
| 87850aa7f8 | |||
| 526170fd47 | |||
| e734e5cddd | |||
| d14472142d | |||
| 0c1897ad12 | |||
| d16b24d5b4 | |||
| d0de64b999 | |||
| f3fde69536 | |||
| 20140c84be | |||
| 33c36842fa | |||
| 0435736314 | |||
| 384f9c29e3 | |||
| aa8597acf8 | |||
| 3ae4ba18c3 | |||
| de88be8a7a | |||
| b9141218ad | |||
| 341b5950e1 | |||
| 8cb4ed27ef | |||
| 729cd8d2a6 | |||
| 3601a8a1fe | |||
| 991969085e |
@@ -98,13 +98,13 @@ for i in $(seq 1 15); do
|
||||
done
|
||||
```
|
||||
|
||||
If it never becomes healthy, check if the gateway process is running:
|
||||
If it never becomes healthy, check the gateway containers. The gateway is a Docker Compose stack (project `onecli`, compose file at `~/.onecli/docker-compose.yml`), **not** a host process — `ps aux | grep onecli` will not find it, and there is no `onecli start` command (removed in OneCLI 1.4.x).
|
||||
|
||||
```bash
|
||||
ps aux | grep -i onecli | grep -v grep
|
||||
docker ps -a --filter "label=com.docker.compose.project=onecli" --format '{{.Names}}\t{{.Status}}'
|
||||
```
|
||||
|
||||
If it's not running, try starting it manually: `onecli start`. If that fails, show the error and stop — the user needs to debug their OneCLI installation.
|
||||
Both services have `restart: unless-stopped`, so they come back automatically once the Docker daemon is up. If Docker isn't running, start it (`open -a Docker` on macOS) and they'll restart on their own. To bring the stack up manually: `docker compose -f ~/.onecli/docker-compose.yml up -d`. If that fails, show the error and stop — the user needs to debug their OneCLI installation.
|
||||
|
||||
## Phase 3: Migrate existing credentials
|
||||
|
||||
@@ -299,7 +299,7 @@ If an agent uses `git` or `gh`, add to `data/v2-sessions/<agent-group-id>/.claud
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**"OneCLI gateway not reachable" in logs:** The gateway isn't running. Check with `curl -sf ${ONECLI_URL}/health`. Start it with `onecli start` if needed.
|
||||
**"OneCLI gateway not reachable" in logs:** The gateway isn't running. Check with `curl -sf ${ONECLI_URL}/health`. The most common cause is that Docker itself is down (the gateway is a Compose stack) — start Docker (`open -a Docker` on macOS) and the containers restart automatically. To bring them up manually: `docker compose -f ~/.onecli/docker-compose.yml up -d`.
|
||||
|
||||
**Container gets no credentials:** Verify `ONECLI_URL` is set in `.env` and the gateway has an Anthropic secret (`onecli secrets list`).
|
||||
|
||||
|
||||
@@ -28,6 +28,15 @@ Two phases: **Extract** (build the migration guide) and **Upgrade** (use it). If
|
||||
|
||||
---
|
||||
|
||||
# Phase 0: Refresh this skill first
|
||||
|
||||
The migration process itself evolves, so run its newest version before doing anything else:
|
||||
- Ensure the `upstream` remote exists (default `https://github.com/nanocoai/nanoclaw.git`) and fetch: `git fetch upstream --prune`. Detect the upstream branch (`main` or `master`).
|
||||
- Refresh this skill from upstream: `git checkout upstream/<branch> -- .claude/skills/migrate-nanoclaw/`
|
||||
- Re-read `.claude/skills/migrate-nanoclaw/SKILL.md`. If it changed, **follow the updated version from the top** instead of this one.
|
||||
|
||||
This is the only working-tree change expected before the preflight check below; changes limited to `.claude/skills/migrate-nanoclaw/` are this self-refresh — ignore them in the 1.0 clean-tree check and proceed.
|
||||
|
||||
# Phase 1: Extract
|
||||
|
||||
## 1.0 Preflight
|
||||
@@ -463,6 +472,11 @@ Do NOT use `git checkout -B` to create an intermediate branch — this caused is
|
||||
|
||||
Run `npm install && pnpm run build` in the main tree to confirm.
|
||||
|
||||
Stamp the upgrade marker (required — without it the startup tripwire stops the host on next start). Only do this after the build above succeeds:
|
||||
```bash
|
||||
pnpm exec tsx scripts/upgrade-state.ts set "" migrate-nanoclaw
|
||||
```
|
||||
|
||||
Restart the service:
|
||||
```bash
|
||||
launchctl load ~/Library/LaunchAgents/com.nanoclaw.plist
|
||||
|
||||
@@ -60,11 +60,20 @@ Help a user with a customized NanoClaw install safely incorporate upstream chang
|
||||
- Default to MERGE (one-pass conflict resolution). Offer REBASE as an explicit option.
|
||||
- Keep token usage low: rely on `git status`, `git log`, `git diff`, and open only conflicted files.
|
||||
|
||||
# Step 0a: Refresh this skill first
|
||||
The update process itself evolves, so run its newest version before doing anything else:
|
||||
- Ensure the `upstream` remote exists (default `https://github.com/nanocoai/nanoclaw.git`) and fetch: `git fetch upstream --prune`. Detect the upstream branch (`main` or `master`).
|
||||
- Refresh this skill from upstream: `git checkout upstream/<branch> -- .claude/skills/update-nanoclaw/`
|
||||
- Re-read `.claude/skills/update-nanoclaw/SKILL.md`. If it changed, **follow the updated version from the top** instead of this one.
|
||||
|
||||
This is the only working-tree change expected before the preflight check; the full update commits it along with everything else.
|
||||
|
||||
# Step 0: Preflight (stop early if unsafe)
|
||||
Run:
|
||||
- `git status --porcelain`
|
||||
If output is non-empty:
|
||||
- Tell the user to commit or stash first, then stop.
|
||||
- Exception: changes limited to `.claude/skills/update-nanoclaw/` are the Step 0a self-refresh — ignore those and proceed.
|
||||
|
||||
Confirm remotes:
|
||||
- `git remote -v`
|
||||
@@ -256,6 +265,16 @@ If any channels/providers are installed AND `upstream/channels` or `upstream/pro
|
||||
|
||||
If no channels/providers are installed, skip silently.
|
||||
|
||||
Proceed to Step 7.9.
|
||||
|
||||
# Step 7.9: Stamp the upgrade marker (required)
|
||||
After validation has **succeeded**, record that this install reached the new version through the supported path. Without this, the startup tripwire stops the host on its next start.
|
||||
|
||||
- `pnpm exec tsx scripts/upgrade-state.ts set "" update-nanoclaw`
|
||||
- The empty version argument stamps the current `package.json` version.
|
||||
|
||||
If validation did NOT succeed, do not stamp — leave the tripwire to catch the broken state.
|
||||
|
||||
Proceed to Step 8.
|
||||
|
||||
# Step 8: Summary + rollback instructions
|
||||
|
||||
@@ -18,12 +18,20 @@ jobs:
|
||||
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
fetch-depth: 0
|
||||
token: ${{ steps.app-token.outputs.token }}
|
||||
|
||||
- uses: pnpm/action-setup@v4
|
||||
|
||||
- name: Bump patch version
|
||||
run: |
|
||||
# Skip the auto-bump when the pushed commits already changed the
|
||||
# version themselves (e.g. a release PR that set a minor/major).
|
||||
# Otherwise the bot would patch a deliberate 2.1.0 up to 2.1.1.
|
||||
if git diff --name-only "${{ github.event.before }}" "${{ github.sha }}" | grep -qx 'package.json'; then
|
||||
echo "package.json already changed in this push; skipping auto-bump."
|
||||
exit 0
|
||||
fi
|
||||
pnpm version patch --no-git-tag-version
|
||||
git add package.json
|
||||
git diff --cached --quiet && exit 0
|
||||
|
||||
@@ -2,6 +2,10 @@
|
||||
|
||||
All notable changes to NanoClaw will be documented in this file.
|
||||
|
||||
## [2.1.0] - 2026-06-07
|
||||
|
||||
- [BREAKING] **Startup now requires an upgrade marker.** The host refuses to boot unless `data/upgrade-state.json` records that this install reached the current version through a sanctioned path (`/setup`, `/update-nanoclaw`, `/migrate-nanoclaw`). After this update completes — and before restarting the service — stamp the marker by running `pnpm exec tsx scripts/upgrade-state.ts set`. If the host has already tripped on restart with "update did not go through the supported path", that same command clears it. See [docs/upgrade-recovery.md](docs/upgrade-recovery.md).
|
||||
|
||||
## [2.0.64] - 2026-05-18
|
||||
|
||||
- **`ncl destinations add` and `remove` through the approval flow now reach the receiver immediately.** Approved destinations weren't being projected into the receiving agent's local session state, so a freshly-added destination silently failed at `send_message` with `unknown destination`, and a removed destination stayed resolvable until the next container restart. Both now take effect the moment the approval executes. Direct (non-approval) calls were unaffected.
|
||||
|
||||
@@ -153,31 +153,17 @@ Key files: `src/container-restart.ts`, `src/container-runner.ts` (`killContainer
|
||||
|
||||
API keys, OAuth tokens, and auth credentials are managed by the OneCLI gateway. Secrets are injected into per-agent containers at request time — none are passed in env vars or through chat context. The container agent sees this via the `onecli-gateway` container skill (`container/skills/onecli-gateway/SKILL.md`), which teaches it how the proxy works, how to handle auth errors, and to never ask for raw credentials. Host-side wiring: `src/onecli-approvals.ts`, `ensureAgent()` in `container-runner.ts`. Run `onecli --help`.
|
||||
|
||||
### Gotcha: auto-created agents start in `selective` secret mode
|
||||
### Secret modes
|
||||
|
||||
When the host first spawns a session for a new agent group, `container-runner.ts:385` calls `onecli.ensureAgent({ name, identifier })`. The OneCLI `POST /api/agents` endpoint creates the agent in **`selective`** secret mode — meaning **no secrets are assigned to it by default**, even if the secrets exist in the vault and have host patterns that would otherwise match.
|
||||
|
||||
Symptom: container starts, the proxy + CA cert are wired correctly, but the agent gets `401 Unauthorized` (or similar) from APIs whose credentials *are* in the vault. The credential just isn't in this agent's allow-list.
|
||||
|
||||
The SDK does not expose `setSecretMode` — the only fix is the CLI (or the web UI at `http://127.0.0.1:10254`).
|
||||
Auto-created agents default to `all` secret mode — every vault secret whose host pattern matches is injected automatically, so the common case needs no per-agent setup. If an agent is in `selective` mode it gets no secrets until you assign them, which shows up as a `401` from an API whose credential *is* in the vault. The SDK can't change this; use the CLI (or the web UI at `http://127.0.0.1:10254`):
|
||||
|
||||
```bash
|
||||
# Find the agent (identifier is the agent group id)
|
||||
onecli agents list
|
||||
|
||||
# Flip to "all" so every vault secret with a matching host pattern gets injected
|
||||
onecli agents set-secret-mode --id <agent-id> --mode all
|
||||
|
||||
# Or, stay selective and assign specific secrets
|
||||
onecli secrets list # find secret ids
|
||||
onecli agents set-secrets --id <agent-id> --secret-ids <id1>,<id2>
|
||||
|
||||
# Inspect what an agent currently has
|
||||
onecli agents secrets --id <agent-id> # secrets assigned to this agent
|
||||
onecli secrets list # all vault secrets (with host patterns)
|
||||
onecli agents list # check secretMode
|
||||
onecli agents set-secret-mode --id <agent-id> --mode all # inject all matching secrets
|
||||
onecli agents set-secrets --id <agent-id> --secret-ids ... # or stay selective, assign specific ones
|
||||
```
|
||||
|
||||
If you've just enabled `mode all`, no container restart is needed — the gateway looks up secrets per request, so the next API call from the running container will see the new credentials.
|
||||
No container restart needed — the gateway looks up secrets per request.
|
||||
|
||||
### Requiring approval for credential use
|
||||
|
||||
|
||||
@@ -11,7 +11,7 @@ import { TIMEZONE, formatLocalTime } from './timezone.js';
|
||||
*/
|
||||
export type CommandCategory = 'admin' | 'filtered' | 'passthrough' | 'none';
|
||||
|
||||
const ADMIN_COMMANDS = new Set(['/remote-control', '/clear', '/compact', '/context', '/cost', '/files']);
|
||||
const ADMIN_COMMANDS = new Set(['/remote-control', '/clear', '/compact', '/context', '/cost', '/files', '/upload-trace']);
|
||||
const FILTERED_COMMANDS = new Set(['/help', '/login', '/logout', '/doctor', '/config', '/start']);
|
||||
|
||||
export interface CommandInfo {
|
||||
|
||||
@@ -13,6 +13,7 @@ import {
|
||||
stripInternalTags,
|
||||
type RoutingContext,
|
||||
} from './formatter.js';
|
||||
import { isUploadTraceCommand, uploadTrace } from './upload-trace.js';
|
||||
import type { AgentProvider, AgentQuery, ProviderEvent } from './providers/types.js';
|
||||
|
||||
const POLL_INTERVAL_MS = 1000;
|
||||
@@ -161,6 +162,19 @@ export async function runPollLoop(config: PollLoopConfig): Promise<void> {
|
||||
commandIds.push(msg.id);
|
||||
continue;
|
||||
}
|
||||
if ((msg.kind === 'chat' || msg.kind === 'chat-sdk') && isUploadTraceCommand(msg)) {
|
||||
log('Uploading session trace to Hugging Face');
|
||||
writeMessageOut({
|
||||
id: generateId(),
|
||||
kind: 'chat',
|
||||
platform_id: routing.platformId,
|
||||
channel_type: routing.channelType,
|
||||
thread_id: routing.threadId,
|
||||
content: JSON.stringify({ text: uploadTrace() }),
|
||||
});
|
||||
commandIds.push(msg.id);
|
||||
continue;
|
||||
}
|
||||
normalMessages.push(msg);
|
||||
}
|
||||
|
||||
|
||||
@@ -0,0 +1,84 @@
|
||||
import { describe, it, expect, beforeEach, afterEach } from 'bun:test';
|
||||
|
||||
import { initTestSessionDb, closeSessionDb, getInboundDb } from './db/connection.js';
|
||||
import { getUndeliveredMessages } from './db/messages-out.js';
|
||||
import { getPendingMessages } from './db/messages-in.js';
|
||||
import type { MessageInRow } from './db/messages-in.js';
|
||||
import { MockProvider } from './providers/mock.js';
|
||||
import { runPollLoop } from './poll-loop.js';
|
||||
import { isUploadTraceCommand } from './upload-trace.js';
|
||||
|
||||
beforeEach(() => {
|
||||
initTestSessionDb();
|
||||
});
|
||||
|
||||
afterEach(() => {
|
||||
closeSessionDb();
|
||||
});
|
||||
|
||||
describe('isUploadTraceCommand', () => {
|
||||
const make = (text: unknown) => ({ content: JSON.stringify({ text }) }) as MessageInRow;
|
||||
|
||||
it('matches /upload-trace (case-insensitive, with args)', () => {
|
||||
expect(isUploadTraceCommand(make('/upload-trace'))).toBe(true);
|
||||
expect(isUploadTraceCommand(make('/UPLOAD-TRACE'))).toBe(true);
|
||||
expect(isUploadTraceCommand(make(' /upload-trace now '))).toBe(true);
|
||||
});
|
||||
|
||||
it('does not match other text or commands', () => {
|
||||
expect(isUploadTraceCommand(make('hello'))).toBe(false);
|
||||
expect(isUploadTraceCommand(make('/upload'))).toBe(false);
|
||||
expect(isUploadTraceCommand(make('/clear'))).toBe(false);
|
||||
expect(isUploadTraceCommand({ content: 'not json' } as MessageInRow)).toBe(false);
|
||||
});
|
||||
});
|
||||
|
||||
describe('poll loop — /upload-trace command', () => {
|
||||
it('handles the command in the runner, writes a status, skips query', async () => {
|
||||
getInboundDb()
|
||||
.prepare(
|
||||
`INSERT INTO messages_in (id, kind, timestamp, status, platform_id, channel_type, content)
|
||||
VALUES ('m-upload-trace', 'chat', datetime('now'), 'pending', 'chan-1', 'discord', ?)`,
|
||||
)
|
||||
.run(JSON.stringify({ text: '/upload-trace' }));
|
||||
|
||||
// If the provider were ever queried it would emit this — asserting its
|
||||
// absence proves the runner intercepted /upload-trace instead of the LLM.
|
||||
const provider = new MockProvider({}, () => '<message to="discord-test">should not run</message>');
|
||||
const controller = new AbortController();
|
||||
const loopPromise = runPollLoopWithTimeout(provider, controller.signal, 5000);
|
||||
|
||||
await waitFor(() => getUndeliveredMessages().length > 0, 5000);
|
||||
controller.abort();
|
||||
|
||||
const out = getUndeliveredMessages();
|
||||
expect(out).toHaveLength(1);
|
||||
// A status line from uploadTrace() — never the provider's reply.
|
||||
const text = JSON.parse(out[0].content).text as string;
|
||||
expect(text.length).toBeGreaterThan(0);
|
||||
expect(text).not.toBe('should not run');
|
||||
|
||||
// Command message was completed (not left pending).
|
||||
expect(getPendingMessages()).toHaveLength(0);
|
||||
|
||||
await loopPromise.catch(() => {});
|
||||
});
|
||||
});
|
||||
|
||||
async function runPollLoopWithTimeout(provider: MockProvider, signal: AbortSignal, timeoutMs: number): Promise<void> {
|
||||
return Promise.race([
|
||||
runPollLoop({ provider, providerName: 'mock', cwd: '/tmp' }),
|
||||
new Promise<void>((_, reject) => {
|
||||
signal.addEventListener('abort', () => reject(new Error('aborted')));
|
||||
}),
|
||||
new Promise<void>((_, reject) => setTimeout(() => reject(new Error('timeout')), timeoutMs)),
|
||||
]);
|
||||
}
|
||||
|
||||
async function waitFor(condition: () => boolean, timeoutMs: number): Promise<void> {
|
||||
const start = Date.now();
|
||||
while (!condition()) {
|
||||
if (Date.now() - start > timeoutMs) throw new Error('waitFor timeout');
|
||||
await new Promise((resolve) => setTimeout(resolve, 50));
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,172 @@
|
||||
import { spawnSync } from 'node:child_process';
|
||||
import fs from 'node:fs';
|
||||
import os from 'node:os';
|
||||
import path from 'node:path';
|
||||
|
||||
import type { MessageInRow } from './db/messages-in.js';
|
||||
|
||||
/**
|
||||
* `/upload-trace` command: upload this session's Claude Code transcript to the user's
|
||||
* own private `{hf_user}/nanoclaw-traces` dataset, browsable in the HF Agent
|
||||
* Trace Viewer. The transcript the Claude provider keeps under
|
||||
* `~/.claude/projects/<dir>/<sessionId>.jsonl` is already in the format the
|
||||
* viewer auto-detects, so this just locates the newest one and pushes it.
|
||||
*
|
||||
* Auth is the OneCLI gateway's job: curl goes out through the injected
|
||||
* HTTPS_PROXY, which adds the user's HF token. We never see the raw token, and
|
||||
* a 401 from `whoami` is our "not signed in" signal.
|
||||
*/
|
||||
|
||||
/**
|
||||
* Narrow check for /upload-trace — the runner handles this command directly
|
||||
* (no LLM turn). Admin-gated by the host router before it reaches the container.
|
||||
*/
|
||||
export function isUploadTraceCommand(msg: MessageInRow): boolean {
|
||||
let text = '';
|
||||
try {
|
||||
text = (JSON.parse(msg.content)?.text ?? '').trim();
|
||||
} catch {
|
||||
return false; // non-JSON content is never a command
|
||||
}
|
||||
return text.toLowerCase().startsWith('/upload-trace');
|
||||
}
|
||||
|
||||
/** Newest Claude Code transcript jsonl (the current session). */
|
||||
function newestTranscript(): string | null {
|
||||
const projects = path.join(os.homedir(), '.claude', 'projects');
|
||||
let best: { p: string; m: number } | null = null;
|
||||
let dirs: string[];
|
||||
try {
|
||||
dirs = fs.readdirSync(projects);
|
||||
} catch {
|
||||
return null;
|
||||
}
|
||||
for (const dir of dirs) {
|
||||
let files: string[];
|
||||
try {
|
||||
files = fs.readdirSync(path.join(projects, dir));
|
||||
} catch {
|
||||
continue;
|
||||
}
|
||||
for (const f of files) {
|
||||
if (!f.endsWith('.jsonl')) continue;
|
||||
const p = path.join(projects, dir, f);
|
||||
const m = fs.statSync(p).mtimeMs;
|
||||
if (!best || m > best.m) best = { p, m };
|
||||
}
|
||||
}
|
||||
return best?.p ?? null;
|
||||
}
|
||||
|
||||
function curl(args: string[], input?: string): { ok: boolean; out: string } {
|
||||
const r = spawnSync('curl', args, { input, encoding: 'utf-8' });
|
||||
return { ok: r.status === 0, out: (r.stdout ?? '') + (r.stderr ?? '') };
|
||||
}
|
||||
|
||||
/**
|
||||
* Setup instructions for when whoami fails. `body` is the gateway's error
|
||||
* JSON (when the request was proxied through OneCLI). We surface the URL it
|
||||
* hands back — `secret_url` for an unknown host (HF's case), `connect_url`
|
||||
* for an OAuth app, `manage_url` when the secret exists but this agent lacks
|
||||
* access — so the link always points at the right gateway (local or hosted).
|
||||
*/
|
||||
function notSignedInMessage(body: string): string {
|
||||
let setupUrl: string | undefined;
|
||||
try {
|
||||
const e = JSON.parse(body) as { secret_url?: string; connect_url?: string; manage_url?: string };
|
||||
if (e.secret_url) {
|
||||
// The pre-filled `path` defaults to the failing request path
|
||||
// (/api/whoami-v2), which scopes the secret to that one endpoint. Blank
|
||||
// it so the secret matches all of huggingface.co — the upload endpoints
|
||||
// included, not just whoami.
|
||||
setupUrl = e.secret_url.replace(/([?&]path=)[^&]*/, '$1');
|
||||
} else {
|
||||
setupUrl = e.connect_url ?? e.manage_url;
|
||||
}
|
||||
} catch {
|
||||
/* non-JSON body (e.g. HF's own error, or no gateway) — generic fallback */
|
||||
}
|
||||
const lines = [
|
||||
"Can't upload — no Hugging Face token is available to this agent. To set it up:",
|
||||
'',
|
||||
'1. Create a token with WRITE access at https://huggingface.co/settings/tokens',
|
||||
' (New token → type "Write" → copy it).',
|
||||
'',
|
||||
setupUrl
|
||||
? `2. Add it to OneCLI here: ${setupUrl}`
|
||||
: '2. Add it to the OneCLI vault as a secret with host pattern huggingface.co',
|
||||
'',
|
||||
'Then run /upload-trace again.',
|
||||
];
|
||||
return lines.join('\n');
|
||||
}
|
||||
|
||||
/** Returns a user-facing status line. Never throws. */
|
||||
export function uploadTrace(): string {
|
||||
const file = newestTranscript();
|
||||
if (!file) return 'No transcript to upload for this session yet.';
|
||||
|
||||
// whoami, capturing the body + HTTP status (no -f, so the gateway's error
|
||||
// JSON survives a 401). When no token is available the OneCLI gateway
|
||||
// returns a setup URL pre-filled for *this* gateway — so we never hardcode
|
||||
// local-vs-hosted dashboard links, and never have to know which it is.
|
||||
const who = curl(['-s', '-w', '\n%{http_code}', 'https://huggingface.co/api/whoami-v2']);
|
||||
const nl = who.out.lastIndexOf('\n');
|
||||
const body = nl === -1 ? '' : who.out.slice(0, nl);
|
||||
const status = nl === -1 ? who.out.trim() : who.out.slice(nl + 1).trim();
|
||||
|
||||
if (status !== '200') {
|
||||
return notSignedInMessage(body);
|
||||
}
|
||||
let user: string | undefined;
|
||||
try {
|
||||
user = JSON.parse(body)?.name;
|
||||
} catch {
|
||||
/* fall through */
|
||||
}
|
||||
if (!user) return 'Could not resolve your Hugging Face username.';
|
||||
|
||||
const repo = `${user}/nanoclaw-traces`;
|
||||
// Idempotent create — ignore failure (already exists / no-op). The
|
||||
// Content-Type header is required: without it curl sends form-encoding and
|
||||
// the Hub rejects the body with 400 (expected string at "name").
|
||||
curl([
|
||||
'-sf',
|
||||
'-X',
|
||||
'POST',
|
||||
'https://huggingface.co/api/repos/create',
|
||||
'-H',
|
||||
'Content-Type: application/json',
|
||||
'-d',
|
||||
JSON.stringify({ type: 'dataset', name: 'nanoclaw-traces', private: true }),
|
||||
]);
|
||||
|
||||
const content = fs.readFileSync(file).toString('base64');
|
||||
const repoPath = `sessions/${path.basename(file)}`;
|
||||
const ndjson =
|
||||
JSON.stringify({ key: 'header', value: { summary: 'add session trace' } }) +
|
||||
'\n' +
|
||||
JSON.stringify({
|
||||
key: 'file',
|
||||
value: { path: repoPath, encoding: 'base64', content },
|
||||
}) +
|
||||
'\n';
|
||||
|
||||
const commit = curl(
|
||||
[
|
||||
'-sf',
|
||||
'-X',
|
||||
'POST',
|
||||
`https://huggingface.co/api/datasets/${repo}/commit/main`,
|
||||
'-H',
|
||||
'Content-Type: application/x-ndjson',
|
||||
'--data-binary',
|
||||
'@-',
|
||||
],
|
||||
ndjson,
|
||||
);
|
||||
if (!commit.ok) {
|
||||
return 'Upload to Hugging Face failed (the transcript may be too large for an inline commit).';
|
||||
}
|
||||
return `Uploaded → https://huggingface.co/datasets/${repo}/blob/main/${repoPath}`;
|
||||
}
|
||||
@@ -0,0 +1,51 @@
|
||||
# Recovering from the upgrade tripwire
|
||||
|
||||
If NanoClaw refuses to start with a message like *"update did not go through the supported path"*, this page explains what happened and how to clear it.
|
||||
|
||||
## What happened
|
||||
|
||||
NanoClaw records the version it reached each time you upgrade through a supported path — `/setup`, `/update-nanoclaw`, or `/migrate-nanoclaw`. That record lives in `data/upgrade-state.json`.
|
||||
|
||||
At startup the host checks that record against the running code. If it's **missing** or its version **doesn't match** the code, the host stops. This almost always means the code was updated by a raw `git pull` instead of the supported flow — so migrations, dependency installs, or container rebuilds that the flow would have run may not have happened.
|
||||
|
||||
## If you just ran a supported upgrade
|
||||
|
||||
If you reached this point by running `/update-nanoclaw`, `/migrate-nanoclaw`, or `/setup` and it **completed successfully**, this is expected the first time an existing install meets the tripwire (your previous version predated it). Clear it by stamping the current version:
|
||||
|
||||
```bash
|
||||
pnpm exec tsx scripts/upgrade-state.ts set
|
||||
```
|
||||
|
||||
Then restart NanoClaw. It won't trip again.
|
||||
|
||||
**Coding agents:** if the upgrade you just ran reported success, run the command above automatically and restart. Only stop to ask a human if the upgrade itself failed or you can't confirm it finished.
|
||||
|
||||
## If you ran `git pull` yourself
|
||||
|
||||
Don't just clear the tripwire — that skips the work the supported flow does. Instead, run the supported upgrade so backups, migrations, and rebuilds happen:
|
||||
|
||||
```
|
||||
/update-nanoclaw
|
||||
```
|
||||
|
||||
Once it finishes it stamps the marker for you, and the next start is clean.
|
||||
|
||||
## If you have your own upgrade flow
|
||||
|
||||
If you've built your own way to upgrade — a custom skill, a deploy script, a CI job, a service that pulls and restarts — it won't stamp the marker, so the host will trip on the next start. Add the stamp as the **last step** of that flow, after the upgrade succeeds and before the restart:
|
||||
|
||||
```bash
|
||||
pnpm exec tsx scripts/upgrade-state.ts set
|
||||
```
|
||||
|
||||
That's the same thing `/setup`, `/update-nanoclaw`, and `/migrate-nanoclaw` do at the end. Do it only when the upgrade actually completed — the marker is your assertion that this install reached the current version through a path you trust.
|
||||
|
||||
## The override
|
||||
|
||||
`pnpm exec tsx scripts/upgrade-state.ts set` is the override: it declares "this install is good at the current version." Use it when you know the install is actually in a good state (e.g. you completed the steps manually). It's safe to re-run.
|
||||
|
||||
To inspect the current marker:
|
||||
|
||||
```bash
|
||||
pnpm exec tsx scripts/upgrade-state.ts get
|
||||
```
|
||||
+1
-1
@@ -1,6 +1,6 @@
|
||||
{
|
||||
"name": "nanoclaw",
|
||||
"version": "2.0.70",
|
||||
"version": "2.1.0",
|
||||
"description": "Personal Claude assistant. Lightweight, secure, customizable.",
|
||||
"type": "module",
|
||||
"packageManager": "pnpm@10.33.0",
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="90" height="20" role="img" aria-label="179k tokens, 89% of context window">
|
||||
<title>179k tokens, 89% of context window</title>
|
||||
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="90" height="20" role="img" aria-label="181k tokens, 91% of context window">
|
||||
<title>181k tokens, 91% of context window</title>
|
||||
<linearGradient id="s" x2="0" y2="100%">
|
||||
<stop offset="0" stop-color="#bbb" stop-opacity=".1"/>
|
||||
<stop offset="1" stop-opacity=".1"/>
|
||||
@@ -15,8 +15,8 @@
|
||||
<g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" font-size="11">
|
||||
<text aria-hidden="true" x="26" y="15" fill="#010101" fill-opacity=".3">tokens</text>
|
||||
<text x="26" y="14">tokens</text>
|
||||
<text aria-hidden="true" x="71" y="15" fill="#010101" fill-opacity=".3">179k</text>
|
||||
<text x="71" y="14">179k</text>
|
||||
<text aria-hidden="true" x="71" y="15" fill="#010101" fill-opacity=".3">181k</text>
|
||||
<text x="71" y="14">181k</text>
|
||||
</g>
|
||||
</g>
|
||||
</a>
|
||||
|
||||
|
Before Width: | Height: | Size: 1.1 KiB After Width: | Height: | Size: 1.1 KiB |
@@ -0,0 +1,26 @@
|
||||
/**
|
||||
* scripts/upgrade-state.ts — read or stamp the upgrade marker.
|
||||
*
|
||||
* Usage:
|
||||
* pnpm exec tsx scripts/upgrade-state.ts get
|
||||
* pnpm exec tsx scripts/upgrade-state.ts set [version] [via]
|
||||
*
|
||||
* `set` with no version stamps the current package.json version. The
|
||||
* sanctioned upgrade paths (setup / update / migrate) call `set` on
|
||||
* success; running it by hand is also the documented way to clear the
|
||||
* startup tripwire — see docs/upgrade-recovery.md.
|
||||
*/
|
||||
import { getCodeVersion, markerPath, readUpgradeState, writeUpgradeState } from '../src/upgrade-state.js';
|
||||
|
||||
const [, , cmd, versionArg, viaArg] = process.argv;
|
||||
|
||||
if (cmd === 'get') {
|
||||
const state = readUpgradeState();
|
||||
console.log(state ? JSON.stringify(state) : 'none');
|
||||
} else if (cmd === 'set') {
|
||||
const state = writeUpgradeState({ version: versionArg || getCodeVersion(), via: viaArg || 'manual' });
|
||||
console.log(`Stamped ${markerPath()}: ${JSON.stringify(state)}`);
|
||||
} else {
|
||||
console.error('Usage: pnpm exec tsx scripts/upgrade-state.ts get | set [version] [via]');
|
||||
process.exit(2);
|
||||
}
|
||||
@@ -11,6 +11,7 @@ import path from 'path';
|
||||
|
||||
import { log } from '../src/log.js';
|
||||
import { getLaunchdLabel, getSystemdUnit } from '../src/install-slug.js';
|
||||
import { writeUpgradeState } from '../src/upgrade-state.js';
|
||||
import { cleanupUnhealthyPeers } from './peer-cleanup.js';
|
||||
import {
|
||||
commandExists,
|
||||
@@ -54,6 +55,11 @@ export async function run(_args: string[]): Promise<void> {
|
||||
|
||||
fs.mkdirSync(path.join(projectRoot, 'logs'), { recursive: true });
|
||||
|
||||
// Stamp the upgrade marker before the host first starts, so the startup
|
||||
// tripwire (enforceUpgradeTripwire) sees this as a sanctioned install.
|
||||
const stamped = writeUpgradeState({ via: 'setup' });
|
||||
log.info('Stamped upgrade marker', { version: stamped.version });
|
||||
|
||||
// Peer preflight — a crash-looping peer install (most often the legacy v1
|
||||
// `com.nanoclaw` plist) will keep trashing this install's containers on
|
||||
// every respawn via its own cleanupOrphans. Detect and unload any peer
|
||||
|
||||
+1
-1
@@ -12,7 +12,7 @@ import { getDb, hasTable } from './db/connection.js';
|
||||
export type GateResult = { action: 'pass' } | { action: 'filter' } | { action: 'deny'; command: string };
|
||||
|
||||
const FILTERED_COMMANDS = new Set(['/help', '/login', '/logout', '/doctor', '/config', '/remote-control']);
|
||||
const ADMIN_COMMANDS = new Set(['/clear', '/compact', '/context', '/cost', '/files']);
|
||||
const ADMIN_COMMANDS = new Set(['/clear', '/compact', '/context', '/cost', '/files', '/upload-trace']);
|
||||
|
||||
/**
|
||||
* Classify a message and decide whether it should reach the container.
|
||||
|
||||
@@ -17,6 +17,7 @@ import { startActiveDeliveryPoll, startSweepDeliveryPoll, setDeliveryAdapter, st
|
||||
import { startHostSweep, stopHostSweep } from './host-sweep.js';
|
||||
import { routeInbound } from './router.js';
|
||||
import { log } from './log.js';
|
||||
import { enforceUpgradeTripwire } from './upgrade-state.js';
|
||||
|
||||
// Response + shutdown registries live in response-registry.ts to break the
|
||||
// circular import cycle: src/index.ts imports src/modules/index.js for side
|
||||
@@ -69,6 +70,10 @@ async function main(): Promise<void> {
|
||||
// 0. Circuit breaker — backoff on rapid restarts
|
||||
await enforceStartupBackoff();
|
||||
|
||||
// 0.5 Upgrade tripwire — refuse to start if this install was updated
|
||||
// outside the sanctioned path (raw `git pull` instead of /update-nanoclaw).
|
||||
enforceUpgradeTripwire();
|
||||
|
||||
// 1. Init central DB
|
||||
const dbPath = path.join(DATA_DIR, 'v2.db');
|
||||
const db = initDb(dbPath);
|
||||
|
||||
@@ -0,0 +1,90 @@
|
||||
import fs from 'fs';
|
||||
import path from 'path';
|
||||
|
||||
import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest';
|
||||
|
||||
vi.mock('./config.js', async () => {
|
||||
const actual = await vi.importActual<typeof import('./config.js')>('./config.js');
|
||||
return { ...actual, DATA_DIR: '/tmp/nanoclaw-test-upgrade-state' };
|
||||
});
|
||||
|
||||
const TEST_DIR = '/tmp/nanoclaw-test-upgrade-state';
|
||||
|
||||
import {
|
||||
enforceUpgradeTripwire,
|
||||
getCodeVersion,
|
||||
isUpgradeCurrent,
|
||||
markerPath,
|
||||
readUpgradeState,
|
||||
writeUpgradeState,
|
||||
} from './upgrade-state.js';
|
||||
|
||||
beforeEach(() => {
|
||||
fs.rmSync(TEST_DIR, { recursive: true, force: true });
|
||||
});
|
||||
afterEach(() => {
|
||||
fs.rmSync(TEST_DIR, { recursive: true, force: true });
|
||||
});
|
||||
|
||||
describe('upgrade-state', () => {
|
||||
it('getCodeVersion reads the package.json version', () => {
|
||||
const pkg = JSON.parse(fs.readFileSync(path.join(process.cwd(), 'package.json'), 'utf8'));
|
||||
expect(getCodeVersion()).toBe(pkg.version);
|
||||
});
|
||||
|
||||
it('readUpgradeState returns null when the marker is absent', () => {
|
||||
expect(readUpgradeState()).toBeNull();
|
||||
});
|
||||
|
||||
it('write then read round-trips, with version/via/updatedAt', () => {
|
||||
const written = writeUpgradeState({ version: '9.9.9', via: 'test' });
|
||||
expect(written).toMatchObject({ version: '9.9.9', via: 'test' });
|
||||
expect(written.updatedAt).toBeTruthy();
|
||||
expect(readUpgradeState()).toEqual(written);
|
||||
});
|
||||
|
||||
it('write defaults the version to the code version', () => {
|
||||
expect(writeUpgradeState({ via: 'test' }).version).toBe(getCodeVersion());
|
||||
});
|
||||
|
||||
it('isUpgradeCurrent: false when absent, false on mismatch, true on match', () => {
|
||||
expect(isUpgradeCurrent()).toBe(false);
|
||||
writeUpgradeState({ version: '0.0.0-nope', via: 'test' });
|
||||
expect(isUpgradeCurrent()).toBe(false);
|
||||
writeUpgradeState({ version: getCodeVersion(), via: 'test' });
|
||||
expect(isUpgradeCurrent()).toBe(true);
|
||||
});
|
||||
|
||||
it('treats a corrupt marker as absent (fails closed, never throws)', () => {
|
||||
fs.mkdirSync(TEST_DIR, { recursive: true });
|
||||
fs.writeFileSync(path.join(TEST_DIR, 'upgrade-state.json'), '{ this is not json');
|
||||
expect(() => readUpgradeState()).not.toThrow();
|
||||
expect(readUpgradeState()).toBeNull();
|
||||
expect(isUpgradeCurrent()).toBe(false);
|
||||
});
|
||||
|
||||
it('markerPath is upgrade-state.json under the data dir', () => {
|
||||
expect(markerPath()).toBe(path.join(TEST_DIR, 'upgrade-state.json'));
|
||||
});
|
||||
|
||||
it('enforceUpgradeTripwire exits when not current and passes when current', () => {
|
||||
const exitSpy = vi.spyOn(process, 'exit').mockImplementation(((code?: number) => {
|
||||
throw new Error(`exit:${code}`);
|
||||
}) as never);
|
||||
const errSpy = vi.spyOn(console, 'error').mockImplementation(() => {});
|
||||
|
||||
// No marker → trips.
|
||||
expect(() => enforceUpgradeTripwire()).toThrow('exit:1');
|
||||
|
||||
// Stale marker → trips.
|
||||
writeUpgradeState({ version: '0.0.0-nope', via: 'test' });
|
||||
expect(() => enforceUpgradeTripwire()).toThrow('exit:1');
|
||||
|
||||
// Matching marker → passes.
|
||||
writeUpgradeState({ version: getCodeVersion(), via: 'test' });
|
||||
expect(() => enforceUpgradeTripwire()).not.toThrow();
|
||||
|
||||
exitSpy.mockRestore();
|
||||
errSpy.mockRestore();
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,126 @@
|
||||
/**
|
||||
* Upgrade marker — the record that an install reached its current version
|
||||
* through a sanctioned path (setup / `/update-nanoclaw` / `/migrate-nanoclaw`).
|
||||
*
|
||||
* The startup tripwire (enforceUpgradeTripwire) refuses to run if the marker
|
||||
* is missing or its version doesn't match the running code — i.e. if the
|
||||
* install was updated by a raw `git pull` instead of the supported flow.
|
||||
*
|
||||
* The marker lives in `data/` (gitignored), so a `git pull` can't touch it.
|
||||
* Only the sanctioned paths call writeUpgradeState(); clearing the tripwire
|
||||
* by hand is the same `set` — see docs/upgrade-recovery.md.
|
||||
*/
|
||||
import fs from 'fs';
|
||||
import path from 'path';
|
||||
|
||||
import { DATA_DIR } from './config.js';
|
||||
import { log } from './log.js';
|
||||
|
||||
export interface UpgradeState {
|
||||
version: string;
|
||||
updatedAt: string;
|
||||
via: string;
|
||||
}
|
||||
|
||||
const MARKER_PATH = path.join(DATA_DIR, 'upgrade-state.json');
|
||||
const FIX_COMMAND = 'pnpm exec tsx scripts/upgrade-state.ts set';
|
||||
|
||||
/** Version the running code declares, read from package.json. */
|
||||
export function getCodeVersion(): string {
|
||||
const pkgPath = path.join(process.cwd(), 'package.json');
|
||||
const pkg = JSON.parse(fs.readFileSync(pkgPath, 'utf8')) as { version?: string };
|
||||
if (!pkg.version) throw new Error(`No version field in ${pkgPath}`);
|
||||
return pkg.version;
|
||||
}
|
||||
|
||||
/**
|
||||
* Read the upgrade marker, or null if it's absent, unreadable, or corrupt.
|
||||
* Never throws — a boot gate must fail closed (treat anything it can't trust
|
||||
* as "no valid marker" → trip), not crash with a stack trace.
|
||||
*/
|
||||
export function readUpgradeState(): UpgradeState | null {
|
||||
let raw: string;
|
||||
try {
|
||||
raw = fs.readFileSync(MARKER_PATH, 'utf8');
|
||||
} catch (e: unknown) {
|
||||
if ((e as NodeJS.ErrnoException).code === 'ENOENT') return null;
|
||||
log.warn('Could not read upgrade marker; treating as absent', { path: MARKER_PATH, err: String(e) });
|
||||
return null;
|
||||
}
|
||||
try {
|
||||
return JSON.parse(raw) as UpgradeState;
|
||||
} catch {
|
||||
log.warn('Upgrade marker is corrupt; treating as absent', { path: MARKER_PATH });
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Stamp the marker. Only the sanctioned paths (setup / update / migrate)
|
||||
* call this on success; `version` defaults to the current code version.
|
||||
*/
|
||||
export function writeUpgradeState(opts: { version?: string; via: string }): UpgradeState {
|
||||
const state: UpgradeState = {
|
||||
version: opts.version ?? getCodeVersion(),
|
||||
updatedAt: new Date().toISOString(),
|
||||
via: opts.via,
|
||||
};
|
||||
fs.mkdirSync(DATA_DIR, { recursive: true });
|
||||
fs.writeFileSync(MARKER_PATH, JSON.stringify(state, null, 2) + '\n');
|
||||
return state;
|
||||
}
|
||||
|
||||
/** True when the marker exists and matches the running code version. */
|
||||
export function isUpgradeCurrent(): boolean {
|
||||
const state = readUpgradeState();
|
||||
return state !== null && state.version === getCodeVersion();
|
||||
}
|
||||
|
||||
/** Absolute path to the marker file. */
|
||||
export function markerPath(): string {
|
||||
return MARKER_PATH;
|
||||
}
|
||||
|
||||
/**
|
||||
* Startup gate. If the install didn't reach the current version through a
|
||||
* sanctioned path, stop with a message written for the coding agent that
|
||||
* just ran the upgrade to act on automatically.
|
||||
*/
|
||||
export function enforceUpgradeTripwire(): void {
|
||||
if (isUpgradeCurrent()) return;
|
||||
|
||||
const code = getCodeVersion();
|
||||
const recorded = readUpgradeState()?.version ?? 'none';
|
||||
|
||||
console.error(
|
||||
[
|
||||
'',
|
||||
'='.repeat(64),
|
||||
'NanoClaw stopped: update did not go through the supported path',
|
||||
'='.repeat(64),
|
||||
'This install was updated outside the sanctioned upgrade flow',
|
||||
'(no upgrade marker, or the marker version does not match the code).',
|
||||
'',
|
||||
` code version: ${code}`,
|
||||
` recorded version: ${recorded}`,
|
||||
'',
|
||||
"IF YOU'RE A HUMAN: you most likely ran `git pull` directly. Don't",
|
||||
'fix this by hand — open Claude Code in this folder and run',
|
||||
'/update-nanoclaw, which repairs the install and clears this stop.',
|
||||
'',
|
||||
'CODING AGENT: if you just ran /update-nanoclaw, /migrate-nanoclaw, or',
|
||||
'setup and it completed successfully, clear this automatically by running:',
|
||||
'',
|
||||
` ${FIX_COMMAND}`,
|
||||
'',
|
||||
'then restart NanoClaw. If the upgrade did NOT complete, stop and',
|
||||
'investigate instead of clearing.',
|
||||
'',
|
||||
'Details: docs/upgrade-recovery.md',
|
||||
'='.repeat(64),
|
||||
'',
|
||||
].join('\n'),
|
||||
);
|
||||
log.error('Upgrade tripwire: install not on the sanctioned path', { code, recorded });
|
||||
process.exit(1);
|
||||
}
|
||||
Reference in New Issue
Block a user