mirror of
https://github.com/qwibitai/nanoclaw.git
synced 2026-06-12 18:11:51 +08:00
Compare commits
29 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| ad3dfad3f5 | |||
| 0bdc6d2bb2 | |||
| 820cd8ece6 | |||
| e44d497cdf | |||
| ac37ecbfd6 | |||
| c6627d32e2 | |||
| 51bf403b22 | |||
| 265953ffec | |||
| 6227bd1a5b | |||
| 28032bc0ec | |||
| 3e3a2945a5 | |||
| f3fc18e56e | |||
| d85efea229 | |||
| c5b22cb308 | |||
| 1592369201 | |||
| 6420c0e254 | |||
| aef8d38b36 | |||
| 6d6f813deb | |||
| f9c86d0af2 | |||
| 9edb33dd3a | |||
| 8ba5261ae8 | |||
| 8c84dec8e9 | |||
| 092487d7ad | |||
| 87850aa7f8 | |||
| 526170fd47 | |||
| 2d9375531b | |||
| e734e5cddd | |||
| 728c6a641b | |||
| 8385236c30 |
@@ -28,6 +28,15 @@ Two phases: **Extract** (build the migration guide) and **Upgrade** (use it). If
|
||||
|
||||
---
|
||||
|
||||
# Phase 0: Refresh this skill first
|
||||
|
||||
The migration process itself evolves, so run its newest version before doing anything else:
|
||||
- Ensure the `upstream` remote exists (default `https://github.com/nanocoai/nanoclaw.git`) and fetch: `git fetch upstream --prune`. Detect the upstream branch (`main` or `master`).
|
||||
- Refresh this skill from upstream: `git checkout upstream/<branch> -- .claude/skills/migrate-nanoclaw/`
|
||||
- Re-read `.claude/skills/migrate-nanoclaw/SKILL.md`. If it changed, **follow the updated version from the top** instead of this one.
|
||||
|
||||
This is the only working-tree change expected before the preflight check below; changes limited to `.claude/skills/migrate-nanoclaw/` are this self-refresh — ignore them in the 1.0 clean-tree check and proceed.
|
||||
|
||||
# Phase 1: Extract
|
||||
|
||||
## 1.0 Preflight
|
||||
@@ -464,6 +473,11 @@ Point the branch at the upgraded state with `git reset --hard <upgrade-commit>`
|
||||
|
||||
Run `pnpm install && pnpm run build` in the main tree to confirm.
|
||||
|
||||
Stamp the upgrade marker (required — without it the startup tripwire stops the host on next start). Only do this after the build above succeeds:
|
||||
```bash
|
||||
pnpm exec tsx scripts/upgrade-state.ts set "" migrate-nanoclaw
|
||||
```
|
||||
|
||||
Restart the service. Service labels are per-install — derive them from `setup/lib/install-slug.sh`:
|
||||
```bash
|
||||
source setup/lib/install-slug.sh
|
||||
|
||||
@@ -60,11 +60,20 @@ Help a user with a customized NanoClaw install safely incorporate upstream chang
|
||||
- Default to MERGE (one-pass conflict resolution). Offer REBASE as an explicit option.
|
||||
- Keep token usage low: rely on `git status`, `git log`, `git diff`, and open only conflicted files.
|
||||
|
||||
# Step 0a: Refresh this skill first
|
||||
The update process itself evolves, so run its newest version before doing anything else:
|
||||
- Ensure the `upstream` remote exists (default `https://github.com/nanocoai/nanoclaw.git`) and fetch: `git fetch upstream --prune`. Detect the upstream branch (`main` or `master`).
|
||||
- Refresh this skill from upstream: `git checkout upstream/<branch> -- .claude/skills/update-nanoclaw/`
|
||||
- Re-read `.claude/skills/update-nanoclaw/SKILL.md`. If it changed, **follow the updated version from the top** instead of this one.
|
||||
|
||||
This is the only working-tree change expected before the preflight check; the full update commits it along with everything else.
|
||||
|
||||
# Step 0: Preflight (stop early if unsafe)
|
||||
Run:
|
||||
- `git status --porcelain`
|
||||
If output is non-empty:
|
||||
- Tell the user to commit or stash first, then stop.
|
||||
- Exception: changes limited to `.claude/skills/update-nanoclaw/` are the Step 0a self-refresh — ignore those and proceed.
|
||||
|
||||
Confirm remotes:
|
||||
- `git remote -v`
|
||||
@@ -256,6 +265,16 @@ If any channels/providers are installed AND `upstream/channels` or `upstream/pro
|
||||
|
||||
If no channels/providers are installed, skip silently.
|
||||
|
||||
Proceed to Step 7.9.
|
||||
|
||||
# Step 7.9: Stamp the upgrade marker (required)
|
||||
After validation has **succeeded**, record that this install reached the new version through the supported path. Without this, the startup tripwire stops the host on its next start.
|
||||
|
||||
- `pnpm exec tsx scripts/upgrade-state.ts set "" update-nanoclaw`
|
||||
- The empty version argument stamps the current `package.json` version.
|
||||
|
||||
If validation did NOT succeed, do not stamp — leave the tripwire to catch the broken state.
|
||||
|
||||
Proceed to Step 8.
|
||||
|
||||
# Step 8: Summary + rollback instructions
|
||||
|
||||
@@ -18,12 +18,20 @@ jobs:
|
||||
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
fetch-depth: 0
|
||||
token: ${{ steps.app-token.outputs.token }}
|
||||
|
||||
- uses: pnpm/action-setup@v4
|
||||
|
||||
- name: Bump patch version
|
||||
run: |
|
||||
# Skip the auto-bump when the pushed commits already changed the
|
||||
# version themselves (e.g. a release PR that set a minor/major).
|
||||
# Otherwise the bot would patch a deliberate 2.1.0 up to 2.1.1.
|
||||
if git diff --name-only "${{ github.event.before }}" "${{ github.sha }}" | grep -qx 'package.json'; then
|
||||
echo "package.json already changed in this push; skipping auto-bump."
|
||||
exit 0
|
||||
fi
|
||||
pnpm version patch --no-git-tag-version
|
||||
git add package.json
|
||||
git diff --cached --quiet && exit 0
|
||||
|
||||
@@ -39,3 +39,10 @@ groups/*
|
||||
.nanoclaw/
|
||||
|
||||
agents-sdk-docs
|
||||
.agents
|
||||
AGENTS.md
|
||||
|
||||
# Internal working docs, never committed
|
||||
docs/maintainer-guide.md
|
||||
docs/drafts/
|
||||
forks.md
|
||||
|
||||
@@ -2,6 +2,10 @@
|
||||
|
||||
All notable changes to NanoClaw will be documented in this file.
|
||||
|
||||
## [2.1.0] - 2026-06-07
|
||||
|
||||
- [BREAKING] **Startup now requires an upgrade marker.** The host refuses to boot unless `data/upgrade-state.json` records that this install reached the current version through a sanctioned path (`/setup`, `/update-nanoclaw`, `/migrate-nanoclaw`). After this update completes — and before restarting the service — stamp the marker by running `pnpm exec tsx scripts/upgrade-state.ts set`. If the host has already tripped on restart with "update did not go through the supported path", that same command clears it. See [docs/upgrade-recovery.md](docs/upgrade-recovery.md).
|
||||
|
||||
## [2.0.64] - 2026-05-18
|
||||
|
||||
- **`ncl destinations add` and `remove` through the approval flow now reach the receiver immediately.** Approved destinations weren't being projected into the receiving agent's local session state, so a freshly-added destination silently failed at `send_message` with `unknown destination`, and a removed destination stayed resolvable until the next container restart. Both now take effect the moment the approval executes. Direct (non-approval) calls were unaffected.
|
||||
|
||||
@@ -274,6 +274,9 @@ This project uses pnpm with `minimumReleaseAge: 4320` (3 days) in `pnpm-workspac
|
||||
| [docs/build-and-runtime.md](docs/build-and-runtime.md) | Runtime split (Node host + Bun container), lockfiles, image build surface, CI, key invariants |
|
||||
| [docs/v1-to-v2-changes.md](docs/v1-to-v2-changes.md) | v1→v2 architecture diff — vocabulary for where v1 things moved |
|
||||
| [docs/migration-dev.md](docs/migration-dev.md) | Migration development guide — testing, debugging, dev loop |
|
||||
| [docs/customizing.md](docs/customizing.md) | Short intro to customizing via skills |
|
||||
| [docs/skills-model.md](docs/skills-model.md) | The skills model in full: recipes, tests, upgrades, migrations |
|
||||
| [docs/skill-guidelines.md](docs/skill-guidelines.md) | Authoritative checklist for writing a skill |
|
||||
|
||||
## Container Build Cache
|
||||
|
||||
|
||||
+17
-12
@@ -29,26 +29,27 @@ Every user should have clean and minimal code that does exactly what they need.
|
||||
|
||||
### Skill types
|
||||
|
||||
#### 1. Feature skills (branch-based)
|
||||
#### 1. Channel and provider skills (registry branches)
|
||||
|
||||
Add capabilities to NanoClaw by merging a git branch. The SKILL.md contains setup instructions; the actual code lives on a `skill/*` branch.
|
||||
Add a messaging channel or an agent provider. The SKILL.md contains the install steps; the actual code lives on a long-lived registry branch (`channels` or `providers`) that we keep in sync with `main`.
|
||||
|
||||
**Location:** `.claude/skills/` on `main` (instructions only), code on `skill/*` branch
|
||||
**Location:** `.claude/skills/` on `main` (instructions only), code on the `channels` or `providers` branch
|
||||
|
||||
**Examples:** `/add-telegram`, `/add-slack`, `/add-discord`, `/add-gmail`
|
||||
**Examples:** `/add-telegram`, `/add-slack`, `/add-discord`, `/add-opencode`
|
||||
|
||||
**How they work:**
|
||||
1. User runs `/add-telegram`
|
||||
2. Claude follows the SKILL.md: fetches and merges the `skill/telegram` branch
|
||||
3. Claude walks through interactive setup (env vars, bot creation, etc.)
|
||||
2. Claude follows the SKILL.md: `git fetch origin channels`, then copies each file in with `git show origin/channels:<path> > <path>`. Install is an additive fetch, never a `git merge`.
|
||||
3. The adapter's registration test is fetched the same way and run as verification
|
||||
4. Claude walks through interactive setup (tokens, bot creation, etc.)
|
||||
|
||||
**Contributing a feature skill:**
|
||||
**Contributing a channel or provider skill:**
|
||||
1. Fork `nanocoai/nanoclaw` and branch from `main`
|
||||
2. Make the code changes (new files, modified source, updated `package.json`, etc.)
|
||||
3. Add a SKILL.md in `.claude/skills/<name>/` with setup instructions — step 1 should be merging the branch
|
||||
4. Open a PR. We'll create the `skill/<name>` branch from your work
|
||||
2. Build the adapter following [docs/skill-guidelines.md](docs/skill-guidelines.md): a self-registering module, one appended barrel import, and a registration test that imports the real barrel
|
||||
3. Add a SKILL.md in `.claude/skills/<name>/` with the fetch-and-copy steps, and a REMOVE.md that reverses every change
|
||||
4. Open a PR. We'll land the code on the registry branch from your work
|
||||
|
||||
See `/add-telegram` for a good example. See [docs/skills-as-branches.md](docs/skills-as-branches.md) for the full system design.
|
||||
See `/add-slack` for a good example. See [docs/skills-model.md](docs/skills-model.md) for why install is a fetch, never a merge.
|
||||
|
||||
#### 2. Utility skills (with code files)
|
||||
|
||||
@@ -58,7 +59,7 @@ Standalone tools that ship code files alongside the SKILL.md. The SKILL.md tells
|
||||
|
||||
**Examples:** a self-contained CLI or helper shipped in a `scripts/` subfolder of the skill.
|
||||
|
||||
**Key difference from feature skills:** No branch merge needed. The code is self-contained in the skill directory and gets copied into place during installation.
|
||||
**Key difference from channel/provider skills:** the code is self-contained in the skill directory and gets copied into place during installation; nothing is fetched from a registry branch.
|
||||
|
||||
**Guidelines:**
|
||||
- Put code in separate files, not inline in the SKILL.md
|
||||
@@ -93,6 +94,10 @@ Skills that run inside the agent container, not on the host. These teach the con
|
||||
- Use `allowed-tools` frontmatter to scope tool permissions
|
||||
- Keep them focused — the agent's context window is shared across all container skills
|
||||
|
||||
### Writing a good skill
|
||||
|
||||
The authoring bar is [docs/skill-guidelines.md](docs/skill-guidelines.md): mostly adds, minimal reach-ins into existing code, a test for every functional integration point, and a REMOVE.md whenever apply leaves anything behind. [docs/skills-model.md](docs/skills-model.md) explains the model behind it.
|
||||
|
||||
### SKILL.md format
|
||||
|
||||
All skills use the [Claude Code skills standard](https://code.claude.com/docs/en/skills):
|
||||
|
||||
@@ -200,7 +200,7 @@ If a step fails, `nanoclaw.sh` hands off to Claude Code to diagnose and resume.
|
||||
|
||||
Only security fixes, bug fixes, and clear improvements will be accepted to the base configuration. That's all.
|
||||
|
||||
Everything else (new capabilities, OS compatibility, hardware support, enhancements) should be contributed as skills on the `channels` or `providers` branch.
|
||||
Everything else (new capabilities, OS compatibility, hardware support, enhancements) should be contributed as skills: channel and provider code on the `channels`/`providers` registry branches, everything else as a self-contained skill. See [docs/customizing.md](docs/customizing.md) and [CONTRIBUTING.md](CONTRIBUTING.md).
|
||||
|
||||
This keeps the base system minimal and lets every user customize their installation without inheriting features they don't want.
|
||||
|
||||
|
||||
@@ -19,7 +19,7 @@ ARG INSTALL_CJK_FONTS=false
|
||||
# Pin CLI versions for reproducibility. Bump deliberately — unpinned installs
|
||||
# mean every rebuild silently picks up the latest and can break in lockstep
|
||||
# across all users.
|
||||
ARG CLAUDE_CODE_VERSION=2.1.154
|
||||
ARG CLAUDE_CODE_VERSION=2.1.170
|
||||
ARG AGENT_BROWSER_VERSION=latest
|
||||
ARG VERCEL_VERSION=52.2.1
|
||||
ARG BUN_VERSION=1.3.12
|
||||
|
||||
@@ -5,7 +5,7 @@
|
||||
"": {
|
||||
"name": "nanoclaw-agent-runner",
|
||||
"dependencies": {
|
||||
"@anthropic-ai/claude-agent-sdk": "^0.3.154",
|
||||
"@anthropic-ai/claude-agent-sdk": "^0.3.170",
|
||||
"@anthropic-ai/sdk": "^0.100.0",
|
||||
"@modelcontextprotocol/sdk": "^1.29.0",
|
||||
"cron-parser": "^5.0.0",
|
||||
@@ -19,23 +19,23 @@
|
||||
},
|
||||
},
|
||||
"packages": {
|
||||
"@anthropic-ai/claude-agent-sdk": ["@anthropic-ai/claude-agent-sdk@0.3.154", "", { "optionalDependencies": { "@anthropic-ai/claude-agent-sdk-darwin-arm64": "0.3.154", "@anthropic-ai/claude-agent-sdk-darwin-x64": "0.3.154", "@anthropic-ai/claude-agent-sdk-linux-arm64": "0.3.154", "@anthropic-ai/claude-agent-sdk-linux-arm64-musl": "0.3.154", "@anthropic-ai/claude-agent-sdk-linux-x64": "0.3.154", "@anthropic-ai/claude-agent-sdk-linux-x64-musl": "0.3.154", "@anthropic-ai/claude-agent-sdk-win32-arm64": "0.3.154", "@anthropic-ai/claude-agent-sdk-win32-x64": "0.3.154" }, "peerDependencies": { "@anthropic-ai/sdk": ">=0.93.0", "@modelcontextprotocol/sdk": "^1.29.0", "zod": "^4.0.0" } }, "sha512-iEn25urI2QrMPFIhId3h7v/7EG5gsmF7ooe+6EvsAosePeLmpVVerp5nXtHnlmBkMinLecurcPA+OddKw76jYw=="],
|
||||
"@anthropic-ai/claude-agent-sdk": ["@anthropic-ai/claude-agent-sdk@0.3.170", "", { "optionalDependencies": { "@anthropic-ai/claude-agent-sdk-darwin-arm64": "0.3.170", "@anthropic-ai/claude-agent-sdk-darwin-x64": "0.3.170", "@anthropic-ai/claude-agent-sdk-linux-arm64": "0.3.170", "@anthropic-ai/claude-agent-sdk-linux-arm64-musl": "0.3.170", "@anthropic-ai/claude-agent-sdk-linux-x64": "0.3.170", "@anthropic-ai/claude-agent-sdk-linux-x64-musl": "0.3.170", "@anthropic-ai/claude-agent-sdk-win32-arm64": "0.3.170", "@anthropic-ai/claude-agent-sdk-win32-x64": "0.3.170" }, "peerDependencies": { "@anthropic-ai/sdk": ">=0.93.0", "@modelcontextprotocol/sdk": "^1.29.0", "zod": "^4.0.0" } }, "sha512-pAvhfk+iTodXZ6RF18Kz7BEUWFjL7EcR3tKuhUNdPpE1NAYCR3mSHGbafi72JsrNwKEDIs7FU31z3fqhwy8QzA=="],
|
||||
|
||||
"@anthropic-ai/claude-agent-sdk-darwin-arm64": ["@anthropic-ai/claude-agent-sdk-darwin-arm64@0.3.154", "", { "os": "darwin", "cpu": "arm64" }, "sha512-oFW3LD5lYrKAU+AKu27Z8hrzqkrh362qQrwi/i3DxGcud9BXUycsXYjShpDj3D3JZu169UzZuSPhx1Wajmbiwg=="],
|
||||
"@anthropic-ai/claude-agent-sdk-darwin-arm64": ["@anthropic-ai/claude-agent-sdk-darwin-arm64@0.3.170", "", { "os": "darwin", "cpu": "arm64" }, "sha512-rwfgArIa5WI0QPNqFsRBgvtSI0mrtpynUm0oK6+l6/KX4hcgnYGEzciZR1bOeD9/7sSZlTdIgt+T9alKeZmXcg=="],
|
||||
|
||||
"@anthropic-ai/claude-agent-sdk-darwin-x64": ["@anthropic-ai/claude-agent-sdk-darwin-x64@0.3.154", "", { "os": "darwin", "cpu": "x64" }, "sha512-5BgWEueP+cqoctWjZYhCbyltuaV/N2DmKDXD3/69cKaVmJp8XL9OCzlq/HEirA/+Ssjskx6hDUBaOcpuZ3iwQA=="],
|
||||
"@anthropic-ai/claude-agent-sdk-darwin-x64": ["@anthropic-ai/claude-agent-sdk-darwin-x64@0.3.170", "", { "os": "darwin", "cpu": "x64" }, "sha512-0e58h8UQMtsQxLGIv9r4foxfBFWKZ7NeDtoplLhuD7EwQonehomw1sBXCch77t/IfUS+q5vQ5zv+fOGmap5nLQ=="],
|
||||
|
||||
"@anthropic-ai/claude-agent-sdk-linux-arm64": ["@anthropic-ai/claude-agent-sdk-linux-arm64@0.3.154", "", { "os": "linux", "cpu": "arm64" }, "sha512-rRkW4SBL3W7zQvKscCIfIGlmoeuTbMV6dXFbPdmpRGvmYZIs79RpzO6xrGBnnhmm+B7znQ9oHAnffi/2FBgJbA=="],
|
||||
"@anthropic-ai/claude-agent-sdk-linux-arm64": ["@anthropic-ai/claude-agent-sdk-linux-arm64@0.3.170", "", { "os": "linux", "cpu": "arm64" }, "sha512-gLbaFqcGppFJQd4DLNV4IXoeahejT/p2/M8bSSvRDbla9GOsBr1AxV5XLRyBn1e7xFGozZIAIQr3+1chp7NJgQ=="],
|
||||
|
||||
"@anthropic-ai/claude-agent-sdk-linux-arm64-musl": ["@anthropic-ai/claude-agent-sdk-linux-arm64-musl@0.3.154", "", { "os": "linux", "cpu": "arm64" }, "sha512-o2bCQN4Xn3UqCLErC5m4T7u0yYArJYmgFCUFnA6K96DdW2RERvx+gTKXxWuHEBkDO+eMoHLHLxk0u2jGES00Ng=="],
|
||||
"@anthropic-ai/claude-agent-sdk-linux-arm64-musl": ["@anthropic-ai/claude-agent-sdk-linux-arm64-musl@0.3.170", "", { "os": "linux", "cpu": "arm64" }, "sha512-SRYfQcsXlOq+CD/FqkQBTSHbaD++w73GnnO+NUV9adLYrca3kfetRwWT1iguY1cNS0l34dCR3rlzCPq78vg1Jg=="],
|
||||
|
||||
"@anthropic-ai/claude-agent-sdk-linux-x64": ["@anthropic-ai/claude-agent-sdk-linux-x64@0.3.154", "", { "os": "linux", "cpu": "x64" }, "sha512-GpiFF8Ez6PbM3m0gqtCo/FKM346qyRdP7VhbmJzdnbNKTiiUZ66vDQyEUPZPCG24ZkrG4m96KpRIUwY08rHiNg=="],
|
||||
"@anthropic-ai/claude-agent-sdk-linux-x64": ["@anthropic-ai/claude-agent-sdk-linux-x64@0.3.170", "", { "os": "linux", "cpu": "x64" }, "sha512-Xl/m7TaSC3T5IDBdHrZQ9fCQYyDmPELN34CL+MoyPIf7uSmuZnjE9fUOqDh2Rv26JxWssi1M6X+BBvVuKd6Cpg=="],
|
||||
|
||||
"@anthropic-ai/claude-agent-sdk-linux-x64-musl": ["@anthropic-ai/claude-agent-sdk-linux-x64-musl@0.3.154", "", { "os": "linux", "cpu": "x64" }, "sha512-zA7S8Lm6O4QBsUpbhiOht8BgiXHOBBFUIo8ZLK6r5wAatK3Q44syWVxICeyCnR6wqfnkf3cugCw27ycS6vVgaA=="],
|
||||
"@anthropic-ai/claude-agent-sdk-linux-x64-musl": ["@anthropic-ai/claude-agent-sdk-linux-x64-musl@0.3.170", "", { "os": "linux", "cpu": "x64" }, "sha512-m4+I0qBEk7cxRKS+pL+eoWXbXTFOAo83fQ0tQvap4z/mDMm06IWJtEPoYTaMBwsp32GJWLkHWKbZSBCHZnp2DQ=="],
|
||||
|
||||
"@anthropic-ai/claude-agent-sdk-win32-arm64": ["@anthropic-ai/claude-agent-sdk-win32-arm64@0.3.154", "", { "os": "win32", "cpu": "arm64" }, "sha512-cDW1YFbU/PJFlrGXhlAGcbkXt80sEO6WtnH8nN8YHXLn5NWduy2q7o/qC6i8XozgvRGf6t/eMoH7IasGIEDhDw=="],
|
||||
"@anthropic-ai/claude-agent-sdk-win32-arm64": ["@anthropic-ai/claude-agent-sdk-win32-arm64@0.3.170", "", { "os": "win32", "cpu": "arm64" }, "sha512-IG+8isJNNJKbnnhO7m+PGhfVCg+XoQ/MDxGde5eigFI0WsEfitjuWSWwx82bT9ghxI1aa6qNvI+UPgPcZuo5Fg=="],
|
||||
|
||||
"@anthropic-ai/claude-agent-sdk-win32-x64": ["@anthropic-ai/claude-agent-sdk-win32-x64@0.3.154", "", { "os": "win32", "cpu": "x64" }, "sha512-tSKaIIpL72OPg3WfzZTCIl8OJgcbq4qieu8/fDWjsdeQuari9gQMIuEflFphk9HqNsxpSmDqKi8Sm5mW2V566Q=="],
|
||||
"@anthropic-ai/claude-agent-sdk-win32-x64": ["@anthropic-ai/claude-agent-sdk-win32-x64@0.3.170", "", { "os": "win32", "cpu": "x64" }, "sha512-7cuqSKbHVItPGVwRbd3A0BEJwcNtc7Fhoh6qHN4C6yrmjSrvdYYx3MLvq/VI768/RoG7mAMDxb+j7WfEfoP9BA=="],
|
||||
|
||||
"@anthropic-ai/sdk": ["@anthropic-ai/sdk@0.100.0", "", { "dependencies": { "json-schema-to-ts": "^3.1.1", "standardwebhooks": "^1.0.0" }, "peerDependencies": { "zod": "^3.25.0 || ^4.0.0" }, "optionalPeers": ["zod"], "bin": { "anthropic-ai-sdk": "bin/cli" } }, "sha512-cAm3aXm6qAiHIvHxyIIGd6tVmsD2gDqlc2h0R20ijNUzGgVnIN822bit4mKbF6CkuV7qIrLQIPoAepHEpanrQQ=="],
|
||||
|
||||
|
||||
@@ -9,7 +9,7 @@
|
||||
"test": "bun test"
|
||||
},
|
||||
"dependencies": {
|
||||
"@anthropic-ai/claude-agent-sdk": "^0.3.154",
|
||||
"@anthropic-ai/claude-agent-sdk": "^0.3.170",
|
||||
"@anthropic-ai/sdk": "^0.100.0",
|
||||
"@modelcontextprotocol/sdk": "^1.29.0",
|
||||
"cron-parser": "^5.0.0",
|
||||
|
||||
@@ -5,8 +5,11 @@
|
||||
* send_message(to="agent-name") since agents and channels share the
|
||||
* unified destinations namespace.
|
||||
*
|
||||
* create_agent is admin-only. Non-admin containers never see this tool
|
||||
* (see mcp-tools/index.ts). The host re-checks permission on receive.
|
||||
* create_agent writes central-DB state. The host authorizes it by CLI scope:
|
||||
* trusted owner agent groups (scope 'global') create directly; confined groups
|
||||
* require admin approval (see src/modules/agent-to-agent/create-agent.ts). This
|
||||
* tool just writes the outbound request; authorization is enforced host-side,
|
||||
* not here — the container is untrusted and cannot be relied on to gate itself.
|
||||
*/
|
||||
import { writeMessageOut } from '../db/messages-out.js';
|
||||
import { registerTools } from './server.js';
|
||||
@@ -32,7 +35,7 @@ export const createAgent: McpToolDefinition = {
|
||||
tool: {
|
||||
name: 'create_agent',
|
||||
description:
|
||||
'Create a long-lived companion sub-agent (research assistant, task manager, specialist) — the name becomes your destination for it. Admin-only. Fire-and-forget.',
|
||||
'Create a long-lived companion sub-agent (research assistant, task manager, specialist) — the name becomes your destination for it. May require admin approval before the agent is created. Fire-and-forget.',
|
||||
inputSchema: {
|
||||
type: 'object' as const,
|
||||
properties: {
|
||||
|
||||
@@ -9,6 +9,5 @@ The files in this directory are original design documents and developer referenc
|
||||
| [SPEC.md](SPEC.md) | [Architecture](https://docs.nanoclaw.dev/concepts/architecture) |
|
||||
| [SECURITY.md](SECURITY.md) | [Security model](https://docs.nanoclaw.dev/concepts/security) |
|
||||
| [REQUIREMENTS.md](REQUIREMENTS.md) | [Introduction](https://docs.nanoclaw.dev/introduction) |
|
||||
| [skills-as-branches.md](skills-as-branches.md) | [Skills system](https://docs.nanoclaw.dev/integrations/skills-system) |
|
||||
| [docker-sandboxes.md](docker-sandboxes.md) | [Docker Sandboxes](https://docs.nanoclaw.dev/advanced/docker-sandboxes) |
|
||||
| [APPLE-CONTAINER-NETWORKING.md](APPLE-CONTAINER-NETWORKING.md) | [Container runtime](https://docs.nanoclaw.dev/advanced/container-runtime) |
|
||||
|
||||
@@ -83,6 +83,48 @@ Each NanoClaw group gets its own OneCLI agent identity. This allows different cr
|
||||
- Any credentials matching blocked patterns
|
||||
- `.env` is shadowed with `/dev/null` in the project root mount
|
||||
|
||||
### 6. Egress Lockdown (Forced Proxy)
|
||||
|
||||
The `HTTPS_PROXY` env var only redirects *proxy-aware* clients — a tool that
|
||||
ignores it (or a raw socket) could reach the internet directly and bypass
|
||||
credential injection, approvals, and audit. Egress lockdown closes that hole at
|
||||
the network layer.
|
||||
|
||||
**How it works:** agents are placed on a Docker `--internal` network
|
||||
(`nanoclaw-egress`) that has **no route to the internet**. The OneCLI gateway
|
||||
container is attached to that network, aliased as `host.docker.internal`, so the
|
||||
injected proxy URL (`…@host.docker.internal:10255`) resolves to the gateway
|
||||
*container-to-container*. The gateway is therefore the **only reachable hop** —
|
||||
anything else has nowhere to go. The agent is non-root with no `NET_ADMIN`, so
|
||||
it cannot undo this. Identical mechanism on macOS and Linux (no host firewall,
|
||||
no `host-gateway` route).
|
||||
|
||||
- **Self-healing:** the gateway is re-attached to the network at every spawn and
|
||||
on each host-sweep tick, so an out-of-band detach (e.g. `docker compose up` on
|
||||
the OneCLI stack — its compose lives in `~/.onecli`, not this repo) recovers
|
||||
automatically.
|
||||
- **Fail-fast:** if lockdown is on but the network can't be created or the
|
||||
gateway can't be attached (e.g. a non-standard gateway container name, or the
|
||||
gateway isn't running), nanoclaw **refuses to spawn the agent** and surfaces a
|
||||
clear error — it never silently falls back to open egress. Fix the cause (or
|
||||
set `NANOCLAW_EGRESS_LOCKDOWN=false`) and retry. The host-sweep re-heal is the
|
||||
exception: a heal failure there is logged but not fatal, since already-running
|
||||
agents stay on the internal net (no leak) until the gateway returns.
|
||||
|
||||
**Configuration:**
|
||||
|
||||
| Env | Default | Meaning |
|
||||
| --- | --- | --- |
|
||||
| `NANOCLAW_EGRESS_LOCKDOWN` | `false` | Set `true` to opt in (otherwise the host-gateway path is used). Enabled automatically by `/add-golden-registry`. |
|
||||
| `NANOCLAW_EGRESS_NETWORK` | `nanoclaw-egress` | Network name. |
|
||||
| `ONECLI_GATEWAY_CONTAINER` | `onecli` | Gateway container to attach. |
|
||||
|
||||
**⚠ Behavior when enabled:** with lockdown on, agents have **no direct
|
||||
internet** — all traffic must go through OneCLI. Proxy-aware clients (npm, pnpm,
|
||||
pip, curl, node/bun with the proxy env) are unaffected. Any workflow that relies
|
||||
on a **non-proxy-aware** tool reaching the internet directly will fail by design.
|
||||
Lockdown is **off by default**; opt in with `NANOCLAW_EGRESS_LOCKDOWN=true`.
|
||||
|
||||
## Privilege Comparison
|
||||
|
||||
| Capability | Main Group | Non-Main Group |
|
||||
|
||||
@@ -0,0 +1,36 @@
|
||||
# Customizing NanoClaw
|
||||
|
||||
NanoClaw is made to be forked and changed. The catch with most projects is that once you edit the code, every upstream update turns into a merge fight, and the more you customized, the worse it gets.
|
||||
|
||||
NanoClaw avoids that with one simple idea: **every change you make is a skill.**
|
||||
|
||||
## The idea in a minute
|
||||
|
||||
- A **skill** is a small, self-contained add-on. It brings its own code and knows how to install itself.
|
||||
- Your **fork is just a list of skills**, plus one "recipe" that says which skills you have and how they fit together.
|
||||
- Because your changes live beside the core instead of tangled into it, **pulling in updates stays easy**.
|
||||
|
||||
## What makes it work
|
||||
|
||||
A good skill mostly **adds** things: new files, a line appended to an existing file, a dependency. It avoids rewriting existing code in place.
|
||||
|
||||
And it ships a test for each spot where it touches the rest of the system. When an update moves something your skill depends on, that test fails and points at the fix, instead of you finding out when things break in production.
|
||||
|
||||
## How you actually work
|
||||
|
||||
You don't have to think in skills while you're building. **Edit the code directly, get it working, then turn your changes into skills afterward.** A coding agent does the conversion for you, following [skill-guidelines.md](skill-guidelines.md).
|
||||
|
||||
The only rule worth remembering: **a change isn't really part of your fork until it's a skill**, because that's the form that survives an upgrade.
|
||||
|
||||
## Upgrading
|
||||
|
||||
Always upgrade by running `/update-nanoclaw`. **Don't just `git pull`.** The command sets a rollback point, pulls the upstream changes, runs your tests, and walks you through anything that needs fixing, usually a small, local fix in one skill.
|
||||
|
||||
## The deal
|
||||
|
||||
We keep the core small and stable, and every breaking change ships with its migration. You keep your changes as skills, with tests. Do that, and upgrades won't break you. Changes edited directly into the core are the one thing the model can't protect.
|
||||
|
||||
## Go deeper
|
||||
|
||||
- **[The skills model in full](skills-model.md)**: how skills, recipes, tests, and upgrades work under the hood.
|
||||
- **[Skill guidelines](skill-guidelines.md)**: the authoritative checklist for writing one.
|
||||
@@ -53,6 +53,80 @@ Model selection considerations for Apple Silicon:
|
||||
|
||||
The agent uses tool calls extensively (read/write files, shell commands). Models that support tool use reliably work best. Gemma 4 and Qwen 3 Coder both handle structured tool calls well.
|
||||
|
||||
## Allowing Prompt Caching (filter the cache-busting hash)
|
||||
|
||||
Out of the box this path is slow — every reply re-reads the whole multi-thousand-token system prompt from scratch, even for a one-word answer. Ollama has a prompt cache that should skip that repeated work, but on this path it never kicks in.
|
||||
|
||||
**Cause.** The Claude Agent SDK adds a per-request hash to the front of every prompt — `x-anthropic-billing-header: ...; cch=<hash>;`. It changes on every request, and Ollama's cache only reuses a prompt whose start is unchanged. So that one shifting value at the front makes Ollama treat every prompt as new and re-read all of it. (Ollama ignores the hash itself, so filtering it has no effect on output.)
|
||||
|
||||
**Fix.** Run a tiny proxy between the container and Ollama that filters the hash out (pins `cch=<hash>` to a constant). The start of the prompt is now stable, so the cache kicks in and only the new message gets processed. In our setup — a 31B model on Apple Silicon — follow-up replies dropped from ~80s to ~4s; your numbers will vary with model size and hardware. Output is unchanged, since Ollama ignores the value anyway.
|
||||
|
||||
Point the agent group's `ANTHROPIC_BASE_URL` at the proxy instead of Ollama directly (everything else from the sections above is unchanged):
|
||||
|
||||
```
|
||||
ANTHROPIC_BASE_URL=http://host.docker.internal:11999 # the proxy
|
||||
# proxy forwards to http://127.0.0.1:11434 (Ollama)
|
||||
```
|
||||
|
||||
The proxy is ~40 lines of dependency-free Node:
|
||||
|
||||
```js
|
||||
// ollama-cch-proxy.mjs — normalize the SDK's per-request cch nonce so Ollama's
|
||||
// prefix cache survives across turns. Listens on :11999, forwards to Ollama.
|
||||
import http from 'node:http';
|
||||
|
||||
const TARGET_HOST = process.env.OLLAMA_HOST || '127.0.0.1';
|
||||
const TARGET_PORT = Number(process.env.OLLAMA_PORT || 11434);
|
||||
const LISTEN_PORT = Number(process.env.PROXY_PORT || 11999);
|
||||
|
||||
const server = http.createServer((req, res) => {
|
||||
const chunks = [];
|
||||
req.on('data', (c) => chunks.push(c));
|
||||
req.on('end', () => {
|
||||
let body = Buffer.concat(chunks);
|
||||
if (req.method === 'POST' && body.length) {
|
||||
body = Buffer.from(body.toString('utf8').replace(/cch=[0-9a-f]+;/g, 'cch=00000;'), 'utf8');
|
||||
}
|
||||
const headers = { ...req.headers, host: `${TARGET_HOST}:${TARGET_PORT}`, 'content-length': String(body.length) };
|
||||
const proxyReq = http.request(
|
||||
{ host: TARGET_HOST, port: TARGET_PORT, method: req.method, path: req.url, headers },
|
||||
(proxyRes) => {
|
||||
res.writeHead(proxyRes.statusCode || 502, proxyRes.headers);
|
||||
proxyRes.pipe(res);
|
||||
},
|
||||
);
|
||||
proxyReq.on('error', (e) => { res.writeHead(502); res.end(String(e)); });
|
||||
proxyReq.end(body);
|
||||
});
|
||||
});
|
||||
server.listen(LISTEN_PORT, '0.0.0.0', () => console.log(`cch-proxy :${LISTEN_PORT} -> ${TARGET_HOST}:${TARGET_PORT}`));
|
||||
```
|
||||
|
||||
Run it durably so it survives reboots. On Linux, a systemd user service:
|
||||
|
||||
```ini
|
||||
# ~/.config/systemd/user/ollama-cch-proxy.service
|
||||
[Unit]
|
||||
Description=Ollama cch-normalizing proxy for NanoClaw
|
||||
After=network-online.target
|
||||
|
||||
[Service]
|
||||
ExecStart=/usr/bin/node %h/.config/nanoclaw/ollama-cch-proxy.mjs
|
||||
Restart=always
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
```
|
||||
|
||||
```bash
|
||||
systemctl --user enable --now ollama-cch-proxy
|
||||
loginctl enable-linger "$USER" # so it runs without an active login session
|
||||
```
|
||||
|
||||
On macOS use a `launchd` user agent (`~/Library/LaunchAgents/`) running the same script.
|
||||
|
||||
**Scope.** This only affects the Claude-Code-CLI → Ollama path described here. Codex and OpenCode don't use the Claude Agent SDK, so they never emit the `cch` hash and get prompt caching for free.
|
||||
|
||||
## What Changes at the Code Level
|
||||
|
||||
Three files need to support this feature. See `/add-ollama-provider` for the exact changes.
|
||||
|
||||
@@ -0,0 +1,168 @@
|
||||
# Skill guidelines
|
||||
|
||||
The authoritative checklist for writing a NanoClaw skill: the bar that conformance tooling and registry review will hold every skill to. [customizing.md](customizing.md) is the short introduction; [skills-model.md](skills-model.md) explains why the model works this way. This document evolves with the system; when a rule here proves wrong, fix the rule.
|
||||
|
||||
---
|
||||
|
||||
## Principles
|
||||
|
||||
Every customization is an additive **skill**: not an edit buried in core, but a skill that carries its own code and knows how to install and remove itself. Two principles make a skill *maintainable*; everything else in this document follows from them.
|
||||
|
||||
### 1. Minimal integration surface
|
||||
|
||||
A skill adds files and makes the **smallest possible reach-ins** into existing code. Adding a file or a dependency never breaks on upgrade; reaching into existing code is the only thing that does, so the integration surface *is* the upgrade risk. Keep reach-ins few, tiny, and ideally a single line that *calls* into the skill's own code.
|
||||
|
||||
Follows from this:
|
||||
|
||||
- **Mostly add.** See the change shapes below, in safety order.
|
||||
- **Push logic into skill-owned files** so the core edit is one call, not an inlined block. This shrinks the surface *and* makes the point testable.
|
||||
- **Colocated, self-contained** edits over edits in two places.
|
||||
- **Use an existing registry or hook when there is one**: appending to a registry is a smaller surface than reaching into code. When none exists, a true code-level edit is fine and first-class. (Whether to *add* a hook because a spot has become a hotspot is the maintainer's call, not the skill's.)
|
||||
|
||||
### 2. A test for every functional integration point
|
||||
|
||||
Every reach-in with a **functional consequence** gets a test that goes **red if the wiring is deleted or drifts**. That's what protects the fork from upstream changes. The tests are also the verification: there is no separate "verify" step.
|
||||
|
||||
Follows from this:
|
||||
|
||||
- **Tests target integration with core, not internal correctness.** Unit tests of a skill's own logic, or its behavior against an external service, are the creator's call: fine, just not required.
|
||||
- **A direct unit test doesn't count**: calling the skill's own function bypasses the wiring and stays green when the reach-in is deleted. Drive the real entry, or assert the wiring structurally.
|
||||
- **Build / typecheck is an always-on leg**: drift (moved imports, renamed fields) is the main enemy and slips past runtime tests.
|
||||
- **The test lives where the point runs**: host code uses vitest under `src/`; container code uses `bun:test` under `container/agent-runner/`.
|
||||
- **"Functional" is the filter**: weigh a reach-in by what breaks if it's gone. A cosmetic one (raising a log line's level) gets no test.
|
||||
|
||||
The two interlock: a minimal surface keeps the integration points few and testable; a test per point keeps the surface safe. *Maintainable = small surface, every functional point guarded.*
|
||||
|
||||
---
|
||||
|
||||
## Skill anatomy
|
||||
|
||||
A skill carries everything it needs:
|
||||
|
||||
- **Code**: the files it adds. They live in the skill's own folder, or, for large registry-backed skills like channels and providers, on a registry branch the skill fetches from. Apply copies them in.
|
||||
- **Apply**: the steps in `SKILL.md`, written as prose an agent can run. Apply must be safe to re-run: upgrades re-run it, and a skill that half-applies twice is a bug.
|
||||
- **Remove**: a separate `REMOVE.md` that reverses *every* change apply made: barrel lines deleted (not commented out), every copied file removed including tests, dependencies uninstalled, Dockerfile edits reverted, env lines removed. **REMOVE.md is required exactly when apply leaves anything behind.** A pure instruction-only skill that copies nothing needs none, and an empty one is noise.
|
||||
- **Tests**: files that ship with the skill and are copied into the project's test tree on apply, so they run against the *composed* system.
|
||||
- **Recipe entry**: how it composes with the fork's other skills (ordering, dependencies).
|
||||
|
||||
---
|
||||
|
||||
## Change shapes
|
||||
|
||||
In rough order of safety:
|
||||
|
||||
- **Add a file**: safest. New code in the skill's own files, or fetched from a registry branch (`git show origin/<branch>:path > path`).
|
||||
- **Append to a file**: an import in a barrel, a line in `.env`, an entry at the end of a list.
|
||||
- **Edit a value in JSON**: e.g. a `package.json` field.
|
||||
- **Add a dependency**, pinned to an exact version.
|
||||
- **Insert into existing code (an "integration point")**: the one risky move. Keep it to a line or two that *calls* code living in the skill's own files, never an inlined block of logic. A skill full of these is a smell.
|
||||
|
||||
Fetching from a registry branch is **additive, never a merge**. `git fetch origin <branch>` then `git show origin/<branch>:path > path` per file. Never `git merge` a registry branch into an install.
|
||||
|
||||
---
|
||||
|
||||
## Integration points
|
||||
|
||||
The integration point is wherever the skill reaches into existing code. Make it **minimal, colocated, and self-contained**:
|
||||
|
||||
- All real logic lives in the skill's own file behind a single entry function; the edit to core is just the call.
|
||||
- **Prefer one colocated block** over edits in two places. For an inserted call, a dynamic import at the call site keeps the import and call together and avoids touching the top-of-file import block (itself a merge hotspot):
|
||||
|
||||
```typescript
|
||||
const { startDashboard } = await import('./dashboard-pusher.js');
|
||||
await startDashboard();
|
||||
```
|
||||
|
||||
A static import + call is acceptable too; this is a recommendation, not a mandate.
|
||||
- Keep any gating (feature flags, env checks) *inside* the skill's function, so the core edit stays a single call.
|
||||
- When the reach-in lands inside an entangled function, extract a tiny skill-owned helper so the core touch is one line, like `args.push(...mySkillEnvArgs())`, rather than exporting the whole function or inlining the logic.
|
||||
|
||||
---
|
||||
|
||||
## Testing
|
||||
|
||||
**What the standard requires: integration with the NanoClaw system.**
|
||||
|
||||
- **Required:** a test for every functional integration point, and, where an added file consumes core (core APIs, data shapes, registries), a test that exercises that consumption against the real core. That's the leg that catches core drift.
|
||||
- **Optional, the creator's call:** unit tests of the skill's own internal logic, or its behavior against an external service. Often good practice; not what defines a maintainable skill, because they don't protect against upstream changes.
|
||||
|
||||
### Choosing the test type
|
||||
|
||||
For a code-edit integration point, how you test the wiring depends on whether you can invoke the function the edit lives in. **Prefer behavior; fall back to structure.**
|
||||
|
||||
- **If the edit lives in an invocable function, test that function's behavior.** Calling it exercises the edit; remove or break the edit and the test goes red. This is the strongest option, and usually available, because a minimal integration point pushes the logic into the skill's own exported function anyway.
|
||||
- **If the edit lives in a non-invocable entry point** (e.g. `main()` or boot), **use a structural / AST test.** Use the TypeScript compiler API and assert not just that the symbol exists but its **placement**: awaited, a direct statement of the right function, importing the right module path, correctly ordered. A present-but-misplaced call must go red.
|
||||
|
||||
Two more legs apply when relevant:
|
||||
|
||||
- **Build / typecheck** always applies: it catches a renamed symbol, a moved module, a bad signature.
|
||||
- **A behavior test of how added code consumes core**, required when the added file reaches into core APIs or data at runtime. When the consumption is a *typed* call into a core API (a Chat SDK adapter calling `createChatSdkBridge`), the build leg already guards it and no separate behavior test is required. The behavior-test requirement targets runtime consumption: core DB state, data shapes, registries.
|
||||
|
||||
Together these cover deletion, misplacement, drift, and core consumption. Only true runtime-reachability (a call stranded behind a dead branch) needs the heavy option of booting the real entry point, a rare "real run" reserved for critical wiring.
|
||||
|
||||
### Registration reach-ins: behavior, not structural
|
||||
|
||||
A registry queryable at runtime gets a **behavior** test: import the real barrel, assert the registry contains the entry. A structural parse only proves the *source line* exists. It stays green when the barrel can't evaluate or the package isn't installed, which is exactly when the thing is actually broken. The behavior test goes red on a deleted barrel line, a barrel that won't evaluate, *and* an uninstalled package (the unmocked import throws), so it covers the dependency integration point for free.
|
||||
|
||||
Two consequences. First, **don't mock the adapter's package in the shipped test**: that would defeat the dependency check, and the test runs in the composed install where the package is present. Second, the only reason to fall back to a structural parse is an adapter with real import-time side effects (spawns a process, opens a socket, needs creds at load), which is an adapter smell to fix, not a reason to weaken the test. Conformant adapters do all side-effectful work in the factory or `setup()`, never at import.
|
||||
|
||||
### Test archetypes
|
||||
|
||||
The test matches the kind of integration point:
|
||||
|
||||
- **In-process seam with core** (a channel into the router, a pusher into the central DB): drive the real added component against the **real core collaborators** (DB, registry, router), faking only the external edge. The highest-value archetype: it exercises the added file's consumption of core, which is what catches core drift.
|
||||
- **Wiring / registration** (a barrel import, a `main()` call, an entry in an `mcpServers` map): behavior test via the registry where queryable (see above); structural / AST test where not.
|
||||
- **Config / container probe** (mounts, Dockerfile, a tool installed in the image): run the change where you can. Spin up a container to confirm a mount or binary. Checking that a line exists in a file is the last resort.
|
||||
- **Agentic run** (operational, instruction-only skills): run the workflow with a small model; did it complete?
|
||||
- **Patch behavior** (a patch skill that changes core logic): a behavior test of the changed behavior.
|
||||
- **Provider (multi-point)**: a non-default agent backend reaches into *two* barrels (host `src/providers/index.ts`; container `container/agent-runner/src/providers/index.ts`), plus Dockerfile edits and a CLI or SDK dependency. Each is a separate way to break, and each needs its own guard. Ship a **barrel-driven registration test per tree** that imports *only* the real barrel and asserts the registry contains the provider. **The trap:** a `*.factory.test.ts` that imports the provider module directly self-registers it and stays green when the barrel line is deleted; that's a unit test, not a registration guard. REMOVE.md must reverse both barrel lines, all copied files in both trees, the dependency, and the Dockerfile edits.
|
||||
- **Content / instruction-only** (a reference wiki, a pure workflow): makes no functional reach-in, so it owes no integration test. Conformance is anatomy: idempotent apply, plus REMOVE.md iff apply leaves anything behind.
|
||||
|
||||
### Dependencies are integration points
|
||||
|
||||
A skill that installs a package has made a reach-in: the code now assumes it's there. Guard it so a missing package goes red, in order of preference:
|
||||
|
||||
1. **An unmocked import in a behavior test**: the test imports real code that imports the package, so a missing package throws. Covers presence *and* exercises the real dependency.
|
||||
2. **The build leg**: a typed import of a missing module fails typecheck. The fallback when the package genuinely can't be imported in a test (e.g. it binds a port on import). Only works if the validate step runs the build before or alongside the tests, so verify the order.
|
||||
3. **A Dockerfile-installed CLI binary** is the case most often left unguarded: it isn't importable, so neither guard above sees it. Use a **structural test** asserting the Dockerfile `ARG <X>_VERSION=` and install line are present, optionally backed by a `<bin> --version` container probe. Pin the version; reject `latest`.
|
||||
|
||||
You do *not* need to test the dependency's own API contract; that's optional external-service coverage.
|
||||
|
||||
### When there is genuinely nothing to test in-tree
|
||||
|
||||
Some skills' only functional integration is a runtime operator action with no source footprint: registering an MCP server through `ncl`, or a mount through the sanctioned query wrapper (until the `ncl` add-mount verb lands). There's no line in the tree whose deletion a test could catch, so a registration test is structurally inapplicable. **State this explicitly in SKILL.md** rather than inventing a hollow test; conformance is then anatomy plus the dependency guard. This is a conformant outcome, valid only when the reach-in has no in-tree representation. (A raw-SQL write into core's schema to achieve the same thing is a smell, not a workaround.)
|
||||
|
||||
### Test rules
|
||||
|
||||
- **Hermetic at the external edge.** Mock genuinely external services (a fake HTTP server, stubbed creds), never the package under guard (see "Registration reach-ins").
|
||||
- **Exercise the real entry, or assert it structurally.** A test that imports the skill's function directly does not test the integration.
|
||||
- **Tests travel with the skill** and are copied in on apply; an integration test only means anything against the composed project.
|
||||
- **Robustness check.** Apply the skill with a small, cheap model. If a small model fumbles the instructions, they're too vague. Fix the instructions, don't blame the model. (Small models also keep applying skills cheap.)
|
||||
|
||||
---
|
||||
|
||||
## Anti-patterns
|
||||
|
||||
Each with its fix. These are patterns to remove, not to test around: a drift-prone, untestable reach-in is usually a symptom of a bad pattern, not a missing test. Reviewers reject them; the conformance linter will flag them automatically.
|
||||
|
||||
1. **A separate VERIFY.md.** Delete it; tests are the verification. Fold any genuinely useful manual smoke check into SKILL.md's next steps.
|
||||
2. **REMOVE.md soft-disable** (comments out an import; leaves copied files behind). DELETE the import line and `rm` every file the skill copied.
|
||||
3. **REMOVE.md incomplete** (misses env vars, the package uninstall, copied tests). Reverse *every* change; read the env vars from the skill's own credentials section, don't guess.
|
||||
4. **Raw SQL against a core DB** (read or write). Use a core helper or an `ncl` verb; the in-tree query wrapper is the sanctioned last resort. Never the `sqlite3` binary.
|
||||
5. **Credential threading** (`-e KEY=…` or a stdin secrets payload into the container). OneCLI gateway only; it injects credentials per request.
|
||||
6. **Branch-merge install** (`git merge` of a registry branch or any code branch). Install by additive fetch: `git fetch origin <branch>`, then `git show origin/<branch>:path > path` per file. For an update/reapply workflow, re-run each installed skill's additive apply, never merge.
|
||||
7. **Diff-against-past framing** ("earlier versions…", "this is now redundant") and **documenting non-steps** ("no X needed"). Write present-tense DO steps only. A skill reads as a standalone artifact with no memory of its own edits.
|
||||
8. **Stale reach-in targets** (an edit aimed at code that no longer exists; a reach-in already shipped in trunk). Verify the target exists *before* instructing the edit; reconcile already-in-trunk ones to a no-op. Before appending to an allowlist or list, check how it's consumed; the entry may already be derived from a registry, making the edit dead.
|
||||
9. **Hand-maintained duplicate copies** (a mirror directory kept in sync by hand or sed). Generate the mirror from a single canonical source.
|
||||
|
||||
---
|
||||
|
||||
## Worked examples
|
||||
|
||||
In-tree exemplars for the code archetypes. (Two carry known smells, kept deliberately pending architectural fixes; they demonstrate the test shapes, not perfection.)
|
||||
|
||||
- `add-dashboard`: in-process seam with core (the pusher against the central DB), plus an AST wiring test for its `main()` call.
|
||||
- `add-slack`: Chat SDK channel registration; the template for the whole channel family.
|
||||
- `add-deltachat`: native channel registration.
|
||||
- `add-atomic-chat-tool`: MCP-tool wiring across both runtimes (container registration and host env-helper call).
|
||||
- `add-opencode` / `add-codex`: the provider multi-point archetype, with two barrels, Dockerfile pins, and per-tree registration tests.
|
||||
@@ -1,677 +0,0 @@
|
||||
# Skills as Branches
|
||||
|
||||
## Overview
|
||||
|
||||
This document covers **feature skills** — skills that add capabilities via git branch merges. This is the most complex skill type and the primary way NanoClaw is extended.
|
||||
|
||||
NanoClaw has four types of skills overall. See [CONTRIBUTING.md](../CONTRIBUTING.md) for the full taxonomy:
|
||||
|
||||
| Type | Location | How it works |
|
||||
|------|----------|-------------|
|
||||
| **Feature** (this doc) | `.claude/skills/` + `skill/*` branch | SKILL.md has instructions; code lives on a branch, applied via `git merge` |
|
||||
| **Utility** | `.claude/skills/<name>/` with code files | Self-contained tools; code in skill directory, copied into place on install |
|
||||
| **Operational** | `.claude/skills/` on `main` | Instruction-only workflows (setup, debug, update) |
|
||||
| **Container** | `container/skills/` | Loaded inside agent containers at runtime |
|
||||
|
||||
---
|
||||
|
||||
Feature skills are distributed as git branches on the upstream repository. Applying a skill is a `git merge`. Updating core is a `git merge`. Everything is standard git.
|
||||
|
||||
This replaces the previous `skills-engine/` system (three-way file merging, `.nanoclaw/` state, manifest files, replay, backup/restore) with plain git operations and Claude for conflict resolution.
|
||||
|
||||
## How It Works
|
||||
|
||||
### Repository structure
|
||||
|
||||
The upstream repo (`nanocoai/nanoclaw`) maintains:
|
||||
|
||||
- `main` — core NanoClaw (no skill code)
|
||||
- `skill/discord` — main + Discord integration
|
||||
- `skill/telegram` — main + Telegram integration
|
||||
- `skill/slack` — main + Slack integration
|
||||
- `skill/gmail` — main + Gmail integration
|
||||
- etc.
|
||||
|
||||
Each skill branch contains all the code changes for that skill: new files, modified source files, updated `package.json` dependencies, `.env.example` additions — everything. No manifest, no structured operations, no separate `add/` and `modify/` directories.
|
||||
|
||||
### Skill discovery and installation
|
||||
|
||||
Skills are split into two categories:
|
||||
|
||||
**Operational skills** (on `main`, always available):
|
||||
- `/setup`, `/debug`, `/update-nanoclaw`, `/customize`, `/update-skills`
|
||||
- These are instruction-only SKILL.md files — no code changes, just workflows
|
||||
- Live in `.claude/skills/` on `main`, immediately available to every user
|
||||
|
||||
**Feature skills** (in marketplace, installed on demand):
|
||||
- `/add-discord`, `/add-telegram`, `/add-slack`, `/add-gmail`, etc.
|
||||
- Each has a SKILL.md with setup instructions and a corresponding `skill/*` branch with code
|
||||
- Live in the marketplace repo (`nanocoai/nanoclaw-skills`)
|
||||
|
||||
Users never interact with the marketplace directly. The operational skills `/setup` and `/customize` handle plugin installation transparently:
|
||||
|
||||
```bash
|
||||
# Claude runs this behind the scenes — users don't see it
|
||||
claude plugin install nanoclaw-skills@nanoclaw-skills --scope project
|
||||
```
|
||||
|
||||
Skills are hot-loaded after `claude plugin install` — no restart needed. This means `/setup` can install the marketplace plugin, then immediately run any feature skill, all in one session.
|
||||
|
||||
### Selective skill installation
|
||||
|
||||
`/setup` asks users what channels they want, then only offers relevant skills:
|
||||
|
||||
1. "Which messaging channels do you want to use?" → Discord, Telegram, Slack, WhatsApp
|
||||
2. User picks Telegram → Claude installs the plugin and runs `/add-telegram`
|
||||
3. After Telegram is set up: "Want to add Agent Swarm support for Telegram?" → offers `/add-telegram-swarm`
|
||||
4. "Want to enable community skills?" → installs community marketplace plugins
|
||||
|
||||
Dependent skills (e.g., `telegram-swarm` depends on `telegram`) are only offered after their parent is installed. `/customize` follows the same pattern for post-setup additions.
|
||||
|
||||
### Marketplace configuration
|
||||
|
||||
NanoClaw's `.claude/settings.json` registers the official marketplace:
|
||||
|
||||
```json
|
||||
{
|
||||
"extraKnownMarketplaces": {
|
||||
"nanoclaw-skills": {
|
||||
"source": {
|
||||
"source": "github",
|
||||
"repo": "nanocoai/nanoclaw-skills"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The marketplace repo uses Claude Code's plugin structure:
|
||||
|
||||
```
|
||||
nanocoai/nanoclaw-skills/
|
||||
.claude-plugin/
|
||||
marketplace.json # Plugin catalog
|
||||
plugins/
|
||||
nanoclaw-skills/ # Single plugin bundling all official skills
|
||||
.claude-plugin/
|
||||
plugin.json # Plugin manifest
|
||||
skills/
|
||||
add-discord/
|
||||
SKILL.md # Setup instructions; step 1 is "merge the branch"
|
||||
add-telegram/
|
||||
SKILL.md
|
||||
add-slack/
|
||||
SKILL.md
|
||||
...
|
||||
```
|
||||
|
||||
Multiple skills are bundled in one plugin — installing `nanoclaw-skills` makes all feature skills available at once. Individual skills don't need separate installation.
|
||||
|
||||
Each SKILL.md tells Claude to merge the corresponding skill branch as step 1, then walks through interactive setup (env vars, bot creation, etc.).
|
||||
|
||||
### Applying a skill
|
||||
|
||||
User runs `/add-discord` (discovered via marketplace). Claude follows the SKILL.md:
|
||||
|
||||
1. `git fetch upstream skill/discord`
|
||||
2. `git merge upstream/skill/discord`
|
||||
3. Interactive setup (create bot, get token, configure env vars, etc.)
|
||||
|
||||
Or manually:
|
||||
|
||||
```bash
|
||||
git fetch upstream skill/discord
|
||||
git merge upstream/skill/discord
|
||||
```
|
||||
|
||||
### Applying multiple skills
|
||||
|
||||
```bash
|
||||
git merge upstream/skill/discord
|
||||
git merge upstream/skill/telegram
|
||||
```
|
||||
|
||||
Git handles the composition. If both skills modify the same lines, it's a real conflict and Claude resolves it.
|
||||
|
||||
### Updating core
|
||||
|
||||
```bash
|
||||
git fetch upstream main
|
||||
git merge upstream/main
|
||||
```
|
||||
|
||||
Since skill branches are kept merged-forward with main (see CI section), the user's merged-in skill changes and upstream changes have proper common ancestors.
|
||||
|
||||
### Checking for skill updates
|
||||
|
||||
Users who previously merged a skill branch can check for updates. For each `upstream/skill/*` branch, check whether the branch has commits that aren't in the user's HEAD:
|
||||
|
||||
```bash
|
||||
git fetch upstream
|
||||
for branch in $(git branch -r | grep 'upstream/skill/'); do
|
||||
# Check if user has merged this skill at some point
|
||||
merge_base=$(git merge-base HEAD "$branch" 2>/dev/null) || continue
|
||||
# Check if the skill branch has new commits beyond what the user has
|
||||
if ! git merge-base --is-ancestor "$branch" HEAD 2>/dev/null; then
|
||||
echo "$branch has updates available"
|
||||
fi
|
||||
done
|
||||
```
|
||||
|
||||
This requires no state — it uses git history to determine which skills were previously merged and whether they have new commits.
|
||||
|
||||
This logic is available in two ways:
|
||||
- Built into `/update-nanoclaw` — after merging main, optionally check for skill updates
|
||||
- Standalone `/update-skills` — check and merge skill updates independently
|
||||
|
||||
### Conflict resolution
|
||||
|
||||
At any merge step, conflicts may arise. Claude resolves them — reading the conflicted files, understanding the intent of both sides, and producing the correct result. This is what makes the branch approach viable at scale: conflict resolution that previously required human judgment is now automated.
|
||||
|
||||
### Skill dependencies
|
||||
|
||||
Some skills depend on other skills. E.g., `skill/telegram-swarm` requires `skill/telegram`. Dependent skill branches are branched from their parent skill branch, not from `main`.
|
||||
|
||||
This means `skill/telegram-swarm` includes all of telegram's changes plus its own additions. When a user merges `skill/telegram-swarm`, they get both — no need to merge telegram separately.
|
||||
|
||||
Dependencies are implicit in git history — `git merge-base --is-ancestor` determines whether one skill branch is an ancestor of another. No separate dependency file is needed.
|
||||
|
||||
### Uninstalling a skill
|
||||
|
||||
```bash
|
||||
# Find the merge commit
|
||||
git log --merges --oneline | grep discord
|
||||
|
||||
# Revert it
|
||||
git revert -m 1 <merge-commit>
|
||||
```
|
||||
|
||||
This creates a new commit that undoes the skill's changes. Claude can handle the whole flow.
|
||||
|
||||
If the user has modified the skill's code since merging (custom changes on top), the revert might conflict — Claude resolves it.
|
||||
|
||||
If the user later wants to re-apply the skill, they need to revert the revert first (git treats reverted changes as "already applied and undone"). Claude handles this too.
|
||||
|
||||
## CI: Keeping Skill Branches Current
|
||||
|
||||
A GitHub Action runs on every push to `main`:
|
||||
|
||||
1. List all `skill/*` branches
|
||||
2. For each skill branch, merge `main` into it (merge-forward, not rebase)
|
||||
3. Run build and tests on the merged result
|
||||
4. If tests pass, push the updated skill branch
|
||||
5. If a skill fails (conflict, build error, test failure), open a GitHub issue for manual resolution
|
||||
|
||||
**Why merge-forward instead of rebase:**
|
||||
- No force-push — preserves history for users who already merged the skill
|
||||
- Users can re-merge a skill branch to pick up skill updates (bug fixes, improvements)
|
||||
- Git has proper common ancestors throughout the merge graph
|
||||
|
||||
**Why this scales:** With a few hundred skills and a few commits to main per day, the CI cost is trivial. Haiku is fast and cheap. The approach that wouldn't have been feasible a year or two ago is now practical because Claude can resolve conflicts at scale.
|
||||
|
||||
## Installation Flow
|
||||
|
||||
### New users (recommended)
|
||||
|
||||
1. Fork `nanocoai/nanoclaw` on GitHub (click the Fork button)
|
||||
2. Clone your fork:
|
||||
```bash
|
||||
git clone https://github.com/<you>/nanoclaw.git
|
||||
cd nanoclaw
|
||||
```
|
||||
3. Run Claude Code:
|
||||
```bash
|
||||
claude
|
||||
```
|
||||
4. Run `/setup` — Claude handles dependencies, authentication, container setup, service configuration, and adds `upstream` remote if not present
|
||||
|
||||
Forking is recommended because it gives users a remote to push their customizations to. Clone-only works for trying things out but provides no remote backup.
|
||||
|
||||
### Existing users migrating from clone
|
||||
|
||||
Users who previously ran `git clone https://github.com/nanocoai/nanoclaw.git` and have local customizations:
|
||||
|
||||
1. Fork `nanocoai/nanoclaw` on GitHub
|
||||
2. Reroute remotes:
|
||||
```bash
|
||||
git remote rename origin upstream
|
||||
git remote add origin https://github.com/<you>/nanoclaw.git
|
||||
git push --force origin main
|
||||
```
|
||||
The `--force` is needed because the fresh fork's main is at upstream's latest, but the user wants their (possibly behind) version. The fork was just created so there's nothing to lose.
|
||||
3. From this point, `origin` = their fork, `upstream` = nanocoai/nanoclaw
|
||||
|
||||
### Existing users migrating from the old skills engine
|
||||
|
||||
Users who previously applied skills via the `skills-engine/` system have skill code in their tree but no merge commits linking to skill branches. Git doesn't know these changes came from a skill, so merging a skill branch on top would conflict or duplicate.
|
||||
|
||||
**For new skills going forward:** just merge skill branches as normal. No issue.
|
||||
|
||||
**For existing old-engine skills**, two migration paths:
|
||||
|
||||
**Option A: Per-skill reapply (keep your fork)**
|
||||
1. For each old-engine skill: identify and revert the old changes, then merge the skill branch fresh
|
||||
2. Claude assists with identifying what to revert and resolving any conflicts
|
||||
3. Custom modifications (non-skill changes) are preserved
|
||||
|
||||
**Option B: Fresh start (cleanest)**
|
||||
1. Create a new fork from upstream
|
||||
2. Merge the skill branches you want
|
||||
3. Manually re-apply your custom (non-skill) changes
|
||||
4. Claude assists by diffing your old fork against the new one to identify custom changes
|
||||
|
||||
In both cases:
|
||||
- Delete the `.nanoclaw/` directory (no longer needed)
|
||||
- The `skills-engine/` code will be removed from upstream once all skills are migrated
|
||||
- `/update-skills` only tracks skills applied via branch merge — old-engine skills won't appear in update checks
|
||||
|
||||
## User Workflows
|
||||
|
||||
### Custom changes
|
||||
|
||||
Users make custom changes directly on their main branch. This is the standard fork workflow — their `main` IS their customized version.
|
||||
|
||||
```bash
|
||||
# Make changes
|
||||
vim src/config.ts
|
||||
git commit -am "change trigger word to @Bob"
|
||||
git push origin main
|
||||
```
|
||||
|
||||
Custom changes, skills, and core updates all coexist on their main branch. Git handles the three-way merging at each merge step because it can trace common ancestors through the merge history.
|
||||
|
||||
### Applying a skill
|
||||
|
||||
Run `/add-discord` in Claude Code (discovered via the marketplace plugin), or manually:
|
||||
|
||||
```bash
|
||||
git fetch upstream skill/discord
|
||||
git merge upstream/skill/discord
|
||||
# Follow setup instructions for configuration
|
||||
git push origin main
|
||||
```
|
||||
|
||||
If the user is behind upstream's main when they merge a skill branch, the merge might bring in some core changes too (since skill branches are merged-forward with main). This is generally fine — they get a compatible version of everything.
|
||||
|
||||
### Updating core
|
||||
|
||||
```bash
|
||||
git fetch upstream main
|
||||
git merge upstream/main
|
||||
git push origin main
|
||||
```
|
||||
|
||||
This is the same as the existing `/update-nanoclaw` skill's merge path.
|
||||
|
||||
### Updating skills
|
||||
|
||||
Run `/update-skills` or let `/update-nanoclaw` check after a core update. For each previously-merged skill branch that has new commits, Claude offers to merge the updates.
|
||||
|
||||
### Contributing back to upstream
|
||||
|
||||
Users who want to submit a PR to upstream:
|
||||
|
||||
```bash
|
||||
git fetch upstream main
|
||||
git checkout -b my-fix upstream/main
|
||||
# Make changes
|
||||
git push origin my-fix
|
||||
# Create PR from my-fix to nanocoai/nanoclaw:main
|
||||
```
|
||||
|
||||
Standard fork contribution workflow. Their custom changes stay on their main and don't leak into the PR.
|
||||
|
||||
## Contributing a Skill
|
||||
|
||||
The flow below is for **feature skills** (branch-based). For utility skills (self-contained tools) and container skills, the contributor opens a PR that adds files directly to `.claude/skills/<name>/` or `container/skills/<name>/` — no branch extraction needed. See [CONTRIBUTING.md](../CONTRIBUTING.md) for all skill types.
|
||||
|
||||
### Contributor flow (feature skills)
|
||||
|
||||
1. Fork `nanocoai/nanoclaw`
|
||||
2. Branch from `main`
|
||||
3. Make the code changes (new channel file, modified integration points, updated package.json, .env.example additions, etc.)
|
||||
4. Open a PR to `main`
|
||||
|
||||
The contributor opens a normal PR — they don't need to know about skill branches or marketplace repos. They just make code changes and submit.
|
||||
|
||||
### Maintainer flow
|
||||
|
||||
When a skill PR is reviewed and approved:
|
||||
|
||||
1. Create a `skill/<name>` branch from the PR's commits:
|
||||
```bash
|
||||
git fetch origin pull/<PR_NUMBER>/head:skill/<name>
|
||||
git push origin skill/<name>
|
||||
```
|
||||
2. Force-push to the contributor's PR branch, replacing it with a single commit that adds the contributor to `CONTRIBUTORS.md` (removing all code changes)
|
||||
3. Merge the slimmed PR into `main` (just the contributor addition)
|
||||
4. Add the skill's SKILL.md to the marketplace repo (`nanocoai/nanoclaw-skills`)
|
||||
|
||||
This way:
|
||||
- The contributor gets merge credit (their PR is merged)
|
||||
- They're added to CONTRIBUTORS.md automatically by the maintainer
|
||||
- The skill branch is created from their work
|
||||
- `main` stays clean (no skill code)
|
||||
- The contributor only had to do one thing: open a PR with code changes
|
||||
|
||||
**Note:** GitHub PRs from forks have "Allow edits from maintainers" checked by default, so the maintainer can push to the contributor's PR branch.
|
||||
|
||||
### Skill SKILL.md
|
||||
|
||||
The contributor can optionally provide a SKILL.md (either in the PR or separately). This goes into the marketplace repo and contains:
|
||||
|
||||
1. Frontmatter (name, description, triggers)
|
||||
2. Step 1: Merge the skill branch
|
||||
3. Steps 2-N: Interactive setup (create bot, get token, configure env vars, verify)
|
||||
|
||||
If the contributor doesn't provide a SKILL.md, the maintainer writes one based on the PR.
|
||||
|
||||
## Community Marketplaces
|
||||
|
||||
Anyone can maintain their own fork with skill branches and their own marketplace repo. This enables a community-driven skill ecosystem without requiring write access to the upstream repo.
|
||||
|
||||
### How it works
|
||||
|
||||
A community contributor:
|
||||
|
||||
1. Maintains a fork of NanoClaw (e.g., `alice/nanoclaw`)
|
||||
2. Creates `skill/*` branches on their fork with their custom skills
|
||||
3. Creates a marketplace repo (e.g., `alice/nanoclaw-skills`) with a `.claude-plugin/marketplace.json` and plugin structure
|
||||
|
||||
### Adding a community marketplace
|
||||
|
||||
If the community contributor is trusted, they can open a PR to add their marketplace to NanoClaw's `.claude/settings.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"extraKnownMarketplaces": {
|
||||
"nanoclaw-skills": {
|
||||
"source": {
|
||||
"source": "github",
|
||||
"repo": "nanocoai/nanoclaw-skills"
|
||||
}
|
||||
},
|
||||
"alice-nanoclaw-skills": {
|
||||
"source": {
|
||||
"source": "github",
|
||||
"repo": "alice/nanoclaw-skills"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Once merged, all NanoClaw users automatically discover the community marketplace alongside the official one.
|
||||
|
||||
### Installing community skills
|
||||
|
||||
`/setup` and `/customize` ask users whether they want to enable community skills. If yes, Claude installs community marketplace plugins via `claude plugin install`:
|
||||
|
||||
```bash
|
||||
claude plugin install alice-skills@alice-nanoclaw-skills --scope project
|
||||
```
|
||||
|
||||
Community skills are hot-loaded and immediately available — no restart needed. Dependent skills are only offered after their prerequisites are met (e.g., community Telegram add-ons only after Telegram is installed).
|
||||
|
||||
Users can also browse and install community plugins manually via `/plugin`.
|
||||
|
||||
### Properties of this system
|
||||
|
||||
- **No gatekeeping required.** Anyone can create skills on their fork without permission. They only need approval to be listed in the auto-discovered marketplaces.
|
||||
- **Multiple marketplaces coexist.** Users see skills from all trusted marketplaces in `/plugin`.
|
||||
- **Community skills use the same merge pattern.** The SKILL.md just points to a different remote:
|
||||
```bash
|
||||
git remote add alice https://github.com/alice/nanoclaw.git
|
||||
git fetch alice skill/my-cool-feature
|
||||
git merge alice/skill/my-cool-feature
|
||||
```
|
||||
- **Users can also add marketplaces manually.** Even without being listed in settings.json, users can run `/plugin marketplace add alice/nanoclaw-skills` to discover skills from any source.
|
||||
- **CI is per-fork.** Each community maintainer runs their own CI to keep their skill branches merged-forward. They can use the same GitHub Action as the upstream repo.
|
||||
|
||||
## Flavors
|
||||
|
||||
A flavor is a curated fork of NanoClaw — a combination of skills, custom changes, and configuration tailored for a specific use case (e.g., "NanoClaw for Sales," "NanoClaw Minimal," "NanoClaw for Developers").
|
||||
|
||||
### Creating a flavor
|
||||
|
||||
1. Fork `nanocoai/nanoclaw`
|
||||
2. Merge in the skills you want
|
||||
3. Make custom changes (trigger word, prompts, integrations, etc.)
|
||||
4. Your fork's `main` IS the flavor
|
||||
|
||||
### Installing a flavor
|
||||
|
||||
During `/setup`, users are offered a choice of flavors before any configuration happens. The setup skill reads `flavors.yaml` from the repo (shipped with upstream, always up to date) and presents options:
|
||||
|
||||
AskUserQuestion: "Start with a flavor or default NanoClaw?"
|
||||
- Default NanoClaw
|
||||
- NanoClaw for Sales — Gmail + Slack + CRM (maintained by alice)
|
||||
- NanoClaw Minimal — Telegram-only, lightweight (maintained by bob)
|
||||
|
||||
If a flavor is chosen:
|
||||
|
||||
```bash
|
||||
git remote add <flavor-name> https://github.com/alice/nanoclaw.git
|
||||
git fetch <flavor-name> main
|
||||
git merge <flavor-name>/main
|
||||
```
|
||||
|
||||
Then setup continues normally (dependencies, auth, container, service).
|
||||
|
||||
**This choice is only offered on a fresh fork** — when the user's main matches or is close to upstream's main with no local commits. If `/setup` detects significant local changes (re-running setup on an existing install), it skips the flavor selection and goes straight to configuration.
|
||||
|
||||
After installation, the user's fork has three remotes:
|
||||
- `origin` — their fork (push customizations here)
|
||||
- `upstream` — `nanocoai/nanoclaw` (core updates)
|
||||
- `<flavor-name>` — the flavor fork (flavor updates)
|
||||
|
||||
### Updating a flavor
|
||||
|
||||
```bash
|
||||
git fetch <flavor-name> main
|
||||
git merge <flavor-name>/main
|
||||
```
|
||||
|
||||
The flavor maintainer keeps their fork updated (merging upstream, updating skills). Users pull flavor updates the same way they pull core updates.
|
||||
|
||||
### Flavors registry
|
||||
|
||||
`flavors.yaml` lives in the upstream repo:
|
||||
|
||||
```yaml
|
||||
flavors:
|
||||
- name: NanoClaw for Sales
|
||||
repo: alice/nanoclaw
|
||||
description: Gmail + Slack + CRM integration, daily pipeline summaries
|
||||
maintainer: alice
|
||||
|
||||
- name: NanoClaw Minimal
|
||||
repo: bob/nanoclaw
|
||||
description: Telegram-only, no container overhead
|
||||
maintainer: bob
|
||||
```
|
||||
|
||||
Anyone can PR to add their flavor. The file is available locally when `/setup` runs since it's part of the cloned repo.
|
||||
|
||||
### Discoverability
|
||||
|
||||
- **During setup** — flavor selection is offered as part of the initial setup flow
|
||||
- **`/browse-flavors` skill** — reads `flavors.yaml` and presents options at any time
|
||||
- **GitHub topics** — flavor forks can tag themselves with `nanoclaw-flavor` for searchability
|
||||
- **Discord / website** — community-curated lists
|
||||
|
||||
## Migration
|
||||
|
||||
Migration from the old skills engine to branches is complete. All feature skills now live on `skill/*` branches, and the skills engine has been removed.
|
||||
|
||||
### Skill branches
|
||||
|
||||
| Branch | Base | Description |
|
||||
|--------|------|-------------|
|
||||
| `skill/whatsapp` | `main` | WhatsApp channel |
|
||||
| `skill/telegram` | `main` | Telegram channel |
|
||||
| `skill/slack` | `main` | Slack channel |
|
||||
| `skill/discord` | `main` | Discord channel |
|
||||
| `skill/gmail` | `main` | Gmail channel |
|
||||
| `skill/voice-transcription` | `skill/whatsapp` | OpenAI Whisper voice transcription |
|
||||
| `skill/image-vision` | `skill/whatsapp` | Image attachment processing |
|
||||
| `skill/pdf-reader` | `skill/whatsapp` | PDF attachment reading |
|
||||
| `skill/local-whisper` | `skill/voice-transcription` | Local whisper.cpp transcription |
|
||||
| `skill/ollama-tool` | `main` | Ollama MCP server for local models |
|
||||
| `skill/apple-container` | `main` | Apple Container runtime |
|
||||
| `skill/reactions` | `main` | WhatsApp emoji reactions |
|
||||
|
||||
### What was removed
|
||||
|
||||
- `skills-engine/` directory (entire engine)
|
||||
- `scripts/apply-skill.ts`, `scripts/uninstall-skill.ts`, `scripts/rebase.ts`
|
||||
- `scripts/fix-skill-drift.ts`, `scripts/validate-all-skills.ts`
|
||||
- `.github/workflows/skill-drift.yml`, `.github/workflows/skill-pr.yml`
|
||||
- All `add/`, `modify/`, `tests/`, and `manifest.yaml` from skill directories
|
||||
- `.nanoclaw/` state directory
|
||||
|
||||
Operational skills (`setup`, `debug`, `update-nanoclaw`, `customize`, `update-skills`) remain on main in `.claude/skills/`.
|
||||
|
||||
## What Changes
|
||||
|
||||
### README Quick Start
|
||||
|
||||
Before:
|
||||
```bash
|
||||
git clone https://github.com/nanocoai/NanoClaw.git
|
||||
cd NanoClaw
|
||||
claude
|
||||
```
|
||||
|
||||
After:
|
||||
```
|
||||
1. Fork nanocoai/nanoclaw on GitHub
|
||||
2. git clone https://github.com/<you>/nanoclaw.git
|
||||
3. cd nanoclaw
|
||||
4. claude
|
||||
5. /setup
|
||||
```
|
||||
|
||||
### Setup skill (`/setup`)
|
||||
|
||||
Updates to the setup flow:
|
||||
|
||||
- Check if `upstream` remote exists; if not, add it: `git remote add upstream https://github.com/nanocoai/nanoclaw.git`
|
||||
- Check if `origin` points to the user's fork (not nanocoai). If it points to nanocoai, guide them through the fork migration.
|
||||
- **Install marketplace plugin:** `claude plugin install nanoclaw-skills@nanoclaw-skills --scope project` — makes all feature skills available (hot-loaded, no restart)
|
||||
- **Ask which channels to add:** present channel options (Discord, Telegram, Slack, WhatsApp, Gmail), run corresponding `/add-*` skills for selected channels
|
||||
- **Offer dependent skills:** after a channel is set up, offer relevant add-ons (e.g., Agent Swarm after Telegram, voice transcription after WhatsApp)
|
||||
- **Optionally enable community marketplaces:** ask if the user wants community skills, install those marketplace plugins too
|
||||
|
||||
### `.claude/settings.json`
|
||||
|
||||
Marketplace configuration so the official marketplace is auto-registered:
|
||||
|
||||
```json
|
||||
{
|
||||
"extraKnownMarketplaces": {
|
||||
"nanoclaw-skills": {
|
||||
"source": {
|
||||
"source": "github",
|
||||
"repo": "nanocoai/nanoclaw-skills"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Skills directory on main
|
||||
|
||||
The `.claude/skills/` directory on `main` retains only operational skills (setup, debug, update-nanoclaw, customize, update-skills). Feature skills (add-discord, add-telegram, etc.) live in the marketplace repo, installed via `claude plugin install` during `/setup` or `/customize`.
|
||||
|
||||
### Skills engine removal
|
||||
|
||||
The following can be removed:
|
||||
|
||||
- `skills-engine/` — entire directory (apply, merge, replay, state, backup, etc.)
|
||||
- `scripts/apply-skill.ts`
|
||||
- `scripts/uninstall-skill.ts`
|
||||
- `scripts/fix-skill-drift.ts`
|
||||
- `scripts/validate-all-skills.ts`
|
||||
- `.nanoclaw/` — state directory
|
||||
- `add/` and `modify/` subdirectories from all skill directories
|
||||
- Feature skill SKILL.md files from `.claude/skills/` on main (they now live in the marketplace)
|
||||
|
||||
Operational skills (`setup`, `debug`, `update-nanoclaw`, `customize`, `update-skills`) remain on main in `.claude/skills/`.
|
||||
|
||||
### New infrastructure
|
||||
|
||||
- **Marketplace repo** (`nanocoai/nanoclaw-skills`) — single Claude Code plugin bundling SKILL.md files for all feature skills
|
||||
- **CI GitHub Action** — merge-forward `main` into all `skill/*` branches on every push to `main`, using Claude (Haiku) for conflict resolution
|
||||
- **`/update-skills` skill** — checks for and applies skill branch updates using git history
|
||||
- **`CONTRIBUTORS.md`** — tracks skill contributors
|
||||
|
||||
### Update skill (`/update-nanoclaw`)
|
||||
|
||||
The update skill gets simpler with the branch-based approach. The old skills engine required replaying all applied skills after merging core updates — that entire step disappears. Skill changes are already in the user's git history, so `git merge upstream/main` just works.
|
||||
|
||||
**What stays the same:**
|
||||
- Preflight (clean working tree, upstream remote)
|
||||
- Backup branch + tag
|
||||
- Preview (git log, git diff, file buckets)
|
||||
- Merge/cherry-pick/rebase options
|
||||
- Conflict preview (dry-run merge)
|
||||
- Conflict resolution
|
||||
- Build + test validation
|
||||
- Rollback instructions
|
||||
|
||||
**What's removed:**
|
||||
- Skill replay step (was needed by the old skills engine to re-apply skills after core update)
|
||||
- Re-running structured operations (npm deps, env vars — these are part of git history now)
|
||||
|
||||
**What's added:**
|
||||
- Optional step at the end: "Check for skill updates?" which runs the `/update-skills` logic
|
||||
- This checks whether any previously-merged skill branches have new commits (bug fixes, improvements to the skill itself — not just merge-forwards from main)
|
||||
|
||||
**Why users don't need to re-merge skills after a core update:**
|
||||
When the user merged a skill branch, those changes became part of their git history. When they later merge `upstream/main`, git performs a normal three-way merge — the skill changes in their tree are untouched, and only core changes are brought in. The merge-forward CI ensures skill branches stay compatible with latest main, but that's for new users applying the skill fresh. Existing users who already merged the skill don't need to do anything.
|
||||
|
||||
Users only need to re-merge a skill branch if the skill itself was updated (not just merged-forward with main). The `/update-skills` check detects this.
|
||||
|
||||
## Discord Announcement
|
||||
|
||||
### For existing users
|
||||
|
||||
> **Skills are now git branches**
|
||||
>
|
||||
> We've simplified how skills work in NanoClaw. Instead of a custom skills engine, skills are now git branches that you merge in.
|
||||
>
|
||||
> **What this means for you:**
|
||||
> - Applying a skill: `git fetch upstream skill/discord && git merge upstream/skill/discord`
|
||||
> - Updating core: `git fetch upstream main && git merge upstream/main`
|
||||
> - Checking for skill updates: `/update-skills`
|
||||
> - No more `.nanoclaw/` state directory or skills engine
|
||||
>
|
||||
> **We now recommend forking instead of cloning.** This gives you a remote to push your customizations to.
|
||||
>
|
||||
> **If you currently have a clone with local changes**, migrate to a fork:
|
||||
> 1. Fork `nanocoai/nanoclaw` on GitHub
|
||||
> 2. Run:
|
||||
> ```
|
||||
> git remote rename origin upstream
|
||||
> git remote add origin https://github.com/<you>/nanoclaw.git
|
||||
> git push --force origin main
|
||||
> ```
|
||||
> This works even if you're way behind — just push your current state.
|
||||
>
|
||||
> **If you previously applied skills via the old system**, your code changes are already in your working tree — nothing to redo. You can delete the `.nanoclaw/` directory. Future skills and updates use the branch-based approach.
|
||||
>
|
||||
> **Discovering skills:** Skills are now available through Claude Code's plugin marketplace. Run `/plugin` in Claude Code to browse and install available skills.
|
||||
|
||||
### For skill contributors
|
||||
|
||||
> **Contributing skills**
|
||||
>
|
||||
> To contribute a skill:
|
||||
> 1. Fork `nanocoai/nanoclaw`
|
||||
> 2. Branch from `main` and make your code changes
|
||||
> 3. Open a regular PR
|
||||
>
|
||||
> That's it. We'll create a `skill/<name>` branch from your PR, add you to CONTRIBUTORS.md, and add the SKILL.md to the marketplace. CI automatically keeps skill branches merged-forward with `main` using Claude to resolve any conflicts.
|
||||
>
|
||||
> **Want to run your own skill marketplace?** Maintain skill branches on your fork and create a marketplace repo. Open a PR to add it to NanoClaw's auto-discovered marketplaces — or users can add it manually via `/plugin marketplace add`.
|
||||
@@ -0,0 +1,150 @@
|
||||
# The skills model
|
||||
|
||||
How NanoClaw stays customizable without breaking its forks. This is the full version; [customizing.md](customizing.md) is the short one, and [skill-guidelines.md](skill-guidelines.md) is the authoritative checklist for writing a skill.
|
||||
|
||||
## The problem
|
||||
|
||||
People fork NanoClaw and change the code. When we ship updates, their changes collide with ours and `git merge` turns into a fight. The more someone customized, the worse it gets. We can't grow the core without breaking everyone downstream.
|
||||
|
||||
## The bet
|
||||
|
||||
Every customization is a skill: not an edit buried in the core, but a skill that adds the change on top.
|
||||
|
||||
The core stays small and stable. Everything else composes on top as skills. Adding your 1st skill and your 500th skill is the same amount of work.
|
||||
|
||||
This works for any fork: a personal install with three tweaks, a company build with fifty.
|
||||
|
||||
## A fork is a recipe of skills
|
||||
|
||||
You don't track your changes as a pile of edits. You track them as skills.
|
||||
|
||||
- Each customization = one small skill.
|
||||
- One "recipe" skill lists all your skills and how they fit together: the order, and any dependencies between them.
|
||||
|
||||
So a fork is defined by its recipe. Most upgrades don't need to run it (see "Upgrading"), but it's what lets you rebuild the fork from scratch on clean upstream, and it's how you hand your whole fork to someone else. It replaces every "what did I change" artifact you'd otherwise keep (a migration guide, a manifest, a pile of notes) with one runnable thing.
|
||||
|
||||
The recipe is the one fork-specific thing. It lives in your fork, never upstream. (A recipe is itself a skill: a SKILL.md listing the fork's skills in apply order.)
|
||||
|
||||
## What's in a skill
|
||||
|
||||
A skill carries everything it needs:
|
||||
|
||||
- **Its code**: the files it adds (see "Where a skill's files live").
|
||||
- **Apply and remove.** Apply installs it; remove uninstalls it. Uninstall isn't a separate problem; it ships with the skill. (Remove is required exactly when apply leaves anything behind. A pure instruction-only skill that changes nothing needs none.)
|
||||
- **Its tests**: see "A test for every integration point." The tests *are* the verification. If they pass against the composed project, the skill applied correctly and works; there is no separate "verify" step.
|
||||
- **Its recipe entry**: how it composes with the others.
|
||||
|
||||
Apply must be safe to re-run. Upgrades re-run skills, so a skill that half-applies twice is a bug.
|
||||
|
||||
## Two kinds of skills
|
||||
|
||||
- **Capability skills** add something new: a channel, a provider, a tool, a dashboard.
|
||||
- **Patch skills** make small tweaks or bug fixes to existing behavior, instead of adding a capability.
|
||||
|
||||
Patch skills follow the same rules: a test for every edit, and code pushed into independent files wherever possible instead of inline. To keep the overhead down, bundle several small patches into a single patch skill rather than making one skill per one-line fix.
|
||||
|
||||
One honest exception: a bug fix that genuinely changes an existing line can't always be moved into a new file. That single line is the one place an upgrade can still hard-conflict. If upstream touched the same line, the fix has to be re-derived against the new code. That's fine when it's small and tested; just don't pretend it's free.
|
||||
|
||||
(Packaging is a separate axis: some skills fetch code from a registry branch, some ship files in their own folder, some are pure instructions.)
|
||||
|
||||
## What makes a good skill
|
||||
|
||||
A good skill mostly just *adds* things:
|
||||
|
||||
- Adds new files.
|
||||
- Adds a line to an existing file (an import, an entry, a line in `.env`).
|
||||
- Adds a dependency.
|
||||
- Changes a value in a JSON file like `package.json`.
|
||||
|
||||
These never really break.
|
||||
|
||||
The one risky move is when a skill has to *reach into* existing code and wire something in at a specific spot. That's the only part that breaks when we change the code later. Keep these rare, and keep them to a line or two that just *calls* code living in the skill's own files, not big chunks of logic inline.
|
||||
|
||||
Rule of thumb: aim for skills that are almost all "adds." Not 100%; some reach-ins are fine. But a skill full of reach-ins is a smell, and a sign that spot in the core should become a proper hook.
|
||||
|
||||
## Where a skill's files live
|
||||
|
||||
The files a skill adds live in the skill's own folder, and the skill copies them into the project when it runs. The skill is self-contained.
|
||||
|
||||
The exception is skills that plug into a registry: channels and providers. Their code is larger, multi-file, and has to stay in sync with the core as it changes over time. That code lives on a long-lived **registry branch** (`channels`, `providers`) that we forward-merge against main, and the skill fetches it from there (`git show origin/channels:path > path`). A frozen copy in a skill folder would go stale.
|
||||
|
||||
This fetch is **additive, never a merge**. The skill copies in the files it needs; it does *not* `git merge` the branch. Merging a registry branch into a customized install is exactly the conflict fight this model exists to avoid. A skill's **tests live on the branch alongside its code** and are fetched the same way; a channel's adapter travels with its registration test. A provider is the multi-point case: its code spans the host *and* container trees plus a Dockerfile edit, so it fetches files into both trees and ships a registration test per tree. See the provider archetype in [skill-guidelines.md](skill-guidelines.md).
|
||||
|
||||
Either way the skill brings its own code, from its folder or from its branch.
|
||||
|
||||
## A test for every integration point
|
||||
|
||||
The tests a skill *must* ship are the ones that prove it integrates with the core and keeps working as the core changes. That's the whole point. Tests of a skill's own internal logic, or of its behavior against an external service, are fine but optional: the creator's call, because they don't guard against upstream changes. A pure-add skill that touches nothing existing needs no required integration test at all.
|
||||
|
||||
The places that break on upgrade are the **integration points**: wherever a skill reaches into the existing system. That's not just the obvious code edit. An appended import, a config entry, a Dockerfile change, a mount, an installed dependency, and a direct read of the core's data all count. Each gets a guard that goes **red if it breaks or goes missing**:
|
||||
|
||||
- **A behavior or structural test of the wiring.** Prefer behavior when the seam is queryable at runtime: a channel's registration test imports the real barrel and asserts the registry contains it. Fall back to a structural test only for wiring with no invocable seam.
|
||||
- **The build / typecheck.** Always on. It catches the drift a runtime test can't: a renamed symbol, a moved module, a changed signature.
|
||||
- **Coverage of how an added file consumes the core.** When a skill's own file reaches into core APIs or data, a test must exercise that consumption against the *real* core. That's the leg that catches core drift.
|
||||
|
||||
Why points and not whole skills: a skill can have several, and each is a separate way to break. The count is honest signal: a skill's integration points are exactly its upgrade risk. Pure-add skills have zero and stay cheap.
|
||||
|
||||
This is what makes upgrades cheap to fix: when we move something in the core, the integration-point tests are exactly what fail, and that failing list *is* the set of skills to update.
|
||||
|
||||
**Tests travel with the skill.** They're files kept with the skill, in its folder or on its branch, and applying the skill copies them into the project's test tree. An integration-point test has to run against the *composed* system, so it only means anything once the skill is applied.
|
||||
|
||||
**The recipe tests the stack.** A single skill's tests prove that skill works alone. The recipe carries tests that run the skills *together*, in order. That's where you catch two skills that collide.
|
||||
|
||||
The full testing doctrine (how to pick the test type per point, the archetypes, the dependency cases) is in [skill-guidelines.md](skill-guidelines.md).
|
||||
|
||||
## How you actually work
|
||||
|
||||
You don't have to write a skill before you touch anything. Edit the code directly, get it working, then turn those edits into skills afterward; a coding agent does that conversion. Good authoring guidelines and a good recipe make skillifying-after-the-fact close to trivial.
|
||||
|
||||
The point isn't to slow you down at edit time. It's that nothing counts as part of your fork until it's a skill, because that's the only form that survives an upgrade.
|
||||
|
||||
## Upgrading
|
||||
|
||||
**Every update goes through `/update-nanoclaw`, never a raw `git pull`.** You don't know what an update contains until it lands; it might carry a breaking change with a migration. So the command inspects what's coming and runs the proper process: back up, pull the changes in, apply migrations, run tests, fix what broke, and flag when a fresh rebuild is needed instead.
|
||||
|
||||
Two different moves, two different rules. Your **fork pulls trunk**: that's a normal pull, run by the update command, and it's safe precisely because your changes live beside the core as skills rather than inside it. A **skill never merges**: it installs by fetching files and copying them in. If a skill's instructions say `git merge`, it isn't built to this model.
|
||||
|
||||
The update takes one of two paths:
|
||||
|
||||
**Normal upgrade: pull and fix what breaks.** Most of the time it pulls the latest upstream, resolves the occasional small conflict, runs the tests, and fixes whatever they flag. This stays cheap *because* the changes are small self-contained skills with tests: conflicts are rare, and when something does break, the failing test points at the exact skill and the fix is local.
|
||||
|
||||
**Rebuild from the recipe: the rare path.** Take fresh upstream and apply every skill from scratch. The command flags this when you've fallen far behind across many breaking changes (a clean rebuild beats catching up step by step). It's also how you hand your entire fork to someone else.
|
||||
|
||||
Around both:
|
||||
|
||||
- **The update skill updates itself first.** The first thing it does is fetch the latest version of the upgrade process. Otherwise you're upgrading with stale instructions.
|
||||
- **Snapshot first, restore on failure.** The upgrade sets a rollback point before it starts: today a git backup branch and tag; the model calls for a full project snapshot (code, database, data, files) so anything that fails rolls back and retries. Until that snapshot lands, a migration that touches data makes its own data backup. Nothing in the upgrade needs its own undo logic.
|
||||
- **Broken skills don't block you.** If a core change broke a skill, its test tells you, but the skill is usually still usable, and an agent fixes it at apply time. Skills are fixed lazily, when applied, not ahead of time for every core version.
|
||||
|
||||
## Migrations
|
||||
|
||||
Migrations are core, not an afterthought. Every breaking change ships with its migration, packaged together. A "migration" is broad: upgrading dependencies, a database change, a data backfill, moving files to new locations, whatever the change requires.
|
||||
|
||||
Migrations are **forward-only**. They don't need reverse scripts; the rollback point in front of the upgrade is the undo. If one fails, restore and retry.
|
||||
|
||||
A **startup tripwire** keeps installs on the supported path. Every sanctioned update path (install, update, migrate) stamps a marker with the version it reached; at startup the host checks that marker against the running code. If it's missing or doesn't match, because someone pulled by hand, the host stops, loudly, with the exact command to fix it instead of silently breaking.
|
||||
|
||||
The tripwire doesn't reason about *which* changes are breaking; it just enforces that the path was used. (DB schema migrations already run automatically at startup, so they aren't its concern; it guards everything else a raw `git pull` leaves undone.) To override, you stamp the marker yourself: an explicit "I know what I'm doing," not a deletion. If you have your **own** upgrade flow (a deploy script, a CI job), make stamping the last step after it succeeds: `pnpm exec tsx scripts/upgrade-state.ts set`. See [upgrade-recovery.md](upgrade-recovery.md).
|
||||
|
||||
## The maintainer's side of the deal
|
||||
|
||||
This is a two-sided contract. Users keep their changes as skills. In return, the maintainer keeps the core stable and owns the breakage.
|
||||
|
||||
As maintainer:
|
||||
|
||||
1. **Keep the core small and stable.** Resist hardwiring features into the core. Push them to skills too.
|
||||
2. **Before shipping a core change, run the skills against it.** That tells you what you broke before users find out.
|
||||
3. **When you break a skill, you fix it, not the users.** If a refactor moves something, update the affected skills or ship a migration. Don't make every user rediscover the same fix.
|
||||
4. **Ship the migration with the breaking change.** Packaged together: code, DB, files. Not a separate "good luck" note.
|
||||
5. **Watch for hotspots.** When lots of skills reach into the same spot in the core, that's the signal to add a proper hook there, so those reach-ins become clean adds.
|
||||
6. **Test against real forks.** Every core change and migration runs against a fleet of real, skill-built forks before shipping. Real proof on real installs.
|
||||
|
||||
## The public registry
|
||||
|
||||
Skills will be shared and composed; that's the whole point. A skill runs real code when it applies (copies files, installs dependencies, edits the Dockerfile). So a public registry of skills is a trust surface.
|
||||
|
||||
The rule: **every skill is reviewed and approved before it goes into the public registry, and every new version is re-reviewed.** Approving once and trusting forever is how supply chains get poisoned. Automated checks (linting against the guidelines, plus a harness that applies the skill on fresh upstream, runs its tests, removes it, and applies it twice) will clear the mechanical part so human review can focus on intent and safety. First-party skills are trusted by where they come from; the gate is for the public registry.
|
||||
|
||||
## The promise
|
||||
|
||||
Build your changes as skills following this, and we won't break you. It's a promise we can only make for skills: changes edited directly into the core are beyond what we can protect.
|
||||
@@ -0,0 +1,51 @@
|
||||
# Recovering from the upgrade tripwire
|
||||
|
||||
If NanoClaw refuses to start with a message like *"update did not go through the supported path"*, this page explains what happened and how to clear it.
|
||||
|
||||
## What happened
|
||||
|
||||
NanoClaw records the version it reached each time you upgrade through a supported path — `/setup`, `/update-nanoclaw`, or `/migrate-nanoclaw`. That record lives in `data/upgrade-state.json`.
|
||||
|
||||
At startup the host checks that record against the running code. If it's **missing** or its version **doesn't match** the code, the host stops. This almost always means the code was updated by a raw `git pull` instead of the supported flow — so migrations, dependency installs, or container rebuilds that the flow would have run may not have happened.
|
||||
|
||||
## If you just ran a supported upgrade
|
||||
|
||||
If you reached this point by running `/update-nanoclaw`, `/migrate-nanoclaw`, or `/setup` and it **completed successfully**, this is expected the first time an existing install meets the tripwire (your previous version predated it). Clear it by stamping the current version:
|
||||
|
||||
```bash
|
||||
pnpm exec tsx scripts/upgrade-state.ts set
|
||||
```
|
||||
|
||||
Then restart NanoClaw. It won't trip again.
|
||||
|
||||
**Coding agents:** if the upgrade you just ran reported success, run the command above automatically and restart. Only stop to ask a human if the upgrade itself failed or you can't confirm it finished.
|
||||
|
||||
## If you ran `git pull` yourself
|
||||
|
||||
Don't just clear the tripwire — that skips the work the supported flow does. Instead, run the supported upgrade so backups, migrations, and rebuilds happen:
|
||||
|
||||
```
|
||||
/update-nanoclaw
|
||||
```
|
||||
|
||||
Once it finishes it stamps the marker for you, and the next start is clean.
|
||||
|
||||
## If you have your own upgrade flow
|
||||
|
||||
If you've built your own way to upgrade — a custom skill, a deploy script, a CI job, a service that pulls and restarts — it won't stamp the marker, so the host will trip on the next start. Add the stamp as the **last step** of that flow, after the upgrade succeeds and before the restart:
|
||||
|
||||
```bash
|
||||
pnpm exec tsx scripts/upgrade-state.ts set
|
||||
```
|
||||
|
||||
That's the same thing `/setup`, `/update-nanoclaw`, and `/migrate-nanoclaw` do at the end. Do it only when the upgrade actually completed — the marker is your assertion that this install reached the current version through a path you trust.
|
||||
|
||||
## The override
|
||||
|
||||
`pnpm exec tsx scripts/upgrade-state.ts set` is the override: it declares "this install is good at the current version." Use it when you know the install is actually in a good state (e.g. you completed the steps manually). It's safe to re-run.
|
||||
|
||||
To inspect the current marker:
|
||||
|
||||
```bash
|
||||
pnpm exec tsx scripts/upgrade-state.ts get
|
||||
```
|
||||
+1
-1
@@ -1,6 +1,6 @@
|
||||
{
|
||||
"name": "nanoclaw",
|
||||
"version": "2.0.76",
|
||||
"version": "2.1.4",
|
||||
"description": "Personal Claude assistant. Lightweight, secure, customizable.",
|
||||
"type": "module",
|
||||
"packageManager": "pnpm@10.33.0",
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="90" height="20" role="img" aria-label="181k tokens, 91% of context window">
|
||||
<title>181k tokens, 91% of context window</title>
|
||||
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="90" height="20" role="img" aria-label="185k tokens, 92% of context window">
|
||||
<title>185k tokens, 92% of context window</title>
|
||||
<linearGradient id="s" x2="0" y2="100%">
|
||||
<stop offset="0" stop-color="#bbb" stop-opacity=".1"/>
|
||||
<stop offset="1" stop-opacity=".1"/>
|
||||
@@ -15,8 +15,8 @@
|
||||
<g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" font-size="11">
|
||||
<text aria-hidden="true" x="26" y="15" fill="#010101" fill-opacity=".3">tokens</text>
|
||||
<text x="26" y="14">tokens</text>
|
||||
<text aria-hidden="true" x="71" y="15" fill="#010101" fill-opacity=".3">181k</text>
|
||||
<text x="71" y="14">181k</text>
|
||||
<text aria-hidden="true" x="71" y="15" fill="#010101" fill-opacity=".3">185k</text>
|
||||
<text x="71" y="14">185k</text>
|
||||
</g>
|
||||
</g>
|
||||
</a>
|
||||
|
||||
|
Before Width: | Height: | Size: 1.1 KiB After Width: | Height: | Size: 1.1 KiB |
@@ -0,0 +1,26 @@
|
||||
/**
|
||||
* scripts/upgrade-state.ts — read or stamp the upgrade marker.
|
||||
*
|
||||
* Usage:
|
||||
* pnpm exec tsx scripts/upgrade-state.ts get
|
||||
* pnpm exec tsx scripts/upgrade-state.ts set [version] [via]
|
||||
*
|
||||
* `set` with no version stamps the current package.json version. The
|
||||
* sanctioned upgrade paths (setup / update / migrate) call `set` on
|
||||
* success; running it by hand is also the documented way to clear the
|
||||
* startup tripwire — see docs/upgrade-recovery.md.
|
||||
*/
|
||||
import { getCodeVersion, markerPath, readUpgradeState, writeUpgradeState } from '../src/upgrade-state.js';
|
||||
|
||||
const [, , cmd, versionArg, viaArg] = process.argv;
|
||||
|
||||
if (cmd === 'get') {
|
||||
const state = readUpgradeState();
|
||||
console.log(state ? JSON.stringify(state) : 'none');
|
||||
} else if (cmd === 'set') {
|
||||
const state = writeUpgradeState({ version: versionArg || getCodeVersion(), via: viaArg || 'manual' });
|
||||
console.log(`Stamped ${markerPath()}: ${JSON.stringify(state)}`);
|
||||
} else {
|
||||
console.error('Usage: pnpm exec tsx scripts/upgrade-state.ts get | set [version] [via]');
|
||||
process.exit(2);
|
||||
}
|
||||
@@ -11,6 +11,7 @@ import path from 'path';
|
||||
|
||||
import { log } from '../src/log.js';
|
||||
import { getLaunchdLabel, getSystemdUnit } from '../src/install-slug.js';
|
||||
import { writeUpgradeState } from '../src/upgrade-state.js';
|
||||
import { cleanupUnhealthyPeers } from './peer-cleanup.js';
|
||||
import {
|
||||
commandExists,
|
||||
@@ -54,6 +55,11 @@ export async function run(_args: string[]): Promise<void> {
|
||||
|
||||
fs.mkdirSync(path.join(projectRoot, 'logs'), { recursive: true });
|
||||
|
||||
// Stamp the upgrade marker before the host first starts, so the startup
|
||||
// tripwire (enforceUpgradeTripwire) sees this as a sanctioned install.
|
||||
const stamped = writeUpgradeState({ via: 'setup' });
|
||||
log.info('Stamped upgrade marker', { version: stamped.version });
|
||||
|
||||
// Peer preflight — a crash-looping peer install (most often the legacy v1
|
||||
// `com.nanoclaw` plist) will keep trashing this install's containers on
|
||||
// every respawn via its own cleanupOrphans. Detect and unload any peer
|
||||
|
||||
@@ -23,6 +23,7 @@ import { materializeContainerJson } from './container-config.js';
|
||||
import { getContainerConfig } from './db/container-configs.js';
|
||||
import { updateContainerConfigScalars, updateContainerConfigJson } from './db/container-configs.js';
|
||||
import { CONTAINER_RUNTIME_BIN, hostGatewayArgs, readonlyMountArgs, stopContainer } from './container-runtime.js';
|
||||
import { EGRESS_NETWORK, egressNetworkArgs, ensureEgressNetwork } from './egress-lockdown.js';
|
||||
import { composeGroupClaudeMd } from './claude-md-compose.js';
|
||||
import { getAgentGroup } from './db/agent-groups.js';
|
||||
import { getDb, hasTable } from './db/connection.js';
|
||||
@@ -432,8 +433,14 @@ async function buildContainerArgs(
|
||||
}
|
||||
log.info('OneCLI gateway applied', { containerName });
|
||||
|
||||
// Host gateway
|
||||
args.push(...hostGatewayArgs());
|
||||
// Egress lockdown when enabled — throws if it can't be established, aborting
|
||||
// the spawn rather than running with open egress. Otherwise the host gateway.
|
||||
if (ensureEgressNetwork()) {
|
||||
args.push(...egressNetworkArgs());
|
||||
log.info('Egress lockdown active', { containerName, network: EGRESS_NETWORK });
|
||||
} else {
|
||||
args.push(...hostGatewayArgs());
|
||||
}
|
||||
|
||||
// User mapping
|
||||
const hostUid = process.getuid?.();
|
||||
|
||||
@@ -0,0 +1,95 @@
|
||||
/**
|
||||
* Egress lockdown — force ALL agent traffic through the OneCLI gateway.
|
||||
* Agents run on a Docker `--internal` network (no internet route) with the
|
||||
* gateway attached as host.docker.internal, so the injected proxy is the only
|
||||
* reachable hop. Non-root, no NET_ADMIN — the agent can't undo it.
|
||||
*
|
||||
* Fail-fast: when the flag is on but the network/gateway can't be set up, throw
|
||||
* rather than silently spawn an agent with open egress.
|
||||
*/
|
||||
import { execFileSync } from 'child_process';
|
||||
|
||||
import { CONTAINER_RUNTIME_BIN } from './container-runtime.js';
|
||||
import { log } from './log.js';
|
||||
|
||||
/** Locked-down, no-internet network agents are placed on. */
|
||||
export const EGRESS_NETWORK = process.env.NANOCLAW_EGRESS_NETWORK || 'nanoclaw-egress';
|
||||
/** The OneCLI gateway container attached as the only egress hop. */
|
||||
const ONECLI_GATEWAY_CONTAINER = process.env.ONECLI_GATEWAY_CONTAINER || 'onecli';
|
||||
/** Off by default; set NANOCLAW_EGRESS_LOCKDOWN=true to opt in. */
|
||||
const EGRESS_LOCKDOWN = process.env.NANOCLAW_EGRESS_LOCKDOWN === 'true';
|
||||
|
||||
/** Raised when lockdown is requested but can't be established. */
|
||||
export class EgressLockdownError extends Error {
|
||||
constructor(reason: string) {
|
||||
super(
|
||||
`Egress lockdown is on (NANOCLAW_EGRESS_LOCKDOWN=true) but ${reason}. ` +
|
||||
`Refusing to spawn with open egress. Start the OneCLI gateway container ` +
|
||||
`"${ONECLI_GATEWAY_CONTAINER}", or set NANOCLAW_EGRESS_LOCKDOWN=false to opt out.`,
|
||||
);
|
||||
this.name = 'EgressLockdownError';
|
||||
}
|
||||
}
|
||||
|
||||
function dockerOk(args: string[]): boolean {
|
||||
try {
|
||||
execFileSync(CONTAINER_RUNTIME_BIN, args, { stdio: 'pipe', timeout: 15000 });
|
||||
return true;
|
||||
} catch {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
/** Is the OneCLI gateway currently attached to the egress network? */
|
||||
function gatewayAttached(): boolean {
|
||||
try {
|
||||
const out = execFileSync(
|
||||
CONTAINER_RUNTIME_BIN,
|
||||
['network', 'inspect', EGRESS_NETWORK, '--format', '{{range .Containers}}{{.Name}} {{end}}'],
|
||||
{ stdio: ['pipe', 'pipe', 'pipe'], encoding: 'utf-8', timeout: 15000 },
|
||||
);
|
||||
return out.split(/\s+/).includes(ONECLI_GATEWAY_CONTAINER);
|
||||
} catch {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Ensure the egress network exists with the OneCLI gateway attached (aliased
|
||||
* host.docker.internal). Idempotent + self-healing. Returns false when lockdown
|
||||
* is disabled (caller uses the host gateway), true when it's active. Throws
|
||||
* EgressLockdownError when enabled but unestablishable — fail fast rather than
|
||||
* spawn an agent with open egress.
|
||||
*/
|
||||
export function ensureEgressNetwork(): boolean {
|
||||
if (!EGRESS_LOCKDOWN) return false;
|
||||
|
||||
if (
|
||||
!dockerOk(['network', 'inspect', EGRESS_NETWORK]) &&
|
||||
!dockerOk(['network', 'create', '--internal', EGRESS_NETWORK])
|
||||
) {
|
||||
throw new EgressLockdownError(`the "${EGRESS_NETWORK}" internal network could not be created`);
|
||||
}
|
||||
|
||||
if (gatewayAttached()) return true;
|
||||
|
||||
if (
|
||||
dockerOk(['network', 'connect', '--alias', 'host.docker.internal', EGRESS_NETWORK, ONECLI_GATEWAY_CONTAINER]) &&
|
||||
gatewayAttached()
|
||||
) {
|
||||
log.info('Egress lockdown: OneCLI gateway attached', {
|
||||
network: EGRESS_NETWORK,
|
||||
gateway: ONECLI_GATEWAY_CONTAINER,
|
||||
});
|
||||
return true;
|
||||
}
|
||||
|
||||
throw new EgressLockdownError(
|
||||
`the OneCLI gateway "${ONECLI_GATEWAY_CONTAINER}" could not be attached to "${EGRESS_NETWORK}"`,
|
||||
);
|
||||
}
|
||||
|
||||
/** CLI args placing a container on the locked-down egress network. */
|
||||
export function egressNetworkArgs(): string[] {
|
||||
return ['--network', EGRESS_NETWORK];
|
||||
}
|
||||
@@ -29,6 +29,7 @@
|
||||
import type Database from 'better-sqlite3';
|
||||
import fs from 'fs';
|
||||
|
||||
import { ensureEgressNetwork } from './egress-lockdown.js';
|
||||
import { getActiveSessions } from './db/sessions.js';
|
||||
import { getAgentGroup } from './db/agent-groups.js';
|
||||
import {
|
||||
@@ -132,6 +133,16 @@ export function stopHostSweep(): void {
|
||||
async function sweep(): Promise<void> {
|
||||
if (!running) return;
|
||||
|
||||
// Re-heal the egress network so already-running agents keep their gateway hop
|
||||
// if it was detached out-of-band. Best-effort here: a heal failure isn't a
|
||||
// leak (agents stay on the internal net), so log and continue. No-op when
|
||||
// lockdown is disabled.
|
||||
try {
|
||||
ensureEgressNetwork();
|
||||
} catch (err) {
|
||||
log.error('Egress lockdown re-heal failed', { err });
|
||||
}
|
||||
|
||||
try {
|
||||
const sessions = getActiveSessions();
|
||||
for (const session of sessions) {
|
||||
|
||||
@@ -17,6 +17,7 @@ import { startActiveDeliveryPoll, startSweepDeliveryPoll, setDeliveryAdapter, st
|
||||
import { startHostSweep, stopHostSweep } from './host-sweep.js';
|
||||
import { routeInbound } from './router.js';
|
||||
import { log } from './log.js';
|
||||
import { enforceUpgradeTripwire } from './upgrade-state.js';
|
||||
|
||||
// Response + shutdown registries live in response-registry.ts to break the
|
||||
// circular import cycle: src/index.ts imports src/modules/index.js for side
|
||||
@@ -69,6 +70,10 @@ async function main(): Promise<void> {
|
||||
// 0. Circuit breaker — backoff on rapid restarts
|
||||
await enforceStartupBackoff();
|
||||
|
||||
// 0.5 Upgrade tripwire — refuse to start if this install was updated
|
||||
// outside the sanctioned path (raw `git pull` instead of /update-nanoclaw).
|
||||
enforceUpgradeTripwire();
|
||||
|
||||
// 1. Init central DB
|
||||
const dbPath = path.join(DATA_DIR, 'v2.db');
|
||||
const db = initDb(dbPath);
|
||||
|
||||
@@ -443,4 +443,28 @@ describe('routeAgentMessage return-path', () => {
|
||||
expect(fs.existsSync(targetPath)).toBe(true);
|
||||
expect(fs.readFileSync(targetPath, 'utf-8')).toBe('fake-pdf-bytes');
|
||||
});
|
||||
|
||||
it('file forwarding: skips symlinked source files', async () => {
|
||||
const secretPath = path.join(TEST_DIR, 'host-secret.txt');
|
||||
fs.writeFileSync(secretPath, 'host-secret-bytes');
|
||||
|
||||
const outboxDir = path.join(sessionDir(A, S1.id), 'outbox', 'msg-with-symlink');
|
||||
fs.mkdirSync(outboxDir, { recursive: true });
|
||||
fs.symlinkSync(secretPath, path.join(outboxDir, 'safe-name.txt'));
|
||||
|
||||
await routeAgentMessage(
|
||||
{
|
||||
id: 'msg-with-symlink',
|
||||
platform_id: B,
|
||||
content: JSON.stringify({ text: 'see attached', files: ['safe-name.txt'] }),
|
||||
in_reply_to: null,
|
||||
},
|
||||
S1,
|
||||
);
|
||||
|
||||
const bRows = readInbound(B, SB.id);
|
||||
expect(bRows).toHaveLength(1);
|
||||
const parsed = JSON.parse(bRows[0].content);
|
||||
expect(parsed.attachments).toHaveLength(0);
|
||||
});
|
||||
});
|
||||
|
||||
@@ -40,6 +40,11 @@ export interface ForwardedAttachment {
|
||||
localPath: string;
|
||||
}
|
||||
|
||||
function isPathInside(parent: string, child: string): boolean {
|
||||
const relative = path.relative(parent, child);
|
||||
return relative === '' || (!relative.startsWith('..') && !path.isAbsolute(relative));
|
||||
}
|
||||
|
||||
/**
|
||||
* Copy file attachments from the source agent's outbox into the target
|
||||
* agent's inbox. Returns attachments using the formatter's existing
|
||||
@@ -57,6 +62,11 @@ export function forwardAttachedFiles(
|
||||
): ForwardedAttachment[] {
|
||||
if (source.filenames.length === 0) return [];
|
||||
|
||||
if (!isSafeAttachmentName(source.messageId)) {
|
||||
log.warn('agent-route: rejecting unsafe source outbox message id', { sourceMsgId: source.messageId });
|
||||
return [];
|
||||
}
|
||||
|
||||
const sourceDir = path.join(sessionDir(source.agentGroupId, source.sessionId), 'outbox', source.messageId);
|
||||
if (!fs.existsSync(sourceDir)) {
|
||||
log.warn('agent-route: source outbox dir missing, no files forwarded', {
|
||||
@@ -66,6 +76,26 @@ export function forwardAttachedFiles(
|
||||
return [];
|
||||
}
|
||||
|
||||
let realSourceDir: string;
|
||||
try {
|
||||
const sourceDirStat = fs.lstatSync(sourceDir);
|
||||
if (!sourceDirStat.isDirectory() || sourceDirStat.isSymbolicLink()) {
|
||||
log.warn('agent-route: rejecting unsafe source outbox dir', {
|
||||
sourceMsgId: source.messageId,
|
||||
sourceDir,
|
||||
});
|
||||
return [];
|
||||
}
|
||||
realSourceDir = fs.realpathSync(sourceDir);
|
||||
} catch (err) {
|
||||
log.warn('agent-route: failed to inspect source outbox dir', {
|
||||
sourceMsgId: source.messageId,
|
||||
sourceDir,
|
||||
err,
|
||||
});
|
||||
return [];
|
||||
}
|
||||
|
||||
const targetInboxDir = path.join(sessionDir(target.agentGroupId, target.sessionId), 'inbox', target.messageId);
|
||||
fs.mkdirSync(targetInboxDir, { recursive: true });
|
||||
|
||||
@@ -79,15 +109,33 @@ export function forwardAttachedFiles(
|
||||
continue;
|
||||
}
|
||||
const src = path.join(sourceDir, filename);
|
||||
if (!fs.existsSync(src)) {
|
||||
let realSrc: string;
|
||||
try {
|
||||
const srcStat = fs.lstatSync(src);
|
||||
if (!srcStat.isFile() || srcStat.isSymbolicLink()) {
|
||||
log.warn('agent-route: rejecting unsafe source outbox file', {
|
||||
sourceMsgId: source.messageId,
|
||||
filename,
|
||||
});
|
||||
continue;
|
||||
}
|
||||
realSrc = fs.realpathSync(src);
|
||||
} catch {
|
||||
log.warn('agent-route: referenced file missing in source outbox, skipped', {
|
||||
sourceMsgId: source.messageId,
|
||||
filename,
|
||||
});
|
||||
continue;
|
||||
}
|
||||
if (!isPathInside(realSourceDir, realSrc)) {
|
||||
log.warn('agent-route: rejecting source file outside source outbox dir', {
|
||||
sourceMsgId: source.messageId,
|
||||
filename,
|
||||
});
|
||||
continue;
|
||||
}
|
||||
const dst = path.join(targetInboxDir, filename);
|
||||
fs.copyFileSync(src, dst);
|
||||
fs.copyFileSync(realSrc, dst);
|
||||
attachments.push({
|
||||
name: filename,
|
||||
filename,
|
||||
|
||||
@@ -0,0 +1,115 @@
|
||||
/**
|
||||
* Tests for create_agent host-side authorization.
|
||||
*
|
||||
* Regression guard for the audit finding: `create_agent` is a privileged
|
||||
* central-DB write with no host-side authz. The fix authorizes by CLI scope —
|
||||
* trusted owner agent groups ('global') create directly; confined groups
|
||||
* ('group', the default and the prompt-injection victim) must get admin
|
||||
* approval. These tests pin that branch decision.
|
||||
*/
|
||||
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
|
||||
|
||||
import type { Session } from '../../types.js';
|
||||
|
||||
// Mocks for the collaborators the branch decides between / depends on.
|
||||
const mockRequestApproval = vi.fn().mockResolvedValue(undefined);
|
||||
const mockGetContainerConfig = vi.fn();
|
||||
const mockCreateAgentGroup = vi.fn();
|
||||
const mockInitGroupFilesystem = vi.fn();
|
||||
const mockWriteDestinations = vi.fn();
|
||||
const mockNotifyWrite = vi.fn();
|
||||
|
||||
vi.mock('../approvals/index.js', () => ({
|
||||
requestApproval: (...a: unknown[]) => mockRequestApproval(...a),
|
||||
}));
|
||||
vi.mock('../../db/container-configs.js', () => ({
|
||||
getContainerConfig: (...a: unknown[]) => mockGetContainerConfig(...a),
|
||||
}));
|
||||
vi.mock('../../db/agent-groups.js', () => ({
|
||||
getAgentGroup: (id: string) => ({ id, name: id.toUpperCase(), folder: id, agent_provider: null, created_at: '' }),
|
||||
getAgentGroupByFolder: () => undefined,
|
||||
createAgentGroup: (...a: unknown[]) => mockCreateAgentGroup(...a),
|
||||
}));
|
||||
vi.mock('../../group-init.js', () => ({
|
||||
initGroupFilesystem: (...a: unknown[]) => mockInitGroupFilesystem(...a),
|
||||
}));
|
||||
vi.mock('./write-destinations.js', () => ({
|
||||
writeDestinations: (...a: unknown[]) => mockWriteDestinations(...a),
|
||||
}));
|
||||
vi.mock('./db/agent-destinations.js', () => ({
|
||||
getDestinationByName: () => undefined,
|
||||
createDestination: vi.fn(),
|
||||
normalizeName: (s: string) => s.toLowerCase().replace(/[^a-z0-9]+/g, '-'),
|
||||
}));
|
||||
// notifyAgent writes to the session inbound.db + wakes the container; stub both.
|
||||
vi.mock('../../session-manager.js', () => ({
|
||||
writeSessionMessage: (...a: unknown[]) => mockNotifyWrite(...a),
|
||||
}));
|
||||
vi.mock('../../container-runner.js', () => ({
|
||||
wakeContainer: vi.fn().mockResolvedValue(undefined),
|
||||
}));
|
||||
vi.mock('../../db/sessions.js', () => ({
|
||||
getSession: (id: string) => ({ id, agent_group_id: 'ag-1' }),
|
||||
}));
|
||||
|
||||
import { handleCreateAgent } from './create-agent.js';
|
||||
|
||||
const SESSION = { id: 'sess-1', agent_group_id: 'ag-1' } as Session;
|
||||
|
||||
beforeEach(() => {
|
||||
vi.clearAllMocks();
|
||||
});
|
||||
|
||||
afterEach(() => {
|
||||
vi.restoreAllMocks();
|
||||
});
|
||||
|
||||
describe('handleCreateAgent — scope-based authorization', () => {
|
||||
it('global scope: creates directly, no approval requested', async () => {
|
||||
mockGetContainerConfig.mockReturnValue({ cli_scope: 'global' });
|
||||
|
||||
await handleCreateAgent({ name: 'Scout', instructions: 'help' }, SESSION);
|
||||
|
||||
expect(mockRequestApproval).not.toHaveBeenCalled();
|
||||
expect(mockCreateAgentGroup).toHaveBeenCalledTimes(1);
|
||||
expect(mockInitGroupFilesystem).toHaveBeenCalledTimes(1);
|
||||
});
|
||||
|
||||
it('group scope (default): requires approval, does NOT create directly', async () => {
|
||||
mockGetContainerConfig.mockReturnValue({ cli_scope: 'group' });
|
||||
|
||||
await handleCreateAgent({ name: 'Scout', instructions: 'help' }, SESSION);
|
||||
|
||||
expect(mockRequestApproval).toHaveBeenCalledTimes(1);
|
||||
expect(mockRequestApproval.mock.calls[0][0]).toMatchObject({ action: 'create_agent' });
|
||||
expect(mockCreateAgentGroup).not.toHaveBeenCalled();
|
||||
expect(mockInitGroupFilesystem).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('missing config: fails closed to approval (no direct create)', async () => {
|
||||
mockGetContainerConfig.mockReturnValue(undefined);
|
||||
|
||||
await handleCreateAgent({ name: 'Scout' }, SESSION);
|
||||
|
||||
expect(mockRequestApproval).toHaveBeenCalledTimes(1);
|
||||
expect(mockCreateAgentGroup).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('disabled/other scope: requires approval', async () => {
|
||||
mockGetContainerConfig.mockReturnValue({ cli_scope: 'disabled' });
|
||||
|
||||
await handleCreateAgent({ name: 'Scout' }, SESSION);
|
||||
|
||||
expect(mockRequestApproval).toHaveBeenCalledTimes(1);
|
||||
expect(mockCreateAgentGroup).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('empty name: neither creates nor requests approval', async () => {
|
||||
mockGetContainerConfig.mockReturnValue({ cli_scope: 'global' });
|
||||
|
||||
await handleCreateAgent({ name: '' }, SESSION);
|
||||
|
||||
expect(mockRequestApproval).not.toHaveBeenCalled();
|
||||
expect(mockCreateAgentGroup).not.toHaveBeenCalled();
|
||||
});
|
||||
});
|
||||
@@ -1,20 +1,29 @@
|
||||
/**
|
||||
* `create_agent` delivery-action handler.
|
||||
*
|
||||
* Spawns a new agent group on demand from the parent agent, wires bidirectional
|
||||
* agent_destinations rows, projects the new destination into the parent's
|
||||
* running container, and notifies the parent.
|
||||
* SECURITY: `create_agent` writes to the CENTRAL DB (agent_groups,
|
||||
* container_configs, agent_destinations) and scaffolds host filesystem state —
|
||||
* a privileged operation a confined container is otherwise architecturally
|
||||
* barred from. The container's MCP tool gate is inside the (untrusted)
|
||||
* container and is trivially bypassed by writing the outbound system row
|
||||
* directly, so authorization MUST be enforced host-side. Trusted owner agent
|
||||
* groups (CLI scope 'global') create directly; every other (confined) group
|
||||
* requires admin approval via `requestApproval` — matching `ncl groups create`
|
||||
* (access: 'approval') and the self-mod actions. `applyCreateAgent` runs the
|
||||
* creation on approve; `performCreateAgent` is the shared body.
|
||||
*/
|
||||
import path from 'path';
|
||||
|
||||
import { GROUPS_DIR } from '../../config.js';
|
||||
import { createAgentGroup, getAgentGroup, getAgentGroupByFolder } from '../../db/agent-groups.js';
|
||||
import { getContainerConfig } from '../../db/container-configs.js';
|
||||
import { getSession } from '../../db/sessions.js';
|
||||
import { wakeContainer } from '../../container-runner.js';
|
||||
import { initGroupFilesystem } from '../../group-init.js';
|
||||
import { log } from '../../log.js';
|
||||
import { writeSessionMessage } from '../../session-manager.js';
|
||||
import type { AgentGroup, Session } from '../../types.js';
|
||||
import { requestApproval, type ApprovalHandler } from '../approvals/index.js';
|
||||
import { createDestination, getDestinationByName, normalizeName } from './db/agent-destinations.js';
|
||||
import { writeDestinations } from './write-destinations.js';
|
||||
|
||||
@@ -34,23 +43,95 @@ function notifyAgent(session: Session, text: string): void {
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Delivery-action entry.
|
||||
*
|
||||
* Authorization depends on the calling group's CLI scope:
|
||||
* - `global` (set by init-first-agent for trusted owner agent groups):
|
||||
* create immediately. create_agent is the intended primitive for these
|
||||
* privileged agents, and an approval tap on every sub-agent spawn would be
|
||||
* needless friction.
|
||||
* - anything else (the default `group` scope — the realistic
|
||||
* prompt-injection victim): require an admin to approve before any
|
||||
* central-DB write. `applyCreateAgent` runs on approve.
|
||||
* Unknown/missing config fails closed to the approval path.
|
||||
*/
|
||||
export async function handleCreateAgent(content: Record<string, unknown>, session: Session): Promise<void> {
|
||||
const requestId = content.requestId as string;
|
||||
const name = content.name as string;
|
||||
const instructions = content.instructions as string | null;
|
||||
const name = typeof content.name === 'string' ? content.name : '';
|
||||
const instructions = typeof content.instructions === 'string' ? content.instructions : null;
|
||||
|
||||
if (!name) {
|
||||
notifyAgent(session, 'create_agent failed: name is required.');
|
||||
return;
|
||||
}
|
||||
|
||||
const sourceGroup = getAgentGroup(session.agent_group_id);
|
||||
if (!sourceGroup) {
|
||||
notifyAgent(session, `create_agent failed: source agent group not found.`);
|
||||
notifyAgent(session, 'create_agent failed: source agent group not found.');
|
||||
log.warn('create_agent failed: missing source group', { sessionAgentGroup: session.agent_group_id, name });
|
||||
return;
|
||||
}
|
||||
|
||||
const cliScope = getContainerConfig(session.agent_group_id)?.cli_scope ?? 'group';
|
||||
if (cliScope === 'global') {
|
||||
// Trusted owner agent group — create directly, then notify (+wake) it.
|
||||
await performCreateAgent(name, instructions, session, sourceGroup, (text) => notifyAgent(session, text));
|
||||
return;
|
||||
}
|
||||
|
||||
await requestApproval({
|
||||
session,
|
||||
agentName: sourceGroup.name,
|
||||
action: 'create_agent',
|
||||
payload: { name, instructions },
|
||||
title: `Create agent: ${name}`,
|
||||
question: `Agent "${sourceGroup.name}" wants to create a new sub-agent "${name}" (a new agent group with its own workspace and container). Approve?`,
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Approval handler: performs the creation once an admin approves a request from
|
||||
* a confined (non-global) agent group. `session` is the requesting parent.
|
||||
*/
|
||||
export const applyCreateAgent: ApprovalHandler = async ({ session, payload, notify }) => {
|
||||
const name = typeof payload.name === 'string' ? payload.name : '';
|
||||
const instructions = typeof payload.instructions === 'string' ? payload.instructions : null;
|
||||
|
||||
if (!name) {
|
||||
notify('create_agent approved but the request had no name.');
|
||||
return;
|
||||
}
|
||||
|
||||
const sourceGroup = getAgentGroup(session.agent_group_id);
|
||||
if (!sourceGroup) {
|
||||
notify('create_agent approved but the source agent group no longer exists.');
|
||||
log.warn('create_agent apply failed: missing source group', { sessionAgentGroup: session.agent_group_id, name });
|
||||
return;
|
||||
}
|
||||
|
||||
await performCreateAgent(name, instructions, session, sourceGroup, notify);
|
||||
};
|
||||
|
||||
/**
|
||||
* Core creation: writes the new agent group + bidirectional destinations and
|
||||
* scaffolds its filesystem, then reports via `notify`. Authorization is the
|
||||
* CALLER's responsibility (the global-scope shortcut in handleCreateAgent or
|
||||
* admin approval via applyCreateAgent) — never call this from an unauthorized
|
||||
* path, as it performs privileged central-DB writes a confined container is
|
||||
* otherwise barred from.
|
||||
*/
|
||||
async function performCreateAgent(
|
||||
name: string,
|
||||
instructions: string | null,
|
||||
session: Session,
|
||||
sourceGroup: AgentGroup,
|
||||
notify: (text: string) => void,
|
||||
): Promise<void> {
|
||||
const localName = normalizeName(name);
|
||||
|
||||
// Collision in the creator's destination namespace
|
||||
if (getDestinationByName(sourceGroup.id, localName)) {
|
||||
notifyAgent(session, `Cannot create agent "${name}": you already have a destination named "${localName}".`);
|
||||
notify(`Cannot create agent "${name}": you already have a destination named "${localName}".`);
|
||||
return;
|
||||
}
|
||||
|
||||
@@ -66,7 +147,7 @@ export async function handleCreateAgent(content: Record<string, unknown>, sessio
|
||||
const resolvedPath = path.resolve(groupPath);
|
||||
const resolvedGroupsDir = path.resolve(GROUPS_DIR);
|
||||
if (!resolvedPath.startsWith(resolvedGroupsDir + path.sep)) {
|
||||
notifyAgent(session, `Cannot create agent "${name}": invalid folder path.`);
|
||||
notify(`Cannot create agent "${name}": invalid folder path.`);
|
||||
log.error('create_agent path traversal attempt', { folder, resolvedPath });
|
||||
return;
|
||||
}
|
||||
@@ -115,12 +196,6 @@ export async function handleCreateAgent(content: Record<string, unknown>, sessio
|
||||
// tries to send to the newly-created child.
|
||||
writeDestinations(session.agent_group_id, session.id);
|
||||
|
||||
// Fire-and-forget notification back to the creator
|
||||
notifyAgent(
|
||||
session,
|
||||
`Agent "${localName}" created. You can now message it with <message to="${localName}">...</message>.`,
|
||||
);
|
||||
notify(`Agent "${localName}" created. You can now message it with <message to="${localName}">...</message>.`);
|
||||
log.info('Agent group created', { agentGroupId, name, localName, folder, parent: sourceGroup.id });
|
||||
// Note: requestId is unused — this is fire-and-forget, not request/response.
|
||||
void requestId;
|
||||
}
|
||||
|
||||
@@ -1,9 +1,13 @@
|
||||
/**
|
||||
* Agent-to-agent module — inter-agent messaging and on-demand agent creation.
|
||||
*
|
||||
* Registers one delivery action (`create_agent`). The sibling `channel_type === 'agent'`
|
||||
* routing path is NOT a system action — core `delivery.ts` dispatches into
|
||||
* `./agent-route.js` via a dynamic import when it sees `msg.channel_type === 'agent'`.
|
||||
* Registers one delivery action (`create_agent`) plus its matching approval
|
||||
* handler — `create_agent` writes central-DB state, so confined (non-global)
|
||||
* groups require admin approval (the delivery action queues the request;
|
||||
* `applyCreateAgent` runs on approve); trusted global-scope groups create
|
||||
* directly. The sibling `channel_type === 'agent'` routing path is NOT a system
|
||||
* action — core `delivery.ts` dispatches into `./agent-route.js` via a dynamic
|
||||
* import when it sees `msg.channel_type === 'agent'`.
|
||||
*
|
||||
* Host integration points:
|
||||
* - `src/container-runner.ts::spawnContainer` dynamically imports
|
||||
@@ -17,6 +21,8 @@
|
||||
* throw because the module isn't installed.
|
||||
*/
|
||||
import { registerDeliveryAction } from '../../delivery.js';
|
||||
import { handleCreateAgent } from './create-agent.js';
|
||||
import { registerApprovalHandler } from '../approvals/index.js';
|
||||
import { applyCreateAgent, handleCreateAgent } from './create-agent.js';
|
||||
|
||||
registerDeliveryAction('create_agent', handleCreateAgent);
|
||||
registerApprovalHandler('create_agent', applyCreateAgent);
|
||||
|
||||
@@ -0,0 +1,164 @@
|
||||
/**
|
||||
* Regression coverage for approval response authorization.
|
||||
*
|
||||
* Approval cards may be delivered to an admin DM, but the callback payload is
|
||||
* still untrusted input. The response handler must not dispatch sensitive
|
||||
* approval handlers merely because a response carries a valid questionId.
|
||||
*/
|
||||
import * as fs from 'fs';
|
||||
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
|
||||
|
||||
import { initTestDb, closeDb, runMigrations } from '../../db/index.js';
|
||||
import { createAgentGroup } from '../../db/agent-groups.js';
|
||||
import { createSession, createPendingApproval, getPendingApproval } from '../../db/sessions.js';
|
||||
import { upsertUser } from '../permissions/db/users.js';
|
||||
import { grantRole } from '../permissions/db/user-roles.js';
|
||||
|
||||
vi.mock('../../container-runner.js', () => ({
|
||||
wakeContainer: vi.fn().mockResolvedValue(undefined),
|
||||
}));
|
||||
|
||||
vi.mock('../../config.js', async () => {
|
||||
const actual = await vi.importActual('../../config.js');
|
||||
return { ...actual, DATA_DIR: '/tmp/nanoclaw-test-approval-response-authz' };
|
||||
});
|
||||
|
||||
const TEST_DIR = '/tmp/nanoclaw-test-approval-response-authz';
|
||||
|
||||
function now() {
|
||||
return new Date().toISOString();
|
||||
}
|
||||
|
||||
beforeEach(() => {
|
||||
if (fs.existsSync(TEST_DIR)) fs.rmSync(TEST_DIR, { recursive: true, force: true });
|
||||
fs.mkdirSync(TEST_DIR, { recursive: true });
|
||||
const db = initTestDb();
|
||||
runMigrations(db);
|
||||
|
||||
createAgentGroup({ id: 'ag-1', name: 'Agent', folder: 'agent', agent_provider: null, created_at: now() });
|
||||
createSession({
|
||||
id: 'sess-1',
|
||||
agent_group_id: 'ag-1',
|
||||
messaging_group_id: null,
|
||||
thread_id: null,
|
||||
agent_provider: null,
|
||||
status: 'active',
|
||||
container_status: 'stopped',
|
||||
last_active: now(),
|
||||
created_at: now(),
|
||||
});
|
||||
});
|
||||
|
||||
afterEach(() => {
|
||||
closeDb();
|
||||
if (fs.existsSync(TEST_DIR)) fs.rmSync(TEST_DIR, { recursive: true, force: true });
|
||||
});
|
||||
|
||||
describe('approval response authorization', () => {
|
||||
it('ignores a valid approval id clicked by a non-admin user', async () => {
|
||||
const { registerApprovalHandler } = await import('./primitive.js');
|
||||
const { handleApprovalsResponse } = await import('./response-handler.js');
|
||||
const handler = vi.fn().mockResolvedValue(undefined);
|
||||
registerApprovalHandler('install_packages', handler);
|
||||
|
||||
createPendingApproval({
|
||||
approval_id: 'appr-1',
|
||||
session_id: 'sess-1',
|
||||
request_id: 'appr-1',
|
||||
action: 'install_packages',
|
||||
payload: JSON.stringify({ packages: ['left-pad'] }),
|
||||
created_at: now(),
|
||||
title: 'Install packages',
|
||||
options_json: JSON.stringify([]),
|
||||
});
|
||||
|
||||
const claimed = await handleApprovalsResponse({
|
||||
questionId: 'appr-1',
|
||||
value: 'approve',
|
||||
userId: 'stranger',
|
||||
channelType: 'telegram',
|
||||
platformId: 'dm-stranger',
|
||||
threadId: null,
|
||||
});
|
||||
|
||||
expect(claimed).toBe(true);
|
||||
expect(handler).not.toHaveBeenCalled();
|
||||
expect(getPendingApproval('appr-1')).toBeDefined();
|
||||
});
|
||||
|
||||
it('allows an owner/admin click to dispatch the registered approval handler', async () => {
|
||||
upsertUser({ id: 'telegram:owner', kind: 'telegram', display_name: 'Owner', created_at: now() });
|
||||
grantRole({ user_id: 'telegram:owner', role: 'owner', agent_group_id: null, granted_by: null, granted_at: now() });
|
||||
|
||||
const { registerApprovalHandler } = await import('./primitive.js');
|
||||
const { handleApprovalsResponse } = await import('./response-handler.js');
|
||||
const handler = vi.fn().mockResolvedValue(undefined);
|
||||
registerApprovalHandler('install_packages_allowed', handler);
|
||||
|
||||
createPendingApproval({
|
||||
approval_id: 'appr-2',
|
||||
session_id: 'sess-1',
|
||||
request_id: 'appr-2',
|
||||
action: 'install_packages_allowed',
|
||||
payload: JSON.stringify({ packages: ['left-pad'] }),
|
||||
created_at: now(),
|
||||
title: 'Install packages',
|
||||
options_json: JSON.stringify([]),
|
||||
});
|
||||
|
||||
const claimed = await handleApprovalsResponse({
|
||||
questionId: 'appr-2',
|
||||
value: 'approve',
|
||||
userId: 'owner',
|
||||
channelType: 'telegram',
|
||||
platformId: 'dm-owner',
|
||||
threadId: null,
|
||||
});
|
||||
|
||||
expect(claimed).toBe(true);
|
||||
expect(handler).toHaveBeenCalledTimes(1);
|
||||
expect(handler).toHaveBeenCalledWith(expect.objectContaining({ userId: 'telegram:owner' }));
|
||||
expect(getPendingApproval('appr-2')).toBeUndefined();
|
||||
});
|
||||
|
||||
it('allows global admins to resolve approvals without a session-scoped agent group', async () => {
|
||||
upsertUser({ id: 'telegram:global-admin', kind: 'telegram', display_name: 'Global Admin', created_at: now() });
|
||||
grantRole({
|
||||
user_id: 'telegram:global-admin',
|
||||
role: 'admin',
|
||||
agent_group_id: null,
|
||||
granted_by: null,
|
||||
granted_at: now(),
|
||||
});
|
||||
|
||||
const { registerApprovalHandler } = await import('./primitive.js');
|
||||
const { handleApprovalsResponse } = await import('./response-handler.js');
|
||||
const handler = vi.fn().mockResolvedValue(undefined);
|
||||
registerApprovalHandler('global_admin_allowed', handler);
|
||||
|
||||
createPendingApproval({
|
||||
approval_id: 'appr-3',
|
||||
session_id: 'sess-1',
|
||||
agent_group_id: null,
|
||||
request_id: 'appr-3',
|
||||
action: 'global_admin_allowed',
|
||||
payload: JSON.stringify({ packages: ['left-pad'] }),
|
||||
created_at: now(),
|
||||
title: 'Install packages',
|
||||
options_json: JSON.stringify([]),
|
||||
});
|
||||
|
||||
const claimed = await handleApprovalsResponse({
|
||||
questionId: 'appr-3',
|
||||
value: 'approve',
|
||||
userId: 'global-admin',
|
||||
channelType: 'telegram',
|
||||
platformId: 'dm-global-admin',
|
||||
threadId: null,
|
||||
});
|
||||
|
||||
expect(claimed).toBe(true);
|
||||
expect(handler).toHaveBeenCalledTimes(1);
|
||||
expect(getPendingApproval('appr-3')).toBeUndefined();
|
||||
});
|
||||
});
|
||||
@@ -18,27 +18,35 @@ import type { ResponsePayload } from '../../response-registry.js';
|
||||
import { log } from '../../log.js';
|
||||
import { writeSessionMessage } from '../../session-manager.js';
|
||||
import type { PendingApproval } from '../../types.js';
|
||||
import { hasAdminPrivilege, isGlobalAdmin, isOwner } from '../permissions/db/user-roles.js';
|
||||
import { ONECLI_ACTION, resolveOneCLIApproval } from './onecli-approvals.js';
|
||||
import { getApprovalHandler } from './primitive.js';
|
||||
|
||||
export async function handleApprovalsResponse(payload: ResponsePayload): Promise<boolean> {
|
||||
// OneCLI credential approvals — resolved via in-memory Promise first.
|
||||
if (resolveOneCLIApproval(payload.questionId, payload.value)) {
|
||||
return true;
|
||||
}
|
||||
|
||||
// DB-backed pending_approvals.
|
||||
const approval = getPendingApproval(payload.questionId);
|
||||
if (!approval) return false;
|
||||
|
||||
if (!isAuthorizedApprovalClick(approval, payload)) {
|
||||
log.warn('Ignoring unauthorized approval response', {
|
||||
approvalId: approval.approval_id,
|
||||
action: approval.action,
|
||||
userId: payload.userId,
|
||||
channelType: payload.channelType,
|
||||
});
|
||||
return true;
|
||||
}
|
||||
|
||||
if (approval.action === ONECLI_ACTION) {
|
||||
if (resolveOneCLIApproval(payload.questionId, payload.value)) {
|
||||
return true;
|
||||
}
|
||||
// Row exists but the in-memory resolver is gone (timer fired or the process
|
||||
// was in a weird state). Nothing to do — just drop the row.
|
||||
deletePendingApproval(payload.questionId);
|
||||
return true;
|
||||
}
|
||||
|
||||
await handleRegisteredApproval(approval, payload.value, payload.userId ?? '');
|
||||
await handleRegisteredApproval(approval, payload.value, namespacedUserId(payload) ?? '');
|
||||
return true;
|
||||
}
|
||||
|
||||
@@ -104,3 +112,22 @@ async function handleRegisteredApproval(
|
||||
deletePendingApproval(approval.approval_id);
|
||||
await wakeContainer(session);
|
||||
}
|
||||
|
||||
function namespacedUserId(payload: ResponsePayload): string | null {
|
||||
if (!payload.userId) return null;
|
||||
return payload.userId.includes(':') ? payload.userId : `${payload.channelType}:${payload.userId}`;
|
||||
}
|
||||
|
||||
function isAuthorizedApprovalClick(approval: PendingApproval, payload: ResponsePayload): boolean {
|
||||
const userId = namespacedUserId(payload);
|
||||
if (!userId) return false;
|
||||
|
||||
const agentGroupId =
|
||||
approval.agent_group_id ?? (approval.session_id ? getSession(approval.session_id)?.agent_group_id : null);
|
||||
|
||||
if (!agentGroupId) {
|
||||
return isOwner(userId) || isGlobalAdmin(userId);
|
||||
}
|
||||
|
||||
return hasAdminPrivilege(userId, agentGroupId);
|
||||
}
|
||||
|
||||
@@ -0,0 +1,90 @@
|
||||
import fs from 'fs';
|
||||
import path from 'path';
|
||||
|
||||
import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest';
|
||||
|
||||
vi.mock('./config.js', async () => {
|
||||
const actual = await vi.importActual<typeof import('./config.js')>('./config.js');
|
||||
return { ...actual, DATA_DIR: '/tmp/nanoclaw-test-upgrade-state' };
|
||||
});
|
||||
|
||||
const TEST_DIR = '/tmp/nanoclaw-test-upgrade-state';
|
||||
|
||||
import {
|
||||
enforceUpgradeTripwire,
|
||||
getCodeVersion,
|
||||
isUpgradeCurrent,
|
||||
markerPath,
|
||||
readUpgradeState,
|
||||
writeUpgradeState,
|
||||
} from './upgrade-state.js';
|
||||
|
||||
beforeEach(() => {
|
||||
fs.rmSync(TEST_DIR, { recursive: true, force: true });
|
||||
});
|
||||
afterEach(() => {
|
||||
fs.rmSync(TEST_DIR, { recursive: true, force: true });
|
||||
});
|
||||
|
||||
describe('upgrade-state', () => {
|
||||
it('getCodeVersion reads the package.json version', () => {
|
||||
const pkg = JSON.parse(fs.readFileSync(path.join(process.cwd(), 'package.json'), 'utf8'));
|
||||
expect(getCodeVersion()).toBe(pkg.version);
|
||||
});
|
||||
|
||||
it('readUpgradeState returns null when the marker is absent', () => {
|
||||
expect(readUpgradeState()).toBeNull();
|
||||
});
|
||||
|
||||
it('write then read round-trips, with version/via/updatedAt', () => {
|
||||
const written = writeUpgradeState({ version: '9.9.9', via: 'test' });
|
||||
expect(written).toMatchObject({ version: '9.9.9', via: 'test' });
|
||||
expect(written.updatedAt).toBeTruthy();
|
||||
expect(readUpgradeState()).toEqual(written);
|
||||
});
|
||||
|
||||
it('write defaults the version to the code version', () => {
|
||||
expect(writeUpgradeState({ via: 'test' }).version).toBe(getCodeVersion());
|
||||
});
|
||||
|
||||
it('isUpgradeCurrent: false when absent, false on mismatch, true on match', () => {
|
||||
expect(isUpgradeCurrent()).toBe(false);
|
||||
writeUpgradeState({ version: '0.0.0-nope', via: 'test' });
|
||||
expect(isUpgradeCurrent()).toBe(false);
|
||||
writeUpgradeState({ version: getCodeVersion(), via: 'test' });
|
||||
expect(isUpgradeCurrent()).toBe(true);
|
||||
});
|
||||
|
||||
it('treats a corrupt marker as absent (fails closed, never throws)', () => {
|
||||
fs.mkdirSync(TEST_DIR, { recursive: true });
|
||||
fs.writeFileSync(path.join(TEST_DIR, 'upgrade-state.json'), '{ this is not json');
|
||||
expect(() => readUpgradeState()).not.toThrow();
|
||||
expect(readUpgradeState()).toBeNull();
|
||||
expect(isUpgradeCurrent()).toBe(false);
|
||||
});
|
||||
|
||||
it('markerPath is upgrade-state.json under the data dir', () => {
|
||||
expect(markerPath()).toBe(path.join(TEST_DIR, 'upgrade-state.json'));
|
||||
});
|
||||
|
||||
it('enforceUpgradeTripwire exits when not current and passes when current', () => {
|
||||
const exitSpy = vi.spyOn(process, 'exit').mockImplementation(((code?: number) => {
|
||||
throw new Error(`exit:${code}`);
|
||||
}) as never);
|
||||
const errSpy = vi.spyOn(console, 'error').mockImplementation(() => {});
|
||||
|
||||
// No marker → trips.
|
||||
expect(() => enforceUpgradeTripwire()).toThrow('exit:1');
|
||||
|
||||
// Stale marker → trips.
|
||||
writeUpgradeState({ version: '0.0.0-nope', via: 'test' });
|
||||
expect(() => enforceUpgradeTripwire()).toThrow('exit:1');
|
||||
|
||||
// Matching marker → passes.
|
||||
writeUpgradeState({ version: getCodeVersion(), via: 'test' });
|
||||
expect(() => enforceUpgradeTripwire()).not.toThrow();
|
||||
|
||||
exitSpy.mockRestore();
|
||||
errSpy.mockRestore();
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,126 @@
|
||||
/**
|
||||
* Upgrade marker — the record that an install reached its current version
|
||||
* through a sanctioned path (setup / `/update-nanoclaw` / `/migrate-nanoclaw`).
|
||||
*
|
||||
* The startup tripwire (enforceUpgradeTripwire) refuses to run if the marker
|
||||
* is missing or its version doesn't match the running code — i.e. if the
|
||||
* install was updated by a raw `git pull` instead of the supported flow.
|
||||
*
|
||||
* The marker lives in `data/` (gitignored), so a `git pull` can't touch it.
|
||||
* Only the sanctioned paths call writeUpgradeState(); clearing the tripwire
|
||||
* by hand is the same `set` — see docs/upgrade-recovery.md.
|
||||
*/
|
||||
import fs from 'fs';
|
||||
import path from 'path';
|
||||
|
||||
import { DATA_DIR } from './config.js';
|
||||
import { log } from './log.js';
|
||||
|
||||
export interface UpgradeState {
|
||||
version: string;
|
||||
updatedAt: string;
|
||||
via: string;
|
||||
}
|
||||
|
||||
const MARKER_PATH = path.join(DATA_DIR, 'upgrade-state.json');
|
||||
const FIX_COMMAND = 'pnpm exec tsx scripts/upgrade-state.ts set';
|
||||
|
||||
/** Version the running code declares, read from package.json. */
|
||||
export function getCodeVersion(): string {
|
||||
const pkgPath = path.join(process.cwd(), 'package.json');
|
||||
const pkg = JSON.parse(fs.readFileSync(pkgPath, 'utf8')) as { version?: string };
|
||||
if (!pkg.version) throw new Error(`No version field in ${pkgPath}`);
|
||||
return pkg.version;
|
||||
}
|
||||
|
||||
/**
|
||||
* Read the upgrade marker, or null if it's absent, unreadable, or corrupt.
|
||||
* Never throws — a boot gate must fail closed (treat anything it can't trust
|
||||
* as "no valid marker" → trip), not crash with a stack trace.
|
||||
*/
|
||||
export function readUpgradeState(): UpgradeState | null {
|
||||
let raw: string;
|
||||
try {
|
||||
raw = fs.readFileSync(MARKER_PATH, 'utf8');
|
||||
} catch (e: unknown) {
|
||||
if ((e as NodeJS.ErrnoException).code === 'ENOENT') return null;
|
||||
log.warn('Could not read upgrade marker; treating as absent', { path: MARKER_PATH, err: String(e) });
|
||||
return null;
|
||||
}
|
||||
try {
|
||||
return JSON.parse(raw) as UpgradeState;
|
||||
} catch {
|
||||
log.warn('Upgrade marker is corrupt; treating as absent', { path: MARKER_PATH });
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Stamp the marker. Only the sanctioned paths (setup / update / migrate)
|
||||
* call this on success; `version` defaults to the current code version.
|
||||
*/
|
||||
export function writeUpgradeState(opts: { version?: string; via: string }): UpgradeState {
|
||||
const state: UpgradeState = {
|
||||
version: opts.version ?? getCodeVersion(),
|
||||
updatedAt: new Date().toISOString(),
|
||||
via: opts.via,
|
||||
};
|
||||
fs.mkdirSync(DATA_DIR, { recursive: true });
|
||||
fs.writeFileSync(MARKER_PATH, JSON.stringify(state, null, 2) + '\n');
|
||||
return state;
|
||||
}
|
||||
|
||||
/** True when the marker exists and matches the running code version. */
|
||||
export function isUpgradeCurrent(): boolean {
|
||||
const state = readUpgradeState();
|
||||
return state !== null && state.version === getCodeVersion();
|
||||
}
|
||||
|
||||
/** Absolute path to the marker file. */
|
||||
export function markerPath(): string {
|
||||
return MARKER_PATH;
|
||||
}
|
||||
|
||||
/**
|
||||
* Startup gate. If the install didn't reach the current version through a
|
||||
* sanctioned path, stop with a message written for the coding agent that
|
||||
* just ran the upgrade to act on automatically.
|
||||
*/
|
||||
export function enforceUpgradeTripwire(): void {
|
||||
if (isUpgradeCurrent()) return;
|
||||
|
||||
const code = getCodeVersion();
|
||||
const recorded = readUpgradeState()?.version ?? 'none';
|
||||
|
||||
console.error(
|
||||
[
|
||||
'',
|
||||
'='.repeat(64),
|
||||
'NanoClaw stopped: update did not go through the supported path',
|
||||
'='.repeat(64),
|
||||
'This install was updated outside the sanctioned upgrade flow',
|
||||
'(no upgrade marker, or the marker version does not match the code).',
|
||||
'',
|
||||
` code version: ${code}`,
|
||||
` recorded version: ${recorded}`,
|
||||
'',
|
||||
"IF YOU'RE A HUMAN: you most likely ran `git pull` directly. Don't",
|
||||
'fix this by hand — open Claude Code in this folder and run',
|
||||
'/update-nanoclaw, which repairs the install and clears this stop.',
|
||||
'',
|
||||
'CODING AGENT: if you just ran /update-nanoclaw, /migrate-nanoclaw, or',
|
||||
'setup and it completed successfully, clear this automatically by running:',
|
||||
'',
|
||||
` ${FIX_COMMAND}`,
|
||||
'',
|
||||
'then restart NanoClaw. If the upgrade did NOT complete, stop and',
|
||||
'investigate instead of clearing.',
|
||||
'',
|
||||
'Details: docs/upgrade-recovery.md',
|
||||
'='.repeat(64),
|
||||
'',
|
||||
].join('\n'),
|
||||
);
|
||||
log.error('Upgrade tripwire: install not on the sanctioned path', { code, recorded });
|
||||
process.exit(1);
|
||||
}
|
||||
Reference in New Issue
Block a user