mirror of
https://github.com/qwibitai/nanoclaw.git
synced 2026-06-04 10:14:47 +08:00
docs: drop v2 refactor planning docs ahead of merge
Removes transient analysis/proposal/checklist docs whose purpose is served once v2 ships: REFACTOR.md, docs/v1-vs-v2/, docs/checklist.md, docs/shared-source.md, docs/claude-md-composition.md, docs/module-contract.md, docs/DEBUG_CHECKLIST.md. Updates CLAUDE.md and docs/README.md index rows accordingly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -209,7 +209,6 @@ This project uses pnpm with `minimumReleaseAge: 4320` (3 days) in `pnpm-workspac
|
||||
| [docs/agent-runner-details.md](docs/agent-runner-details.md) | Agent-runner internals + MCP tool interface |
|
||||
| [docs/isolation-model.md](docs/isolation-model.md) | Three-level channel isolation model |
|
||||
| [docs/setup-wiring.md](docs/setup-wiring.md) | What's wired, what's open in the setup flow |
|
||||
| [docs/checklist.md](docs/checklist.md) | Rolling status checklist across all subsystems |
|
||||
| [docs/architecture-diagram.md](docs/architecture-diagram.md) | Diagram version of the architecture |
|
||||
| [docs/build-and-runtime.md](docs/build-and-runtime.md) | Runtime split (Node host + Bun container), lockfiles, image build surface, CI, key invariants |
|
||||
|
||||
|
||||
-175
@@ -1,175 +0,0 @@
|
||||
# NanoClaw Refactor — Forward-Looking Reference
|
||||
|
||||
Consolidates what's still relevant from `REFACTOR_PLAN.md` and `REFACTOR_EXECUTION.md`: open decisions, remaining work, operational patterns worth keeping. Historical PR timeline and phase framing have been dropped — the work is in the commit history.
|
||||
|
||||
---
|
||||
|
||||
## Architecture (still authoritative)
|
||||
|
||||
### Module tiers
|
||||
|
||||
Three categories, distinguished by shipping model and dependency direction:
|
||||
|
||||
| Tier | Where it lives | Loaded by default? | Removal cost |
|
||||
|------|----------------|--------------------|--------------|
|
||||
| **Core** | `src/**` (outside `src/modules/`, `src/channels/`, `src/providers/`) | always | N/A — can't remove |
|
||||
| **Default modules** | `src/modules/<name>/` on main | yes — imported by `src/modules/index.ts` | edit core imports (intentional friction) |
|
||||
| **Optional modules** | `src/modules/<name>/` on main (for now — see open q #7) | yes, via barrel import | delete files + barrel line + revert `MODULE-HOOK` edits |
|
||||
| **Channel adapters** | `src/channels/<name>.ts` on `channels` branch | no — cherry-pick via `/add-<name>` | delete files + barrel line |
|
||||
| **Providers** | on `providers` branch | no — cherry-pick via `/add-<provider>` | delete files + barrel line |
|
||||
|
||||
Default modules today: `typing`, `mount-security`, `approvals`, `cli`.
|
||||
Optional modules: `interactive`, `scheduling`, `permissions`, `agent-to-agent`, `self-mod`.
|
||||
|
||||
Dependency rule: **core ← default modules ← optional modules**. Optional modules must not depend on each other. Known transitional violation (flagged): `src/db/messaging-groups.ts` auto-wires `agent_destinations` when agent-to-agent is installed.
|
||||
|
||||
### The four registries
|
||||
|
||||
Full contract in [`docs/module-contract.md`](docs/module-contract.md). Summary:
|
||||
|
||||
1. **Delivery action handlers** — `delivery.ts`; modules call `registerDeliveryAction(name, fn)`.
|
||||
2. **Router inbound gate** — `router.ts`; single setter (`setSenderResolver` + `setAccessGate`). Default: allow-all.
|
||||
3. **Response dispatcher** — `response-registry.ts`; modules call `registerResponseHandler(fn)`. First to return `true` claims.
|
||||
4. **Container MCP tool self-registration** — `container/agent-runner/src/mcp-tools/server.ts`; modules call `registerTools([...])` at import.
|
||||
|
||||
Anything else single-consumer uses either a `sqlite_master`-guarded inline read or a `MODULE-HOOK:<name>:start/end` skill edit.
|
||||
|
||||
### Module distribution (pending)
|
||||
|
||||
- **`main`** — core + default modules + default channel (`cli`). Ships clean.
|
||||
- **`channels`** — fully loaded runnable branch with all channel adapters; skills cherry-pick from it.
|
||||
- **`providers`** — same pattern for agent providers (OpenCode).
|
||||
- **`modules` branch** — proposed but NOT created yet. See "Remaining work" below.
|
||||
|
||||
---
|
||||
|
||||
## Remaining work
|
||||
|
||||
### Phase 5: merge `v2` → `main`
|
||||
|
||||
Cut-over the refactor. Pre-reqs (already met): green build, green tests, green service boot, clean `channels` / `providers` syncs.
|
||||
|
||||
Open logistics:
|
||||
- Release versioning: bump to `1.3.0` at merge time or cut a `v2-rc` tag first for internal testing? Non-blocking — decide at merge.
|
||||
- Coordinate with anyone still running the old `main` (v1.2.53) — breaking change for them.
|
||||
- Announce the new layout + the one shell command that changed (`pnpm run chat` is new default).
|
||||
|
||||
### `modules` branch — create, skip, or defer?
|
||||
|
||||
The original plan (PR #10) was to fork a `modules` branch and populate it with the 5 optional modules, so future `/add-<module>` skills pull via `git show origin/modules:path`. Three paths:
|
||||
|
||||
- **(a) Create it now.** Matches the `channels`/`providers` pattern for consistency. Extra surface to maintain: every core change must be merged into `modules` at phase boundaries (same cadence as channels/providers). Pays off if we ever want to make a module *truly* optional (not shipped on main).
|
||||
- **(b) Skip it.** Leave all 5 optional modules shipped on main. No `modules` branch, no install skills, no cherry-picking. Simpler but loses the "opt-in" property for users who want a leaner install.
|
||||
- **(c) Defer.** Ship main without the modules branch; create it later if someone actually wants to slim their install. No-cost option for now.
|
||||
|
||||
Recommendation leans toward (c) — we've already paid the architectural cost (tier boundary, dependency rule, registries) without needing the branch today.
|
||||
|
||||
### Per-module follow-ups (tracked as open questions below)
|
||||
|
||||
Each has a specific landing zone when we get to it:
|
||||
- #11–13 (admin mechanism, providers registry, container-runner audit) — scope a focused cleanup pass.
|
||||
- #14 (CLAUDE.md review) — single dedicated PR touching every module.
|
||||
- #15 (A2A / destinations rethink) — requires design, not just cleanup.
|
||||
- #17–18 (self-mod rethink, per-group source) — requires design.
|
||||
- #19 (system vs user CLAUDE.md) — requires install-skill tooling.
|
||||
|
||||
---
|
||||
|
||||
## Operational patterns (keep using these)
|
||||
|
||||
### Standing checks for every PR
|
||||
|
||||
Non-negotiable; a unit test suite alone doesn't catch circular-import TDZ bugs:
|
||||
|
||||
1. `pnpm run build` clean.
|
||||
2. `pnpm test` + `bun test` (in `container/agent-runner/`) all green.
|
||||
3. **Service actually starts.** `gtimeout 5 node dist/index.js` (or `launchctl kickstart`) must reach `NanoClaw running`. Unit tests import individual files; only `main()` exercises the module-init order.
|
||||
4. Expected boot log lines present (at least: `Central DB ready`, `Delivery polls started`, `Host sweep started`, `NanoClaw running`, plus any module lifecycle line like `OneCLI approval handler started` or `CLI channel listening`).
|
||||
|
||||
### Module architecture rule (TDZ bug, PR #3)
|
||||
|
||||
Any registry state a module writes to at import time must live in a file with **no back-edge to `src/index.ts`** — transitively. `src/index.ts` imports `src/modules/index.js` for side effects; if a module calls `registerX()` at top level and `X` lives in `src/index.ts`, the ES module loader hits a TDZ reference on the const declaration. Fix: registry state lives in its own dependency-free file (e.g. `src/response-registry.ts`). Any new registry follows the same pattern.
|
||||
|
||||
### Branch sync procedure
|
||||
|
||||
After every `v2` (or future `main`) sync into `channels` / `providers` / future `modules`:
|
||||
|
||||
1. **File-presence diff.** Enumerate files that existed pre-sync but are missing post-sync:
|
||||
```
|
||||
git ls-tree -r <pre-sync> | awk '{print $4}' | sort > /tmp/pre.txt
|
||||
git ls-tree -r <post-sync> | awk '{print $4}' | sort > /tmp/post.txt
|
||||
comm -23 /tmp/pre.txt /tmp/post.txt
|
||||
```
|
||||
Classify each missing file:
|
||||
- **Intentional** (core deleted it) → leave deleted.
|
||||
- **Branch-owned** (channels branch still needs it) → restore from pre-sync HEAD.
|
||||
|
||||
This has caught real losses on both `channels` (17 adapter files plus 3 setup scripts after PR #2's channel move) and `providers` (opencode files after PR #2).
|
||||
|
||||
2. **Cross-file consistency.** When restoring a file, check whether something *else* that also changed references it (e.g. `setup/index.ts`'s `STEPS` map).
|
||||
|
||||
3. **Run the standing checks** against the synced branch (not just v2).
|
||||
|
||||
### Prettier drift pattern
|
||||
|
||||
The `format:fix` pre-commit hook sometimes reformats peer files *after* the commit completes, leaving cosmetic-only diffs in the working tree. Discard with `git checkout -- <files>`. Do not re-commit the drift — it's trivial whitespace and noise.
|
||||
|
||||
---
|
||||
|
||||
## Open questions (curated)
|
||||
|
||||
### Design / architecture
|
||||
|
||||
1. **`NANOCLAW_ADMIN_USER_IDS` as the admin mechanism.** Host queries `user_roles` at container wake, collapses into env var, container compares sender IDs. Conflates identity-at-send with privilege-at-wake and forces the container to care about namespaced user IDs. Revisit during a container-runner audit.
|
||||
|
||||
2. **Host-side `src/providers/` registry.** One real consumer (OpenCode). A registry is probably overkill — the install skill could just edit `container-runner.ts` via `MODULE-HOOK`. Fold into the container-runner audit.
|
||||
|
||||
3. **Container-runner audit.** `src/container-runner.ts` has accreted wake/spawn/kill, mount assembly, OneCLI credential application, admin-ID env var, idle timers, image rebuild. Some pieces should pull apart or move into modules. Not blocking. Related to #1 and #2.
|
||||
|
||||
4. **Revisit destinations + A2A capability holistically.** The destination projection invariant, dual-purpose routing+ACL table, channel vs agent destination shapes, `createMessagingGroupAgent` auto-wire coupling — more machinery than the feature warrants. Phase 3 moved it out of core intact; a redesign is warranted but scoped post-refactor.
|
||||
|
||||
5. **Self-mod approach rethink.** _Partially addressed_ — the redundant `request_rebuild` tool was removed; approval of `install_packages` now bundles rebuild + container restart, and `add_mcp_server` approval restarts without rebuilding (bun runs TS directly). Still to consider: collapsing `install_packages` + `add_mcp_server` into a single "apply this container-config diff" approval primitive to reduce post-rebuild latency further.
|
||||
|
||||
6. **Per-agent-group source / per-group base image.** Self-mod today layers packages/MCP on a shared base. As groups diverge (different base images, provider configs, runtime toolchains), the shared-base assumption won't scale. Scope post-refactor.
|
||||
|
||||
### Distribution / operational
|
||||
|
||||
7. **Providers on a consolidated `modules` branch?** Staying separate for now. Revisit if a second optional provider appears.
|
||||
|
||||
8. **Per-group module enablement.** Modules are currently project-wide. If one agent group wants approvals and another doesn't, we'd need per-group feature flags. Flag if asked.
|
||||
|
||||
9. **Module removal UX.** We do not drop tables on uninstall. Is that the right default? (Alternative: `/remove-<module>` optionally runs a down migration. YAGNI until requested.)
|
||||
|
||||
10. **Cross-module ordering for the response dispatcher.** Registration order determines who claims a given `questionId`. IDs are disjoint in practice (`q-…` vs `appr-…`), so first-match-wins is safe. If a third response-consuming module arrives, we may need keyed dispatch.
|
||||
|
||||
11. **Versioned module migrations.** Reinstalls are idempotent (migrator skips anything already in `schema_version`). If a module ships a *new* migration in a later version, the install skill must append the new file + barrel entry without touching prior ones. Simplest rule: install skills are additive; content changes to an already-applied migration are a hard error.
|
||||
|
||||
12. **Telegram pairing imports from permissions (channels branch).** `src/channels/telegram.ts` reaches into `src/modules/permissions/db/*` for `grantRole`/`hasAnyOwner`/`upsertUser` in the pairing-bootstrap branch. Cross-branch tier violation. Fix: extract those writes into a pairing helper (e.g. `src/channels/telegram-pairing-accept.ts` or `setup/pair-telegram.ts`). Non-blocking.
|
||||
|
||||
### Core slotting (files not explicitly discussed)
|
||||
|
||||
13. **`state-sqlite.ts`, `webhook-server.ts`, `timezone.ts`.** state-sqlite is likely core (host tracker). Webhook-server likely core (channel infra). Timezone likely core utility. Confirm if any of them prove to be module-shaped during future audits.
|
||||
|
||||
14. **Chat SDK bridge location.** `src/channels/chat-sdk-bridge.ts` is channel infra that bridges adapters on the `channels` branch. Stays in `src/channels/` for now.
|
||||
|
||||
15. **OneCLI credential injection.** Lives in `container-runner.ts`. Every agent call uses it, no clean optional boundary. Stays core. Related: `onecli-approvals.ts` is bundled inside the `approvals` default module on the assumption OneCLI stays in core. If OneCLI later moves to its own module, `onecli-approvals` follows.
|
||||
|
||||
### Documentation
|
||||
|
||||
16. **CLAUDE.md content per module.** Every module ships with project.md + agent.md. Need a dedicated review pass: (a) write the missing agent-to-agent snippets, (b) audit other modules for accuracy/tone, (c) confirm `agent.md` files are actually tailored for the agent vs. copy-pastes of `project.md`.
|
||||
|
||||
17. **Split system CLAUDE.md from user CLAUDE.md.** Project `CLAUDE.md` and `groups/global/CLAUDE.md` mix system-authored content (module contracts, install-skill appends) with user customizations. Updates currently risk clobbering user intent. Look at a system-owned region (or separate file) that skills rewrite freely plus a user-owned one that's never touched. Related to #16.
|
||||
|
||||
---
|
||||
|
||||
## Where the canonical references live
|
||||
|
||||
- **Module contract** — [`docs/module-contract.md`](docs/module-contract.md)
|
||||
- **Architecture overview** — [`docs/architecture.md`](docs/architecture.md)
|
||||
- **DB layout** — [`docs/db.md`](docs/db.md), [`docs/db-central.md`](docs/db-central.md), [`docs/db-session.md`](docs/db-session.md)
|
||||
- **Agent-runner internals** — [`docs/agent-runner-details.md`](docs/agent-runner-details.md)
|
||||
- **Channel isolation model** — [`docs/isolation-model.md`](docs/isolation-model.md)
|
||||
- **Build + runtime split** — [`docs/build-and-runtime.md`](docs/build-and-runtime.md)
|
||||
- **Top-level** — [`CLAUDE.md`](CLAUDE.md)
|
||||
|
||||
This doc (`REFACTOR.md`) is transient — prune when open questions close; retire entirely once the refactor is fully behind us and the operational patterns have been absorbed into `CLAUDE.md` or `docs/`.
|
||||
@@ -1,171 +0,0 @@
|
||||
# NanoClaw Debug Checklist
|
||||
|
||||
## Known Issues (2026-02-08)
|
||||
|
||||
### 1. [FIXED] Resume branches from stale tree position
|
||||
When agent teams spawns subagent CLI processes, they write to the same session JSONL. On subsequent `query()` resumes, the CLI reads the JSONL but may pick a stale branch tip (from before the subagent activity), causing the agent's response to land on a branch the host never receives a `result` for. **Fix**: pass `resumeSessionAt` with the last assistant message UUID to explicitly anchor each resume.
|
||||
|
||||
### 2. IDLE_TIMEOUT == CONTAINER_TIMEOUT (both 30 min)
|
||||
Both timers fire at the same time, so containers always exit via hard SIGKILL (code 137) instead of graceful `_close` sentinel shutdown. The idle timeout should be shorter (e.g., 5 min) so containers wind down between messages, while container timeout stays at 30 min as a safety net for stuck agents.
|
||||
|
||||
### 3. Cursor advanced before agent succeeds
|
||||
`processGroupMessages` advances `lastAgentTimestamp` before the agent runs. If the container times out, retries find no messages (cursor already past them). Messages are permanently lost on timeout.
|
||||
|
||||
### 4. Kubernetes image garbage collection deletes nanoclaw-agent image
|
||||
|
||||
**Symptoms**: `Container exited with code 125: pull access denied for nanoclaw-agent` — the container image disappears overnight or after a few hours, even though you just built it.
|
||||
|
||||
**Cause**: If your container runtime has Kubernetes enabled (Rancher Desktop enables it by default), the kubelet runs image garbage collection when disk usage exceeds 85%. NanoClaw containers are ephemeral (run and exit), so `nanoclaw-agent:latest` is never protected by a running container. The kubelet sees it as unused and deletes it — often overnight when no messages are being processed. Other images (docker-compose services) survive because they have long-running containers referencing them.
|
||||
|
||||
**Fix**: Disable Kubernetes if you don't need it:
|
||||
```bash
|
||||
# Rancher Desktop
|
||||
rdctl set --kubernetes-enabled=false
|
||||
|
||||
# Then rebuild the container image
|
||||
./container/build.sh
|
||||
```
|
||||
|
||||
**Diagnosis**: Check the k3s log for image GC activity:
|
||||
```bash
|
||||
grep -i "nanoclaw" ~/Library/Logs/rancher-desktop/k3s.log
|
||||
# Look for: "Removing image to free bytes" with the nanoclaw-agent image ID
|
||||
```
|
||||
|
||||
Check NanoClaw logs for image status:
|
||||
```bash
|
||||
grep -E "image found|image NOT found|image missing" logs/nanoclaw.log
|
||||
```
|
||||
|
||||
If you need Kubernetes enabled, set `CONTAINER_IMAGE` to an image stored in a registry that the kubelet won't GC, or raise the GC thresholds.
|
||||
|
||||
## Quick Status Check
|
||||
|
||||
```bash
|
||||
# 1. Is the service running?
|
||||
launchctl list | grep nanoclaw
|
||||
# Expected: PID 0 com.nanoclaw (PID = running, "-" = not running, non-zero exit = crashed)
|
||||
|
||||
# 2. Any running containers?
|
||||
docker ps --format '{{.Names}} {{.Status}}' 2>/dev/null | grep nanoclaw
|
||||
|
||||
# 3. Any stopped/orphaned containers?
|
||||
docker ps -a --format '{{.Names}} {{.Status}}' 2>/dev/null | grep nanoclaw
|
||||
|
||||
# 4. Recent errors in service log?
|
||||
grep -E 'ERROR|WARN' logs/nanoclaw.log | tail -20
|
||||
|
||||
# 5. Are channels connected? (look for last connection event)
|
||||
grep -E 'Connected|Connection closed|connection.*close|channel.*ready' logs/nanoclaw.log | tail -5
|
||||
|
||||
# 6. Are groups loaded?
|
||||
grep 'groupCount' logs/nanoclaw.log | tail -3
|
||||
```
|
||||
|
||||
## Session Transcript Branching
|
||||
|
||||
```bash
|
||||
# Check for concurrent CLI processes in session debug logs
|
||||
ls -la data/sessions/<group>/.claude/debug/
|
||||
|
||||
# Count unique SDK processes that handled messages
|
||||
# Each .txt file = one CLI subprocess. Multiple = concurrent queries.
|
||||
|
||||
# Check parentUuid branching in transcript
|
||||
python3 -c "
|
||||
import json, sys
|
||||
lines = open('data/sessions/<group>/.claude/projects/-workspace-group/<session>.jsonl').read().strip().split('\n')
|
||||
for i, line in enumerate(lines):
|
||||
try:
|
||||
d = json.loads(line)
|
||||
if d.get('type') == 'user' and d.get('message'):
|
||||
parent = d.get('parentUuid', 'ROOT')[:8]
|
||||
content = str(d['message'].get('content', ''))[:60]
|
||||
print(f'L{i+1} parent={parent} {content}')
|
||||
except: pass
|
||||
"
|
||||
```
|
||||
|
||||
## Container Timeout Investigation
|
||||
|
||||
```bash
|
||||
# Check for recent timeouts
|
||||
grep -E 'Container timeout|timed out' logs/nanoclaw.log | tail -10
|
||||
|
||||
# Check container log files for the timed-out container
|
||||
ls -lt groups/*/logs/container-*.log | head -10
|
||||
|
||||
# Read the most recent container log (replace path)
|
||||
cat groups/<group>/logs/container-<timestamp>.log
|
||||
|
||||
# Check if retries were scheduled and what happened
|
||||
grep -E 'Scheduling retry|retry|Max retries' logs/nanoclaw.log | tail -10
|
||||
```
|
||||
|
||||
## Agent Not Responding
|
||||
|
||||
```bash
|
||||
# Check if messages are being received from channels
|
||||
grep 'New messages' logs/nanoclaw.log | tail -10
|
||||
|
||||
# Check if messages are being processed (container spawned)
|
||||
grep -E 'Processing messages|Spawning container' logs/nanoclaw.log | tail -10
|
||||
|
||||
# Check if messages are being piped to active container
|
||||
grep -E 'Piped messages|sendMessage' logs/nanoclaw.log | tail -10
|
||||
|
||||
# Check the queue state — any active containers?
|
||||
grep -E 'Starting container|Container active|concurrency limit' logs/nanoclaw.log | tail -10
|
||||
|
||||
# Check lastAgentTimestamp vs latest message timestamp
|
||||
sqlite3 store/messages.db "SELECT chat_jid, MAX(timestamp) as latest FROM messages GROUP BY chat_jid ORDER BY latest DESC LIMIT 5;"
|
||||
```
|
||||
|
||||
## Container Mount Issues
|
||||
|
||||
```bash
|
||||
# Check mount validation logs (shows on container spawn)
|
||||
grep -E 'Mount validated|Mount.*REJECTED|mount' logs/nanoclaw.log | tail -10
|
||||
|
||||
# Verify the mount allowlist is readable
|
||||
cat ~/.config/nanoclaw/mount-allowlist.json
|
||||
|
||||
# Check group's container_config in DB
|
||||
sqlite3 store/messages.db "SELECT name, container_config FROM registered_groups;"
|
||||
|
||||
# Test-run a container to check mounts (dry run)
|
||||
# Replace <group-folder> with the group's folder name
|
||||
docker run -i --rm --entrypoint ls nanoclaw-agent:latest /workspace/extra/
|
||||
```
|
||||
|
||||
## Channel Auth Issues
|
||||
|
||||
```bash
|
||||
# Check if QR code was requested (means auth expired)
|
||||
grep 'QR\|authentication required\|qr' logs/nanoclaw.log | tail -5
|
||||
|
||||
# Check auth files exist
|
||||
ls -la store/auth/
|
||||
|
||||
# Re-authenticate if needed
|
||||
pnpm run auth
|
||||
```
|
||||
|
||||
## Service Management
|
||||
|
||||
```bash
|
||||
# Restart the service
|
||||
launchctl kickstart -k gui/$(id -u)/com.nanoclaw
|
||||
|
||||
# View live logs
|
||||
tail -f logs/nanoclaw.log
|
||||
|
||||
# Stop the service (careful — running containers are detached, not killed)
|
||||
launchctl bootout gui/$(id -u)/com.nanoclaw
|
||||
|
||||
# Start the service
|
||||
launchctl bootstrap gui/$(id -u) ~/Library/LaunchAgents/com.nanoclaw.plist
|
||||
|
||||
# Rebuild after code changes
|
||||
pnpm run build && launchctl kickstart -k gui/$(id -u)/com.nanoclaw
|
||||
```
|
||||
@@ -10,6 +10,5 @@ The files in this directory are original design documents and developer referenc
|
||||
| [SECURITY.md](SECURITY.md) | [Security model](https://docs.nanoclaw.dev/concepts/security) |
|
||||
| [REQUIREMENTS.md](REQUIREMENTS.md) | [Introduction](https://docs.nanoclaw.dev/introduction) |
|
||||
| [skills-as-branches.md](skills-as-branches.md) | [Skills system](https://docs.nanoclaw.dev/integrations/skills-system) |
|
||||
| [DEBUG_CHECKLIST.md](DEBUG_CHECKLIST.md) | [Troubleshooting](https://docs.nanoclaw.dev/advanced/troubleshooting) |
|
||||
| [docker-sandboxes.md](docker-sandboxes.md) | [Docker Sandboxes](https://docs.nanoclaw.dev/advanced/docker-sandboxes) |
|
||||
| [APPLE-CONTAINER-NETWORKING.md](APPLE-CONTAINER-NETWORKING.md) | [Container runtime](https://docs.nanoclaw.dev/advanced/container-runtime) |
|
||||
|
||||
@@ -1,276 +0,0 @@
|
||||
# NanoClaw Checklist
|
||||
|
||||
Status: [x] done, [~] partial, [ ] not started
|
||||
|
||||
---
|
||||
|
||||
## Core Architecture
|
||||
|
||||
- [x] Session DB replaces IPC (messages_in / messages_out as sole IO)
|
||||
- [x] Central DB (agent groups, messaging groups, sessions, routing)
|
||||
- [x] Host sweep (stale detection via heartbeat file, retry with backoff, recurrence scheduling)
|
||||
- [x] Active delivery polling (1s for running sessions)
|
||||
- [x] Sweep delivery polling (60s across all sessions)
|
||||
- [x] Container runner with session DB mounting
|
||||
- [x] Per-session container lifecycle and idle timeout
|
||||
- [ ] Replace hard Idle and Timeout with work aware prompts to user to kill stuck processes
|
||||
- [x] Session resume (sessionId + resumeAt across queries)
|
||||
- [x] Graceful shutdown (SIGTERM/SIGINT handlers)
|
||||
- [x] Orphan container cleanup on startup
|
||||
|
||||
## Agent Runner (Container)
|
||||
|
||||
- [x] Poll loop (pending messages, status transitions, idle detection)
|
||||
- [x] Concurrent follow-up polling while agent is thinking
|
||||
- [x] Message formatter (chat, task, webhook, system kinds)
|
||||
- [x] Command categorization (admin, filtered, passthrough)
|
||||
- [x] Transcript archiving (pre-compact hook)
|
||||
- [x] XML message formatting with sender, timestamp
|
||||
- [~] Media handling inbound (native files support for claude)
|
||||
|
||||
## Agent Providers
|
||||
|
||||
- [x] Claude provider (Agent SDK, tool allowlist, message stream, session resume)
|
||||
- [x] Mock provider (testing)
|
||||
- [x] Provider factory
|
||||
- [ ] Codex provider
|
||||
- [x] OpenCode provider
|
||||
|
||||
## Channel Adapters
|
||||
|
||||
- [x] Channel adapter interface (setup, deliver, teardown, typing)
|
||||
- [x] Chat SDK bridge (generic, works with any Chat SDK adapter)
|
||||
- [x] Chat SDK SQLite state adapter (KV, subscriptions, locks, lists)
|
||||
- [x] Discord via Chat SDK
|
||||
- [~] Slack via Chat SDK (adapter + skill written, not tested)
|
||||
- [x] Telegram via Chat SDK (E2E verified: inbound, routing, typing, delivery)
|
||||
- [~] Microsoft Teams via Chat SDK (adapter + skill written, not tested)
|
||||
- [~] Google Chat via Chat SDK (adapter + skill written, not tested)
|
||||
- [~] Linear via Chat SDK (adapter + skill written, not tested)
|
||||
- [~] GitHub via Chat SDK (adapter + skill written, not tested)
|
||||
- [x] WhatsApp Cloud API via Chat SDK (adapter + skill written, not tested)
|
||||
- [~] Resend (email) via Chat SDK (adapter + skill written, not tested)
|
||||
- [~] Matrix via Chat SDK (adapter + skill written, not tested)
|
||||
- [~] Webex via Chat SDK (adapter + skill written, not tested)
|
||||
- [~] iMessage via Chat SDK (adapter + skill written, not tested)
|
||||
- [x] Backward compatibility with native channels (old adapters still work)
|
||||
- [x] Channel barrel wired (src/index.ts imports barrel, skills uncomment)
|
||||
- [x] Setup flow wired to channels (channel skills + /manage-channels for registration + verify.ts checks all tokens)
|
||||
- [x] Channel Info metadata in each channel skill (type, terminology, how-to-find-id, isolation defaults)
|
||||
- [x] /manage-channels skill (wire channels to agent groups with three isolation levels)
|
||||
- [x] /init-first-agent skill (standalone first-agent bootstrap; walks the operator through channel pick → identity lookup → DM platform_id resolution → wire → welcome DM; fallback to telegram pair-code or "DM the bot first" lookup for channels without cold DM)
|
||||
- [x] Cold-DM infrastructure — `ChannelAdapter.openDM?(handle)` optional method, resolved via Chat SDK `chat.openDM` for resolution-required channels (Discord, Slack, Teams, Webex, gChat) and fall-through to the handle directly for direct-addressable channels (Telegram, WhatsApp, iMessage, Matrix, Resend). `src/user-dm.ts::ensureUserDm` caches every resolution in `user_dms` so subsequent cold DMs are a DB read.
|
||||
- [x] Agent-shared session mode (cross-channel shared sessions, e.g. GitHub + Slack)
|
||||
- [x] Auto-onboarding on channel registration (/welcome skill triggered on first wiring)
|
||||
- [ ] Wire different chat modes - mentions, whitelist, approve, etc
|
||||
|
||||
## Chat-First Setup Flow
|
||||
|
||||
**Goal:** get the user out of Claude Code and into their messaging app as quickly as possible, then enable every part of customization, configuration, and setup from inside the chat app. Claude Code is the bootstrap, not the home.
|
||||
|
||||
- [~] Minimum-viable bootstrap in Claude Code: install deps, pick one channel, authenticate it, wire it to a default agent group, hand off — nothing else required before the user can leave Claude Code. `/setup` handles deps/auth, `/init-first-agent` handles the first-agent wiring + welcome DM. Still TODO: single top-level entrypoint that composes both, and a true "nothing else required" handoff (today `/setup` still runs through `/manage-channels` for additional channels).
|
||||
- [~] Post-handoff welcome message in the chat app guides the user through remaining setup (channels, skills, integrations, memory, scheduling, etc.) — `/init-first-agent` stages a `kind:'chat'` / `sender:'system'` welcome prompt that the agent DMs back to the operator via the normal delivery path. Current prompt just introduces the agent; TODO: expand the prompt (or follow-up flow) to walk through remaining setup tasks from within the chat.
|
||||
- [ ] Add more channels from chat (currently requires returning to Claude Code to run `/add-*` skills)
|
||||
- [ ] Self-register agent into a new chat room from chat: user gives the agent a channel/group name + approval, and the agent joins via the underlying adapter (e.g. Baileys for WhatsApp), wires the room to an agent group, and posts a first "hi, I'm here" message — no manual invite, no `/add-*` skill, no terminal
|
||||
- [ ] Authenticate channels from chat (OAuth/token entry via cards, no terminal required)
|
||||
- [ ] Add credentials / secrets to the OneCLI vault from chat via rich card (agent collects API keys, OAuth tokens, and other secrets through a card flow and writes them into the vault — no `.env` editing, no terminal)
|
||||
- [ ] Wire channels to agent groups from chat (today lives in `/manage-channels` Claude Code skill — port to in-chat flow with isolation-level question cards)
|
||||
- [ ] Create new agent groups from chat (`create_agent` exists — expose via user-facing flow, not just agent-called tool)
|
||||
- [ ] Edit agent group CLAUDE.md / instructions from chat
|
||||
- [ ] Install / uninstall / configure skills from chat (see Skills & Marketplace section)
|
||||
- [ ] Install / configure MCP servers from chat (see Skills & Marketplace section)
|
||||
- [ ] Install packages from chat (today agent can request install_packages — expose a direct user-facing "install X" flow)
|
||||
- [ ] Manage scheduled tasks from chat (list, pause, cancel, edit recurrence)
|
||||
- [ ] Manage destinations from chat (list, rename, revoke)
|
||||
- [ ] Manage permissions from chat (admin list, role assignment, approval policies)
|
||||
- [ ] Trigger /setup, /debug, /customize, /migrate-nanoclaw from chat (today all require Claude Code)
|
||||
- [ ] View and edit memory from chat
|
||||
- [ ] Visualize current setup from chat (ties into Container Skills: installation diagram)
|
||||
- [ ] Export / share setup from chat (ties into Container Skills: end-of-setup diagram + share)
|
||||
- [ ] Fallback to Claude Code only when a change requires a code edit the agent can't self-apply (and even then, agent should offer to open Claude Code on the user's behalf)
|
||||
|
||||
## Product Focus
|
||||
|
||||
**North star:** prioritize skills, flows, and custom setups. Platform work (channels, routing, session DBs, approval flows, MCP tools) is plumbing — it should reach a "boring and reliable" state and then stop absorbing attention. The interesting surface area is what users can *build on top* of that plumbing: skills that add capabilities, conversational flows that orchestrate those skills, and custom per-user setups that compose channels/agents/skills/memory into something personal.
|
||||
|
||||
- [ ] Every new feature request should be answered first with "is this a skill?" before being answered with "is this a platform change?"
|
||||
- [ ] Skills should be the primary extension mechanism users and agents reach for — adding, removing, browsing, editing, debugging
|
||||
- [ ] Flows (multi-step interactive sequences: setup, onboarding, migration, customize, debug) should be authorable as skills rather than hardcoded into the platform
|
||||
- [ ] Custom setups (diverging from defaults: multiple agents, cross-channel routing, per-group memory, specialist sub-agents) should be composable from existing primitives without touching core platform code
|
||||
- [ ] Platform-level work gets budgeted against the question: "does this unblock a class of skills/flows/setups that's otherwise impossible?"
|
||||
|
||||
## Routing
|
||||
|
||||
- [x] Inbound routing (platform ID + thread ID -> agent group -> session)
|
||||
- [x] Auto-create messaging group on first message
|
||||
- [x] Session resolution (shared vs per-thread modes)
|
||||
- [x] Message writing to session DB with seq numbering
|
||||
- [x] Container waking on new message
|
||||
- [x] Typing indicator triggered on message route
|
||||
- [~] Trigger rule matching (router picks highest-priority agent, regex/mention matching TODO)
|
||||
|
||||
## Rich Messaging
|
||||
|
||||
- [x] Interactive cards with buttons (ask_user_question)
|
||||
- [x] Native platform rendering (Discord embeds, buttons)
|
||||
- [x] Message editing
|
||||
- [x] Emoji reactions
|
||||
- [x] File sending from agent (outbox -> delivery)
|
||||
- [x] File upload delivery (buffer-based via adapter)
|
||||
- [x] Markdown formatting
|
||||
- [~] Formatted /usage, /context, /cost output (commands pass through, no rich card formatting)
|
||||
- [ ] Context window visibility: show position in context, approaching compaction, when compaction happens, post-compaction state
|
||||
- [ ] Threading and replies support
|
||||
- [ ] Auto-compact on idle before cache expires
|
||||
|
||||
## MCP Tools (Container)
|
||||
|
||||
- [x] send_message (routes via named destinations; `to` field resolved against agent's local map)
|
||||
- [x] send_file (copy to outbox, write messages_out)
|
||||
- [x] edit_message (routed via destinations)
|
||||
- [x] add_reaction (routed via destinations)
|
||||
- [x] send_card
|
||||
- [x] ask_user_question (blocking poll for response)
|
||||
- [x] schedule_task (with process_after and recurrence)
|
||||
- [x] list_tasks
|
||||
- [x] cancel_task / pause_task / resume_task
|
||||
- [x] create_agent (any agent, creates agent group + folder + bidirectional destinations; host re-normalizes the name, deduplicates folder, path-traversal guarded)
|
||||
- [x] install_packages (apt/npm, owner/admin approval required via `pickApprover`, strict name validation; single approval step covers the image rebuild + container restart)
|
||||
- [x] add_mcp_server (owner/admin approval required via `pickApprover`; approval triggers container restart, no image rebuild needed — bun runs TS directly)
|
||||
|
||||
## Scheduling
|
||||
|
||||
- [x] One-shot scheduled messages (process_after / deliver_after)
|
||||
- [x] Recurring tasks via cron expressions
|
||||
- [x] Host sweep picks up due messages and advances recurrence
|
||||
- [x] Scheduled outbound messages (no container wake needed)
|
||||
- [ ] Pre-agent scripts (formatter references scriptOutput but no execution logic)
|
||||
|
||||
## Permissions and Approval Flows
|
||||
|
||||
- [x] User-level privilege model — `users` + `user_roles` (owner / admin, global or scoped to an agent group). Replaces the old `agent_groups.is_admin` / `messaging_groups.admin_user_id` coupling. See `src/modules/permissions/db/users.ts`, `src/modules/permissions/db/user-roles.ts`, `src/modules/permissions/access.ts`.
|
||||
- [x] Admin-only command filtering — gate runs host-side in `src/command-gate.ts`, querying `user_roles` directly. The container receives no admin identity (no env var, no fallback).
|
||||
- [x] Approval routing — `pickApprover` (scoped admin → global admin → owner, dedup) + `pickApprovalDelivery` (first reachable, same-channel-kind tie-break); delivery lands in the approver's DM via `ensureUserDm` / `user_dms` cache. See `src/modules/approvals/primitive.ts`, `src/modules/approvals/onecli-approvals.ts`.
|
||||
- [x] Per-messaging-group unknown-sender gating — `messaging_groups.unknown_sender_policy` (`strict` | `request_approval` | `public`), enforced in `src/router.ts`.
|
||||
- [x] Approval flow (sensitive action -> card to admin -> approve/reject -> execute) — `pending_approvals` table, `requestApproval()` helper, reuses interactive card infra
|
||||
- [x] Agent requests dependency/package install (install_packages, admin approval, rebuild on approval)
|
||||
- [x] Self-modification — direct tools:
|
||||
- [x] install_packages (apt/npm, admin approval, name validation both sides, max 20 per request; on approve → handler rebuilds the image, kills the container, schedules a verify-and-report follow-up prompt)
|
||||
- [x] add_mcp_server (admin approval; on approve → handler updates `container.json`, kills the container — no image rebuild)
|
||||
- [x] Fire-and-forget model (write request, return immediately; chat notification on approval; container killed so next wake picks up new config/image)
|
||||
- [~] OneCLI integration for human-loop approvals on credentialed requests (agent touching a credentialed resource → OneCLI gates → approval card to admin → OneCLI releases credential) — SDK 0.3.1 `configureManualApproval` wired into host, routes to admin via existing `pending_approvals` infra
|
||||
- [ ] Tunneled OneCLI dashboard for credential addition (Telegram Mini Apps aside, iMessage without Apple Business Register, Matrix, email). Signed short-lived URL → browser form served by OneCLI at 10254 → tunnel via cloudflare durable object. Value never touches the chat surface.
|
||||
- [ ] Self-modification via direct source edits — planned draft/activate flow: RO baseline mount at `/app/src`, RW draft at `/workspace/src-draft`, atomic snapshot into `pending`, admin approval, `cp -a` into baseline, restart + deadman rollback. Unifies runner src, host src, migrations, package.json, container config through one edit path. Collapses the abandoned `create_dev_agent`/`request_swap` dev-agent-in-worktree approach.
|
||||
|
||||
## Named Destinations + ACL
|
||||
|
||||
- [x] `agent_destinations` table (agent_group_id, local_name, target_type, target_id) — migration 004
|
||||
- [x] Per-agent local-name routing map (channels and peer agents referenced by local names)
|
||||
- [x] Destinations stored in inbound.db `destinations` table (moved from JSON file in `b591d7c`) — single source of truth, no separate file
|
||||
- [x] Host writes the destination map into inbound.db before every container wake; container queries it live on every lookup so admin changes take effect mid-session
|
||||
- [x] Container loads map at startup, appends system-prompt addendum listing destinations + `<message to="name">` syntax
|
||||
- [x] Agent main output parsed for `<message to="...">` blocks; `<internal>...</internal>` treated as scratchpad
|
||||
- [x] Host re-validates every outbound route via `hasDestination()` — unauthorized drops logged
|
||||
- [x] Inbound formatter adds `from="name"` via reverse-lookup (consistent namespace both directions)
|
||||
- [x] Single-destination shortcut — agents with one destination don't need `<message>` wrapping
|
||||
- [x] Backfill from existing `messaging_group_agents` on migration
|
||||
- [x] Removed `NANOCLAW_PLATFORM_ID` / `CHANNEL_TYPE` / `THREAD_ID` env-var routing entirely
|
||||
|
||||
## Agent-to-Agent Communication
|
||||
|
||||
- [x] Host delivery to target agent's session DB (`channel_type='agent'` routing in `src/delivery.ts`)
|
||||
- [x] Agent spawning a new sub-agent (`create_agent` MCP tool, available to any agent, path-traversal guarded)
|
||||
- [x] Dynamic agent group creation (folder + optional CLAUDE.md at runtime)
|
||||
- [x] Internal-only agents (agents created without a channel attached)
|
||||
- [x] Permission delegation from parent to child (bidirectional destination rows inserted at creation)
|
||||
- [x] Bidirectional routing via inherited routing context; sender info enriched on the target side
|
||||
- [ ] Specialist sub-agents (browser agent, dev agent — user's agent delegates with request/approval)
|
||||
- [ ] Browser agent with per-destination permissions between main agent and browser agent (main requests navigation/interaction; browser agent executes in isolated container)
|
||||
- [ ] Sanitization of browser agent responses before handing back to main agent (strip scripts, inline images, untrusted HTML; prevent prompt injection from web content)
|
||||
- [ ] Same permission + sanitization model for any sub-agent that accesses sensitive data sources (files, DBs, third-party APIs)
|
||||
|
||||
## In-Chat Agent Management
|
||||
|
||||
- [x] /clear (resets session)
|
||||
- [x] /compact (triggers context compaction)
|
||||
- [~] /context (passes through, no rich formatting)
|
||||
- [~] /usage (passes through, no rich formatting)
|
||||
- [~] /cost (passes through, no rich formatting)
|
||||
- [ ] Smooth session transitions: load context into new sessions, solve cold start problem
|
||||
- [x] MCP/package installation from chat
|
||||
- [ ] Browse MCP marketplace / skills repository from chat
|
||||
|
||||
## Skills & Marketplace
|
||||
|
||||
- [ ] Install skills from chat (agent requests, admin approves, skill dropped into container skills dir)
|
||||
- [ ] Scan skills before install (lint SKILL.md, sandbox-check shell commands, require approval for network/FS-heavy skills)
|
||||
- [ ] Scan marketplace npm packages before install (supply-chain check, typo-squat detection, known-bad list)
|
||||
- [ ] MCP server marketplace — discover, preview, install
|
||||
- [ ] Browse skills / MCP marketplace from chat (cards with search, preview, install)
|
||||
- [ ] Local voice transcription skill — "just works" install flow: when the user sends a voice message and no transcription backend is installed, the agent asks once ("Install local voice transcription?"), and on approval the skill installs a fully-local speech-to-text model (no cloud calls). Subsequent voice messages transcribe automatically.
|
||||
- [ ] Fully local NanoClaw — OpenCode + Gemma 4 as the agent provider instead of Claude Code, so an entire install can run with zero cloud inference. Requires wiring OpenCode as an agent provider (see Agent Providers) and a setup path that picks local models, pulls weights, and verifies everything runs offline.
|
||||
|
||||
## Container Skills
|
||||
|
||||
Container skills live inside agent containers at runtime (`container/skills/`) and are loaded into every agent session. These are distinct from feature/operational skills that ship with the host.
|
||||
|
||||
- [ ] Customize container skill — agent-driven customization flow (add channel, integration, behavior change) usable from inside any agent session, not just the main repo
|
||||
- [ ] Debug container skill — inspect logs, session DB, MCP server state, container env, recent errors from inside the agent
|
||||
- [ ] Build-system container skills:
|
||||
- [ ] Karpathy LLM Wiki builder (agent scaffolds a persistent wiki knowledge base for a group)
|
||||
- [ ] Generic build-system framework for agent-authored sub-systems
|
||||
- [ ] NanoClaw installation diagram skill — agent generates a visual diagram of the user's current setup (agent groups, channels, wirings, destinations, sub-agents, installed packages/MCP servers)
|
||||
- [ ] Video replay skill — generate Remotion (or similar) videos that replay chat flows and sessions, referencing good UI patterns to produce shareable clips
|
||||
- [ ] Excitement trigger skill — detects when the user expresses excitement about the agent's capabilities or their setup, and proactively encourages generating a diagram + sharing it
|
||||
- [ ] End-of-migration diagram skill — at the end of `/migrate-nanoclaw` (or any migration flow), agent generates a visual diagram of the resulting setup and suggests sharing
|
||||
- [ ] End-of-setup diagram skill — at the end of first-time `/setup`, agent generates a visual diagram and suggests sharing (merges the old "Generate visual diagram of customized instance at end of setup" line from Channel Adapters)
|
||||
|
||||
## Webhook Ingestion
|
||||
|
||||
- [ ] Generic webhook endpoint for external events
|
||||
- [ ] GitHub webhook handling
|
||||
- [ ] CI/CD notification handling
|
||||
- [ ] Webhook -> messages_in routing
|
||||
|
||||
## System Actions
|
||||
|
||||
- [ ] register_group from inside agent
|
||||
- [ ] reset_session from inside agent
|
||||
- [ ] Delivery failures should round-trip back to the agent as system messages so it can decide how to recover (retry as plain text, simplify, give up), with a hard retry cap + poison-pill backstop in delivery.ts to keep the queue healthy
|
||||
|
||||
## Integrations
|
||||
|
||||
- [x] Vercel CLI integration in setup process
|
||||
- [x] Skills for deploying and managing Vercel websites from chat
|
||||
- [ ] Office 365 integration (create/edit documents with inline suggestions)
|
||||
|
||||
## Memory
|
||||
|
||||
- [ ] Shared memory with approval flow (write to global memory requires admin approval)
|
||||
- [ ] Agent memory system skills — skills for building and managing memory systems for an agent: archive/index large collections of files and data, then expose a memory interface the agent can query and update (e.g. QMD-style systems)
|
||||
|
||||
## Migration
|
||||
|
||||
- [ ] Custom skill/code porting
|
||||
- [ ] OneCLI migration check — determine if existing installs need OneCLI re-init (credentials re-scoped to new `agent_group.id` identifier, new SDK version, approval handler registered). If needed, add a migration step to `/update-nanoclaw` or a dedicated skill.
|
||||
|
||||
## Testing
|
||||
|
||||
- [x] DB layer tests (agent groups, messaging groups, sessions, pending questions)
|
||||
- [x] Channel registry tests
|
||||
- [x] Poll loop / formatter tests
|
||||
- [x] Integration test (container agent-runner)
|
||||
- [x] Host core tests
|
||||
- [ ] End-to-end flow tests (message in -> agent -> message out -> delivery)
|
||||
- [ ] Delivery polling tests
|
||||
- [ ] Host sweep tests (stale detection, recurrence)
|
||||
- [ ] Multi-channel integration tests
|
||||
|
||||
## Rollout
|
||||
|
||||
- [ ] Internal testing across all channels
|
||||
- [ ] Migration skill built and tested
|
||||
- [ ] PR factory migrated as validation
|
||||
- [ ] Blog post / announcement
|
||||
- [ ] Video demos of key flows
|
||||
- [ ] Vercel coordination
|
||||
@@ -1,146 +0,0 @@
|
||||
# CLAUDE.md Composition
|
||||
|
||||
Compose agent instructions from a shared base, skill/tool fragments, and per-group memory — replacing the current per-group CLAUDE.md with a host-regenerated entry point.
|
||||
|
||||
## Problem
|
||||
|
||||
Today each agent group has a single RW `groups/<folder>/CLAUDE.md`, written once at init and never updated. Consequences:
|
||||
|
||||
- Upstream improvements to shared agent guidance don't propagate to existing groups
|
||||
- No way to ship tool-specific guidance with the tool itself (e.g., an agent-browser usage fragment)
|
||||
- Human-authored identity and agent-accumulated memory live in the same file with no separation
|
||||
- The `.claude-global.md` symlink + `groups/global/CLAUDE.md` pattern handled the shared base but not per-module fragments
|
||||
|
||||
## Design
|
||||
|
||||
**Principle: RW = per-group memory, RO = shared content.** Same rule that governs the shared-source refactor, applied to agent instructions.
|
||||
|
||||
### Three tiers
|
||||
|
||||
| Tier | File | Location | Mount | Editor | Change rate |
|
||||
|---|---|---|---|---|---|
|
||||
| **Shared base** | `CLAUDE.md` | `container/CLAUDE.md` | RO at `/app/CLAUDE.md` | Owner (via git) | Rare |
|
||||
| **Module fragments** | `instructions.md` | Inside each module | RO via shared skills mount, or inline in `container.json` | Module author | Ships with module |
|
||||
| **Per-group memory** | `CLAUDE.local.md` | `groups/<folder>/` | RW at `/workspace/agent/` | Agent + owner | Continuous |
|
||||
| **Composed entry** | `CLAUDE.md` | `groups/<folder>/` | RW but host-regenerated | **Host, not human** | Every spawn |
|
||||
|
||||
### Composition
|
||||
|
||||
At every spawn, the host regenerates `groups/<folder>/CLAUDE.md` as an import-only file:
|
||||
|
||||
```markdown
|
||||
<!-- Composed at spawn — do not edit. Edit CLAUDE.local.md for per-group content. -->
|
||||
@./.claude-shared.md
|
||||
@./.claude-fragments/welcome.md
|
||||
@./.claude-fragments/agent-browser.md
|
||||
@./.claude-fragments/<enabled-skill-with-fragment>.md
|
||||
@./.claude-fragments/mcp-<server-name>.md
|
||||
```
|
||||
|
||||
Symlinks are created alongside, following the `.claude-global.md` pattern (dangling on host, valid in container via the RO mount):
|
||||
|
||||
- `groups/<folder>/.claude-shared.md` → `/app/CLAUDE.md`
|
||||
- `groups/<folder>/.claude-fragments/<name>.md` → `/app/skills/<name>/instructions.md` (for each enabled skill that ships a fragment)
|
||||
|
||||
Claude Code auto-loads `CLAUDE.local.md` from cwd without an import line — native behavior. Agent memory works natively; composition only wraps around it.
|
||||
|
||||
### Module fragment contract
|
||||
|
||||
**Skills.** A skill optionally ships an `instructions.md` at the top of its directory:
|
||||
|
||||
```
|
||||
container/skills/welcome/
|
||||
SKILL.md — description + when-to-use (existing)
|
||||
instructions.md — always-in-context guidance (optional, new)
|
||||
```
|
||||
|
||||
When the skill is enabled for a group, the host imports `instructions.md` into the composed CLAUDE.md. `SKILL.md` semantics are unchanged — Claude Code still uses it for skill discovery and on-demand invocation. Most skills won't need an `instructions.md` (SKILL.md is sufficient for on-demand skills); it's only for guidance that should be in context at all times.
|
||||
|
||||
**MCP servers.** A `container.json` MCP server entry can contribute a fragment inline:
|
||||
|
||||
```jsonc
|
||||
{
|
||||
"mcpServers": {
|
||||
"my-db": {
|
||||
"command": "...",
|
||||
"instructions": "Read-only access to the production DB. Never run UPDATE/DELETE without admin approval."
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Host writes the inline content to `.claude-fragments/mcp-<server-name>.md` at spawn and imports it.
|
||||
|
||||
**Global CLIs baked into the image** (agent-browser, vercel, claude-code) have always-present guidance; it belongs in `container/CLAUDE.md`, not as a conditional fragment. Don't try to make universally-present tools dynamic.
|
||||
|
||||
### Identity vs memory
|
||||
|
||||
All per-group content — human-authored identity ("you are the research agent, be terse") and agent-accumulated memory (inventories, user preferences, learned patterns) — lives in a single `CLAUDE.local.md`. Both humans and agents can edit it.
|
||||
|
||||
If the distinction becomes operationally important later (agents confused about what they were told vs. what they learned), split into `identity.md` (human-authored, imported into composed CLAUDE.md) + `CLAUDE.local.md` (agent memory only). Starting with one file.
|
||||
|
||||
## Changes
|
||||
|
||||
### `container/CLAUDE.md` (new)
|
||||
|
||||
Write the shared base: general NanoClaw context, how to engage with users, output conventions, anything that should apply to every agent across every group. Seed from current `groups/global/CLAUDE.md`.
|
||||
|
||||
### `container/skills/<name>/instructions.md` (optional, per skill)
|
||||
|
||||
Add for any skill that warrants always-in-context guidance. Optional.
|
||||
|
||||
### `container.json` schema
|
||||
|
||||
Add optional `instructions` field (string) to each MCP server entry.
|
||||
|
||||
### `container-runner.ts` spawn-time sync
|
||||
|
||||
Extend the skill-symlink sync function (added in the shared-source refactor) to also compose CLAUDE.md. On every spawn:
|
||||
|
||||
1. Sync `.claude-shared/skills/<name>` symlinks from `container.json` skill selection.
|
||||
2. Sync `.claude-shared.md` symlink → `/app/CLAUDE.md`.
|
||||
3. For each enabled skill with an `instructions.md`, create `.claude-fragments/<name>.md` symlink → `/app/skills/<name>/instructions.md`.
|
||||
4. For each `container.json` MCP server with an `instructions` field, write the inline content to `.claude-fragments/mcp-<server-name>.md`.
|
||||
5. Write `groups/<folder>/CLAUDE.md` atomically (temp + rename) with import lines in a deterministic order: shared base → skill fragments (alphabetical) → MCP fragments (alphabetical).
|
||||
6. Remove stale symlinks and fragment files for modules no longer enabled.
|
||||
|
||||
### `group-init.ts`
|
||||
|
||||
- Stop writing an initial `groups/<folder>/CLAUDE.md` at group creation — host regenerates at first spawn.
|
||||
- Stop creating the `.claude-global.md` symlink — replaced by `.claude-shared.md` in the composition step.
|
||||
- Optionally create an empty `groups/<folder>/CLAUDE.local.md` at init as a clear affordance for humans and agents.
|
||||
|
||||
### `groups/global/`
|
||||
|
||||
Eliminate. The shared base moves to `container/CLAUDE.md`. Any deployment-specific overrides live in the owner's customized `container/CLAUDE.md` (same pattern as any other codebase customization).
|
||||
|
||||
## Migration
|
||||
|
||||
Breaking change, one-time cutover:
|
||||
|
||||
- For every group, rename `groups/<folder>/CLAUDE.md` → `groups/<folder>/CLAUDE.local.md`. Preserves all existing per-group content as memory.
|
||||
- Move content from `groups/global/CLAUDE.md` (beyond the default stub) into `container/CLAUDE.md`. Delete `groups/global/`.
|
||||
- Delete stale `.claude-global.md` symlinks in each group dir — the spawn pass creates `.claude-shared.md` instead.
|
||||
- First spawn after cutover regenerates `CLAUDE.md` with proper imports.
|
||||
|
||||
## Interaction with shared-source refactor
|
||||
|
||||
This refactor depends on the shared skills mount (`/app/skills/` RO) from the shared-source refactor landing first. It extends the spawn-time sync from "just skill symlinks" to "skill symlinks + CLAUDE.md composition" — both passes share the same helper.
|
||||
|
||||
After this refactor, the "Personality / instructions" row in the shared-source per-group customization table splits:
|
||||
|
||||
| Resource | Location | Mechanism |
|
||||
|----------|----------|-----------|
|
||||
| Agent memory | `groups/<folder>/CLAUDE.local.md` | RW at `/workspace/agent/`, auto-loaded by Claude Code |
|
||||
| Composed entry | `groups/<folder>/CLAUDE.md` | Host-regenerated at every spawn |
|
||||
|
||||
## What triggers what
|
||||
|
||||
| Change | Action | Scope |
|
||||
|--------|--------|-------|
|
||||
| Edit `container/CLAUDE.md` | Kill running containers (next spawn recomposes) | All groups |
|
||||
| Add/edit a skill's `instructions.md` | Kill running containers | All groups with the skill enabled |
|
||||
| Enable/disable a skill in `container.json` | Kill that group's containers | One group |
|
||||
| Add MCP server with `instructions` field | Kill that group's containers | One group |
|
||||
| Edit `CLAUDE.local.md` | Nothing — live via RW mount; Claude Code re-reads at next prompt | One group |
|
||||
| Add a new agent group | Spawn writes `CLAUDE.md` fresh from the composition pass | One group |
|
||||
@@ -1,221 +0,0 @@
|
||||
# Module Contract
|
||||
|
||||
This doc is the authoritative reference for how core and modules connect. Everything downstream — extraction PRs, install skills, module authors — keys off these signatures and defaults. See [REFACTOR_PLAN.md](../REFACTOR_PLAN.md) for the broader plan; this doc is the narrow interface spec.
|
||||
|
||||
## Principles
|
||||
|
||||
- Core runs standalone (modulo default modules — see tiers below). The optional-module portion of the `src/modules/index.ts` barrel can be empty and NanoClaw still routes messages in and delivers responses out.
|
||||
- Optional modules are independent. No optional module imports from another optional module. Cross-module coordination goes through a core registry (delivery action, response handler, etc.).
|
||||
- Registries exist only when multiple modules plug into the same decision point. Single-consumer integrations use skill edits (`MODULE-HOOK` markers) or stay inline with `sqlite_master` guards.
|
||||
- Removing an optional module = delete files + remove barrel imports + revert any `MODULE-HOOK` content. Migration files stay (data is preserved). Removing a default module is more invasive: it requires editing the core files that import from it.
|
||||
|
||||
## Module taxonomy
|
||||
|
||||
Three categories. All three live under `src/modules/` (or equivalent adapter dirs) with the same folder layout; the distinction is about **shipping** and **who can depend on them**.
|
||||
|
||||
### 1. Default modules
|
||||
|
||||
Ship with `main` in `src/modules/`. Imported by the default `src/modules/index.ts` barrel from day one. They are not really core — they live under `src/modules/` specifically to signal "not really core, rippable if needed" — but they're always present on a `main` install. Core imports from them directly. No hook, no registry indirection for the exports themselves.
|
||||
|
||||
Current: `typing`, `mount-security`.
|
||||
|
||||
### 2. Optional modules
|
||||
|
||||
Live on the `modules` branch. Installed via `/add-<name>` skills that cherry-pick files. Register into core via one of the four registries (or `MODULE-HOOK` skill edits). Core and other optional modules must not statically import an optional module's code.
|
||||
|
||||
Current: `interactive`, `approvals`, `scheduling`, `permissions`. Pending: `agent-to-agent`.
|
||||
|
||||
### 3. Channel adapters
|
||||
|
||||
Live on the `channels` branch, installed via `/add-<channel>` skills. Not covered by this contract; they use the pre-existing `ChannelAdapter` interface and `registerChannelAdapter()`.
|
||||
|
||||
## Dependency rule
|
||||
|
||||
```
|
||||
core ← default modules ← optional modules
|
||||
```
|
||||
|
||||
- **Core** may import from core and from default modules.
|
||||
- **Default modules** may import from core and from other default modules. They must not import from optional modules.
|
||||
- **Optional modules** may import from core and from default modules. They must not import from each other.
|
||||
|
||||
Peer-to-peer coupling between optional modules goes through a core registry — see "The four registries" below. This keeps the module dependency graph a DAG and install order irrelevant.
|
||||
|
||||
### Known transitional violations
|
||||
|
||||
- `src/access.ts` (core) imports from `src/modules/permissions/` (optional). Shim left from PR #5; resolved in the planned approvals re-tier (PR #7) which moves approver-picking into a new default `approvals-primitive` module that may then depend on permissions however it likes — at which point `src/access.ts` ceases to exist.
|
||||
|
||||
## The four registries
|
||||
|
||||
Each registry has an explicit default for when no module registers. Core must run when all four are empty.
|
||||
|
||||
### 1. Delivery action handlers
|
||||
|
||||
```typescript
|
||||
// src/delivery.ts
|
||||
type ActionHandler = (
|
||||
content: Record<string, unknown>,
|
||||
session: Session,
|
||||
inDb: Database.Database,
|
||||
) => Promise<void>;
|
||||
|
||||
export function registerDeliveryAction(action: string, handler: ActionHandler): void;
|
||||
```
|
||||
|
||||
**Purpose:** system-kind outbound messages (`msg.kind === 'system'`) carry an `action` string. Core dispatches to the registered handler.
|
||||
|
||||
**Default when action is unknown:** log `"Unknown system action"` at `warn` and return. Message is still marked delivered (it was consumed by the host, not sent to a channel).
|
||||
|
||||
**Current consumers:** scheduling (5 actions — `schedule_task`, `cancel_task`, `pause_task`, `resume_task`, `update_task`), approvals (2 actions — `install_packages`, `add_mcp_server`), agent-to-agent (`create_agent`, and the agent-routing branch keyed as a pseudo-action `agent_route`).
|
||||
|
||||
### 2. Router sender resolver + access gate
|
||||
|
||||
Two separate setters, called at different points in `routeInbound`. Preserves the pre-refactor ordering: sender-upsert side effects fire even when the message is ultimately dropped by wiring or trigger rules.
|
||||
|
||||
```typescript
|
||||
// src/router.ts
|
||||
type SenderResolverFn = (event: InboundEvent) => string | null;
|
||||
|
||||
export function setSenderResolver(fn: SenderResolverFn): void;
|
||||
|
||||
type AccessGateResult =
|
||||
| { allowed: true }
|
||||
| { allowed: false; reason: string };
|
||||
|
||||
type AccessGateFn = (
|
||||
event: InboundEvent,
|
||||
userId: string | null,
|
||||
mg: MessagingGroup,
|
||||
agentGroupId: string,
|
||||
) => AccessGateResult;
|
||||
|
||||
export function setAccessGate(fn: AccessGateFn): void;
|
||||
```
|
||||
|
||||
**Call order in `routeInbound`:**
|
||||
1. Resolve messaging group.
|
||||
2. **Sender resolver** (if set). Permissions upserts the users row here so the record exists even if agent resolution drops the message.
|
||||
3. Resolve wired agents; `no_agent_wired` → record + drop. (Core writes the dropped_messages row.)
|
||||
4. Pick agent by trigger rules; `no_trigger_match` → record + drop.
|
||||
5. **Access gate** (if set). On refusal it writes its own `dropped_messages` row keyed by policy reason.
|
||||
|
||||
**Defaults when unset:** resolver returns null; gate defaults to `{ allowed: true }`. Every message routes through, no users table is needed, downstream tolerates `userId=null`.
|
||||
|
||||
**Current consumer:** permissions module (registers both).
|
||||
|
||||
**Not registries, setters.** There is one sender and one access decision per inbound message and one module that owns both. Calling `setSenderResolver` / `setAccessGate` twice overwrites; core does not iterate.
|
||||
|
||||
### 3. Response dispatcher
|
||||
|
||||
```typescript
|
||||
// src/index.ts (or src/response-dispatch.ts if it grows)
|
||||
interface ResponsePayload {
|
||||
questionId: string;
|
||||
value: string;
|
||||
userId: string | null;
|
||||
channelType: string;
|
||||
platformId: string;
|
||||
threadId: string | null;
|
||||
}
|
||||
|
||||
type ResponseHandler = (payload: ResponsePayload) => Promise<boolean>;
|
||||
|
||||
export function registerResponseHandler(handler: ResponseHandler): void;
|
||||
```
|
||||
|
||||
**Purpose:** button-click / question responses arrive via the channel adapter's `onAction` callback. Core iterates registered handlers in registration order. The first one that returns `true` claims the response.
|
||||
|
||||
**Default when empty:** log `"Unclaimed response"` at `warn` and drop.
|
||||
|
||||
**Current consumers:** interactive (matches `pending_questions`), approvals (matches `pending_approvals`). The two tables have disjoint `question_id` / `approval_id` namespaces in practice (`q-*` vs `appr-*`), so first-match-wins is safe.
|
||||
|
||||
### 4. Container MCP tool self-registration
|
||||
|
||||
```typescript
|
||||
// container/agent-runner/src/mcp-tools/server.ts
|
||||
export function registerTools(tools: McpToolDefinition[]): void;
|
||||
```
|
||||
|
||||
**Purpose:** each tool module calls `registerTools([...])` at import time. The MCP server uses whatever was registered.
|
||||
|
||||
**Default:** only `mcp-tools/core.ts` (`send_message`) registered.
|
||||
|
||||
**Current consumers:** all container-side modules (scheduling, interactive, agents, self-mod).
|
||||
|
||||
## Skill edits to core
|
||||
|
||||
For one-off integrations with a single consumer, install skills edit core directly between `MODULE-HOOK` markers. No registry.
|
||||
|
||||
Marker format:
|
||||
|
||||
```typescript
|
||||
// MODULE-HOOK:<module>-<site>:start
|
||||
// MODULE-HOOK:<module>-<site>:end
|
||||
```
|
||||
|
||||
The skill inserts between markers on install and clears between them on uninstall. Markers live in core from day one (empty until a skill fills them).
|
||||
|
||||
**Current uses:**
|
||||
|
||||
- `src/host-sweep.ts` → `MODULE-HOOK:scheduling-recurrence` — call to scheduling module's `handleRecurrence`.
|
||||
- `container/agent-runner/src/poll-loop.ts` → `MODULE-HOOK:scheduling-pre-task` — call to scheduling module's `applyPreTaskScripts`.
|
||||
|
||||
**Promotion rule:** if a third consumer appears for any marker, promote to a registry.
|
||||
|
||||
## Guarded inline (core)
|
||||
|
||||
Some code stays in core but references module-owned tables. These use `sqlite_master` checks to degrade cleanly when the owning module isn't installed.
|
||||
|
||||
| Site | Owning module | Fallback |
|
||||
|------|---------------|----------|
|
||||
| `container-runner.ts` admin-ID query (`user_roles`, `agent_group_members`) | permissions | returns `[]` |
|
||||
| `container-runner.ts` `writeDestinations` (`agent_destinations`) | agent-to-agent | no-op |
|
||||
| `delivery.ts` channel-permission check (`agent_destinations`) | agent-to-agent | permit (origin-chat always OK) |
|
||||
| `delivery.ts` `createPendingQuestion` (`pending_questions`) | interactive | no-op (log warning) |
|
||||
|
||||
Container-side admin gating no longer exists. Admin authorization is now performed host-side in `src/command-gate.ts`, which queries `user_roles` directly — no env var is passed to the container, and no agent-runner fallback exists.
|
||||
|
||||
## Migrations
|
||||
|
||||
All migrations live in `src/db/migrations/` as TypeScript files exporting a `Migration` object:
|
||||
|
||||
```typescript
|
||||
export interface Migration {
|
||||
version: number;
|
||||
name: string;
|
||||
up: (db: Database.Database) => void;
|
||||
}
|
||||
```
|
||||
|
||||
The barrel `src/db/migrations/index.ts` imports each and lists them in an ordered array.
|
||||
|
||||
**Uniqueness key is `name`, not `version`.** The migrator applies any migration whose `name` isn't in `schema_version`. Version stays as an ordering hint; integer collisions across modules are allowed.
|
||||
|
||||
**Module migration naming:**
|
||||
|
||||
- File: `src/db/migrations/module-<module>-<short>.ts`
|
||||
- `Migration.name`: `'<module>-<short>'` (e.g. `'approvals-pending-approvals'`)
|
||||
|
||||
**Uninstall behavior:** migration files and barrel entries stay. Tables persist across reinstalls. No down migrations.
|
||||
|
||||
## What a registry-based module provides
|
||||
|
||||
Each `src/modules/<name>/` module must supply:
|
||||
|
||||
- `index.ts` — imported by `src/modules/index.ts` for side-effect registration (calls `registerDeliveryAction` / `setInboundGate` / `registerResponseHandler` at module load time).
|
||||
- `project.md` — appended to project `CLAUDE.md` by the install skill. Describes module architecture for anyone reading the codebase.
|
||||
- `agent.md` — appended to `groups/global/CLAUDE.md` by the install skill. Describes the module's tools for the agent.
|
||||
- Migration file in `src/db/migrations/` if the module owns any tables.
|
||||
- Barrel entry in `src/db/migrations/index.ts` for that migration.
|
||||
|
||||
Optionally:
|
||||
|
||||
- Container-side additions to `container/agent-runner/src/mcp-tools/<name>.ts` that call `registerTools([...])`, with a barrel entry in `container/agent-runner/src/mcp-tools/index.ts`.
|
||||
- `MODULE-HOOK` edits to specific core files, applied by the install skill.
|
||||
|
||||
## What a module must not do
|
||||
|
||||
- Import from another module.
|
||||
- Write to core-owned tables (`sessions`, `agent_groups`, `messaging_groups`, `schema_version`, etc.) outside of migrations.
|
||||
- Depend on a specific channel adapter being installed.
|
||||
- Break core behavior when unloaded. If a module's absence leaves a core feature non-functional, that feature belongs in core, not the module.
|
||||
@@ -1,270 +0,0 @@
|
||||
# Shared Source
|
||||
|
||||
Replace per-group agent-runner-src copies with a single shared read-only mount.
|
||||
|
||||
## Problem
|
||||
|
||||
Each agent group gets a full copy of `container/agent-runner/src/` at creation time. This copy is mounted RW at `/app/src` in the container. Consequences:
|
||||
|
||||
- Bug fixes and features don't propagate to existing groups
|
||||
- Owner edits to `container/agent-runner/src/` silently don't apply to existing groups
|
||||
- No tooling to diff or detect drift between groups and upstream
|
||||
- The RW mount lets agents write to their own runtime source without approval
|
||||
- Cross-cutting changes (host + container) break down when container code is per-group
|
||||
- Skills have the same copy-and-drift problem
|
||||
|
||||
## Design
|
||||
|
||||
**Principle: RW is per-group, RO is shared.** Every mount is either read-only and shared across all groups, or read-write and scoped to one group. Source and skills become RO + shared. Personality, config, working files, and Claude state stay RW + per-group. This makes drift impossible by construction — no group can diverge from shared code because no group has write access to it.
|
||||
|
||||
### Shared source mount
|
||||
|
||||
Mount `container/agent-runner/src/` into all containers at `/app/src` as **read-only**.
|
||||
|
||||
```
|
||||
container/agent-runner/src/ → /app/src (RO, shared)
|
||||
```
|
||||
|
||||
Source is never baked into the image. `/app/src/` exists only via this mount — running without it is an intentional startup failure (entrypoint `bun run /app/src/index.ts` → ENOENT). Source-only changes never trigger image rebuilds; edits to `.ts` files take effect on next container spawn.
|
||||
|
||||
Image rebuilds are only needed for:
|
||||
- Agent-runner npm dependency changes (`package.json` / `bun.lock`)
|
||||
- System packages, runtime versions, global CLI version bumps
|
||||
- Dockerfile/entrypoint changes
|
||||
|
||||
### Shared skills mount
|
||||
|
||||
Mount `container/skills/` into all containers at `/app/skills/` as **read-only**.
|
||||
|
||||
Per-group skill selection via `container.json`:
|
||||
|
||||
```jsonc
|
||||
{
|
||||
"skills": ["welcome", "agent-browser", "self-customize"]
|
||||
// or "skills": "all" (default)
|
||||
}
|
||||
```
|
||||
|
||||
At every spawn, the host syncs symlinks in the group's `.claude-shared/skills/` directory to match the selected set. For `"all"`, the set is recomputed from the shared skills dir on each spawn — newly-added upstream skills appear without intervention. Symlinks for skills no longer in the set are removed.
|
||||
|
||||
Each symlink points to a container path:
|
||||
|
||||
```
|
||||
.claude-shared/skills/welcome → /app/skills/welcome
|
||||
.claude-shared/skills/agent-browser → /app/skills/agent-browser
|
||||
```
|
||||
|
||||
Claude Code scans `/home/node/.claude/skills/`, follows the symlinks, loads the selected skills. Same dangling-symlink-on-host pattern as `.claude-global.md` — host tools don't resolve the target, the container mount makes it valid at read time.
|
||||
|
||||
### Per-group customization surface
|
||||
|
||||
What remains per-group (unchanged):
|
||||
|
||||
| Resource | Location | Mechanism |
|
||||
|----------|----------|-----------|
|
||||
| Personality / instructions | `groups/<folder>/CLAUDE.md` | Mount at `/workspace/agent` (RW, live) |
|
||||
| MCP servers | `groups/<folder>/container.json` | Env var at spawn |
|
||||
| apt/npm packages | `groups/<folder>/container.json` | Per-group image layer |
|
||||
| Skill selection | `groups/<folder>/container.json` | Symlinks at spawn |
|
||||
| Additional mounts | `groups/<folder>/container.json` | Validated bind mounts |
|
||||
| Agent provider / model | `groups/<folder>/container.json` | Read by runner at startup |
|
||||
| Claude Code settings | `.claude-shared/settings.json` | Mount at `/home/node/.claude` (RW) |
|
||||
| Working files | `groups/<folder>/` | Mount at `/workspace/agent` (RW) |
|
||||
|
||||
### Self-modification
|
||||
|
||||
Existing config-level self-mod tools (`install_packages`, `add_mcp_server`) mutate `container.json` and per-group images, not source. Unchanged — stays per-group.
|
||||
|
||||
Source-level self-modification (not yet implemented) uses staging: edits happen against a copy of `container/agent-runner/src/`, reviewed and swapped in on approval. Owner can also edit source directly.
|
||||
|
||||
## Environment variables
|
||||
|
||||
Env is for things read by code we don't own: glibc, Node's http agent, CLIs we shell out to. Everything NanoClaw-specific moves out of env.
|
||||
|
||||
**Stays in env (read by non-nanoclaw code):**
|
||||
|
||||
| Var | Reader |
|
||||
|---|---|
|
||||
| `TZ` | glibc, child processes |
|
||||
| `HTTPS_PROXY`, `NO_PROXY` | Node http agent, curl, git, etc. (OneCLI-injected) |
|
||||
| `NODE_EXTRA_CA_CERTS` | Node at startup (OneCLI-injected) |
|
||||
|
||||
**Moves to `container.json` (read by runner at startup):**
|
||||
|
||||
| Var | Reason |
|
||||
|---|---|
|
||||
| `AGENT_PROVIDER` | Per-group config; runner reads before importing provider module |
|
||||
| `NANOCLAW_AGENT_GROUP_NAME` | Per-group identity |
|
||||
| `NANOCLAW_ASSISTANT_NAME` | Per-group identity |
|
||||
| `NANOCLAW_MAX_MESSAGES_PER_PROMPT` | Config constant; per-group override possible |
|
||||
|
||||
**Deleted (admin gating moves to router):**
|
||||
|
||||
`NANOCLAW_ADMIN_USER_IDS` is removed entirely — not moved to a new location. The container no longer makes authorization decisions. See **Router command gate** below.
|
||||
|
||||
**Hardcoded as conventions:**
|
||||
|
||||
| Var | Convention |
|
||||
|---|---|
|
||||
| `SESSION_INBOUND_DB_PATH` | `/workspace/inbound.db` |
|
||||
| `SESSION_OUTBOUND_DB_PATH` | `/workspace/outbound.db` |
|
||||
| `SESSION_HEARTBEAT_PATH` | `/workspace/.heartbeat` |
|
||||
| `NANOCLAW_AGENT_GROUP_ID` | Read from `/workspace/agent/container.json` at startup |
|
||||
|
||||
### Runner startup order
|
||||
|
||||
The runner can no longer assume DB paths or provider identity are handed to it in env. Revised startup:
|
||||
|
||||
1. Set up logging.
|
||||
2. Read `/workspace/agent/container.json` (mounted RW but read-only here).
|
||||
3. Open `/workspace/inbound.db` and `/workspace/outbound.db` (fixed paths).
|
||||
4. Read bootstrap tables from `inbound.db` (destinations).
|
||||
5. Import the provider module selected by `container.json`.
|
||||
6. Enter the poll loop.
|
||||
|
||||
### Router command gate
|
||||
|
||||
The host router gates slash commands before writing to `messages_in`. The container still handles whatever reaches it; it just stops making authorization decisions.
|
||||
|
||||
1. **Filtered commands** (`/help`, `/login`, `/logout`, `/doctor`, `/config`, `/start`, `/remote-control`) → drop silently. Never reach the container.
|
||||
2. **Admin commands** (`/clear`, `/compact`, `/context`, `/cost`, `/files`) → check sender against `user_roles` (owners + global admins + admins scoped to this agent group).
|
||||
- Denied: write "Permission denied: `<cmd>` requires admin access." directly to `messages_out` in the same thread. Do not write to `messages_in`.
|
||||
- Allowed: pass through to container unchanged.
|
||||
3. **Normal messages** → pass through unchanged.
|
||||
|
||||
Admin commands that flow through continue to be handled the same way they are today:
|
||||
- `/clear` — container's existing handler in `poll-loop.ts` resets session continuation and writes "Session cleared."
|
||||
- `/compact`, `/context`, `/cost`, `/files` — container forwards them to Claude Code's native slash-command handler.
|
||||
|
||||
Container receives only authorized messages. The runner has no admin concept, no `adminUserIds` field, no admin-gate branch — but it still recognizes `/clear` to reset session state.
|
||||
|
||||
### Scope rules
|
||||
|
||||
Each channel answers a single scope question:
|
||||
|
||||
| Channel | Scope | What it holds |
|
||||
|---|---|---|
|
||||
| Env vars | Process | Things read by code we don't own (`TZ`, `HTTPS_PROXY`) |
|
||||
| `container.json` | Per-group | Per-group config (MCP, packages, provider, model, skills, mounts) |
|
||||
| `inbound.db` / `outbound.db` | Per-session | Messages, session state, and host-projected views of cross-group state (destinations) |
|
||||
| Central DB (`data/v2.db`) | Cross-group | Users, roles, wiring, messaging groups, sessions |
|
||||
|
||||
The runner reads from env (for external-convention vars), `container.json` (for its own group's config), and `inbound.db` (for messages + projected views). It never reads central DB directly — that's always host-projected through inbound.db first.
|
||||
|
||||
After this change, the spawn-time `-e` flags shrink from ~10 to ~3-5 (TZ + OneCLI networking). No `NANOCLAW_*` env var survives.
|
||||
|
||||
## Image layer strategy
|
||||
|
||||
Single Dockerfile with aggressive layer ordering: stable layers first, frequently-bumped layers last. BuildKit's layer cache handles "upstream layers unchanged" rebuilds efficiently — a separate base image isn't justified.
|
||||
|
||||
Two image tags exist at runtime:
|
||||
|
||||
```
|
||||
nanoclaw-agent:latest — shared base (rebuild: dep/CLI bumps + Dockerfile changes)
|
||||
└── nanoclaw-agent:<group> — per-group apt/npm packages (rebuild: per-group via install_packages)
|
||||
```
|
||||
|
||||
Layer order within the base:
|
||||
|
||||
```dockerfile
|
||||
FROM node:22-slim
|
||||
|
||||
# System deps (apt) — rarely change
|
||||
RUN apt-get install ...
|
||||
|
||||
# Bun — pinned version, rarely changes
|
||||
RUN ... bun
|
||||
|
||||
# Agent-runner deps — cached independently of CLI versions
|
||||
COPY agent-runner/package.json agent-runner/bun.lock /app/
|
||||
RUN cd /app && bun install --frozen-lockfile
|
||||
|
||||
# Global CLIs — most stable first, most frequently bumped last
|
||||
RUN pnpm install -g "vercel@${VERCEL_VERSION}"
|
||||
RUN pnpm install -g "agent-browser@${AGENT_BROWSER_VERSION}"
|
||||
RUN pnpm install -g "@anthropic-ai/claude-code@${CLAUDE_CODE_VERSION}"
|
||||
```
|
||||
|
||||
Bumping claude-code (the most common change) only rebuilds one layer. Agent-runner deps and other CLIs stay cached.
|
||||
|
||||
Source is never baked into the image — always provided by the shared RO mount at runtime.
|
||||
|
||||
### Agent-triggered version bumps
|
||||
|
||||
Agents can request a claude-code version bump via a new self-mod tool (`bump_claude_code`). Same fire-and-forget pattern as `install_packages`: agent requests → owner approves → host rebuilds base image → kill all running containers. Unlike `install_packages` (per-group image), this rebuilds the shared base image and affects all groups.
|
||||
|
||||
## Changes
|
||||
|
||||
### `group-init.ts`
|
||||
|
||||
- Remove the `agent-runner-src` copy block (lines 109–117)
|
||||
- Remove the `skills/` copy block (lines 100–107)
|
||||
- Skill symlinks are no longer created at init — sync is spawn-owned (see `container-runner.ts`)
|
||||
|
||||
### `container-runner.ts` `buildMounts()`
|
||||
|
||||
- Remove per-group `agent-runner-src` mount (lines 206–209)
|
||||
- Add shared RO mount: `container/agent-runner/src/` → `/app/src`
|
||||
- Add shared RO mount: `container/skills/` → `/app/skills`
|
||||
- Sync skill symlinks in `.claude-shared/skills/` at spawn: write desired set from `container.json` (`"all"` = every skill in the shared dir, recomputed per spawn), remove symlinks not in the set
|
||||
|
||||
### `container-runner.ts` `buildContainerArgs()`
|
||||
|
||||
- Remove `-e SESSION_INBOUND_DB_PATH`, `-e SESSION_OUTBOUND_DB_PATH`, `-e SESSION_HEARTBEAT_PATH` (hardcoded conventions now)
|
||||
- Remove `-e AGENT_PROVIDER` (moves to `container.json`)
|
||||
- Remove `-e NANOCLAW_ASSISTANT_NAME`, `-e NANOCLAW_AGENT_GROUP_ID`, `-e NANOCLAW_AGENT_GROUP_NAME`
|
||||
- Remove `-e NANOCLAW_MAX_MESSAGES_PER_PROMPT`
|
||||
- Remove the `user_roles` join + `-e NANOCLAW_ADMIN_USER_IDS` block (lines 269–287) entirely. Admin gating moves to the router — no admin data passed to the container.
|
||||
- Keep: `-e TZ`, OneCLI-contributed env (`HTTPS_PROXY`, `NODE_EXTRA_CA_CERTS`, `NO_PROXY`)
|
||||
|
||||
### `router.ts` (new command gate)
|
||||
|
||||
- Classify inbound slash commands before writing to `messages_in`: filtered / admin / normal.
|
||||
- Filtered (`/help`, `/login`, `/logout`, `/doctor`, `/config`, `/start`, `/remote-control`) → drop silently.
|
||||
- Admin commands (`/clear`, `/compact`, `/context`, `/cost`, `/files`) from non-admins → write "Permission denied" directly to `messages_out`, skip `messages_in`.
|
||||
- All authorized messages (admin commands from admins, and normal messages) → pass through unchanged to `messages_in`. Container handles them as today.
|
||||
- The `ADMIN_COMMANDS` and `FILTERED_COMMANDS` lists move from `container/agent-runner/src/formatter.ts` to a host-side module.
|
||||
|
||||
### `container/agent-runner/src/` (runner)
|
||||
|
||||
- New `config.ts` module: loads `/workspace/agent/container.json` at startup, exposes a typed config singleton. All previous `process.env.NANOCLAW_*` reads go through this.
|
||||
- `db/connection.ts`: use hardcoded paths `/workspace/inbound.db` and `/workspace/outbound.db`; drop `SESSION_*_DB_PATH` lookups.
|
||||
- `formatter.ts`: remove `ADMIN_COMMANDS`, `FILTERED_COMMANDS`, and the `filtered` / admin-gate categorization. Keep enough to recognize `/clear` so `poll-loop.ts` can route it (e.g., a narrow `isClearCommand(msg)` helper).
|
||||
- `poll-loop.ts`: remove `adminUserIds` field from config type and the admin-gate branch (lines 113–126). Keep the `/clear` handler (lines 128–142) — `/clear` still flows through from the router.
|
||||
- Provider selection (`providers/index.ts` or equivalent): read provider from config singleton, not env.
|
||||
|
||||
### `container-config.ts`
|
||||
|
||||
- Add `skills` field to `ContainerConfig` (`string[] | "all"`, default `"all"`)
|
||||
- Add fields: `provider`, `groupName`, `assistantName`, `maxMessagesPerPrompt` (optional, falls back to code default)
|
||||
|
||||
### `.env` / `.env.example`
|
||||
|
||||
- Remove any `NANOCLAW_*` entries that were documented as tunables. Update `.env.example` to list only TZ and OneCLI-related vars as valid overrides.
|
||||
|
||||
### DB migration
|
||||
|
||||
- Drop `agent_groups.agent_provider` column and `sessions.agent_provider` column. Source of truth becomes `container.json.provider`.
|
||||
- One-time data migration reads existing values and writes them to each group's `container.json`. Sessions lose any per-session provider override — provider is a per-group property now.
|
||||
|
||||
### Migration
|
||||
|
||||
**This is a breaking change.** Host restart kills all running containers. No gradual rollout. Any code referencing dropped columns or removed env vars must be updated before the migration runs.
|
||||
|
||||
- Provider install skills (`/add-opencode`, `/add-ollama-tool`) now write to the shared `container/agent-runner/src/providers/` tree. The per-group `providers/` overlay pattern is removed. Any uncommitted provider overlays must be upstreamed before cutover.
|
||||
- Delete existing `data/v2-sessions/<id>/agent-runner-src/` directories on first run after cutover.
|
||||
- Existing `.claude-shared/skills/` directories get replaced with symlinks on next spawn.
|
||||
- DB migration (see above) reads `agent_provider` columns and projects into `container.json`, then drops the columns.
|
||||
|
||||
## What triggers what
|
||||
|
||||
| Change | Action needed | Scope |
|
||||
|--------|--------------|-------|
|
||||
| Agent-runner `.ts` source | Kill running containers | All groups |
|
||||
| Agent-runner npm deps | Rebuild `nanoclaw-agent` + kill all | All groups |
|
||||
| System deps, Bun, Node | Rebuild `nanoclaw-agent` + kill all | All groups |
|
||||
| Claude-code version bump | Rebuild `nanoclaw-agent` + kill all | All groups (agent-triggerable) |
|
||||
| Skill content | Kill running containers | All groups |
|
||||
| Per-group apt/npm packages | `buildAgentGroupImage()` + kill | One group |
|
||||
| Per-group config (MCP, mounts, provider, model, skills) | Kill that group's containers | One group |
|
||||
| CLAUDE.md, working files | Nothing (live via RW mount) | One group |
|
||||
@@ -1,618 +0,0 @@
|
||||
# v1 → v2 Action Items
|
||||
|
||||
Working doc for each finding from [SUMMARY.md](SUMMARY.md). Decisions were made one-by-one; this rollup summarizes the outcome.
|
||||
|
||||
**Status legend**: `pending` · `discussing` · `decided` · `deferred` · `dropped` · `done`
|
||||
|
||||
---
|
||||
|
||||
## Rollup
|
||||
|
||||
### To implement (~800 LOC total, roughly)
|
||||
|
||||
| # | Topic | LOC | Notes |
|
||||
|---|---|---|---|
|
||||
| 1 | Engage modes + sender scope + accumulate/drop + fan-out + tool blocklist | ~315 | DB migration drops `trigger_rules`/`response_scope`, adds `engage_mode`/`engage_pattern`/`sender_scope`/`ignored_message_policy` + `trigger` column on `messages_in`; router `pickAgents` fan-out; adapter-level gating via new hooks |
|
||||
| 5 | `request_approval` flow for unknown senders (default policy flips from `strict` to `request_approval`) | ~175 | New `pending_sender_approvals` table; reuses existing `pickApprover` + card infra |
|
||||
| 9 | Stuck detection (60s claim-age rule), heartbeat-based lifecycle, `max(30m, bash_timeout)` absolute ceiling, SDK tool blocklist (`AskUserQuestion`, `EnterPlanMode`, `ExitPlanMode`, `EnterWorktree`, `ExitWorktree`), remove `IDLE_TIMEOUT` setTimeout + `IDLE_END_MS` machinery | ~115 | Container state row for Bash timeout tracking |
|
||||
| 15 | Delete three dead config constants from `src/config.ts` | 3 | `POLL_INTERVAL`, `SCHEDULER_POLL_INTERVAL`, `IPC_POLL_INTERVAL` |
|
||||
| 18 | Timezone + formatting recreation — port v1 bit-for-bit (`formatLocalTime`, `<context timezone="..."/>` header, `reply_to` + `<quoted_message>` XML, `stripInternalTags`) + scheduling tool TZ normalization + cron TZ parsing | ~195 (75 prod + 120 tests) | Full spec in [timezone-formatting-v1-recreation.md](timezone-formatting-v1-recreation.md) |
|
||||
|
||||
### Deferred (wait for trigger)
|
||||
|
||||
| # | Topic | Trigger |
|
||||
|---|---|---|
|
||||
| 2 | `nonMainReadOnly` mount isolation | If multi-tenant / untrusted-group use ever surfaces. In the meantime, mount-declaration skill must explicitly prompt RO/RW when added |
|
||||
| 3a | End-to-end recovery test | When next touching `host-sweep.ts` / `index.ts` startup |
|
||||
| 14 | Remote control subsystem | When someone needs it. Opt-in skill, provider-specific (Claude SDK only) |
|
||||
| 17 | Dynamic group-add (bridge conversations cache refresh) | When implementing dynamic group registration feature. Code comment added at `chat-sdk-bridge.ts:73` |
|
||||
|
||||
### Dropped (won't implement / not-a-regression)
|
||||
|
||||
| # | Topic | Why |
|
||||
|---|---|---|
|
||||
| 3 | Explicit pending-message recovery | Working as designed via sweep's immediate first tick + `cleanupOrphans` |
|
||||
| 4 | `response_scope` enforcement | Folded into item 1 migration (column deleted, values backfilled) |
|
||||
| 6 | Per-group container timeout | Not a regression — v1's hard-kill was worse than v2's keep-alive-after-idle |
|
||||
| 7 | Container streaming output markers | Replaced by `send_message` MCP tool; latency ~1s is fine for chat UX |
|
||||
| 8 | Per-exit container log files | Underlying info still recoverable (session DBs, heartbeat mtime, exit code) |
|
||||
| 10 | Host-level retry on agent error | Folded into item 9's kill + sweep-reset loop |
|
||||
| 11 | Process ID in logger output | Single host process; container stderr already tagged with `agentGroup.folder` |
|
||||
| 12 | Task dedup via unique series_id index | Recurrence logic is structurally dedup-safe; not a real issue |
|
||||
| 13 | Silent-drop sender mode | Admin can use `unknown_sender_policy='strict'` or remove from members instead |
|
||||
| 16 | Configurable retention thresholds | Personal-assistant scale; source constants are fine |
|
||||
|
||||
### Extras recorded during discussion
|
||||
- **1a**: Implementation-ordering plan for item 1
|
||||
- **6a**: Remove `IDLE_END_MS` from `poll-loop.ts` (folded into item 9)
|
||||
- **3a**: E2E recovery test (deferred)
|
||||
|
||||
### Follow-up PRs (scoped, not in this branch)
|
||||
| # | Topic | Why later |
|
||||
|---|---|---|
|
||||
| 22 | Unknown-channel wiring approval flow (card to owner when bot receives inbound in an unwired messaging group) | Gap surfaced after item 5 landed — item 5's `request_approval` covers unknown senders but presupposes a wired channel. See item 22 for the full design. |
|
||||
|
||||
---
|
||||
|
||||
## HIGH
|
||||
|
||||
### 1. Trigger-rule matching in `pickAgent`
|
||||
**Finding**: `src/router.ts:246` TODO. Confirmed trigger filtering is non-functional end-to-end: `trigger_rules` JSON is parsed into `ConversationConfig` and passed to adapters, but the Chat SDK bridge never reads it, and router's `pickAgent` picks by priority only. `response_scope` on `messaging_group_agents` is stored but never enforced. Chat SDK bridge hard-subscribes on every mention (bridge:173) and every DM (bridge:189).
|
||||
|
||||
**Status**: decided — design locked; implementation pending
|
||||
|
||||
**Decision**: replace `trigger_rules` JSON + `response_scope` with four explicit orthogonal columns on `messaging_group_agents`. Fan out inbound messages to all matching agents (N containers for N agents). Adapter-level gating in the bridge. `sender_scope` enforcement moves to the permissions module.
|
||||
|
||||
**Schema** (`messaging_group_agents`):
|
||||
```
|
||||
engage_mode TEXT NOT NULL DEFAULT 'mention'
|
||||
-- 'pattern' | 'mention' | 'mention-sticky'
|
||||
engage_pattern TEXT -- required when mode='pattern'; '.' = always
|
||||
sender_scope TEXT NOT NULL DEFAULT 'all' -- 'all' | 'known'
|
||||
ignored_message_policy TEXT NOT NULL DEFAULT 'drop' -- 'drop' | 'accumulate'
|
||||
```
|
||||
Drop `trigger_rules` + `response_scope`. **No per-wiring accumulate cap** — storage is unbounded.
|
||||
|
||||
**Global wake cap** (not a column): reuse `MAX_MESSAGES_PER_PROMPT` in `src/config.ts` (already defined, default 10, currently dead code from v1). Pass to container via `NANOCLAW_MAX_MESSAGES_PER_PROMPT`. Container applies `ORDER BY seq DESC LIMIT $N` when pulling pending messages on wake.
|
||||
|
||||
**Session DB** (`messages_in`):
|
||||
```
|
||||
trigger INTEGER NOT NULL DEFAULT 1 -- 0 = context-only, 1 = wake agent
|
||||
```
|
||||
Host's `countDueMessages` / wake logic gates on `trigger=1`. Container reads all messages for context regardless.
|
||||
|
||||
**Decisions locked**:
|
||||
- `always` collapses into `pattern` with `engage_pattern='.'` (three modes total)
|
||||
- `mention` and `mention-sticky` are separate modes (stickiness is user-visible)
|
||||
- `pattern` is a JS regex string — applied as `new RegExp(pattern).test(text)`
|
||||
- Accumulate cap = last N messages, default 10
|
||||
- Fan-out: each matching agent gets its own session + container
|
||||
- Per-channel defaults live in the setup/register flow, not in the schema:
|
||||
- DM → `pattern` with `.`
|
||||
- Threaded group → `mention-sticky`
|
||||
- Non-threaded group → `mention`
|
||||
|
||||
**Routing flow** (future):
|
||||
1. Inbound → resolve messaging_group → group-level `unknown_sender_policy` gate
|
||||
2. `pickAgents()` returns all wired agents (not just priority 0)
|
||||
3. For each agent:
|
||||
a. `sender_scope` check (permissions module)
|
||||
b. `engage_mode` check (regex / mention / mention-sticky)
|
||||
c. Matched → write with `trigger=1`, wake container
|
||||
d. Not matched + `accumulate` → write with `trigger=0`, don't wake (no cap — stored forever)
|
||||
e. Not matched + `drop` → skip
|
||||
|
||||
On wake, container pulls pending messages with `ORDER BY seq DESC LIMIT MAX_MESSAGES_PER_PROMPT` so only the most recent N reach the prompt regardless of accumulation depth.
|
||||
|
||||
**Adapter bridge**:
|
||||
- Read `conversations.get(channelId)` before `setupConfig.onInbound(...)`
|
||||
- For `pattern` mode: test regex
|
||||
- For `mention` / `mention-sticky`: require bot to be mentioned
|
||||
- Only `thread.subscribe()` when mode is `mention-sticky` (today it subscribes unconditionally)
|
||||
|
||||
**LOC estimate**: ~315 (~255 prod + ~60 test)
|
||||
- schema migration + backfill: 40
|
||||
- session DB `trigger` column: 25
|
||||
- types + adapter contract: 20
|
||||
- DB helpers (CRUD): 20
|
||||
- host→adapter plumbing (including `NANOCLAW_MAX_MESSAGES_PER_PROMPT` env): 10
|
||||
- router fan-out + gating: 70
|
||||
- sender-scope in permissions module: 15
|
||||
- Chat SDK bridge gating + subscribe control: 40
|
||||
- container-side `LIMIT N` on pending-message pull: 5
|
||||
- smart defaults in setup/register flow: 15
|
||||
- tests: 60
|
||||
|
||||
(Note: earlier plan's "accumulate prune-to-N in router" is dropped — host doesn't prune. Cap is container-side only.)
|
||||
|
||||
**Core vs module split**:
|
||||
- Core (`src/`): schema, engage_mode enforcement, pickAgents fan-out, bridge gating, `trigger` column, accumulate/drop
|
||||
- Permissions module: `sender_scope` enforcement (extends existing access gate). Default `sender_scope='all'` → no-op when permissions module absent
|
||||
|
||||
**Next step**: new action item for implementation — see item 1a.
|
||||
|
||||
---
|
||||
|
||||
### 1a. Implementation plan for engage/sender/ignored columns
|
||||
**Status**: pending — ready to implement
|
||||
**Order**: (a) migration + backfill, (b) types + DB helpers, (c) router fan-out + gating, (d) bridge gating, (e) permissions sender_scope, (f) setup-flow defaults, (g) tests
|
||||
**Next step**: draft the migration + write up the PR plan when ready
|
||||
|
||||
### 2. `nonMainReadOnly` mount isolation
|
||||
**Finding**: `mount-security.ts` moved to `src/modules/mount-security/index.ts` during the refactor. `validateMount(mount)` no longer takes an `isMain` param; `MountAllowlist` has no `nonMainReadOnly` field. Regression is real. But v1's "main vs non-main" concept doesn't map cleanly to v2 — `agent_groups` has no `is_main` flag.
|
||||
|
||||
**Status**: deferred
|
||||
|
||||
**Decision**: do not restore the v1 flag. Trust admin-declared `readonly` values in `container.json`. The allowlist's per-root `allowReadWrite` + path gating is sufficient for the current threat model (personal-assistant use, single admin). If multi-tenant / untrusted auxiliary groups become a real use case, prefer framing B (add `agent_groups.mount_access: 'rw' | 'ro'` column) over resurrecting `isMain`.
|
||||
|
||||
**Rationale**: v2 deliberately dropped the "main" concept. Reintroducing `isMain` to restore a defense-in-depth check that was designed for a different entity model is the wrong trade. Admin already has to opt-in twice (allowlist `allowReadWrite: true` + container.json `readonly: false`) to get RW — that's two deliberate keys. The v1 flag was a triple-check for a rare class of admin mistakes in a shared-infra setup.
|
||||
|
||||
**Follow-up (required)**: when building the skill / guide / setup flow that lets admins declare additional mounts (e.g. self-customize, manage-mounts, or a new `/add-mount` skill), the flow **must clearly surface the RO vs RW distinction** to the admin — explicit choice, explicit warning when RW is selected, and default to RO. This replaces v1's automatic enforcement with informed consent.
|
||||
|
||||
**Next step**: when the mount-declaration skill/flow is next touched, add explicit RO/RW prompting. Track as a sub-item if a skill exists yet.
|
||||
|
||||
### 3. Explicit pending-message recovery on startup
|
||||
**Finding**: v1 had a named `recoverPendingMessages()` function at startup. v2 relies on the host sweep. Verified: the recovery path exists and is correct — just renamed/relocated.
|
||||
|
||||
**Status**: decided — working as designed, no code change
|
||||
|
||||
**Current mechanism** (verified against tree):
|
||||
1. `cleanupOrphans()` at startup kills any leftover container from the previous run (`src/index.ts:69`)
|
||||
2. `startHostSweep()` runs its first sweep **immediately** — no 60s delay (`src/host-sweep.ts:38`)
|
||||
3. Sweep per session: `syncProcessingAcks` → `countDueMessages` → `wakeContainer` if work pending and no container → `detectStaleContainers` resets stuck `processing` rows with backoff
|
||||
|
||||
**Scenarios covered**:
|
||||
- Host crashed while container idle with pending messages → orphan cleanup + first sweep re-wakes
|
||||
- Host crashed mid-processing → stale detection resets `processing → pending`, next sweep wakes
|
||||
- Container crashed with host alive → heartbeat mtime catches it inside 10 min `STALE_THRESHOLD_MS`
|
||||
|
||||
**Rationale**: the function got renamed (recovery → sweep) but the behavior is equivalent or better. Sweep is continuous; recovery used to be one-shot.
|
||||
|
||||
**Next step**: see item 3a.
|
||||
|
||||
---
|
||||
|
||||
### 22. Unknown-channel wiring approval flow
|
||||
**Finding** (post-item-5 discussion): item 5's `request_approval` only fires when a messaging group already has agents wired. Three scenarios slip through to the earlier `no_agent_wired` structural-drop branch in `src/router.ts` and get silent-dropped with no signal to the owner:
|
||||
|
||||
1. A new user DMs the agent directly (the DM's messaging group auto-creates but has no wiring)
|
||||
2. The agent is @mentioned in a group the admin hasn't registered
|
||||
3. The agent is added to a new group and someone there addresses it
|
||||
|
||||
In all three, the user sees no response and the owner has no signal anything happened.
|
||||
|
||||
**Status**: decided — companion PR to item 5, scoped separately
|
||||
|
||||
**Decision**: when the router hits `no_agent_wired` for a non-public event, **instead of silent-dropping, pick the owner and DM them a wiring card**. Two flavors depending on who triggered it:
|
||||
|
||||
- **Sender IS an owner/admin** (the common "I just added the bot" case) → auto-wire IF exactly one agent group exists. Silent seamless flow. If multiple agent groups exist, fall through to the card so the owner picks.
|
||||
- **Sender is anyone else** (stranger, or owner in a multi-agent install) → deliver a card:
|
||||
- Title: `🔌 New channel — wire it?`
|
||||
- Body: `<senderName> is trying to reach you in <channelName> on <platform>. Wire to which agent?`
|
||||
- Options: one button per existing `agent_groups` row, plus `➕ Create new` and `Ignore`
|
||||
|
||||
**On approve (existing agent group)**:
|
||||
1. `createMessagingGroupAgent(...)` with channel-kind defaults — DM→`pattern` + `'.'`, threaded group→`mention-sticky`, non-threaded group→`mention` (same defaults as `scripts/init-first-agent.ts`)
|
||||
2. Replay the stored event via `routeInbound` (sender-approval pattern)
|
||||
3. Delete pending row
|
||||
|
||||
**On approve "Create new"**: [OPEN SCOPE] — needs name/folder input. Options:
|
||||
- Follow-up ask_question card asking for a name → auto-derive folder from slug → create group + wire
|
||||
- Or: skill-backed flow — the button dispatches to `/init-agent` or similar and the card just links out
|
||||
- Punt until implementation; mention in the PR brief that we'll decide when building
|
||||
|
||||
**On ignore**: delete pending row; future attempts re-prompt fresh (consistent with sender-approval deny; no denial persistence).
|
||||
|
||||
**Failure cases** (drop silently with log, don't leave a pending row):
|
||||
- No owner configured (fresh install) — same behaviour as sender-approval
|
||||
- No reachable DM for any owner/admin
|
||||
- Delivery adapter missing
|
||||
|
||||
**New table**:
|
||||
```
|
||||
pending_channel_approvals (
|
||||
id TEXT PRIMARY KEY,
|
||||
messaging_group_id TEXT NOT NULL REFERENCES messaging_groups(id),
|
||||
sender_identity TEXT, -- NULL when triggered by a non-identifiable event
|
||||
sender_name TEXT,
|
||||
original_message TEXT NOT NULL, -- JSON InboundEvent for replay
|
||||
approver_user_id TEXT NOT NULL,
|
||||
created_at TEXT NOT NULL,
|
||||
UNIQUE(messaging_group_id) -- one pending wiring per channel
|
||||
)
|
||||
```
|
||||
|
||||
Dedup is narrower than sender-approval's `(mg_id, sender_id)` — one pending wiring per channel, period. A second stranger writing into the same unwired channel piggybacks on the existing card instead of spawning a new one. Latest event replaces the stored `original_message` (we only replay one anyway, and latest is most useful).
|
||||
|
||||
**Card action id prefix**: `nca-<approvalId>:<value>` where value is `agent-group-<id>` / `create` / `ignore`. Response handler lives in `src/modules/permissions/` alongside `handleSenderApprovalResponse`.
|
||||
|
||||
**Owner-sender auto-wire logic**:
|
||||
```
|
||||
if sender is owner/admin AND getAllAgentGroups().length === 1:
|
||||
auto-wire to that group, replay event, done — no card
|
||||
else:
|
||||
deliver card
|
||||
```
|
||||
|
||||
Don't auto-create a new agent group silently — always require a prompt for that.
|
||||
|
||||
**LOC estimate**: ~145
|
||||
- Migration + CRUD: 45
|
||||
- Router hook before `no_agent_wired` drop → try channel approval: 15
|
||||
- Owner-sender auto-wire fast path: 20
|
||||
- Card delivery (scope `pickApprover(null)`; build buttons from `getAllAgentGroups()`): 25
|
||||
- Response handler: 25
|
||||
- Tests: 15
|
||||
|
||||
**Open scopes (flag at PR time)**:
|
||||
- "Create new" sub-flow — pick between follow-up card vs skill link
|
||||
- Do we also react to bot-added-to-group platform events? Simpler to stay lazy (first-message-triggered only). Platform lifecycle events are inconsistent across Discord/Slack/Telegram anyway.
|
||||
- Worth scanning the `channels` branch for any existing channel-lifecycle handlers that might conflict.
|
||||
|
||||
**Next step**: open a follow-up PR off this branch once #1869 lands.
|
||||
|
||||
---
|
||||
|
||||
### 3a. End-to-end recovery test
|
||||
**Finding**: no test confirms the host-crash-restart scenario produces timely re-delivery.
|
||||
|
||||
**Status**: pending — nice-to-have
|
||||
|
||||
**Decision**: add an integration test: (1) write a pending message to inbound.db, (2) kill the host simulating crash, (3) start host, (4) assert container is woken and message processed within a bounded time (≤5s? ≤ one sweep interval).
|
||||
|
||||
**Rationale**: the sweep logic is correct as written, but a regression here would be silent (messages just sit). Worth a safety net.
|
||||
|
||||
**Next step**: draft test when touching `host-sweep.ts` or `index.ts` startup flow next.
|
||||
|
||||
---
|
||||
|
||||
## MEDIUM
|
||||
|
||||
### 4. `response_scope` enforcement
|
||||
**Finding**: `messaging_group_agents.response_scope` stores `'all' | 'triggered' | 'allowlisted'` but nothing reads it.
|
||||
|
||||
**Status**: decided — folded into item 1
|
||||
|
||||
**Decision**: delete the `response_scope` column as part of the item-1 migration. Values backfill into the new explicit columns:
|
||||
|
||||
| Old `response_scope` | New columns |
|
||||
|---|---|
|
||||
| `all` | `engage_mode='pattern'`, `engage_pattern='.'`, `sender_scope='all'` |
|
||||
| `triggered` | `engage_mode='mention'` (or `'pattern'` if legacy row has a pattern), `sender_scope='all'` |
|
||||
| `allowlisted` | `engage_mode` derived from `trigger_rules`, `sender_scope='known'` |
|
||||
|
||||
**Rationale**: `response_scope` conflated two orthogonal axes (engage + sender). Splitting them is the whole point of item 1.
|
||||
|
||||
**Next step**: ensure the item-1 migration includes the `response_scope` backfill in its UP step.
|
||||
|
||||
### 5. `request_approval` flow for unknown senders
|
||||
**Finding**: `unknown_sender_policy='request_approval'` is scaffolded in `src/modules/permissions/index.ts:100-108` but falls through to log-and-drop (explicit TODO comment). Current default is `'strict'`, which silently drops — user has no signal that their agent isn't responding.
|
||||
|
||||
**Status**: decided — implement, keep simple
|
||||
|
||||
**Decision**: implement full approval flow **and** flip the schema default from `'strict'` to `'request_approval'`. UX rationale: users wire their DM during setup; silent drops create a mystery when the agent doesn't respond. Public is unsafe. Approval default → admin sees a card and explicitly decides.
|
||||
|
||||
**Flow**:
|
||||
1. Unknown sender writes to wired messaging group with policy `'request_approval'`
|
||||
2. If pending approval for `(messaging_group, sender)` already exists → drop this message silently (in-flight dedup; not persistence)
|
||||
3. Otherwise: insert into `pending_sender_approvals` with original message + timestamp
|
||||
4. `pickApprover(agent_group_id)` + `pickApprovalDelivery(approverUserId)` — existing machinery in `src/access.ts`
|
||||
5. Deliver a card via adapter's `deliver()` with `Card`/`Actions`/`Button` primitives (already in chat-sdk-bridge)
|
||||
6. Card action id prefix `nsa:<approval_id>:<allow|deny>` (parallels existing `ncq:` prefix for `ask_user_question`)
|
||||
7. On `allow`: upsert `users` row, insert into `agent_group_members`, deliver stored message through normal routing (original timestamp preserved), cleanup pending row
|
||||
8. On `deny`: cleanup pending row, drop the message. No denial persistence — next attempt from same sender triggers a fresh card.
|
||||
|
||||
**No denial persistence** explicit rationale: personal-assistant scale, admin can switch policy to `'strict'` per messaging group if a hostile sender starts spamming. Avoids a new table column and a TTL config.
|
||||
|
||||
**New table**:
|
||||
```
|
||||
pending_sender_approvals (
|
||||
id TEXT PRIMARY KEY,
|
||||
messaging_group_id TEXT NOT NULL,
|
||||
agent_group_id TEXT NOT NULL,
|
||||
sender_identity TEXT NOT NULL, -- channel_type:handle
|
||||
sender_name TEXT,
|
||||
original_message TEXT NOT NULL, -- JSON of the InboundEvent
|
||||
approver_user_id TEXT NOT NULL,
|
||||
created_at TEXT NOT NULL,
|
||||
UNIQUE(messaging_group_id, sender_identity) -- enforces in-flight dedup
|
||||
)
|
||||
```
|
||||
Dedicated (not reusing `pending_approvals` which is OneCLI-specific).
|
||||
|
||||
**Reuse**:
|
||||
- `pickApprover` / `pickApprovalDelivery` in `src/access.ts`
|
||||
- Card rendering primitives already in `src/channels/chat-sdk-bridge.ts`
|
||||
- `onAction` dispatch — add the `nsa:` prefix handler alongside existing `ncq:`
|
||||
|
||||
**LOC estimate**: ~175
|
||||
- Migration + CRUD for `pending_sender_approvals`: 55
|
||||
- `handleUnknownSender` request_approval branch + in-flight dedup: 25
|
||||
- Host-side card dispatcher (pick approver + deliver card): 25
|
||||
- `onAction` handler for `nsa:` prefix (allow/deny): 30
|
||||
- Schema default flip + router auto-create update: 5
|
||||
- Tests: 35
|
||||
|
||||
**Module location**: all in `src/modules/permissions/`. Module stays optional; default-allow fallback behavior when not loaded is preserved.
|
||||
|
||||
**Open gotchas noted**:
|
||||
- The router's auto-create at `router.ts:123` currently hardcodes `'strict'` — change to omit the field so schema default applies
|
||||
- `pickApprover` may return null if no admin/owner exists (e.g. fresh install before first user registered). In that case: log + drop silently, treat as effectively `'strict'` for safety. Don't block message forever.
|
||||
|
||||
**Scope boundary** (important): this item covers **unknown sender in a wired channel**. The parallel case — **unknown channel** (new DM / unwired group / bot-added-to-group) — short-circuits at the `no_agent_wired` structural drop before this flow ever runs. Tracked as item 22.
|
||||
|
||||
**Next step**: implement alongside item 1 or as a follow-up. Same migration window is fine (one migration for engage columns + request_approval default change + new table).
|
||||
|
||||
### 6. Per-group container timeout
|
||||
**Finding**: v1's `containerConfig.timeout` override is gone. All groups share `IDLE_TIMEOUT`. Original framing (slow-but-healthy agents getting killed) was wrong — v1's `timeout` was a hard wall-clock kill on the whole spawn, totally different from v2's `IDLE_TIMEOUT` (keep-alive after last activity). V2's behavior is strictly better for long-running agents.
|
||||
|
||||
**Status**: dropped — not a regression
|
||||
|
||||
**Decision**: don't restore per-group timeout override. `IDLE_TIMEOUT=30min` global is the right model. If per-group idle tuning ever becomes useful it's ~15 LOC (new column, env injection at spawn) — small feature add, not a regression to repair.
|
||||
|
||||
**Rationale**: v2 lets long-running agents finish; v1 would have hard-killed them at 30min. Current behavior is an improvement.
|
||||
|
||||
**Next step**: see 6a.
|
||||
|
||||
---
|
||||
|
||||
### 6a. Remove IDLE_END_MS (container-side query idle termination)
|
||||
**Finding**: `container/agent-runner/src/poll-loop.ts:11` defines `IDLE_END_MS = 20_000`. Inside `processQuery`, a concurrent interval ends the active Agent SDK `query()` stream after 20s of SDK silence. Any SDK event (tool use, tool result, streamed text, new pushed message) resets the timer.
|
||||
|
||||
This is a general "SDK silence detector," not specifically post-result. It fires any time:
|
||||
- Mid-work: slow tool call with no intermediate SDK events (`npm install`, `pytest`, long `WebFetch`, etc.)
|
||||
- Post-result: agent finished, stream waiting for potential follow-up
|
||||
- Any other pause in the SDK stream
|
||||
|
||||
**Status**: decided — remove, pending SDK verification
|
||||
|
||||
**Decision**: delete `IDLE_END_MS` and its setInterval check. Let the `query()` stream stay open as long as the container is active. Container's 30-min `IDLE_TIMEOUT` (host-side in `container-runner.ts`) is the single source of truth for "when to let go."
|
||||
|
||||
**Rationale**:
|
||||
- When new messages arrive mid-stream, they push in via `query.push()` with no reconnect — stream-open is cheaper per-message than close-and-reopen
|
||||
- Closing early forces a reconnect + cold prompt cache for the next message
|
||||
- Container stays alive anyway; ending the stream doesn't save resources at the container level
|
||||
- `CLAUDE_CODE_AUTO_COMPACT_WINDOW=165000` already handles context window growth within a long-lived query
|
||||
- Anthropic API's own stream timeout will fire if needed; SDK should handle it transparently
|
||||
- Avoids the false-positive kill during legitimate slow tool calls (common case: agent running `npm install` gets cut off at 20s)
|
||||
|
||||
**Caveat (must verify before removal)**: confirm Claude Agent SDK doesn't require explicit `query.end()` for prompt-cache commit or session-state persistence. Expected to be fine (SDK checkpoints per turn) but double-check docs / run a quick test where container idles with stream open, then processes a follow-up.
|
||||
|
||||
**LOC estimate**: ~−15 (net deletion — remove constant, setInterval idle check, the `done` flag plumbing may also simplify)
|
||||
|
||||
**Next step**: when implementing item 1's changes (or standalone), verify SDK behavior with stream-open-indefinite, then delete IDLE_END_MS block. Watch for any test assertions on it.
|
||||
|
||||
### 7. Container streaming output (marker-based pre-delivery)
|
||||
**Finding**: v1's `---NANOCLAW_OUTPUT_START/END---` markers enabled pre-completion delivery. v2's two paths (final-result `dispatchResultText` + mid-turn `send_message` MCP tool) both write to outbound.db; host polls every `ACTIVE_POLL_MS = 1000ms`.
|
||||
|
||||
**Status**: dropped — not a regression
|
||||
|
||||
**Decision**: v2's `send_message` MCP tool is the correct replacement for v1's marker-based streaming. Latency is ≤1s (poll interval), which is fine for chat UX.
|
||||
|
||||
**Rationale**: v1's marker model required the agent and host to share a fragile state machine over stdout. v2 uses explicit tool calls and a DB surface — cleaner architecture, comparable latency, and control stays with the agent. If perceived latency ever becomes a real complaint, tune `ACTIVE_POLL_MS` down (500ms / 250ms) — low-cost knob.
|
||||
|
||||
**Next step**: none.
|
||||
|
||||
### 8. Per-exit container log files
|
||||
**Finding**: v1 wrote timestamped per-exit logs with full I/O + mounts + stderr. v2: stderr → `log.debug` (invisible at default `LOG_LEVEL=info`), container close → `log.info` with exit code, session DBs preserved on disk. Real gap: stderr on abnormal exit isn't auto-surfaced.
|
||||
|
||||
**Status**: dropped
|
||||
|
||||
**Decision**: skip — no per-exit file restoration, no stderr-on-crash buffer.
|
||||
|
||||
**Rationale**: underlying forensic info is still recoverable (session DBs on disk, heartbeat mtime, exit code in log). `LOG_LEVEL=debug` surfaces stderr when needed. The cost of adding buffered crash-log promotion (~15 LOC) isn't justified by the frequency of post-mortem cases.
|
||||
|
||||
**Next step**: none.
|
||||
|
||||
### 9. Stuck detection + heartbeat-based container lifecycle
|
||||
**Finding**: v2's sweep detects stale heartbeats (10 min) and resets messages with backoff, but doesn't kill the container. Idle timeout is delivery-count-based (30 min since last messages_out). Together these produce a gap where a stuck container holds resources + blocks new wakes for up to 30 min.
|
||||
|
||||
**Empirical findings from SDK probe** (`container/agent-runner/scripts/sdk-signal-probe.ts`, runs logged in `/tmp/probe-*.jsonl`):
|
||||
- Silent Bash tools (e.g. `sleep 30`) produce 30+ seconds of zero SDK events — heartbeat goes stale during legitimate work
|
||||
- Natural intra-stream silences up to ~12s observed mid-tool-use JSON streaming
|
||||
- `PreToolUse` / `PostToolUse` hook pair is reliable; `PostToolUseFailure` fires on blocked requests
|
||||
- `SubagentStart`/`SubagentStop` and `system/task_started`/`system/task_notification` pairs also reliable
|
||||
- **Pushing a new message mid-active-turn does NOT fire `UserPromptSubmit`** (fires only at start of a new turn, after `result`)
|
||||
- SDK's built-in `AskUserQuestion` doesn't actually block; returns placeholder
|
||||
- Bash tool's declared `timeout` param is visible in `tool_use` input — we can read it container-side
|
||||
- Stuck tools (hook that never resolves) produce indefinite silence — no SDK-side timeout
|
||||
|
||||
**Status**: decided
|
||||
|
||||
**Decision**: replace existing IDLE_TIMEOUT setTimeout + STALE_THRESHOLD=10min combo with message-scoped stuck detection + absolute 30-min ceiling. Reset messages inline when we kill. Blocklist SDK tools that don't fit our async model.
|
||||
|
||||
**Sweep logic** (per active session):
|
||||
|
||||
If container isn't running → reset any `'processing'` rows in processing_ack to `'pending'` + tries++ + backoff. Done.
|
||||
|
||||
If container IS running, apply in order:
|
||||
|
||||
1. **Absolute ceiling**: if `heartbeat_mtime` older than `max(30 min, current_bash_timeout)` → kill + reset any processing to pending + retry++.
|
||||
Rationale: 30 min idle ceiling, extended only if agent is currently inside a Bash tool with longer declared timeout. Agents needing >30 min should use `run_in_background`.
|
||||
|
||||
2. **Message-scoped stuck**: for each `processing_ack` row with status=`'processing'`:
|
||||
- `claim_age = now - status_changed`
|
||||
- `tolerance = max(60s, current_bash_timeout)` if Bash in flight, else `60s`
|
||||
- If `claim_age > tolerance` AND `heartbeat_mtime <= status_changed` → kill + reset this message + retry++
|
||||
|
||||
Semantics: "container claimed a message and went silent for >tolerance since claim."
|
||||
|
||||
No separate idle rule — rule 1 covers it. An idle container hits 30-min stale with no processing rows; kill has nothing to reset.
|
||||
|
||||
**Container state surface** (for Bash timeout tracking):
|
||||
New table in outbound.db (or session_state row):
|
||||
```
|
||||
container_state (
|
||||
session_id TEXT PRIMARY KEY,
|
||||
current_tool TEXT, -- null when no tool in flight
|
||||
tool_declared_timeout_ms INTEGER,
|
||||
tool_started_at TEXT
|
||||
)
|
||||
```
|
||||
Container writes on `PreToolUse` (reads Bash `timeout` from tool input), clears on `PostToolUse` / `PostToolUseFailure`. Host reads in sweep decision.
|
||||
|
||||
**Tool blocklist** (initial):
|
||||
- `AskUserQuestion` — SDK built-in; we have our own DB-backed MCP version
|
||||
- `EnterPlanMode` / `ExitPlanMode` — Claude Code UI only
|
||||
- `EnterWorktree` / `ExitWorktree` — Claude Code UI only
|
||||
|
||||
Enforcement:
|
||||
- Pass `disallowedTools: [...]` to `query()` options — agent never sees them in its tool list
|
||||
- `PreToolUse` hook guard (defense-in-depth): if a blocklisted tool name somehow fires, immediately reset the current message + kill (treat as stuck)
|
||||
|
||||
**Kill old machinery**:
|
||||
- Remove `setTimeout` + `resetIdle` plumbing in `container-runner.ts:128-140`
|
||||
- Remove `resetContainerIdleTimer` export + its caller in `delivery.ts:26`
|
||||
- Remove `IDLE_END_MS = 20_000` in `poll-loop.ts:11` (item 6a decision) — stream stays open as long as container alive
|
||||
- Existing `detectStaleContainers` logic merges into the new sweep rules; the heartbeat-stale-10-min path disappears
|
||||
|
||||
**LOC estimate**: ~115
|
||||
- New sweep decision logic replacing existing detectStaleContainers + IDLE_TIMEOUT path: 50
|
||||
- Container state table + PreToolUse/PostToolUse write, host read: 25
|
||||
- Tool blocklist (disallowedTools + hook guard): 15
|
||||
- Deletions (IDLE_TIMEOUT setTimeout, IDLE_END_MS): −25
|
||||
- Tests (kill paths, Bash-timeout grace, blocklist hit): 50
|
||||
|
||||
**Why this converged here** (rationale summary):
|
||||
- Empirical data showed we can't reliably tell stuck from legitimate-silent-work without state. Bash-declared-timeout is the cleanest per-tool signal available.
|
||||
- 60s-since-claim is tight enough for normal work (WebSearch/WebFetch finish in ~8s) but generous enough for reasonable delays. Exception for Bash covers agents running scripts with user-declared timeouts.
|
||||
- 30-min absolute ceiling prevents infinitely-stuck containers; agents needing longer have `run_in_background`.
|
||||
- Pushing messages can't serve as a liveness probe (they're silent mid-turn), so detection is state-driven, not push-driven.
|
||||
- Blocklist prevents a whole class of "SDK tool designed for interactive UI" footguns that would appear stuck in our async model.
|
||||
|
||||
**Next step**: implement as a focused PR. Order: (a) tool blocklist — safe to ship alone, (b) container state table + PreToolUse writes, (c) sweep rewrite + message reset, (d) delete old IDLE_TIMEOUT + IDLE_END_MS machinery, (e) tests.
|
||||
|
||||
### 10. Host-level retry with backoff on agent error
|
||||
**Finding**: v1 had MAX_RETRIES=5 + exp. backoff on `processGroupMessages` failure. v2's equivalent is now covered by item 9's sweep logic — any time the container isn't running with `'processing'` rows present, they get reset to pending with backoff + retry++.
|
||||
|
||||
**Status**: folded into item 9
|
||||
|
||||
**Decision**: no separate action. Agent-error retry happens via container-exit → sweep reset. Container errors also surface via provider-side session invalidation check (`poll-loop.ts:200-211` — `provider.isSessionInvalid(err)` → clears stored session id → fresh retry). Both paths preserved.
|
||||
|
||||
**Next step**: none.
|
||||
|
||||
---
|
||||
|
||||
### 11. Process ID in logger output
|
||||
**Finding**: v1 emitted `(${process.pid})` after the level tag. v2 dropped it.
|
||||
|
||||
**Status**: dropped
|
||||
|
||||
**Decision**: don't restore. Host is single-process (PID is constant). Container stderr already gets tagged with `{ container: agentGroup.folder }` at `container-runner.ts:121`, which is more informative than a PID.
|
||||
|
||||
**Next step**: none.
|
||||
|
||||
---
|
||||
|
||||
## LOW
|
||||
|
||||
### 11. Process ID in logger output
|
||||
**Finding**: v1 emitted `(${process.pid})` after the level tag. v2 dropped it.
|
||||
**Status**: pending
|
||||
**Decision**:
|
||||
**Rationale**:
|
||||
**Next step**:
|
||||
|
||||
### 12. Task dedup via unique `(kind, series_id)` index
|
||||
**Finding**: verified — `messages_in.series_id` column exists with a non-unique index. Concern was theoretical: two pending rows with same series could coexist.
|
||||
|
||||
**Status**: dropped
|
||||
|
||||
**Decision**: not a real issue. Recurrence logic at `src/modules/scheduling/recurrence.ts` is structurally dedup-safe: only `completed` rows with `recurrence` get cloned, and after cloning `recurrence` is cleared on the original so it can't re-clone. Plus container's atomic `markProcessing` prevents double-execution at claim time.
|
||||
|
||||
**Next step**: none.
|
||||
|
||||
### 13. Silent-drop mode for noisy senders
|
||||
**Finding**: v1's `mode:'drop'` let you ignore specific users without logging. v2 only has binary allow/deny via access gate.
|
||||
|
||||
**Status**: dropped — won't implement
|
||||
|
||||
**Decision**: not worth the table + gate complexity for a personal-assistant scale. If a specific sender becomes a problem, admin can switch the messaging_group's `unknown_sender_policy` to `'strict'` or remove the sender from `agent_group_members`.
|
||||
|
||||
**Next step**: none.
|
||||
|
||||
### 14. Remote control subsystem
|
||||
**Finding**: v1's `/remote-control` command spawned `claude remote-control` CLI detached, polled stdout for session URL, persisted PID/URL state. Entirely gone in v2.
|
||||
|
||||
**Status**: deferred — opt-in skill when needed
|
||||
|
||||
**Decision**: reintroduce as an opt-in install skill (e.g. `/add-remote-control`), not on trunk. Provider-specific: only works with `claude` provider (Claude Agent SDK); not supported by OpenCode or other providers. Skill should check `agent_group.provider` at install time and bail gracefully with an error message if not `'claude'`.
|
||||
|
||||
**Rationale**: niche feature valuable only for direct agent SDK attachment during dev/debugging. Keeping it off trunk matches v2's "infra-only trunk, features-via-skills" philosophy. Also avoids carrying code for a feature that simply doesn't exist in non-Claude providers.
|
||||
|
||||
**Next step**: none until someone needs it. When implementing, likely lives on the `providers` branch (since it's provider-specific) or its own branch, installed via skill that copies files + checks provider.
|
||||
|
||||
### 15. Dead config constants
|
||||
**Finding**: verified — `POLL_INTERVAL` (line 13), `SCHEDULER_POLL_INTERVAL` (line 14), and `IPC_POLL_INTERVAL` (line 32) in `src/config.ts` have zero imports elsewhere in v2. Container's `POLL_INTERVAL_MS` in `poll-loop.ts` is a distinct local constant, unrelated.
|
||||
|
||||
**Status**: decided — delete
|
||||
|
||||
**Decision**: remove the three constants from `src/config.ts`. Trivial 3-line deletion.
|
||||
|
||||
**Next step**: do as part of any sweep-touching PR, or standalone.
|
||||
|
||||
### 16. Configurable retention thresholds
|
||||
**Finding**: `STALE_THRESHOLD_MS` (10 min) and `MAX_TRIES` (5) in `host-sweep.ts` are hardcoded. Item 9's redesign replaces `STALE_THRESHOLD_MS` with new constants (60s claim-age, 30 min ceiling).
|
||||
|
||||
**Status**: dropped — keep as constants
|
||||
|
||||
**Decision**: leave the new item-9 thresholds + `MAX_TRIES` as source constants. Adding config surface for them isn't worth it at personal-assistant scale. If operational tuning ever becomes a real need, revisit — they're small centralized constants, one-line change each.
|
||||
|
||||
**Next step**: none.
|
||||
|
||||
### 17. Dynamic group-add (IPC watcher equivalent)
|
||||
**Finding**: not actually a restart requirement — investigation showed:
|
||||
- Router reads `messaging_groups` + `messaging_group_agents` fresh per inbound (dynamic by design)
|
||||
- Chat SDK bridge has a `conversations: Map<platformId, ConversationConfig>` populated at setup + `updateConversations()` method
|
||||
- **Nothing in the bridge currently reads the map**, and no code calls `updateConversations()` after startup
|
||||
- Today: stale map has no observable effect (dead state)
|
||||
- After item 1 ships (adapter-level gating): stale map would matter; new wirings wouldn't apply in the adapter gate until restart
|
||||
|
||||
**Status**: deferred — comment added now, implement alongside dynamic group registration feature
|
||||
|
||||
**Decision**: don't refactor the adapter interface now. Added a NOTE comment at `src/channels/chat-sdk-bridge.ts:73` flagging the staleness issue so the next person touching the bridge or adding dynamic-registration sees it. When dynamic group registration is implemented (admin adds a new messaging_group_agents row while host is running), handle cache refresh then — most likely by calling `adapter.updateConversations(freshConfigs)` after the mutation, keyed off the adapter's `channelType`.
|
||||
|
||||
**Rationale**: item 1's initial landing can keep the adapter gating responsibilities small or skip adapter-side gating entirely. Refactoring ConversationConfig now would add scope; better to ship item 1 first, see if over-subscription bites, address if it does.
|
||||
|
||||
**Next step**: when building the admin-skill path for adding messaging_group ↔ agent_group wirings, include a `refreshAdapterConversations(channelType)` call after the INSERT. ~10 LOC when needed.
|
||||
|
||||
---
|
||||
|
||||
## Test regressions (v1 `formatting.test.ts` assertions)
|
||||
|
||||
### 18+19+20+21. Timezone + formatting recreation (merged)
|
||||
**Finding**: v1 had a full timezone-aware formatting pipeline. v2 lost most of it, producing real bugs where the agent misinterprets user intent (scheduling for wrong times, suggesting time-inappropriate things).
|
||||
|
||||
**Scope** — recreate v1 behavior faithfully wherever times touch the agent:
|
||||
- Timestamp formatting on inbound messages: `formatLocalTime(utcIso, TIMEZONE)` producing "Jan 1, 2024, 1:30 PM" format via `Intl.DateTimeFormat('en-US', {...})` (v1 `timezone.ts`)
|
||||
- `<context timezone="<IANA_NAME>" />` header prepended to message block (v1 `router.ts:20-22`)
|
||||
- Reply-to with message ID: `<message ... reply_to="<id>">...<quoted_message from="...">...</quoted_message></message>` (v1 `router.ts:10-18`)
|
||||
- `stripInternalTags()`: regex `/<internal>[\s\S]*?<\/internal>/g` applied to outbound text, then `.trim()` (v1 `router.ts:25-27`)
|
||||
- Cron expressions parsed with explicit user TZ: `CronExpressionParser.parse(expr, { tz: TIMEZONE })` (v1 `task-scheduler.ts:20-49`)
|
||||
- User-specified times normalized via the user's TZ: in v1 this was the host-side task scheduler; in v2 it's the new-in-v2 scheduling MCP tool (`mcp-tools/scheduling.ts`). Same principle — accept user-local times, normalize to UTC for storage, interpret cron in user's TZ.
|
||||
|
||||
**Status**: decided — recreate with tests
|
||||
|
||||
**Decision**: port v1's formatter + timezone behavior faithfully. Full recreation spec at [`timezone-formatting-v1-recreation.md`](timezone-formatting-v1-recreation.md) — includes exact v1 code, line numbers at commit `27c5220`, complete test inventory from `src/v1/formatting.test.ts` and `src/v1/task-scheduler.test.ts`.
|
||||
|
||||
**Core principle** (per Gavriel): the agent operates in the user's timezone. Every timestamp the agent sees is user-local. Every time the agent outputs is interpreted as user-local. This is load-bearing for correctness, not a nice-to-have.
|
||||
|
||||
**Porting plan** (from recreation spec):
|
||||
1. `container/agent-runner/src/formatter.ts` — replace `formatTime` with `formatLocalTime(ts, TIMEZONE)` call; add reply_to attribute + `<quoted_message>` element exactly as v1
|
||||
2. Prepend `<context timezone="<IANA>" />\n` to the messages block at formatter entry
|
||||
3. Extract `stripInternalTags` as a named function; apply in outbound dispatch path (`poll-loop.ts:389` currently uses inline regex)
|
||||
4. `container/agent-runner/src/mcp-tools/scheduling.ts` — clarify `processAfter` description, normalize to UTC ISO in handler
|
||||
5. `src/modules/scheduling/recurrence.ts` — pass `{ tz: TIMEZONE }` to `CronExpressionParser.parse()` explicitly
|
||||
6. Port all test cases from v1's `formatting.test.ts` and `task-scheduler.test.ts` to v2's test tree
|
||||
|
||||
**LOC estimate**: ~75 prod + ~120 tests (reproducing v1's 40+ test cases)
|
||||
|
||||
**Next step**: implement as a focused PR. Order: (a) formatter changes + tests, (b) context header + tests, (c) reply_to + tests, (d) stripInternalTags extraction + tests, (e) scheduling tool + cron TZ + tests.
|
||||
|
||||
### 19, 20, 21 — merged into 18 above
|
||||
See item 18 for the full recreation plan and spec reference.
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
- `src/v1/` was deleted upstream (commit 86becf8) after this analysis was written. v2 tree has since had a major module extraction (approvals, interactive, scheduling, permissions, agent-to-agent, self-mod) and a new CLI channel. **Verify each item against the current tree before deciding** — some may already be addressed.
|
||||
@@ -1,146 +0,0 @@
|
||||
# v1 → v2 Deep Dive: Aggregate Summary
|
||||
|
||||
Per-file deep-dives were produced for every file in `src/v1/` and `container/agent-runner/src/v1/`. This document aggregates findings across all 21 modules.
|
||||
|
||||
## Per-file docs
|
||||
|
||||
| Topic | File | v1 source(s) |
|
||||
|---|---|---|
|
||||
| Configuration | [config.md](config.md) | `src/v1/config.ts` |
|
||||
| Environment helpers | [env.md](env.md) | `src/v1/env.ts` |
|
||||
| Types | [types.md](types.md) | `src/v1/types.ts` |
|
||||
| Logger | [logger.md](logger.md) | `src/v1/logger.ts` |
|
||||
| Timezone | [timezone.md](timezone.md) | `src/v1/timezone.ts` |
|
||||
| Database layer | [db.md](db.md) | `src/v1/db.ts` |
|
||||
| Container runner | [container-runner.md](container-runner.md) | `src/v1/container-runner.ts` |
|
||||
| Container runtime + mounts | [container-runtime.md](container-runtime.md) | `src/v1/container-runtime.ts`, `mount-security.ts` |
|
||||
| Group folder | [group-folder.md](group-folder.md) | `src/v1/group-folder.ts` |
|
||||
| Group queue | [group-queue.md](group-queue.md) | `src/v1/group-queue.ts` |
|
||||
| Host index | [index-host.md](index-host.md) | `src/v1/index.ts` |
|
||||
| IPC (host + container) | [ipc.md](ipc.md) | `src/v1/ipc.ts`, `container/.../v1/ipc-mcp-stdio.ts` |
|
||||
| Remote control | [remote-control.md](remote-control.md) | `src/v1/remote-control.ts` |
|
||||
| Router | [router.md](router.md) | `src/v1/router.ts` + `index.ts` routing |
|
||||
| Sender allowlist | [sender-allowlist.md](sender-allowlist.md) | `src/v1/sender-allowlist.ts` |
|
||||
| Session cleanup | [session-cleanup.md](session-cleanup.md) | `src/v1/session-cleanup.ts` |
|
||||
| Task scheduler | [task-scheduler.md](task-scheduler.md) | `src/v1/task-scheduler.ts` |
|
||||
| Channels | [channels.md](channels.md) | `src/v1/channels/*` |
|
||||
| Agent-runner entry | [container-index.md](container-index.md) | `container/.../v1/index.ts` |
|
||||
| Agent-runner MCP tools | [container-mcp-tools.md](container-mcp-tools.md) | `container/.../v1/mcp-tools.ts` |
|
||||
| Formatting test (orphan) | [formatting-test.md](formatting-test.md) | `src/v1/formatting.test.ts` |
|
||||
|
||||
## The big shift
|
||||
|
||||
v2 rewrote the fundamental transport between host and container. The one-line version:
|
||||
|
||||
> **v1 = IPC files + stdin/stdout + in-memory GroupQueue + polling message loop.
|
||||
> v2 = two SQLite DBs per session + event-driven routing + 60s host sweep.**
|
||||
|
||||
Everything else flows from that. Removing IPC forced a rewrite of the router, the container-runner, the agent-runner entry, and the MCP-tool bridge. The 60s sweep absorbed the task scheduler, session cleanup, and pending-message recovery. The entity model (users/roles/messaging_groups) replaced the flat sender allowlist and chat-level config. Provider abstraction + Chat SDK bridge replaced hardcoded Claude SDK + per-channel adapters.
|
||||
|
||||
Net LOC: v1 (~7.4k host + monolithic container-runner) → v2 (~5.5k host, split modules). Fewer lines, cleaner boundaries, more coverage.
|
||||
|
||||
## What's kept (identical or near-identical)
|
||||
- `timezone.ts` — byte-identical
|
||||
- `group-folder.ts` — byte-identical validation; v2 adds `group-init.ts` for filesystem scaffold
|
||||
- `container-runtime.ts` — nearly identical (only logger import swapped)
|
||||
- `mount-security.ts` — same structure, one field removed (see regressions)
|
||||
- `config.ts` / `env.ts` — same structure, same `.env` surface; several constants now dead code
|
||||
- `logger.ts` — same levels/colors/routing, but API shape changed (message-first instead of data-first)
|
||||
- MCP `send_message` tool — kept + enhanced with named destinations
|
||||
|
||||
## What's new in v2
|
||||
- **Two-DB session model** (`inbound.db` + `outbound.db`) with even/odd seq parity, journal_mode=DELETE for cross-mount visibility
|
||||
- **Entity model** — `users`, `user_roles` (owner/admin/scoped), `agent_group_members`, `messaging_groups`, `messaging_group_agents`, `user_dms` (cold-DM cache)
|
||||
- **Host sweep** (60s) — absorbs scheduler, cleanup, pending-message recovery, recurrence firing, stale detection, orphan cleanup
|
||||
- **Chat SDK bridge** — unifies Discord/Slack/Teams/other adapters through `@anthropic-ai/chat`
|
||||
- **Provider abstraction** — default Claude + opt-in OpenCode etc. via `providers` branch
|
||||
- **OneCLI integration** — credential gateway + approval flow (`src/onecli-approvals.ts`)
|
||||
- **16 new MCP tools** — scheduling (6), interactive (2), self-mod (3), agent mgmt (1), message manipulation (3), plus enhanced `send_message`
|
||||
- **Heartbeat file mtime** — replaces IPC liveness
|
||||
- **Session persistence** — session ID survives container restarts
|
||||
- **Dual-rate polling** — 1000ms idle / 500ms active inside container
|
||||
- **Idle stream termination** — 20s timeout prevents zombie queries
|
||||
- **Processing ACK** — reverse channel (outbound → inbound) for idempotence
|
||||
- **Migration system** — 9 numbered migrations vs v1's ad-hoc ALTERs
|
||||
- **Webhook server** (new for HTTP-based channels)
|
||||
- **Container typing indicator refresh** via delivery
|
||||
|
||||
## What's removed (deliberately)
|
||||
- **IPC transport** (files, stdin/stdout JSON, MCP-over-stdio bridge) — replaced by DB polling
|
||||
- **`GroupQueue`** in-memory state machine — serialization via `messages_in.status`
|
||||
- **Output markers** (`---NANOCLAW_OUTPUT_START/END---`) — results land in `messages_out`
|
||||
- **State persistence** (`router_state`, `lastAgentTimestamp` map) — each message is independent
|
||||
- **Per-exit container log files** — only logger.debug to host log
|
||||
- **Flat sender allowlist** (JSON config) — replaced by role-based access + `unknown_sender_policy`
|
||||
- **Remote control subsystem** (`/remote-control` command → spawned CLI)
|
||||
- **IPC watcher** (dynamic group-add while running)
|
||||
- **`task_runs` audit table** — no task execution log
|
||||
- **Cron/interval task types** as first-class entities — tasks are `messages_in` rows with `kind='task'` + `recurrence`
|
||||
- **Stdin protocol** for agent input — container reads from inbound.db
|
||||
|
||||
## Regressions worth fixing (ranked)
|
||||
|
||||
### HIGH priority
|
||||
1. **Trigger-rule matching in `pickAgent`** (`src/router.ts:198` TODO).
|
||||
Without this, a messaging group wired to multiple agents fires ALL of them on every message. Schema (`messaging_group_agents.trigger_rules`) is ready; the check is ~10 lines. **Likely broken-by-default for multi-agent setups.**
|
||||
|
||||
2. **`nonMainReadOnly` mount isolation removed** (`src/mount-security.ts`).
|
||||
Non-main/shared agent groups can now mount read-write on any path the allowlist permits. v1 enforced read-only-for-non-main regardless of allowlist. **Security regression** for multi-tenant setups. Restore: add field + restore `isMain` param flow.
|
||||
|
||||
3. **Pending-message recovery on startup** (`src/v1/index.ts:465-473`).
|
||||
v1 explicitly scanned for unprocessed messages on restart. v2 relies on the sweep to notice. Likely works in practice, but worth a test: kill container mid-message, restart host, verify redelivery within ≤5s.
|
||||
|
||||
### MEDIUM priority
|
||||
4. **`response_scope` enforcement** (`messaging_group_agents.response_scope` stored but unused).
|
||||
Values `'all' | 'triggered' | 'allowlisted'` are saved but nothing reads them.
|
||||
|
||||
5. **`request_approval` flow for unknown senders** (`src/router.ts:295` TODO).
|
||||
`unknown_sender_policy='request_approval'` is scaffolded but doesn't actually produce an approval card.
|
||||
|
||||
6. **Per-group container timeout**.
|
||||
v1's `containerConfig.timeout` override is gone; all groups share `IDLE_TIMEOUT`. Slow-but-healthy agents get killed with fast agents' timeout.
|
||||
|
||||
7. **Container streaming output**.
|
||||
v1's marker-based pre-completion delivery is gone. v2 must wait for outbound.db poll. Latency-sensitive UX regresses.
|
||||
|
||||
8. **Per-exit container logs**.
|
||||
v1 wrote timestamped per-exit log files with full I/O + mounts + stderr. v2 only has logger.debug. Zero-cost on success, high-value on crash. Restore at least for non-zero exit.
|
||||
|
||||
9. **Explicit container kill on stale detection**.
|
||||
v2's sweep marks messages for retry but doesn't stop the stale container. Only `cleanupOrphans()` at startup removes them. Add `stopContainer()` when heartbeat stale AND processing stuck.
|
||||
|
||||
10. **Host-level retry with backoff on agent error**.
|
||||
v1 had MAX_RETRIES=5 + exp. backoff on `processGroupMessages` failure. v2 only retries on stale-heartbeat. Explicit agent-error retry could close the gap.
|
||||
|
||||
### LOW priority
|
||||
11. **Process ID in logger output** — lost multi-process debugging info
|
||||
12. **Task dedup via unique `(kind, series_id)` index** — v2 can have two pending rows with same series; best-effort via atomic status update
|
||||
13. **Silent-drop mode for noisy senders** — v1's `mode:'drop'` had a use case; orthogonal to privilege
|
||||
14. **Remote control** — decide: restore as opt-in skill or document as removed
|
||||
15. **Dead config constants** (`POLL_INTERVAL`, `SCHEDULER_POLL_INTERVAL`, `IPC_POLL_INTERVAL`) — delete from `src/config.ts`
|
||||
16. **Configurable retention thresholds** (`STALE_THRESHOLD_MS`, `MAX_TRIES`) — move from constants to `config.ts`
|
||||
17. **Dynamic group-add** (IPC watcher equivalent) — probably not worth; document that restart is required
|
||||
|
||||
## Things kept as test-only regression risk
|
||||
The orphan `src/v1/formatting.test.ts` asserted behaviors that aren't fully exercised in v2:
|
||||
- **Timezone-aware formatted timestamps** — v1 emitted locale strings ("Jan 1, 2024, 1:30 PM"); v2 emits UTC HH:MM
|
||||
- **`<context timezone="..."/>` header** — gone
|
||||
- **`reply_to="<id>"` attribute** — v2 only stores sender name + truncated preview
|
||||
- **Trigger-pattern unit tests** — no direct equivalent (logic moved to DB but isn't tested at the router level)
|
||||
- **Internal tag stripping** tests — no isolated tests in agent-runner
|
||||
|
||||
These are specs worth porting to v2 tests once trigger matching is implemented.
|
||||
|
||||
## Files entirely gone in v2
|
||||
- `src/v1/ipc.ts` + `src/v1/ipc-auth.test.ts` — IPC is dead
|
||||
- `container/.../v1/ipc-mcp-stdio.ts` — MCP-over-stdio bridge dead
|
||||
- `src/v1/group-queue.ts` — serialization via DB
|
||||
- `src/v1/session-cleanup.ts` — merged into `host-sweep.ts`
|
||||
- `src/v1/task-scheduler.ts` — merged into `host-sweep.ts` + system actions in `delivery.ts`
|
||||
- `src/v1/remote-control.ts` — feature removed
|
||||
- `src/v1/sender-allowlist.ts` — entity model supersedes
|
||||
|
||||
## Net architectural assessment
|
||||
v2 is strictly simpler, more consistent, and more robust in its happy path. The remaining TODOs (trigger matching, response_scope, request_approval) reflect scaffolding that was checked in ahead of the feature — none are deep design issues. The one actual regression is `nonMainReadOnly` mount isolation; it was a defense-in-depth feature and deserves to come back. The removal of per-exit container logs and streaming output markers are judgment calls that traded observability for simplicity — both can be restored cheaply if needed.
|
||||
|
||||
No file in v1 contains a behavior that v2 is architecturally unable to express. The outstanding work is feature-completion, not architecture.
|
||||
@@ -1,305 +0,0 @@
|
||||
# channels: v1 vs v2
|
||||
|
||||
## Scope
|
||||
|
||||
### v1
|
||||
- **Paths**: `src/v1/channels/index.ts`, `src/v1/channels/registry.ts`, `src/v1/channels/registry.test.ts`
|
||||
- **LOC**: 62 total (1 + 23 + 38)
|
||||
- **Purpose**: Registry and interface stubs for external channel adapters (real adapters live on `channels` branch)
|
||||
|
||||
### v2 counterparts
|
||||
- **Paths**: `src/channels/adapter.ts`, `src/channels/channel-registry.ts`, `src/channels/chat-sdk-bridge.ts`, `src/channels/index.ts`, `src/channels/ask-question.ts`, and tests
|
||||
- **LOC**: 1,055 total (excluding tests: ~757)
|
||||
- **Purpose**: Full adapter interface, registry with lifecycle, Chat SDK bridge (new in v2), ask_question normalization, plus integration tests
|
||||
|
||||
---
|
||||
|
||||
## Adapter Interface Diff
|
||||
|
||||
### v1: `Channel` (from src/v1/types.ts:87–98)
|
||||
|
||||
```typescript
|
||||
export interface Channel {
|
||||
name: string;
|
||||
connect(): Promise<void>;
|
||||
sendMessage(jid: string, text: string): Promise<void>;
|
||||
isConnected(): boolean;
|
||||
ownsJid(jid: string): boolean;
|
||||
disconnect(): Promise<void>;
|
||||
setTyping?(jid: string, isTyping: boolean): Promise<void>; // Optional
|
||||
syncGroups?(force: boolean): Promise<void>; // Optional
|
||||
}
|
||||
```
|
||||
|
||||
**Callbacks** (src/v1/types.ts:101–112):
|
||||
- `OnInboundMessage(chatJid: string, message: NewMessage): void`
|
||||
- `OnChatMetadata(chatJid: string, timestamp: string, name?: string, channel?: string, isGroup?: boolean): void`
|
||||
|
||||
**Factory & Registration** (src/v1/channels/registry.ts:3–23):
|
||||
```typescript
|
||||
export interface ChannelOpts {
|
||||
onMessage: OnInboundMessage;
|
||||
onChatMetadata: OnChatMetadata;
|
||||
registeredGroups: () => Record<string, RegisteredGroup>;
|
||||
}
|
||||
export type ChannelFactory = (opts: ChannelOpts) => Channel | null;
|
||||
registerChannel(name: string, factory: ChannelFactory): void;
|
||||
getChannelFactory(name: string): ChannelFactory | undefined;
|
||||
getRegisteredChannelNames(): string[];
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### v2: `ChannelAdapter` (from src/channels/adapter.ts:61–106)
|
||||
|
||||
```typescript
|
||||
export interface ChannelAdapter {
|
||||
name: string;
|
||||
channelType: string;
|
||||
supportsThreads: boolean; // NEW: declares thread model
|
||||
|
||||
// Lifecycle (was: connect/disconnect)
|
||||
setup(config: ChannelSetup): Promise<void>;
|
||||
teardown(): Promise<void>;
|
||||
isConnected(): boolean;
|
||||
|
||||
// Message delivery (was: sendMessage, now structured)
|
||||
deliver(platformId: string, threadId: string | null, message: OutboundMessage): Promise<string | undefined>;
|
||||
|
||||
// Optional
|
||||
setTyping?(platformId: string, threadId: string | null): Promise<void>;
|
||||
syncConversations?(): Promise<ConversationInfo[]>;
|
||||
updateConversations?(conversations: ConversationConfig[]): void;
|
||||
openDM?(userHandle: string): Promise<string>; // NEW: cold-DM initiation
|
||||
}
|
||||
```
|
||||
|
||||
**Callbacks** (src/channels/adapter.ts:18–30):
|
||||
```typescript
|
||||
export interface ChannelSetup {
|
||||
conversations: ConversationConfig[];
|
||||
onInbound(platformId: string, threadId: string | null, message: InboundMessage): void | Promise<void>;
|
||||
onMetadata(platformId: string, name?: string, isGroup?: boolean): void;
|
||||
onAction(questionId: string, selectedOption: string, userId: string): void; // NEW
|
||||
}
|
||||
```
|
||||
|
||||
**Factory & Registration** (src/channels/channel-registry.ts:25–47):
|
||||
```typescript
|
||||
export type ChannelAdapterFactory = () => ChannelAdapter | Promise<ChannelAdapter> | null;
|
||||
export interface ChannelRegistration {
|
||||
factory: ChannelAdapterFactory;
|
||||
containerConfig?: { mounts?: [...]; env?: Record<string, string>; };
|
||||
}
|
||||
registerChannelAdapter(name: string, registration: ChannelRegistration): void;
|
||||
getChannelAdapter(channelType: string): ChannelAdapter | undefined; // RENAMED
|
||||
getActiveAdapters(): ChannelAdapter[]; // NEW
|
||||
getRegisteredChannelNames(): string[];
|
||||
getChannelContainerConfig(name: string): ChannelRegistration['containerConfig']; // NEW
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Capability Map
|
||||
|
||||
| v1 Behavior | v2 Location | Status | Notes |
|
||||
|---|---|---|---|
|
||||
| **Interface & Lifecycle** | | | |
|
||||
| `connect()` → `disconnect()` | `setup()` / `teardown()` | Renamed + consolidated | v2 groups init work; adds promise-based retry on NetworkError (src/channels/channel-registry.ts:73) |
|
||||
| `Channel.name: string` | `ChannelAdapter.name` + `ChannelAdapter.channelType` | Split | `name` is identity; `channelType` is the key for active lookup |
|
||||
| `ownsJid(jid)` | Implicit in platformId model | Removed | v2 uses structured platformId + threadId; ownership logic pushed to router |
|
||||
| **Message Flow** | | | |
|
||||
| `sendMessage(jid, text)` | `deliver(platformId, threadId, message)` | Refactored | v2 passes structured `OutboundMessage` with `kind` field; returns platform messageId; supports edit/reaction ops (src/channels/chat-sdk-bridge.ts:279–289) |
|
||||
| Callbacks: `onMessage` | `onInbound(platformId, threadId, message)` | Refactored | v2 passes message object with `kind` enum ('chat' \| 'chat-sdk'); can be async |
|
||||
| Callbacks: `onChatMetadata` | `onMetadata(platformId, name?, isGroup?)` | Simplified | Signature matches v1; removed `channel` param; timestamp now in inbound message itself |
|
||||
| | `onAction(questionId, option, userId)` | **NEW** | Handles ask_question card button clicks via Chat SDK bridge (src/channels/chat-sdk-bridge.ts:193–218) |
|
||||
| **Typing Indicator** | | | |
|
||||
| `setTyping(jid, bool)` | `setTyping(platformId, threadId)` | Refactored | v2 omits boolean flag (always true, no off-toggle); threaded parameter |
|
||||
| **Group/Conversation Sync** | | | |
|
||||
| `syncGroups(force?)` | `syncConversations()?: Promise<ConversationInfo[]>` | Renamed | Now returns structured list; decoupled from periodic init (optional hook) |
|
||||
| | `updateConversations(configs)`: void | **NEW** | Push notifications of conversation changes from host to adapter (e.g., new wiring) |
|
||||
| **Thread Model** | | | |
|
||||
| Implicit (adapter-specific) | `supportsThreads: boolean` | **NEW** | v2 explicitly declares it; router uses this to collapse/expand thread context (src/channels/adapter.ts:73–75) |
|
||||
| **DM Initiation** | | | |
|
||||
| Not exposed | `openDM(userHandle)?: Promise<string>` | **NEW** | For cold-DM reaching (approvals, onboarding, alerts) on platforms that distinguish user-id from DM-channel-id. Optional; fallback in user-dm.ts if absent (src/channels/adapter.ts:94–105) |
|
||||
| **Inbound Message Structure** | | | |
|
||||
| v1 `NewMessage` object | v2 `InboundMessage` (generic JSON) | Generalized | v1 had flat fields (sender, content, timestamp, thread_id, reply_to_*); v2 wraps serialized Chat SDK Message or native JSON in `content` field; Chat SDK bridge enriches (adds senderId, senderName) before sending (src/channels/chat-sdk-bridge.ts:124–141) |
|
||||
| **Outbound Message Structure** | | | |
|
||||
| Plain text + typing flag | v2 `OutboundMessage` (typed `kind` + flexible `content`) | Generalized | Supports 'chat', 'chat-sdk', edit ops, reactions, ask_question cards (src/channels/adapter.ts:46–51, src/channels/chat-sdk-bridge.ts:279–317) |
|
||||
| **Factory Pattern** | | | |
|
||||
| `ChannelFactory(opts) → Channel \| null` | `ChannelAdapterFactory() → ChannelAdapter \| Promise<...> \| null` | Async + cred check | v2 supports async factory (for loading credentials); promise-based retry on NetworkError (src/channels/channel-registry.ts:68–87) |
|
||||
| **Container Config** | | | |
|
||||
| Not exposed | `ChannelRegistration.containerConfig` | **NEW** | Adapters can declare mounts + env vars for their container (used by container-runner); see src/channels/channel-registry.ts:45–47 |
|
||||
|
||||
---
|
||||
|
||||
## Message Conversion & Error Handling
|
||||
|
||||
### v1 Flow
|
||||
- Adapter calls `onMessage(chatJid, NewMessage)` synchronously
|
||||
- Router extracts fields, upserts user, creates/finds session, writes to `inbound.db`
|
||||
- No built-in error handling; adapters catch and log themselves
|
||||
|
||||
### v2 Flow (src/channels/chat-sdk-bridge.ts:85–141)
|
||||
1. **Inbound**: Chat SDK `Message` → `InboundMessage` (kind='chat-sdk', content=serialized JSON)
|
||||
2. **Attachment handling**: Downloads attachments, converts to base64 (src/channels/chat-sdk-bridge.ts:90–111)
|
||||
3. **Reply context extraction**: Platform-specific hook (src/channels/chat-sdk-bridge.ts:115–120)
|
||||
4. **User field normalization**: Maps Chat SDK author → senderId, sender, senderName (src/channels/chat-sdk-bridge.ts:124–131)
|
||||
5. **Raw data drop**: Removes `raw` to save DB space (src/channels/chat-sdk-bridge.ts:134)
|
||||
6. **Call onInbound**: Async-capable (can await router writes)
|
||||
|
||||
**Outbound** (src/channels/chat-sdk-bridge.ts:273–344):
|
||||
- Supports multiple operation types via `content.operation`:
|
||||
- `'edit'` + `messageId` → `adapter.editMessage()`
|
||||
- `'reaction'` + `emoji` → `adapter.addReaction()`
|
||||
- `type: 'ask_question'` → render Card with buttons
|
||||
- Normal text/markdown → `adapter.postMessage()` with optional files
|
||||
|
||||
**Error Propagation**:
|
||||
- Network errors on setup get retry (src/channels/channel-registry.ts:73; duck-type check for Error.name==='NetworkError')
|
||||
- Delivery errors logged but don't block (src/channels/chat-sdk-bridge.ts:213–214, 484–486)
|
||||
|
||||
---
|
||||
|
||||
## New: Chat SDK Bridge
|
||||
|
||||
The v2 `Chat` abstraction (from `@anthropic-ai/chat`) wraps platform-specific adapters (Discord.js, Slack SDK, etc.) into a unified API. The NanoClaw `createChatSdkBridge()` (src/channels/chat-sdk-bridge.ts:68–384) adapts that `Chat` instance to the `ChannelAdapter` interface.
|
||||
|
||||
**Key methods**:
|
||||
- `setup(hostConfig)`: Initialize Chat, set up event handlers (subscribed messages, DMs, mentions, actions), start Gateway listener or register webhook (src/channels/chat-sdk-bridge.ts:149–271)
|
||||
- `deliver()`: Route outbound payloads (text, edit, reaction, ask_question card) to Chat SDK (src/channels/chat-sdk-bridge.ts:273–344)
|
||||
- `setTyping()`: Delegate to `adapter.startTyping()` (src/channels/chat-sdk-bridge.ts:346–349)
|
||||
- `teardown()`: Abort Gateway, shutdown Chat (src/channels/chat-sdk-bridge.ts:351–355)
|
||||
- `updateConversations()`: Rebuild conversation map on changes (src/channels/chat-sdk-bridge.ts:361–363)
|
||||
- `openDM()`: Conditional; only if underlying adapter supports it (src/channels/chat-sdk-bridge.ts:366–381)
|
||||
|
||||
**Event routing** (src/channels/chat-sdk-bridge.ts:163–191):
|
||||
- `chat.onSubscribedMessage()` → `onInbound()` for all known threads
|
||||
- `chat.onNewMention()` → `onInbound()` + auto-subscribe
|
||||
- `chat.onDirectMessage()` → `onInbound()` for DMs
|
||||
- `chat.onAction()` → `onAction()` for ask_question button clicks (src/channels/chat-sdk-bridge.ts:193–218)
|
||||
|
||||
**Gateway listener** (src/channels/chat-sdk-bridge.ts:222–268):
|
||||
- Adapters like Discord that support websocket connection declare `startGatewayListener()`.
|
||||
- NanoClaw runs it, forwards interactions (button clicks) to a local HTTP webhook server (src/channels/chat-sdk-bridge.ts:392–506).
|
||||
- Non-Gateway adapters (Slack, Teams) register on the shared webhook-server instead (src/channels/chat-sdk-bridge.ts:266–268).
|
||||
|
||||
---
|
||||
|
||||
## Test Fixtures
|
||||
|
||||
### v1 (src/v1/channels/registry.test.ts:10–38)
|
||||
- Simple lambda factories: `() => null`
|
||||
- No mock adapters (tests only verify registry API mechanics)
|
||||
- Test count: 4 (unknown-channel, round-trip, listing, overwrite)
|
||||
|
||||
### v2 (src/channels/channel-registry.test.ts + src/channels/chat-sdk-bridge.test.ts)
|
||||
|
||||
**Mock Adapter** (src/channels/channel-registry.test.ts:31–71):
|
||||
```typescript
|
||||
createMockAdapter(channelType): ChannelAdapter & { delivered, inbound, setupConfig }
|
||||
- Properties: name, channelType, supportsThreads, delivered[], inbound[], setupConfig
|
||||
- Methods: setup(config), teardown(), isConnected(), deliver(), setTyping(), updateConversations()
|
||||
```
|
||||
|
||||
**Registry Tests** (src/channels/channel-registry.test.ts:84–119):
|
||||
- Adapter registration with container config (src/channels/channel-registry.test.ts:88–98)
|
||||
- Credential-missing adapters skipped (src/channels/channel-registry.test.ts:101–119)
|
||||
|
||||
**Integration Tests** (src/channels/channel-registry.test.ts:122–234):
|
||||
- Router receives inbound from adapter, writes to inbound.db (src/channels/channel-registry.test.ts:166–197)
|
||||
- Delivery adapter bridge calls adapter.deliver() (src/channels/channel-registry.test.ts:199–233)
|
||||
|
||||
**Chat SDK Bridge Tests** (src/channels/chat-sdk-bridge.test.ts:11–38):
|
||||
- Conditional openDM exposure (src/channels/chat-sdk-bridge.test.ts:12–18)
|
||||
- openDM delegation to underlying adapter (src/channels/chat-sdk-bridge.test.ts:20–37)
|
||||
|
||||
---
|
||||
|
||||
## Missing from v2
|
||||
|
||||
### 1. `ownsJid(jid: string): boolean`
|
||||
- **v1 use**: Adapters declared ownership of a JID (e.g., "does this Telegram numeric ID belong to me?")
|
||||
- **v2 model**: JIDs → platformId + threadId; ownership is implicit in `platformId` format (e.g., `"telegram:6037840640"` vs `"discord:guildId:channelId"`). Router uses this to route inbound to the right adapter.
|
||||
- **Impact**: Adapters no longer need explicit ownership checks; the structured ID handles it.
|
||||
|
||||
### 2. `syncGroups(force?: boolean): Promise<void>`
|
||||
- **v1 use**: Periodic or on-demand sync of all groups/channels from the platform.
|
||||
- **v2 model**: Optional `syncConversations()` returns metadata instead of mutating internal state; host calls it when needed (not baked into adapter init). Conversations are tracked in central DB `messaging_groups` table.
|
||||
- **Impact**: Host has more control; adapters don't side-effect their own state.
|
||||
|
||||
### 3. `registeredGroups` callback in `ChannelOpts`
|
||||
- **v1 use**: Passed at init time; adapters could query which groups were registered.
|
||||
- **v2 model**: Conversations provided upfront in `ChannelSetup.conversations`; can be updated via `updateConversations()`.
|
||||
- **Impact**: Cleaner dependency injection; avoids callback nesting.
|
||||
|
||||
### 4. `channel` parameter in `OnChatMetadata`
|
||||
- **v1 use**: Metadata callback could optionally return which channel type made the discovery.
|
||||
- **v2 model**: Not needed; `platformId` in `onMetadata(platformId, name, isGroup)` encodes the channel type.
|
||||
|
||||
---
|
||||
|
||||
## Behavioral Discrepancies
|
||||
|
||||
### 1. Thread-ID Handling
|
||||
- **v1**: Some adapters (Telegram, WhatsApp) don't use threads; JIDs are the same as channel IDs. Others (Discord, Slack) embed thread IDs in reply_to logic.
|
||||
- **v2**: Explicit `supportsThreads` flag; adapters that don't support threads pass `threadId: null` to `onInbound()`. Router uses this to decide session granularity (file:src/channels/adapter.ts:73–75).
|
||||
|
||||
### 2. Outbound Message Structure
|
||||
- **v1**: Plain text + optional typing flag.
|
||||
- **v2**: Structured `{ kind, content, files? }` with operation support (edit, reaction, ask_question cards). Allows multi-op delivery without repeated deliver() calls.
|
||||
|
||||
### 3. Inbound Serialization
|
||||
- **v1**: Adapters directly passed `NewMessage` interface objects.
|
||||
- **v2**: Adapters pass `InboundMessage` with generic `content` field (JSON-serializable JS object). Chat SDK bridge converts Chat SDK Message → JSON, then stringifies for DB (file:src/channels/chat-sdk-bridge.ts:136–140).
|
||||
|
||||
### 4. Ask-Question Handling
|
||||
- **v1**: No native support; would be custom per-adapter.
|
||||
- **v2**: Unified via `ask_question` payload type. Chat SDK bridge renders as Card + Buttons; handles button clicks via `onAction()` callback and updates card to show selection (file:src/channels/chat-sdk-bridge.ts:292–317, 459–486).
|
||||
|
||||
### 5. Cold-DM Initiation
|
||||
- **v1**: Not exposed.
|
||||
- **v2**: `openDM(userHandle): Promise<string>` allows host to initiate DMs to users without prior message. Adapters that need it (Discord, Slack, Teams) implement; others omit and fall back to direct handle as platformId (file:src/user-dm.ts fallback).
|
||||
|
||||
### 6. Async Factory
|
||||
- **v1**: `ChannelFactory` returns `Channel | null` synchronously.
|
||||
- **v2**: `ChannelAdapterFactory` returns `ChannelAdapter | Promise<ChannelAdapter> | null`, supporting async credential loading. Registry retries on `NetworkError` (file:src/channels/channel-registry.ts:68–87).
|
||||
|
||||
### 7. Lifecycle Promises
|
||||
- **v1**: `connect()` / `disconnect()` are separate.
|
||||
- **v2**: `setup()` / `teardown()` grouped; no intermediate "starting/stopping" state. Gateway listeners and webhook servers are started inside `setup()`, torn down inside `teardown()` (file:src/channels/chat-sdk-bridge.ts:149–271, 351–355).
|
||||
|
||||
---
|
||||
|
||||
## Worth Preserving?
|
||||
|
||||
**All v1 patterns are preserved in v2, just restructured:**
|
||||
|
||||
1. **Adapter interface model**: v1's optional hooks (`setTyping?`, `syncGroups?`) become v2's optional methods (`setTyping?`, `syncConversations?`, `openDM?`). Structural compatibility for native adapters.
|
||||
|
||||
2. **Registry pattern**: v1's `registerChannel(name, factory)` → v2's `registerChannelAdapter(name, registration)`. Same self-registration barrel; v2 adds container config metadata.
|
||||
|
||||
3. **Callback-driven message flow**: v1's `onMessage` and `onChatMetadata` callbacks live on as `onInbound` and `onMetadata`. v2 adds `onAction` for interactive features (ask_question buttons).
|
||||
|
||||
4. **No built-in state mutation**: v1 adapters own their group state; v2 adapters are stateless (conversations pushed in). Both respect adapter autonomy.
|
||||
|
||||
**What's genuinely new and worth keeping:**
|
||||
|
||||
- **Chat SDK bridge**: Unifies platform SDKs without duplicating channel adapters per SDK. Huge reduction in code duplication (one Discord adapter instead of native + Chat SDK versions).
|
||||
- **Structured message payloads**: v2's `kind` field and flexible `content` JSON allow single delivery path for text, edits, reactions, and rich interactions.
|
||||
- **Ask-question cards**: Native support for interactive approvals and user input, reducing agent-side boilerplate.
|
||||
- **openDM**: Enables host-initiated contact (onboarding, alerts, approvals) without waiting for inbound.
|
||||
- **supportsThreads**: Explicit declaration lets router make informed session granularity decisions, vs. hardcoded per-adapter assumptions.
|
||||
|
||||
**Minimal migration burden:**
|
||||
|
||||
Native adapters written for v1 need only:
|
||||
1. Rename `connect` → `setup` (add `ChannelSetup` param).
|
||||
2. Rename `disconnect` → `teardown`.
|
||||
3. Rename `sendMessage(jid, text)` → `deliver(platformId, threadId, message)` (wrap text in `{ kind: 'chat', content: { text } }`).
|
||||
4. Add `supportsThreads: boolean`, `name`, `channelType` fields.
|
||||
5. Add `isConnected()` stub if not already present.
|
||||
6. Optional: Implement `setTyping?`, `syncConversations?`, `openDM?` for feature parity.
|
||||
|
||||
Nothing is fundamentally broken; it's a straightforward refactor of the adapter contract.
|
||||
|
||||
@@ -1,99 +0,0 @@
|
||||
# config: v1 vs v2
|
||||
|
||||
## Scope
|
||||
|
||||
- **v1**: `/Users/gavriel/nanoclaw4/src/v1/config.ts` (63 lines) + `/Users/gavriel/nanoclaw4/src/v1/env.ts` (42 lines)
|
||||
- **v2 counterparts**: `/Users/gavriel/nanoclaw4/src/config.ts` (63 lines, **identical**), `/Users/gavriel/nanoclaw4/src/env.ts` (42 lines, **identical**), plus host-level polling in `/Users/gavriel/nanoclaw4/src/host-sweep.ts` and `/Users/gavriel/nanoclaw4/src/delivery.ts`; container agent-runner reads at `/Users/gavriel/nanoclaw4/container/agent-runner/src/index.ts`
|
||||
|
||||
## Capability map
|
||||
|
||||
| v1 behavior | v2 location | Status | Notes |
|
||||
|---|---|---|---|
|
||||
| **ASSISTANT_NAME** env var (default: 'Andy') | `src/config.ts:10`; read from `.env` or `process.env` | Kept, partially used | v2 exports it but doesn't use it in host. Container receives via `NANOCLAW_ASSISTANT_NAME` env var (set by `src/container-runner.ts:302`) for transcript archiving only. v1 used it for CLAUDE.md substitution, trigger pattern, and prompt context. |
|
||||
| **ASSISTANT_HAS_OWN_NUMBER** boolean env var | `src/config.ts:11-12` | **Removed, unused** | Exported but neither v1 nor v2 use it. No evidence of any implementation. |
|
||||
| **POLL_INTERVAL = 2000ms** | `src/config.ts:13` | **Removed, unused** | v1 used in `index.ts:457` (IPC watcher polling). v2 replaced IPC with session DBs; no polling needed at this interval. |
|
||||
| **SCHEDULER_POLL_INTERVAL = 60000ms** | `src/config.ts:14` | **Removed, unused** | v1 used in `task-scheduler.ts:231`. v2 uses hard-coded `SWEEP_INTERVAL_MS = 60_000` in `host-sweep.ts:31` instead (same value, different source). |
|
||||
| **IPC_POLL_INTERVAL = 1000ms** | `src/config.ts:32` | **Removed, unused** | v1 used in `ipc.ts:50, ipc.ts:122`. v2 replaced file-based IPC with SQLite session DBs; this interval has no meaning. |
|
||||
| **MOUNT_ALLOWLIST_PATH** = `~/.config/nanoclaw/mount-allowlist.json` | `src/config.ts:21` | Kept, same behavior | Used by `src/mount-security.ts` (host) to whitelist directories containers can read. Same in both versions. |
|
||||
| **SENDER_ALLOWLIST_PATH** = `~/.config/nanoclaw/sender-allowlist.json` | `src/config.ts:22` | Kept, same behavior | Stored outside project root for security. Path derivation identical in v1 and v2. **Unused in v2** (no grep hits outside v1 folder). |
|
||||
| **STORE_DIR** = `store/` | `src/config.ts:23` | **Removed, unused** | v1 used in `db.ts`. v2 uses central DB (`data/v2.db`) and per-session DBs (`data/v2-sessions/<id>/{inbound,outbound}.db`). `store/` directory no longer part of v2 architecture. |
|
||||
| **GROUPS_DIR** = `groups/` | `src/config.ts:24` | Kept, same behavior | Per-agent-group filesystem (CLAUDE.md, skills, config). Used in `src/container-runner.ts`, `src/delivery.ts`, `src/group-init.ts`. Identical role in both versions. |
|
||||
| **DATA_DIR** = `data/` | `src/config.ts:25` | Kept, extended usage | v1: IPC files, task DB. v2: central DB, session DBs, heartbeat files. More central in v2. Used in `src/index.ts`, `src/session-manager.ts`, `src/group-init.ts`, etc. |
|
||||
| **CONTAINER_IMAGE** env var (default: 'nanoclaw-agent:latest') | `src/config.ts:27` | Kept, same behavior | Specifies Docker image name. Used in `src/container-runner.ts`. Identical in both versions. |
|
||||
| **CONTAINER_TIMEOUT** env var (default: 1800000ms = 30min) | `src/config.ts:28` | Kept, same behavior | Maximum wall-clock time for a single container invocation. Used in `src/container-runner.ts`. Identical in both versions. |
|
||||
| **CONTAINER_MAX_OUTPUT_SIZE** env var (default: 10485760 bytes = 10MB) | `src/config.ts:29` | **Removed, unused** | Exported but never referenced in v1 or v2. No evidence of implementation. |
|
||||
| **ONECLI_URL** env var (no default) | `src/config.ts:30` | Kept, same behavior | OneCLI gateway URL for credential management. Read from `.env` or `process.env`. Used in `src/onecli-approvals.ts`. Identical in both versions. |
|
||||
| **MAX_MESSAGES_PER_PROMPT** env var (default: 10) | `src/config.ts:31` | **Removed, unused** | v1 used in message batching for prompt formatting (`v1/index.ts:192-193, 434-435, 467`). v2 removed MAX_MESSAGES limit; agent processes all pending messages in a turn. |
|
||||
| **IDLE_TIMEOUT** env var (default: 1800000ms = 30min) | `src/config.ts:33` | Kept, same behavior | How long to keep container alive after last result before killing due to inactivity. Used in `src/container-runner.ts:134-139`. Identical in both versions. |
|
||||
| **MAX_CONCURRENT_CONTAINERS** env var (default: 5) | `src/config.ts:34` | **Removed, unused** | v1 used in `group-queue.ts` for queue management. v2 removed group queueing (no group-queue.ts equivalent). Sessions start containers independently; no global cap enforced. |
|
||||
| **escapeRegex()** helper | `src/config.ts:36-38` | Kept, same implementation | Escapes regex special characters. Used by `buildTriggerPattern()`. Identical in both versions. |
|
||||
| **buildTriggerPattern()** helper | `src/config.ts:40-42` | Kept, same implementation | Builds case-insensitive word-boundary regex from trigger string. Used in v2 by... (no grep hits in non-v1 v2 code). Exported but **unused in v2**. |
|
||||
| **DEFAULT_TRIGGER** = `@${ASSISTANT_NAME}` | `src/config.ts:44` | Kept, **unused** | Default trigger pattern for agent activation. Computed from ASSISTANT_NAME. Exported but not used in v2 (no grep hits outside v1). |
|
||||
| **getTriggerPattern()** helper | `src/config.ts:46-49` | Kept, **unused** | Returns regex for trigger matching. Used in v1 for routing decisions. Exported but **not used in v2** (trigger logic moved to DB `messaging_group_agents.trigger_rules`). |
|
||||
| **TRIGGER_PATTERN** = computed | `src/config.ts:51` | Kept, **unused** | Pre-built DEFAULT_TRIGGER pattern. Exported but **not used in v2**. |
|
||||
| **resolveConfigTimezone()** helper | `src/config.ts:55-61` | Kept, same implementation | Resolves IANA timezone from TZ env var → `.env` TZ → system timezone → 'UTC'. Identical logic in both versions. |
|
||||
| **TIMEZONE** const | `src/config.ts:62` | Kept, same behavior | Current timezone for scheduled tasks, message timestamps. Used in `src/host-sweep.ts`, `container/agent-runner/src/index.ts`. Identical in both versions. |
|
||||
| **readEnvFile()** function | `src/env.ts:11-42` | Kept, identical | Reads `.env` file, returns only requested keys, does not pollute `process.env`. Used by config.ts. Prevents secrets leak to child processes. Identical in both versions. |
|
||||
|
||||
---
|
||||
|
||||
## Missing from v2
|
||||
|
||||
- **POLL_INTERVAL** (2000ms hardcoded constant) — v1 polling loop. v2 has no direct equivalent; delivery uses hard-coded `ACTIVE_POLL_MS = 1000` (`src/delivery.ts:56`). Not configurable.
|
||||
|
||||
- **SCHEDULER_POLL_INTERVAL** (60000ms hardcoded constant) — v1 task scheduler. v2 uses hard-coded `SWEEP_INTERVAL_MS = 60_000` (`src/host-sweep.ts:31`). Same interval, not configurable from config.ts.
|
||||
|
||||
- **IPC_POLL_INTERVAL** (1000ms hardcoded constant) — v1 IPC file watcher. No v2 equivalent; IPC replaced with session DBs.
|
||||
|
||||
- **MAX_MESSAGES_PER_PROMPT** (env var, default 10) — v1 message batching. v2 has no message batching limit; all pending messages in a turn are processed together.
|
||||
|
||||
- **MAX_CONCURRENT_CONTAINERS** (env var, default 5) — v1 group queue. v2 has no group-level concurrency cap; sessions start containers independently.
|
||||
|
||||
- **STORE_DIR** (store/ directory) — v1 task/group storage. v2 uses central DB + session DBs; no store/ directory needed.
|
||||
|
||||
- **SENDER_ALLOWLIST_PATH** — Path is defined but never used in either version.
|
||||
|
||||
---
|
||||
|
||||
## Behavioral discrepancies
|
||||
|
||||
1. **ASSISTANT_NAME usage**
|
||||
- v1: Used for CLAUDE.md template substitution (`v1/index.ts:135-137`), getLastBotMessageTimestamp comparison, and trigger pattern building.
|
||||
- v2: Only passed to container as `NANOCLAW_ASSISTANT_NAME` env var (`src/container-runner.ts:302`); container uses it for transcript archiving only. Host does not use it.
|
||||
- **Impact**: v1 personalized CLAUDE.md by name; v2 relies on statically authored CLAUDE.md in `groups/<folder>/`.
|
||||
|
||||
2. **Trigger pattern handling**
|
||||
- v1: Trigger pattern from `getTriggerPattern()` used at host routing layer (`v1/index.ts:200, 419`).
|
||||
- v2: Trigger rules stored in DB (`messaging_group_agents.trigger_rules` JSON field), evaluated at delivery time by router. `getTriggerPattern()` exported but unused.
|
||||
- **Impact**: v1 required config-level trigger changes; v2 allows per-messaging-group customization via DB.
|
||||
|
||||
3. **Timezone resolution**
|
||||
- v1: `resolveConfigTimezone()` used in `task-scheduler.ts:5`.
|
||||
- v2: Same function; `TIMEZONE` used in `host-sweep.ts`, `container/agent-runner/src/index.ts:45` (but never actually referenced in agent-runner).
|
||||
- **Impact**: Identical behavior; minor: container reads env var but doesn't use it.
|
||||
|
||||
4. **Poll intervals**
|
||||
- v1: `POLL_INTERVAL`, `SCHEDULER_POLL_INTERVAL`, `IPC_POLL_INTERVAL` all separately configured.
|
||||
- v2: Hard-coded `ACTIVE_POLL_MS = 1000`, `SWEEP_POLL_MS = 60_000` in `src/delivery.ts`. Container poll loop uses hard-coded `POLL_INTERVAL_MS = 1000`, `ACTIVE_POLL_INTERVAL_MS = 500` in `container/agent-runner/src/poll-loop.ts:10-11`.
|
||||
- **Impact**: v2 intervals are not tunable via env vars; requires code change.
|
||||
|
||||
5. **Message batching**
|
||||
- v1: `MAX_MESSAGES_PER_PROMPT` limits messages per turn (`v1/index.ts:467`).
|
||||
- v2: No limit; all pending messages (minus filtered/denied commands) are formatted and sent to agent in one turn.
|
||||
- **Impact**: v2 may send larger prompts; unbounded context risk if message queue grows.
|
||||
|
||||
6. **Container concurrency**
|
||||
- v1: `MAX_CONCURRENT_CONTAINERS` enforced via group queue (`v1/group-queue.ts`).
|
||||
- v2: No global or per-group limit. Each session independently starts its container on wake.
|
||||
- **Impact**: v2 can spawn many containers simultaneously; no backpressure mechanism.
|
||||
|
||||
7. **IPC → Session DB**
|
||||
- v1: Uses file-based IPC (JSON files, `IPC_POLL_INTERVAL` polling).
|
||||
- v2: Uses SQLite session DBs (`inbound.db` host-owned, `outbound.db` container-owned).
|
||||
- **Impact**: v2 is more reliable (ACID semantics) but less debuggable (binary format).
|
||||
|
||||
---
|
||||
|
||||
## Worth preserving?
|
||||
|
||||
**No.** The config.ts file is largely a legacy artifact. Most of its exports are unused in v2, and the few that remain (TIMEZONE, IDLE_TIMEOUT, ONECLI_URL, paths) are minimally invasive. The hardcoded poll intervals and removed features (MAX_MESSAGES, MAX_CONCURRENT_CONTAINERS, IPC_POLL_INTERVAL) reflect architectural changes that are intentional and correct for v2. The trigger pattern and ASSISTANT_NAME handling in config.ts should be removed from the host layer entirely — they're now managed by the DB and container env vars. Consolidate host-level config into a smaller, focused module that only exports what v2 actually uses: TIMEZONE, IDLE_TIMEOUT, CONTAINER_TIMEOUT, ONECLI_URL, path constants, and the env file reader.
|
||||
@@ -1,72 +0,0 @@
|
||||
# container index (agent-runner entry): v1 vs v2
|
||||
|
||||
## Scope
|
||||
- v1: `container/agent-runner/src/v1/index.ts` (736 LOC) — monolithic: arg parsing, IPC polling, SDK integration, output marshaling
|
||||
- v2 (split): `container/agent-runner/src/index.ts` (124 LOC) + `poll-loop.ts` (436 LOC) + `destinations.ts` (118 LOC) + `formatter.ts` (228 LOC) + `db/*.ts` + `providers/*.ts`
|
||||
|
||||
## Startup sequence diff
|
||||
|
||||
| Step | v1 (IPC) | v2 (SQLite poll) |
|
||||
|------|----------|------------------|
|
||||
| Arg parsing | stdin JSON via `readStdin()` (v1:105-115) | env vars: `AGENT_PROVIDER`, `NANOCLAW_*` (v2 index.ts:44-51) |
|
||||
| Env setup | `sdkEnv` + `CLAUDE_CODE_AUTO_COMPACT_WINDOW` (v1:626-629) | same, delegated to provider (index.ts:109) |
|
||||
| DB open | — (IPC files only) | inbound.db (RO) + outbound.db (RW) + `session_state` table |
|
||||
| MCP server config | hardcoded nanoclaw server (v1:477-486) | same + `NANOCLAW_MCP_SERVERS` env for additional (index.ts:94-104) |
|
||||
| Message loop | `waitForIpcMessage()` polling (v1:350-366) | `poll-loop.ts:62+` `getPendingMessages()` every 1000ms idle / 500ms active |
|
||||
| Provider | Claude SDK direct | provider abstraction factory (`providers/factory.ts`, supports claude/mock/custom) |
|
||||
| Message stream | `MessageStream` iterable (v1:71-103) | same pattern in `providers/claude.ts:51-80` |
|
||||
| System prompt | manual CLAUDE.md load + hardcoded destinations (v1:416-420) | `buildSystemPromptAddendum()` from inbound.db destinations (`destinations.ts:76-117`) |
|
||||
| Query execution | `runQuery()` with IPC polling during query (v1:374-545) | `processQuery()` polls messages_in + `provider.query()` (`poll-loop.ts:259-319`) |
|
||||
| Session resumption | sessionId on stdin + `resumeAt` tracking | `getStoredSessionId()` from outbound.db; cleared on `/clear` admin command |
|
||||
| Shutdown | stdout output markers + exit(1) on error | no markers; logs errors; host manages lifecycle |
|
||||
| Heartbeat | — | file touch at `SESSION_HEARTBEAT_PATH` on each result |
|
||||
|
||||
## Capability map
|
||||
|
||||
| v1 behavior | v2 location | Status | Notes |
|
||||
|---|---|---|---|
|
||||
| Parse prompt/session/group/chat/etc. from stdin | env + inbound.db | kept | |
|
||||
| Env injection (ANTHROPIC_BASE_URL, proxy) | passed to provider.query() (index.ts:109) | kept | |
|
||||
| Stdin JSON parsing | — | **removed** | |
|
||||
| IPC file polling | `messages_in` table | modernized | Same semantics, DB-backed |
|
||||
| IPC `_close` sentinel | implicit (process killed by host) | simplified | |
|
||||
| Output wrapping markers | writes to `messages_out` | **removed** | |
|
||||
| Session archiving PreCompact hook | `providers/claude.ts` hook | kept | |
|
||||
| Session resumption by ID | `getStoredSessionId()` (poll-loop.ts:51) | **persisted** | Survives container restart |
|
||||
| Scheduled task script execution | `task-script.ts:applyPreTaskScripts()` (poll-loop.ts:159) | kept | |
|
||||
| Command filtering (`/help`, `/login`) | `categorizeMessage()` + filtered set (formatter.ts:14, poll-loop.ts:95-100) | **enhanced** | Explicit categories |
|
||||
| Admin commands (`/clear`, etc.) | `categorizeMessage` + `NANOCLAW_ADMIN_USER_IDS` gate (poll-loop.ts:102-131) | kept | Explicit admin role from env |
|
||||
| Destination routing `to=` | `destinations` table + `dispatchResultText()` (poll-loop.ts:350-432) | modernized | Named destinations instead of raw JIDs |
|
||||
| Multi-destination message blocks | `MESSAGE_RE` regex (poll-loop.ts:350-414) | kept | |
|
||||
| Tool allowlist | `providers/claude.ts:19-39` | kept | |
|
||||
| MCP server setup | index.ts:81-104 | kept + extensible | |
|
||||
| `@-syntax` additional dirs | `/workspace/extra/*` discovered at startup (index.ts:64-74) | kept | |
|
||||
| Global CLAUDE.md | SDK preset append (index.ts:56-58) | kept | |
|
||||
| Idle stream termination | — | **new** (IDLE_END_MS = 20s prevents zombies) |
|
||||
| Admin user ID prefixing (chat-sdk) | explicit `channel_type:` prefix (formatter.ts:58-66) | **new** | |
|
||||
| Processing ACK | **new** | prevents re-processing on container restart |
|
||||
| Message kind formatting | `formatMessages()` (formatter.ts) | enhanced | Routes by kind: chat/task/webhook/system |
|
||||
|
||||
## Missing from v2
|
||||
None of v1's core capabilities dropped. Notes on format/protocol shifts:
|
||||
1. **Stdout markers removed** — host now parses `messages_out` table instead of stdout
|
||||
2. **Stdin protocol gone** — follow-up messages via `messages_in` table
|
||||
3. **Script-phase fast exit removed** — v1 could skip container entirely if `wakeAgent=false`; v2 gates message processing but container keeps polling (slightly more idle cost)
|
||||
|
||||
## Behavioral discrepancies
|
||||
1. **Idle timeout**: v1 had no query-level timeout → zombies possible. v2 ends stream after 20s with no SDK events
|
||||
2. **Resume**: v1 re-read sessionId from stdin each run; v2 persists in `session_state` across restarts
|
||||
3. **Admin gating**: v1 passed everything through; v2 categorizes + admin-gates `/clear` etc.
|
||||
4. **Destination naming**: v1 raw JID; v2 human names from destinations table
|
||||
5. **Poll cadence**: v2 dual-rate — 1000ms idle, 500ms active (CPU efficiency + responsiveness)
|
||||
6. **Message kind routing**: v1 uniform; v2 distinguishes chat/chat-sdk/task/webhook/system with per-kind formatting
|
||||
|
||||
## Worth preserving?
|
||||
v1 should remain historical reference only. v2 strictly supersedes:
|
||||
- DB-backed state survives restarts
|
||||
- Provider abstraction allows non-Claude agents
|
||||
- Dynamic destinations from inbound.db
|
||||
- Session invalidation detection + processing ACK idempotence
|
||||
- Dual poll rate + idle termination prevent pathological query hangs
|
||||
|
||||
No merge-back candidates identified.
|
||||
@@ -1,58 +0,0 @@
|
||||
# container mcp-tools: v1 vs v2
|
||||
|
||||
## Scope
|
||||
- v1: `container/agent-runner/src/v1/mcp-tools.ts` (81 LOC) — single tool (`send_message`)
|
||||
- v2: `container/agent-runner/src/mcp-tools/` — 7 modules (~971 LOC): `index.ts`, `core.ts`, `scheduling.ts`, `interactive.ts`, `agents.ts`, `self-mod.ts`, `types.ts`
|
||||
|
||||
## Tool map
|
||||
|
||||
| v1 tool | v2 file | Status | Schema / behavior diff |
|
||||
|---|---|---|---|
|
||||
| `send_message(text, channel, platformId, threadId)` | `core.ts:50-95` | **kept, enhanced** | v2 uses named destinations (`to`), auto-resolves via session default or lookup, preserves `thread_id` intelligently |
|
||||
| — | `core.ts:133-177` `send_file` | **new** | Copies file to outbox dir, routes via destinations |
|
||||
| — | `core.ts:179-218` `edit_message` | **new** | Edit previously-sent message by seq id |
|
||||
| — | `core.ts:220-259` `add_reaction` | **new** | Emoji reaction by seq id |
|
||||
| — | `scheduling.ts:33-79` `schedule_task` | **new** | One-shot or recurring (cron) |
|
||||
| — | `scheduling.ts:81-137` `list_tasks` | **new** | Pending/paused tasks grouped by series |
|
||||
| — | `scheduling.ts:139-165` `cancel_task` | **new** | |
|
||||
| — | `scheduling.ts:167-192` `pause_task` | **new** | |
|
||||
| — | `scheduling.ts:194-219` `resume_task` | **new** | |
|
||||
| — | `scheduling.ts:221-266` `update_task` | **new** | Modify prompt/recurrence/processAfter/script |
|
||||
| — | `interactive.ts:36-129` `ask_user_question` | **new** | Blocking with timeout — writes to outbound.db then polls inbound.db for response |
|
||||
| — | `interactive.ts:131-166` `send_card` | **new** | Structured Chat SDK cards |
|
||||
| — | `self-mod.ts` `install_packages` | **new** | apt/npm install, regex name validation, admin approval; approval handler auto-rebuilds image and restarts container |
|
||||
| — | `self-mod.ts` `add_mcp_server` | **new** | Wire existing MCP server; approval handler restarts container (no image rebuild) |
|
||||
| — | `agents.ts:30-63` `create_agent` | **new** | Admin-only sub-agent creation; not exposed to non-admin containers |
|
||||
|
||||
## New tools in v2
|
||||
15 new tools split across 5 capability domains:
|
||||
- **Message manipulation**: `send_file`, `edit_message`, `add_reaction`
|
||||
- **Scheduling**: 6 task-management tools
|
||||
- **Interactive**: `ask_user_question`, `send_card`
|
||||
- **Self-modification**: `install_packages`, `add_mcp_server`
|
||||
- **Agent management**: `create_agent`
|
||||
|
||||
## Missing from v2
|
||||
**None.** v2 strictly adds; v1's only tool (`send_message`) was kept and enhanced.
|
||||
|
||||
## Behavioral discrepancies
|
||||
1. **Destination resolution**: v1 used explicit channel/platformId/threadId params; v2 resolves named destinations from `destinations` map with fallback to session routing
|
||||
2. **Two-DB split pattern**: all scheduling/self-mod tools write system actions to **outbound.db**; host processes (applies to inbound.db). Container never writes directly to inbound
|
||||
3. **`ask_user_question` is blocking**: synchronously polls inbound.db until response arrives or timeout — agent perception is blocking, transport is async
|
||||
4. **Admin enforcement**: `create_agent` + self-mod tools check admin approval host-side (`NANOCLAW_ADMIN_USER_IDS` env controls tool visibility)
|
||||
5. **Message editing/reactions**: use internal seq id (not user-visible numeric message ID) — requires outbound.db lookup
|
||||
|
||||
## Transport pattern (v2 common)
|
||||
1. Agent invokes tool → validation (regex, enum, length)
|
||||
2. Tool writes `messages_out` or system-action row
|
||||
3. Tool returns success immediately (fire-and-forget)
|
||||
4. Host polls outbound.db, applies approval / routing / side effects
|
||||
|
||||
## Worth preserving?
|
||||
**Yes, fully.** The v2 modular architecture is a large improvement:
|
||||
- Clear separation by capability domain
|
||||
- Two-DB constraint cleanly encoded (container → outbound, host → inbound)
|
||||
- Named destination abstraction (better UX than raw JIDs)
|
||||
- Admin-only tool filtering at the MCP server level
|
||||
|
||||
v1 is retained as historical reference only. No merge-back.
|
||||
@@ -1,51 +0,0 @@
|
||||
# container-runner: v1 vs v2
|
||||
|
||||
## Scope
|
||||
- v1: `src/v1/container-runner.ts` (677 LOC) + `container-runner.test.ts` (204 LOC) — spawn + IPC plumbing + stdin/stdout JSON + process supervision + output-marker parsing
|
||||
- v2: `src/container-runner.ts` (405 LOC) + `src/container-config.ts` (114 LOC) + `src/session-manager.ts` (DB paths). Net ~272 LOC removed by eliminating IPC and output parsing
|
||||
|
||||
## Capability map
|
||||
|
||||
| v1 behavior | v2 location | Status | Notes |
|
||||
|---|---|---|---|
|
||||
| Image selection | `container-runner.ts:348-349` | kept | Reads `imageTag` from container.json or env |
|
||||
| Env injection | `container-runner.ts:266-284` | **changed** | Replaced IPC vars with `SESSION_INBOUND/OUTBOUND_DB_PATH`, `SESSION_HEARTBEAT_PATH`, `AGENT_PROVIDER`, `NANOCLAW_*` admin IDs |
|
||||
| Volume mounts | `container-runner.ts:200-252` | **changed** | Removed per-group IPC dir; added session folder `/workspace` + agent group `/workspace/agent` |
|
||||
| Mount validation | `container-runner.ts:240-244` | kept | Validates `additionalMounts` from container.json |
|
||||
| Provider integration | `container-runner.ts:184-198` | **new** | `resolveProviderContribution()` wires provider host-side configs |
|
||||
| stdin/stdout IPC | — | **removed** | v1 lines 318-387; v2 uses DB polling only; stdio=`['ignore','pipe','pipe']` |
|
||||
| Process spawn | `container-runner.ts:119` | kept | |
|
||||
| OneCLI `ensureAgent` + `applyContainerConfig` | `container-runner.ts:301-313` | enhanced | v2 calls `ensureAgent` first |
|
||||
| Admin ID injection | `container-runner.ts:289-295` | **new** | Queries `getOwners/getGlobalAdmins/getAdminsOfAgentGroup` at wake |
|
||||
| Idle timeout | `container-runner.ts:135-140` | changed | v2 uses `resetIdle()` callback on activeContainers entry, settable by `delivery.ts` |
|
||||
| Timeout logic | — | **removed** | v1 had configurable per-group timeout reset on output markers |
|
||||
| Output parsing | — | **removed** | v1 parsed `---NANOCLAW_OUTPUT_START/END---` from stdout; v2 ignores stdout |
|
||||
| Streaming output callback | — | **removed** | v1 had `onOutput()` for real-time delivery |
|
||||
| Per-exit log file | — | **removed** | v1 wrote `groups/<folder>/logs/container-*.log` with full I/O; v2 only logs stderr to logger.debug |
|
||||
| Graceful SIGTERM→SIGKILL | — | simplified | v2 just calls `stopContainer()` |
|
||||
| Concurrent wake dedup | `container-runner.ts:44-82` | **new** | `wakePromises` Map prevents race on spawn |
|
||||
| Per-group image builds | `container-runner.ts:357-405` | **new** | `buildAgentGroupImage()` writes `imageTag` |
|
||||
| Session folder init | `container-runner.ts:210` | **new** | `initGroupFilesystem()` at spawn |
|
||||
| Heartbeat file `/workspace/.heartbeat` | session-manager.ts | **new** | File-touch replaces IPC liveness |
|
||||
| Task/group JSON snapshots (`current_tasks.json`, `available_groups.json`) | — | **removed** | v2 pushes data via inbound.db writeDestinations/writeSessionRouting |
|
||||
| Container name | `container-runner.ts:103` | changed | `nanoclaw-v2-${folder}-${Date.now()}` |
|
||||
|
||||
## Missing from v2
|
||||
1. **Streaming output markers** — `---NANOCLAW_OUTPUT_START/END---` enabled pre-completion delivery; v2 must wait for outbound.db poll to deliver results
|
||||
2. **Configurable per-group timeout** — `group.containerConfig.timeout` override is gone; all groups share `IDLE_TIMEOUT`
|
||||
3. **Per-exit detailed logs** — v1 wrote timestamped logs with full I/O + mounts + stderr + stdout; invaluable for post-mortem
|
||||
4. **Graceful-stop sentinel** — v1 sent SIGTERM and waited for `_close` marker before SIGKILL
|
||||
5. **JSON snapshots for tasks/groups** — `current_tasks.json` / `available_groups.json` in the group IPC dir
|
||||
|
||||
## Behavioral discrepancies
|
||||
1. **Async result model**: v1 `runContainerAgent()` returned `Promise<ContainerOutput>` with inline result; v2 `wakeContainer()` is fire-and-forget — results asynchronous via delivery poll
|
||||
2. **No stdin**: v1 wrote full `ContainerInput` JSON to stdin; v2 container reads everything from inbound.db
|
||||
3. **Admin injection at wake**: v2 queries admins fresh on every spawn (`NANOCLAW_ADMIN_USER_IDS`)
|
||||
4. **Destination routing timing**: v2 calls `writeDestinations()` + `writeSessionRouting()` on every wake so changes apply without restart
|
||||
5. **Session lifecycle**: v1 created a session per spawn; v2 resolves session via router before wake
|
||||
|
||||
## Worth preserving?
|
||||
- **Streaming output**: Meaningful latency improvement. Hybrid model (DB polling + optional marker pre-delivery) could reduce perceived latency for long outputs
|
||||
- **Per-group timeout**: Restore — different agent groups have different expected latencies
|
||||
- **Per-exit logs**: At minimum, restore on non-zero exit. Cheap forensics, huge debug value
|
||||
- **Graceful-stop sentinel**: Not critical — bun container is disposable
|
||||
@@ -1,46 +0,0 @@
|
||||
# container-runtime + mount-security: v1 vs v2
|
||||
|
||||
## Scope
|
||||
- v1: `src/v1/container-runtime.ts` (81 LOC), `container-runtime.test.ts` (148 LOC), `mount-security.ts` (406 LOC)
|
||||
- v2: `src/container-runtime.ts` (81 LOC), `container-runtime.test.ts` (149 LOC), `mount-security.ts` (390 LOC)
|
||||
|
||||
## Capability map
|
||||
|
||||
### container-runtime.ts
|
||||
|
||||
| v1 behavior | v2 location | Status | Notes |
|
||||
|---|---|---|---|
|
||||
| `CONTAINER_RUNTIME_BIN = 'docker'` | `container-runtime.ts:1` | kept | Hardcoded; Apple Container runtime is NOT handled here in either version |
|
||||
| `hostGatewayArgs()` | `container-runtime.ts` | kept | Identical |
|
||||
| `readonlyMountArgs()` | `container-runtime.ts` | kept | Identical |
|
||||
| `stopContainer()` | `container-runtime.ts` | kept | Identical |
|
||||
| `ensureContainerRuntimeRunning()` | `container-runtime.ts` | kept | Identical |
|
||||
| `cleanupOrphans()` | `container-runtime.ts:60-80` | kept | Identical logic |
|
||||
| Logging module | | **changed** | v1 imports `logger` (data-first); v2 imports `log` (message-first) |
|
||||
|
||||
### mount-security.ts
|
||||
|
||||
| v1 behavior | v2 location | Status | Notes |
|
||||
|---|---|---|---|
|
||||
| `AdditionalMount` / `AllowedRoot` / `MountAllowlist` types | `mount-security.ts:16-29` | kept | Same shape except `nonMainReadOnly` removed |
|
||||
| Default blocked patterns | `mount-security.ts:39` | kept | Same list |
|
||||
| Allowlist load + file-watch cache | `mount-security.ts:64-102` | kept | |
|
||||
| Path expansion (`~`) | `mount-security.ts` | kept | |
|
||||
| Symlink resolution | `mount-security.ts` | kept | |
|
||||
| Container-path validation | `mount-security.ts` | kept | |
|
||||
| Template generation | `mount-security.ts:362-386` | changed | v2 template omits `nonMainReadOnly: true` |
|
||||
| `validateMount(mount, isMain)` | `mount-security.ts:230-307` | **signature changed** | v2 is `validateMount(mount)` — no `isMain` |
|
||||
| `validateAdditionalMounts(mounts, groupName, isMain)` | same | **signature changed** | v2 drops `isMain` |
|
||||
| Non-main groups forced to read-only | — | **removed** | v1 lines 283-291; v2 only checks `allowedRoot.allowReadWrite` |
|
||||
|
||||
## Missing from v2
|
||||
1. **`nonMainReadOnly` flag on `MountAllowlist`** — v1 could force non-main agent groups to read-only even when their allowlist permitted RW
|
||||
2. **`isMain` param flow** through `validateMount` / `validateAdditionalMounts`
|
||||
3. **Non-main group RW enforcement** at mount-validation time — now delegated entirely to `allowedRoot.allowReadWrite`
|
||||
|
||||
## Behavioral discrepancies
|
||||
1. **Isolation model weakened**: a non-main ("shared" or auxiliary) agent group can now mount RW on any path its root permits. v1's defense-in-depth (allowlist permits RW + group must be main) is reduced to just the allowlist check
|
||||
2. **Logger import**: only surface difference in container-runtime.ts
|
||||
|
||||
## Worth preserving?
|
||||
**`nonMainReadOnly` restoration has security value** for multi-tenant setups where shared/sandbox agent groups should not mutate filesystem even if the allowlist is permissive. Low-cost to reintroduce: restore the field on `MountAllowlist`, restore the `isMain` param, restore the check in `validateMount()`. If v2 has explicitly decided isolation is enforced elsewhere (agent-group config), document that; otherwise this is a regression.
|
||||
@@ -1,542 +0,0 @@
|
||||
# db: v1 vs v2
|
||||
|
||||
## Scope
|
||||
|
||||
**v1 (historical, not runtime):**
|
||||
- `/Users/gavriel/nanoclaw4/src/v1/db.ts` (659 lines)
|
||||
- `/Users/gavriel/nanoclaw4/src/v1/db.test.ts` (592 lines)
|
||||
- `/Users/gavriel/nanoclaw4/src/v1/db-migration.test.ts` (60 lines)
|
||||
- **Single database:** `<STORE_DIR>/messages.db` (better-sqlite3)
|
||||
- No session/agent-runner separation; chat metadata + message history only
|
||||
|
||||
**v2 counterparts:**
|
||||
- Central: `/Users/gavriel/nanoclaw4/src/db/*.ts` (index, schema, connection, 9 modules + 7 migrations)
|
||||
- Session: `/Users/gavriel/nanoclaw4/src/db/session-db.ts` (200+ lines)
|
||||
- Chat SDK state: `/Users/gavriel/nanoclaw4/src/state-sqlite.ts` (250+ lines)
|
||||
- Docs: `docs/db.md`, `docs/db-central.md`, `docs/db-session.md`
|
||||
|
||||
---
|
||||
|
||||
## High-Level Shift
|
||||
|
||||
| Aspect | v1 | v2 |
|
||||
|--------|----|----|
|
||||
| **Database count** | 1 | 3 (central + per-session inbound + per-session outbound) |
|
||||
| **Primary purpose** | Message history for a WhatsApp/multi-channel bot | Admin plane (identity, wiring, approvals) + per-session message queues |
|
||||
| **Writer model** | Single process | Single writer per file (host writes central + inbound; container writes outbound) |
|
||||
| **Schema evolution** | Ad-hoc ALTER TABLE in `createSchema()` | Versioned migrations in `src/db/migrations/` |
|
||||
| **Multi-tenant** | No (one bot per instance) | Yes (multiple agent groups, isolation levels, approval flows) |
|
||||
| **Key invariants** | Bot prefix filter, last-bot-timestamp cursor | Seq parity (even host, odd container), journal_mode=DELETE cross-mount visibility |
|
||||
|
||||
---
|
||||
|
||||
## Capability Map
|
||||
|
||||
| v1 Behavior | v2 Location | Status | Notes |
|
||||
|-------------|-------------|--------|-------|
|
||||
| **`chats` table** (jid, name, last_message_time, channel, is_group) | `messaging_groups` (central DB) | Kept, renamed | v1: chat metadata only, no messages stored. v2: per-platform chat, with `unknown_sender_policy`, routing to multiple agents. |
|
||||
| **`messages` table** (id, chat_jid, sender, content, timestamp, is_from_me, is_bot_message, reply_to_*) | `messages_in` (session inbound) | Moved to session DB | v1: indexed by `timestamp`, filtered by bot prefix + flag. v2: indexed by `series_id` (recurring), seq-numbered, multi-kind (chat|task|system), host-written with even seq. Container reads pending/unprocessed. |
|
||||
| **`scheduled_tasks` table** (id, group_folder, chat_jid, prompt, script, schedule_type, schedule_value, next_run, context_mode, status) | `messages_in` (session inbound, kind='task') | Moved to session messages | v1: separate table with status='active'\|'paused'\|'completed'. v2: unified into `messages_in` with kind='task', status per message. Scheduling engine lives in v2 `host-sweep.ts`. |
|
||||
| **`task_run_logs` table** (task_id, run_at, duration_ms, status, result, error) | No direct counterpart | Removed | v2 doesn't persist task execution logs in DB; host-sweep handles recurrence in-memory and via `processing_ack` acks. |
|
||||
| **`router_state` table** (key, value) | Not needed in v2 | Removed | v1: stored `last_timestamp`, `last_agent_timestamp` for polling cursor. v2: central DB and message tables eliminate need for manual state; routing is deterministic via `messaging_group_agents` and session queues. |
|
||||
| **`sessions` table** (group_folder, session_id) | `sessions` (central DB) | Kept, extended | v1: maps group folder to session ID. v2: central registry: id, agent_group_id, messaging_group_id, thread_id, status, container_status, last_active. Keyed by `(agent_group_id, messaging_group_id, thread_id)` tuples. |
|
||||
| **`registered_groups` table** (jid, name, folder, trigger_pattern, requires_trigger, is_main, container_config) | `agent_groups` (central DB) | Converted | v1: per-JID trigger; one agent per bot instance. v2: agent_groups independent of channels; multiple messaging_groups wire to each agent via `messaging_group_agents`. Container config moved to disk (`groups/<folder>/container.json`). |
|
||||
| **Bot message filtering (is_bot_message flag + prefix)** | `messages_in` schema + container read filter | Kept, schema-level | v1: dual check (flag + `content LIKE 'Andy:%'` backstop). v2: container-side filtering in agent-runner. |
|
||||
| **Reply context (reply_to_message_id, reply_to_content, reply_to_sender_name)** | `messages_in` columns | Kept | v1: nullable columns added via migration. v2: same schema, inherited from v1 shape. |
|
||||
| **Chat metadata sync (last_message_time, channel, is_group)** | `messaging_groups` + lazy platform discovery | Converted | v1: timestamps in `chats.last_message_time`. v2: platform metadata in `messaging_groups`; `last_active` in `sessions` for activity tracking. |
|
||||
| **Group discovery** (getAllChats) | Channel adapters + `messaging_groups` query | Removed from DB | v1: `getAllChats()` queries local DB. v2: adapters populate `messaging_groups` on first message; host discovers channels via routing, not polling DB. |
|
||||
| **Message filtering by timestamp window** | `getNewMessages()`, `getMessagesSince()` with LIMIT subquery | Moved to session inbound | v1: queries with ORDER BY DESC, LIMIT N, then re-sort chronologically. v2: host writes to inbound; container polls. Cursor logic inverted (container drives processing, host feeds). |
|
||||
| **Limit behavior (cap to N most recent)** | Hardcoded LIMIT 200 with timestamp filter | Kept, per-session | v1: `getNewMessages(limit=200)` by default. v2: `messages_in` has process-after and recurrence; container pulls per poll batch. |
|
||||
| **Journal mode** | Not explicitly configured | DELETE (session), WAL (central) | v1: better-sqlite3 default (volatile). v2: `journal_mode=DELETE` on session DBs for cross-mount visibility; WAL on central DB for consistency. See `db/connection.ts:17` and `db/session-db.ts:15`. |
|
||||
| **Foreign key constraints** | Soft (checked in code) | Hard (enforced in schema) | v1: no FK constraints. v2: all references are `REFERENCES table(id)` with implicit RESTRICT. Central DB enforces full FK graph. |
|
||||
| **Pragmas** | Not set | `foreign_keys=ON`, `busy_timeout=5000` | v1: defaults only. v2: explicit, cross-mount-safe timeouts. |
|
||||
| **Index coverage** | `idx_timestamp` on messages, `idx_next_run` on tasks, `idx_status` on tasks | Expanded | v1: 3 indexes. v2: series_id, user_roles scope, sessions lookup, agent_destinations target, pending_approvals action+status. |
|
||||
|
||||
---
|
||||
|
||||
## Schema Diff: Table-by-Table
|
||||
|
||||
### **Chats → Messaging Groups**
|
||||
|
||||
**v1 `chats` (PK: jid):**
|
||||
```sql
|
||||
jid, name, last_message_time, channel, is_group
|
||||
```
|
||||
|
||||
**v2 `messaging_groups` (PK: id, UNIQUE: channel_type, platform_id):**
|
||||
```sql
|
||||
id, channel_type, platform_id, name, is_group, unknown_sender_policy, created_at
|
||||
```
|
||||
|
||||
**Diff:**
|
||||
- v1: jid is the platform ID directly (`"tg:123"`, `"group@g.us"`)
|
||||
- v2: splits into `channel_type` ("telegram", "whatsapp") + `platform_id` (normalized ID)
|
||||
- v1: no `unknown_sender_policy`; dropped messages silently
|
||||
- v2: adds policy for first-time senders: `strict` (drop), `request_approval` (ask admin), `public` (allow)
|
||||
- v1: `last_message_time` per chat; v2: moved to `sessions.last_active`
|
||||
- **Table lifecycle:** `chats` is ephemeral in v2 (discovered lazily); `messaging_groups` is central registry
|
||||
|
||||
### **Messages → Messages In (Session)**
|
||||
|
||||
**v1 `messages` (PK: id + chat_jid):**
|
||||
```sql
|
||||
id, chat_jid, sender, sender_name, content, timestamp, is_from_me, is_bot_message,
|
||||
reply_to_message_id, reply_to_message_content, reply_to_sender_name
|
||||
```
|
||||
|
||||
**v2 `messages_in` (PK: id, UNIQUE: seq):**
|
||||
```sql
|
||||
id, seq, kind, timestamp, status, process_after, recurrence, series_id, tries,
|
||||
platform_id, channel_type, thread_id, content
|
||||
```
|
||||
|
||||
**Diff:**
|
||||
- v1: single-session messages; chat_jid is the routing key
|
||||
- v2: per-session inbound queue; platform_id + channel_type + thread_id from routing, not payload
|
||||
- v1: sender/sender_name as columns
|
||||
- v2: content is JSON (all fields, including sender, are inside)
|
||||
- v1: `is_bot_message` flag
|
||||
- v2: `kind` field (`'chat'`, `'task'`, `'system'`) replaces ad-hoc bot detection
|
||||
- v1: no seq, no status, no recurrence
|
||||
- v2: **seq invariant** — even numbers only (host-assigned); see `nextEvenSeq()` at `src/db/session-db.ts:75`
|
||||
- v1: `reply_to_*` columns preserved in v2
|
||||
- v1: indexed on timestamp; v2: indexed on series_id (for recurring task grouping)
|
||||
|
||||
### **Scheduled Tasks → Messages In + Processing**
|
||||
|
||||
**v1 `scheduled_tasks` (PK: id):**
|
||||
```sql
|
||||
id, group_folder, chat_jid, prompt, script, schedule_type, schedule_value,
|
||||
next_run, last_run, last_result, context_mode, status, created_at
|
||||
```
|
||||
|
||||
**v2 spread across:**
|
||||
- `messages_in` (host writes kind='task')
|
||||
- `processing_ack` (container reads/writes status)
|
||||
- No persistent `task_run_logs`
|
||||
|
||||
**Diff:**
|
||||
- v1: tasks are a separate schema; v2: tasks are messages (kind='task')
|
||||
- v1: `prompt`, `script`, `context_mode` in task row; v2: in JSON `content`
|
||||
- v1: `schedule_type` (once, recurring), `schedule_value` (cron); v2: same, in `recurrence` field (cron string)
|
||||
- v1: `next_run`, `last_run` tracked in table; v2: `process_after`, `status` in messages_in; recurrence logic in host-sweep
|
||||
- v1: `last_result` stored; v2: no persistence; result is in container logs or delivery flow
|
||||
- v1: status='active'|'paused'|'completed'; v2: status='pending'|'processing'|'completed'|'failed'|'paused' (per message, unified with chat)
|
||||
|
||||
### **Task Run Logs → Removed**
|
||||
|
||||
**v1 `task_run_logs` (PK: id auto-increment, FK: task_id):**
|
||||
```sql
|
||||
task_id, run_at, duration_ms, status, result, error
|
||||
```
|
||||
|
||||
**v2:** Not in DB.
|
||||
|
||||
**Rationale:** v2 doesn't persist execution history in-DB; logs are streamed to host and written to operational logs. Task state is tracked via `processing_ack` status on the message itself, not a separate log table.
|
||||
|
||||
### **Router State → Removed**
|
||||
|
||||
**v1 `router_state` (PK: key):**
|
||||
```sql
|
||||
key (last_timestamp, last_agent_timestamp), value
|
||||
```
|
||||
|
||||
**v2:** Not needed.
|
||||
|
||||
**Rationale:** v1 used this to track polling cursors across restarts. v2 uses message IDs and seq numbers; polling logic is implicit in the session queue architecture.
|
||||
|
||||
### **Sessions Table**
|
||||
|
||||
**v1 `sessions` (PK: group_folder):**
|
||||
```sql
|
||||
group_folder, session_id
|
||||
```
|
||||
|
||||
**v2 `sessions` (PK: id):**
|
||||
```sql
|
||||
id, agent_group_id, messaging_group_id, thread_id, agent_provider, status, container_status, last_active, created_at
|
||||
```
|
||||
|
||||
**Diff:**
|
||||
- v1: simple folder → session mapping
|
||||
- v2: full session tuple: agent group + messaging group + thread, with lookup index on (messaging_group_id, thread_id)
|
||||
- v1: no status tracking; v2: `status` (active|paused|archived), `container_status` (stopped|starting|running)
|
||||
- v2: `agent_provider` override per session
|
||||
- v2: `last_active` timestamp for stale detection
|
||||
|
||||
### **Registered Groups → Agent Groups + Messaging Group Agents**
|
||||
|
||||
**v1 `registered_groups` (PK: jid):**
|
||||
```sql
|
||||
jid, name, folder, trigger_pattern, requires_trigger, is_main, added_at, container_config
|
||||
```
|
||||
|
||||
**v2 split into:**
|
||||
- `agent_groups` (PK: id): `id, name, folder, agent_provider, created_at` — container config on disk
|
||||
- `messaging_group_agents` (PK: id): bridges messaging groups to agents with wiring rules
|
||||
|
||||
**Diff:**
|
||||
- v1: one-to-one chat ↔ group; v2: many-to-many messaging group ↔ agent group
|
||||
- v1: `trigger_pattern` on chat; v2: `trigger_rules` (JSON) on the `messaging_group_agents` wiring
|
||||
- v1: `container_config` JSON in DB; v2: lives on disk at `groups/<folder>/container.json`
|
||||
- v1: `requires_trigger`, `is_main` flags; v2: `session_mode` (shared|per-thread|agent-shared) on wiring
|
||||
|
||||
### **New v2 Tables (Central)**
|
||||
|
||||
**`users`:**
|
||||
```sql
|
||||
id, kind, display_name, created_at
|
||||
```
|
||||
Platform identities: `"tg:123"`, `"discord:abc"`, `"phone:+1555..."`, `"email:a@x.com"`. No v1 counterpart (permissions were implicit).
|
||||
|
||||
**`user_roles`:**
|
||||
```sql
|
||||
user_id, role (owner|admin), agent_group_id (NULL=global), granted_by, granted_at
|
||||
```
|
||||
v1 had no explicit permissions; v2 enforces owner/admin privilege with audit trail.
|
||||
|
||||
**`agent_group_members`:**
|
||||
```sql
|
||||
user_id, agent_group_id, added_by, added_at
|
||||
```
|
||||
Non-privileged user membership. v1: implied (everyone could message the bot).
|
||||
|
||||
**`user_dms`:**
|
||||
```sql
|
||||
user_id, channel_type, messaging_group_id, resolved_at
|
||||
```
|
||||
Cached DM channel discovery (avoids repeated API calls). No v1 equivalent.
|
||||
|
||||
**`pending_questions`:**
|
||||
```sql
|
||||
question_id, session_id, message_out_id, platform_id, channel_type, thread_id, title, options_json, created_at
|
||||
```
|
||||
Interactive multiple-choice questions. v1: no interactive prompts.
|
||||
|
||||
**`agent_destinations`:**
|
||||
```sql
|
||||
agent_group_id, local_name, target_type, target_id, created_at
|
||||
```
|
||||
Per-agent ACL and name-resolution map for `send_message(to="name")`. Projected into session inbound as `destinations` table (see db-session.md §2.3). v1: no permission model for outbound sends.
|
||||
|
||||
**`pending_approvals`:**
|
||||
```sql
|
||||
approval_id, session_id, request_id, action, payload, agent_group_id, channel_type, platform_id, platform_message_id, expires_at, status, title, options_json, created_at
|
||||
```
|
||||
Approval queue for `install_packages`, `add_mcp_server`, OneCLI credential flows. v1: no approval model.
|
||||
|
||||
**`unregistered_senders` (via migration 008):**
|
||||
```sql
|
||||
user_id, messaging_group_id, first_seen, last_seen
|
||||
```
|
||||
Audit trail of unknown senders (strict unknown_sender_policy). v1: silently dropped.
|
||||
|
||||
**Chat SDK tables (via migration 002):**
|
||||
- `chat_sdk_kv` (key, value, expires_at)
|
||||
- `chat_sdk_subscriptions` (thread_id, subscribed_at)
|
||||
- `chat_sdk_locks` (thread_id, token, expires_at)
|
||||
- `chat_sdk_lists` (key, idx, value, expires_at)
|
||||
|
||||
Backing store for Chat SDK state adapter. No v1 equivalent (Chat SDK didn't exist).
|
||||
|
||||
### **New v2 Session Tables (Inbound, Host-written)**
|
||||
|
||||
**`delivered`:**
|
||||
```sql
|
||||
message_out_id, platform_message_id, status, delivered_at
|
||||
```
|
||||
Host tracks delivery outcomes without writing to container-owned outbound.db.
|
||||
|
||||
**`destinations` (projection from central):**
|
||||
```sql
|
||||
name, display_name, type, channel_type, platform_id, agent_group_id
|
||||
```
|
||||
Local ACL cache; rewritten on every container wake. Container queries this live to authorize sends.
|
||||
|
||||
**`session_routing` (single-row table):**
|
||||
```sql
|
||||
id=1, channel_type, platform_id, thread_id
|
||||
```
|
||||
Default reply routing for the session. Allows container to default outbound messages without querying central DB.
|
||||
|
||||
### **New v2 Session Tables (Outbound, Container-written)**
|
||||
|
||||
**`messages_out`:**
|
||||
```sql
|
||||
id, seq (ODD), in_reply_to, timestamp, deliver_after, recurrence, kind, platform_id, channel_type, thread_id, content
|
||||
```
|
||||
Container-produced: chat replies, edits, reactions, cards, system actions. Seq always odd (container-assigned); see `src/db/session-db.ts:76` for parity logic.
|
||||
|
||||
**`processing_ack`:**
|
||||
```sql
|
||||
message_id, status (processing|completed|failed), status_changed
|
||||
```
|
||||
Container writes status for each message_in it touched. Host polls and syncs back into messages_in (avoids container writing inbound.db).
|
||||
|
||||
**`session_state` (KV):**
|
||||
```sql
|
||||
key, value, updated_at
|
||||
```
|
||||
Container persistent store (Chat SDK session ID, conversation state). Cleared by `/clear`.
|
||||
|
||||
---
|
||||
|
||||
## Missing from v2
|
||||
|
||||
1. **Per-message sender/sender_name columns** — moved into JSON `content`. Container unpacks on read.
|
||||
2. **`task_run_logs` persistent history** — v2 streams logs to host; no in-DB history.
|
||||
3. **`last_agent_timestamp` cursor state** — implicit in session message seq.
|
||||
4. **Message filtering by bot prefix** — handled by container when writing to outbound; inbound doesn't filter.
|
||||
5. **Direct chat timestamp tracking** — replaced by `sessions.last_active` and message timestamps.
|
||||
6. **Single-writer assumption for one bot** — v2: one writer per file, across multiple agent groups (containers).
|
||||
|
||||
---
|
||||
|
||||
## Behavioral Discrepancies
|
||||
|
||||
### Sequence Numbering (Load-Bearing Invariant)
|
||||
|
||||
**v1:** No seq; messages identified by (id, chat_jid).
|
||||
|
||||
**v2:**
|
||||
- Host assigns **even** seq (2, 4, 6, …) to `messages_in`; see `nextEvenSeq()` at `src/db/session-db.ts:75–78`.
|
||||
- Container assigns **odd** seq (1, 3, 5, …) to `messages_out`; see container logic at `container/agent-runner/src/db/messages-out.ts:54`.
|
||||
- **Invariant:** seq is globally unique within a session across both tables. Parity disambiguates table membership for `edit_message(seq=5)` (odd → messages_out, even → messages_in).
|
||||
- **If violated:** edits target wrong table; messaging breaks.
|
||||
|
||||
### Message Status Lifecycle
|
||||
|
||||
**v1:** `messages` are immutable once written; `scheduled_tasks` have status (active|paused|completed).
|
||||
|
||||
**v2:** `messages_in` have status (pending|processing|completed|failed|paused). Container writes status into `processing_ack`; host syncs back. Processing is non-blocking (container reads when status='pending').
|
||||
|
||||
### Journal Mode (Cross-Mount Visibility)
|
||||
|
||||
**v1:** Not configured (better-sqlite3 defaults to `PRAGMA journal_mode = memory` or implicit rollback).
|
||||
|
||||
**v2:** **`journal_mode = DELETE` on session DBs** (see `db/session-db.ts:15`), **WAL on central** (see `db/connection.ts:17`).
|
||||
|
||||
**Rationale:** v1 is single-process. v2 has host and container accessing the same session DBs across a Docker mount or Apple Container mount. WAL has issues with cross-mount visibility (rolled WAL files don't sync reliably); DELETE forces each write to flush the main file, so readers see the latest state.
|
||||
|
||||
### Unknown Sender Handling
|
||||
|
||||
**v1:** Silently dropped or stored with no policy tracking.
|
||||
|
||||
**v2:** `unknown_sender_policy` on `messaging_groups`: `strict` (drop), `request_approval` (admin card), `public` (allow). Dropped senders tracked in `unregistered_senders` audit table (migration 008).
|
||||
|
||||
### Recurring Tasks
|
||||
|
||||
**v1:** `scheduled_tasks.recurrence` (cron); `schedule_type` (once|recurring); status tracking in row.
|
||||
|
||||
**v2:** `messages_in.recurrence` (cron string), `series_id` (groups occurrences). Host-sweep recalculates next run via cron parser; no persistence. Status per message (pending|paused|completed).
|
||||
|
||||
### Chat Metadata Sync
|
||||
|
||||
**v1:** `getAllChats()` queries local DB; `last_message_time` updated by each message insert.
|
||||
|
||||
**v2:** Metadata lives in `messaging_groups` (central, discovered lazily by adapters). Activity tracked in `sessions.last_active`. No global "last message" timestamp per chat.
|
||||
|
||||
### Destinations and Permissions
|
||||
|
||||
**v1:** No model; all agents can send to all chats.
|
||||
|
||||
**v2:**
|
||||
- Central: `agent_destinations` (source of truth)
|
||||
- Session: `destinations` (projection in inbound.db, rewritten on wake)
|
||||
- Container: queries `destinations` live; sends rejected if name not found
|
||||
- Invariant: if wiring changes mid-session and `writeDestinations()` isn't called, container sees stale data
|
||||
|
||||
### Foreign Key Enforcement
|
||||
|
||||
**v1:** No constraints; referential integrity checked in code.
|
||||
|
||||
**v2:** All FKs enforced; central DB will reject orphaned references. Session DBs don't need as many FKs (immutable projections).
|
||||
|
||||
---
|
||||
|
||||
## Pragmas & Configuration
|
||||
|
||||
### v1
|
||||
|
||||
```javascript
|
||||
// Implicit defaults — not set in code
|
||||
```
|
||||
|
||||
### v2
|
||||
|
||||
**Central DB (src/db/connection.ts:17–18):**
|
||||
```javascript
|
||||
_db.pragma('journal_mode = WAL');
|
||||
_db.pragma('foreign_keys = ON');
|
||||
```
|
||||
|
||||
**Session Inbound (src/db/session-db.ts:23–24):**
|
||||
```javascript
|
||||
db.pragma('journal_mode = DELETE');
|
||||
db.pragma('busy_timeout = 5000');
|
||||
```
|
||||
|
||||
**Session Outbound (src/db/session-db.ts:31–32):**
|
||||
```javascript
|
||||
// Opens readonly
|
||||
db.pragma('busy_timeout = 5000');
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Migrations
|
||||
|
||||
### v1
|
||||
Adhoc `ALTER TABLE` in `createSchema()` (src/v1/db.ts:82–134):
|
||||
- context_mode → scheduled_tasks
|
||||
- script → scheduled_tasks
|
||||
- is_bot_message → messages
|
||||
- is_main → registered_groups
|
||||
- channel, is_group → chats
|
||||
- reply_to_* → messages
|
||||
|
||||
No versioning; all tables are `IF NOT EXISTS` and ALTERs are try-catch silent.
|
||||
|
||||
### v2
|
||||
Numbered migrations in `src/db/migrations/` (1–9, note: 5–6 missing):
|
||||
|
||||
1. **001-initial.ts** — all core tables (agent_groups, messaging_groups, users, user_roles, agent_group_members, user_dms, sessions, pending_questions)
|
||||
2. **002-chat-sdk-state.ts** — chat_sdk_kv, chat_sdk_subscriptions, chat_sdk_locks, chat_sdk_lists
|
||||
3. **003-pending-approvals.ts** — pending_approvals table with action, payload, status
|
||||
4. **004-agent-destinations.ts** — agent_destinations table + backfill from existing messaging_group_agents wirings
|
||||
5. **(missing)**
|
||||
6. **(missing)**
|
||||
7. **007-pending-approvals-title-options.ts** — adds title, options_json columns to pending_approvals
|
||||
8. **008-dropped-messages.ts** — unregistered_senders audit table
|
||||
9. **009-drop-pending-credentials.ts** — cleanup (if any)
|
||||
|
||||
**Runner:** `runMigrations()` (src/db/migrations/index.ts:28–60) tracks version in `schema_version` table; applies pending migrations in transaction.
|
||||
|
||||
---
|
||||
|
||||
## Index Coverage
|
||||
|
||||
### v1
|
||||
|
||||
- `idx_timestamp` on `messages(timestamp)` — range queries for new messages
|
||||
- `idx_next_run` on `scheduled_tasks(next_run)` — sweep query for due tasks
|
||||
- `idx_status` on `scheduled_tasks(status)` — filter for active tasks
|
||||
- `idx_task_run_logs` on `task_run_logs(task_id, run_at)` — log lookup
|
||||
|
||||
### v2
|
||||
|
||||
- `idx_user_roles_scope` on `user_roles(agent_group_id, role)` — permission queries
|
||||
- `idx_sessions_agent_group` on `sessions(agent_group_id)` — session lookup per agent
|
||||
- `idx_sessions_lookup` on `sessions(messaging_group_id, thread_id)` — resolve session from channel+thread
|
||||
- `idx_messages_in_series` on `messages_in(series_id)` — recurring task grouping
|
||||
- `idx_agent_dest_target` on `agent_destinations(target_type, target_id)` — reverse lookup (find agents that can send to this target)
|
||||
- `idx_pending_approvals_action_status` on `pending_approvals(action, status)` — sweep query for pending/expired approvals
|
||||
|
||||
---
|
||||
|
||||
## Prepared Queries & Helpers
|
||||
|
||||
### v1 Helpers (src/v1/db.ts)
|
||||
|
||||
```
|
||||
storeChatMetadata(jid, timestamp, name?, channel?, isGroup?)
|
||||
— INSERT OR REPLACE into chats (ON CONFLICT upsert)
|
||||
|
||||
storeMessage(NewMessage)
|
||||
storeMessageDirect({id, chat_jid, sender, ...})
|
||||
— INSERT OR REPLACE into messages
|
||||
|
||||
getNewMessages(jids[], lastTimestamp, botPrefix, limit=200)
|
||||
— SELECT from messages, filter by jid list, timestamp > last, bot filter
|
||||
— Returns {messages, newTimestamp}
|
||||
|
||||
getMessagesSince(chatJid, sinceTimestamp, botPrefix, limit=200)
|
||||
— SELECT from messages, filter by chat, timestamp > since, bot filter, ORDER DESC + outer sort
|
||||
|
||||
getLastBotMessageTimestamp(chatJid, botPrefix)
|
||||
— SELECT MAX(timestamp) from messages WHERE (is_bot_message=1 OR content LIKE prefix)
|
||||
|
||||
createTask(ScheduledTask) / updateTask(id, fields) / getTaskById(id) / deleteTask(id)
|
||||
— Standard CRUD
|
||||
|
||||
getDueTasks()
|
||||
— SELECT * WHERE status='active' AND next_run <= now
|
||||
|
||||
updateTaskAfterRun(id, nextRun, lastResult)
|
||||
— UPDATE task set next_run, last_run, last_result, status
|
||||
|
||||
logTaskRun(TaskRunLog)
|
||||
— INSERT into task_run_logs
|
||||
|
||||
getRouterState(key) / setRouterState(key, value)
|
||||
— KV table
|
||||
|
||||
getSession(groupFolder) / setSession(groupFolder, sessionId) / deleteSession(groupFolder)
|
||||
— Simple mapping
|
||||
|
||||
getRegisteredGroup(jid) / setRegisteredGroup(jid, group) / getAllRegisteredGroups()
|
||||
— CRUD on registered_groups
|
||||
```
|
||||
|
||||
### v2 Helpers
|
||||
|
||||
**Central DB (src/db/*.ts):**
|
||||
- `createAgentGroup`, `getAgentGroup`, `getAgentGroupByFolder`, `updateAgentGroup`, `deleteAgentGroup`
|
||||
- `createMessagingGroup`, `getMessagingGroup`, `getMessagingGroupByPlatform`, `updateMessagingGroup`, `deleteMessagingGroup`
|
||||
- `createMessagingGroupAgent`, `getMessagingGroupAgents`, `getMessagingGroupAgentByPair`, `updateMessagingGroupAgent`, `deleteMessagingGroupAgent`
|
||||
- `grantRole`, `revokeRole`, `getUserRoles`, `isOwner`, `isGlobalAdmin`, `isAdminOfAgentGroup`, `hasAdminPrivilege`
|
||||
- `createUser`, `upsertUser`, `getUser`, `getAllUsers`, `updateDisplayName`, `deleteUser`
|
||||
- `addMember`, `removeMember`, `getMembers`, `isMember`
|
||||
- `upsertUserDm`, `getUserDm`, `getUserDmsForUser`, `deleteUserDm`
|
||||
- `createSession`, `getSession`, `findSession`, `findSessionByAgentGroup`, `getSessionsByAgentGroup`, `getActiveSessions`, `getRunningSessions`, `updateSession`, `deleteSession`
|
||||
- `createPendingQuestion`, `getPendingQuestion`, `deletePendingQuestion`
|
||||
- `createPendingApproval`, `getPendingApproval`, `updatePendingApprovalStatus`, `deletePendingApproval`, `getPendingApprovalsByAction`
|
||||
|
||||
**Session DB (src/db/session-db.ts):**
|
||||
- `ensureSchema(dbPath, 'inbound'|'outbound')` — idempotent schema setup
|
||||
- `openInboundDb(dbPath)`, `openOutboundDb(dbPath)` — safe open with pragmas
|
||||
- `nextEvenSeq(db)` — helper for host seq assignment
|
||||
- `insertMessage(db, {id, kind, timestamp, platformId, channelType, threadId, content, processAfter, recurrence})`
|
||||
- `insertTask(db, {id, processAfter, recurrence, ...})`
|
||||
- `cancelTask(db, taskId)`, `pauseTask(db, taskId)`, `resumeTask(db, taskId)`
|
||||
- `upsertSessionRouting(db, {channel_type, platform_id, thread_id})`
|
||||
- `replaceDestinations(db, entries: DestinationRow[])`
|
||||
|
||||
---
|
||||
|
||||
## Key Invariants
|
||||
|
||||
### v1
|
||||
- **Bot message filtering:** is_bot_message flag + content prefix as backstop (for pre-migration rows)
|
||||
- **Cursor recovery:** getLastBotMessageTimestamp() to resume after stale downtime
|
||||
- **Single writer:** Process that imports db.ts owns all writes; no IPC
|
||||
- **Chat metadata immutability:** chats table updated only on metadata sync or first message, never deleted
|
||||
|
||||
### v2 (Load-Bearing)
|
||||
|
||||
1. **Single writer per file** — host writes central + inbound; container writes outbound only
|
||||
2. **Seq parity invariant** — even in messages_in, odd in messages_out; parity disambiguates edit target
|
||||
3. **Journal mode = DELETE on session DBs** — `DELETE` mode ensures cross-mount visibility (no WAL rollback issues)
|
||||
4. **Foreign keys enforced** — central DB rejects orphans; schema_version tracks migrations
|
||||
5. **Projection consistency** — `agent_destinations` (central) must be projected to `destinations` (session inbound) on every container wake; if wiring changes mid-session, must call `writeDestinations()` or container sees stale ACL
|
||||
6. **Seq monotonicity** — no gaps, no reuse. `nextEvenSeq()` and container logic both scan MAX(seq) across both tables before assigning next
|
||||
7. **Processing_ack as reverse channel** — container never writes to inbound.db; all status goes through outbound.db processing_ack, which host polls
|
||||
8. **Heartbeat out of band** — `.heartbeat` file mtime is liveness signal, not a DB write; avoids serialization with message processing
|
||||
9. **Admin at A implies membership in A** — invariant enforced in code (src/db/user-roles.ts, src/access.ts); no FK prevents deletion
|
||||
|
||||
---
|
||||
|
||||
## Worth Preserving?
|
||||
|
||||
**Yes — all v1 features are preserved or evolved:**
|
||||
- Message history: v1 stores per-chat; v2 per-session. Content and metadata shapes mostly compatible.
|
||||
- Scheduled tasks: v1 separate table; v2 unified into messages_in with kind='task'. Recurrence logic identical (cron).
|
||||
- Bot filtering: v1 dual-check (flag + prefix); v2 single flag. Backstop logic removed (assumed migrated by now).
|
||||
- Reply context: All v1 columns kept; v2 schema inherited.
|
||||
|
||||
**What's gone and why:**
|
||||
- `task_run_logs` — v2 doesn't persist execution history; logging is operational only.
|
||||
- `router_state` — v1 polling cursors; v2 implicit in message queuing.
|
||||
- Single-bot assumption — v2 is multi-tenant; this is a feature, not a loss.
|
||||
|
||||
**Migration path:** v1 `src/v1/db-migration.test.ts` shows the pattern: create legacy table, init v2 schema, backfill. Migration 004 does this for agent_destinations (backfill from messaging_group_agents wirings).
|
||||
@@ -1,38 +0,0 @@
|
||||
# env: v1 vs v2
|
||||
|
||||
## Scope
|
||||
- v1: `src/v1/env.ts` (42 LOC), `src/v1/config.ts` (63 LOC)
|
||||
- v2 counterparts: `src/env.ts` (identical), `src/config.ts` (identical structure); plus new consumers `src/webhook-server.ts`, `src/log.ts`, `src/container-runner.ts`, `container/build.sh`, `container/agent-runner/src/index.ts`
|
||||
|
||||
## Capability map
|
||||
|
||||
| v1 behavior | v2 location | Status | Notes |
|
||||
|---|---|---|---|
|
||||
| `readEnvFile(keys)` | `src/env.ts:11-42` | kept | Identical — reads `.env` without polluting `process.env` |
|
||||
| `ASSISTANT_NAME` / `ASSISTANT_HAS_OWN_NUMBER` | `src/config.ts:8-12` | kept | Same read order: process.env → .env → default |
|
||||
| `ONECLI_URL` | `src/config.ts:30` | kept | Used host-side + container-side |
|
||||
| `TZ` + `isValidTimezone` guard | `src/config.ts:56-62` | kept | Passes to containers |
|
||||
| `CONTAINER_IMAGE` / `CONTAINER_TIMEOUT` / `CONTAINER_MAX_OUTPUT_SIZE` | `src/config.ts:27-29` | kept | Same defaults |
|
||||
| `MAX_MESSAGES_PER_PROMPT` | `src/config.ts:31` | kept | **Unused in v2** |
|
||||
| `IDLE_TIMEOUT` | `src/config.ts:33` | kept | Used by container heartbeat model |
|
||||
| `MAX_CONCURRENT_CONTAINERS` | `src/config.ts:34` | kept | Enforced in `container-runner.ts` |
|
||||
| `POLL_INTERVAL` / `SCHEDULER_POLL_INTERVAL` / `IPC_POLL_INTERVAL` | `src/config.ts:13-32` | **dead code** | Defined but not imported anywhere in v2 runtime |
|
||||
| `MOUNT_ALLOWLIST_PATH` / `SENDER_ALLOWLIST_PATH` | `src/config.ts:21-22` | kept | SENDER_ALLOWLIST_PATH unused (model replaced by `user_roles`) |
|
||||
| `STORE_DIR` / `GROUPS_DIR` / `DATA_DIR` | `src/config.ts:23-25` | kept | `DATA_DIR` now hosts `v2.db` + `v2-sessions/<id>/*` |
|
||||
| `buildTriggerPattern` / `getTriggerPattern` / `TRIGGER_PATTERN` / `DEFAULT_TRIGGER` | `src/config.ts:40-51` | kept | Used sparingly; trigger model largely DB-driven now |
|
||||
| Container env injection via stdin JSON | `src/container-runner.ts:266-338` | **changed** | Replaced with `docker run -e`. New vars: `SESSION_INBOUND_DB_PATH`, `SESSION_OUTBOUND_DB_PATH`, `SESSION_HEARTBEAT_PATH`, `AGENT_PROVIDER`, `NANOCLAW_AGENT_GROUP_ID`, `NANOCLAW_AGENT_GROUP_NAME`, `NANOCLAW_MCP_SERVERS`, `NANOCLAW_ADMIN_USER_IDS` |
|
||||
| `INSTALL_CJK_FONTS` | `container/build.sh:18-26`, `container/Dockerfile:13` | **new in v2** | Build-time arg, not runtime env |
|
||||
| `WEBHOOK_PORT` (default 3000) | `src/webhook-server.ts:82` | **new in v2** | |
|
||||
| `LOG_LEVEL` | `src/log.ts:16` | **new in v2** | |
|
||||
|
||||
## Missing from v2
|
||||
Nothing user-facing. Container-only vars (`SESSION_*_DB_PATH`, `AGENT_PROVIDER`, `NANOCLAW_*`) are dynamic per-session and never belong in `.env`.
|
||||
|
||||
## Behavioral discrepancies
|
||||
1. **Dead constants**: `POLL_INTERVAL`, `SCHEDULER_POLL_INTERVAL`, `IPC_POLL_INTERVAL` remain in `src/config.ts` but are not imported by any v2 runtime code — safe to delete
|
||||
2. **Container transport**: v1 piped config via stdin JSON; v2 injects via `-e` at spawn
|
||||
3. **Build-time vs runtime**: `INSTALL_CJK_FONTS` is a Dockerfile build-arg, not a process env var
|
||||
4. **Output markers**: v1's `---NANOCLAW_OUTPUT_START/END---` stdout markers are gone — v2 reads from `messages_out` table
|
||||
|
||||
## Worth preserving?
|
||||
Dead constants (`POLL_INTERVAL`, `SCHEDULER_POLL_INTERVAL`, `IPC_POLL_INTERVAL`) should be **removed** from `src/config.ts` — they're confusing carry-overs. Everything else is either actively used or deliberately dynamic. The `.env`-based config surface is byte-identical and correct to keep.
|
||||
@@ -1,154 +0,0 @@
|
||||
# formatting (test-only) : v1 vs v2
|
||||
|
||||
## Scope
|
||||
|
||||
- **v1**: `/Users/gavriel/nanoclaw4/src/v1/formatting.test.ts` (316 lines)
|
||||
- **v1 production sibling**: `/Users/gavriel/nanoclaw4/src/v1/router.ts` (43 lines) — `escapeXml()`, `formatMessages()`, `stripInternalTags()`, `formatOutbound()`, plus `/Users/gavriel/nanoclaw4/src/v1/config.ts` (63 lines) — `getTriggerPattern()`, `TRIGGER_PATTERN`, `buildTriggerPattern()`, `DEFAULT_TRIGGER`
|
||||
- **v2 counterparts**:
|
||||
- Inbound message formatting: `/Users/gavriel/nanoclaw4/container/agent-runner/src/formatter.ts` (228 lines) — `formatMessages()`, `categorizeMessage()`, `extractRouting()`
|
||||
- Outbound tag stripping: embedded in container delivery logic
|
||||
- Trigger patterns: moved to DB model (`messaging_group_agents.trigger_rules` JSON) — no code-level function
|
||||
- v2 tests: `/Users/gavriel/nanoclaw4/container/agent-runner/src/poll-loop.test.ts:26–84` (formatter section only)
|
||||
|
||||
---
|
||||
|
||||
## Test-case map
|
||||
|
||||
| v1 Test Case | v2 Formatter Handling | Status | Notes |
|
||||
|---|---|---|---|
|
||||
| **escapeXml: ampersands** (src/v1/formatting.test.ts:22–23) | `/container/agent-runner/src/formatter.ts:225` `escapeXml()` with `&` → `&` | ✅ Preserved | Both use identical regex replacement. V2 escaping is used in `formatSingleChat()` for sender, time, text. |
|
||||
| **escapeXml: less-than** (test:26–27) | `formatter.ts:225` `escapeXml()` with `<` → `<` | ✅ Preserved | Used in XML attributes and content. |
|
||||
| **escapeXml: greater-than** (test:30–31) | `formatter.ts:225` with `>` → `>` | ✅ Preserved | Same. |
|
||||
| **escapeXml: double quotes** (test:34–35) | `formatter.ts:225` with `"` → `"` | ✅ Preserved | Same. |
|
||||
| **escapeXml: multiple special characters** (test:38–39) | `formatter.ts:225` (regex composition) | ✅ Preserved | Single pass through all four replacements. |
|
||||
| **escapeXml: passthrough clean text** (test:42–43) | `formatter.ts:225` (no-op if no specials) | ✅ Preserved | Same. |
|
||||
| **escapeXml: empty string** (test:46–47) | `formatter.ts:225` (no-op on empty) | ✅ Preserved | Same. |
|
||||
| **formatMessages: single message with context header & time** (test:56–62) | `/container/agent-runner/src/formatter.ts:124–158` `formatChatMessages()` & `formatSingleChat()` | ⚠️ Changed | v1 formats as `<context timezone="UTC" />\n<messages>...\n</messages>` with full timestamp in US locale. v2 uses `<message id="seq" from="dest-name" sender="..." time="HH:MM">...` with 24-hour time only. No context header. |
|
||||
| **formatMessages: multiple messages** (test:64–84) | `formatter.ts:124–134` (batch wrapping in `<messages>` tag) | ⚠️ Changed | v2 wraps multiple chat messages in `<messages>` tags but structure differs: no timezone attribute, different time format, `from` attribute added. |
|
||||
| **formatMessages: escape sender names** (test:86–88) | `formatter.ts:157` `sender="${escapeXml(sender)}"` | ✅ Preserved | Same escaping strategy. |
|
||||
| **formatMessages: escape content** (test:91–93) | `formatter.ts:157` `${escapeXml(text)}` | ✅ Preserved | Same. |
|
||||
| **formatMessages: empty array** (test:96–99) | `formatter.ts:98` — returns empty string if no messages | ❌ Incompatible | v1 returns `<context>\n<messages>\n\n</messages>` even for empty. v2 returns empty string. Different expected output. |
|
||||
| **formatMessages: reply context (quoted_message)** (test:102–116) | `formatter.ts:143, 183–188` `formatReplyContext()` | ⚠️ Changed | v1 renders `reply_to="42"` attribute + `<quoted_message from="Bob">text</quoted_message>` child. v2 renders as `<reply-to sender="..." >preview</reply-to>` without message ID attribute. |
|
||||
| **formatMessages: omit reply when absent** (test:119–122) | `formatter.ts:183` (conditional) | ✅ Preserved | Both check for presence before rendering. |
|
||||
| **formatMessages: omit quoted_message when content missing** (test:125–136) | `formatter.ts:184` (check `replyTo.text`) | ✅ Preserved | Both guard against missing content. |
|
||||
| **formatMessages: escape reply context** (test:139–151) | `formatter.ts:188` `escapeXml()` on sender and text | ✅ Preserved | Same escaping applied. |
|
||||
| **formatMessages: timezone conversion** (test:154–160) | `formatter.ts:216–223` `formatTime()` — HH:MM UTC only | ❌ Incompatible | v1 uses `formatLocalTime()` (full locale string with date, month, am/pm) from `timezone.ts:26–37`. v2 uses 24-hour `HH:MM` UTC only; no timezone localization. |
|
||||
| **TRIGGER_PATTERN: matches @name at start** (test:170–171) | No v2 code equivalent | ❌ Not in v2 | v2 moved trigger rules to DB; no regex pattern in code. Router evaluates `messaging_group_agents.trigger_rules` JSON. |
|
||||
| **TRIGGER_PATTERN: case-insensitive** (test:174–176) | DB model (applied at runtime by router) | ❌ Not in v2 | Same behavior (case-insensitive in router) but no test coverage for trigger logic in v2. |
|
||||
| **TRIGGER_PATTERN: word boundary checks** (test:179–192) | DB model (router enforces) | ❌ Not in v2 | Router evaluates trigger rules; no unit tests for pattern matching. |
|
||||
| **getTriggerPattern: custom per-group trigger** (test:201–206) | `/src/router.ts` evaluates `messaging_group_agents.trigger_rules` at delivery time | ❌ Not tested in v2 | v2 has no unit test for custom trigger selection. Behavior preserved in router but untested. |
|
||||
| **getTriggerPattern: regex characters literal** (test:215–219) | DB-stored rule (router uses literal match or regex) | ❌ Not tested | v2 stores trigger as string in DB; runtime evaluation depends on router implementation (not inspected here). |
|
||||
| **stripInternalTags: single-line** (test:226–227) | No direct v2 function — embedded in polling | ❌ Not isolated | v1 regex `/<internal>[\s\S]*?<\/internal>/g` with `.trim()`. v2 container poll-loop does not test this; no dedicated outbound function in v2 agent-runner. |
|
||||
| **stripInternalTags: multi-line** (test:230–231) | Not tested in v2 | ❌ Not isolated | v1 regex handles `[\s\S]*?` (newlines included). |
|
||||
| **stripInternalTags: multiple blocks** (test:234–235) | Not tested in v2 | ❌ Not isolated | Regex global flag `/g` handles multiple. Not verified in v2 tests. |
|
||||
| **stripInternalTags: only internal tags** (test:238–239) | Not tested in v2 | ❌ Not isolated | v1 returns empty after trim; behavior not verified in v2. |
|
||||
| **formatOutbound: passthrough clean text** (test:244–245) | Not tested in v2 | ❌ Not isolated | v1 calls `stripInternalTags()` then returns. v2 does not have isolated test. |
|
||||
| **formatOutbound: empty after strip** (test:248–249) | Not tested in v2 | ❌ Not isolated | v1 returns empty if all was internal. |
|
||||
| **formatOutbound: strip tags from text** (test:252–253) | Not tested in v2 | ❌ Not isolated | v1 example: `<internal>thinking</internal>The answer is 42` → `The answer is 42`. |
|
||||
| **trigger gating: main group always processes** (test:277–279) | No unit test in v2; logic in `/src/router.ts` routing decision | ❌ Not tested | v1 shows that main groups bypass trigger check. Behavior likely preserved (main group always forwards to agent) but not verified by test. |
|
||||
| **trigger gating: main group ignores requiresTrigger flag** (test:282–284) | Not tested in v2 | ❌ Not tested | v1 shows `isMainGroup=true` overrides `requiresTrigger` flag. No v2 test. |
|
||||
| **trigger gating: non-main needs trigger (default)** (test:287–289) | Not tested in v2 | ❌ Not tested | v1 default behavior: non-main group requires trigger unless explicitly disabled. |
|
||||
| **trigger gating: custom per-group trigger enforcement** (test:302–309) | Not tested in v2 | ❌ Not tested | v1 shows per-group trigger override. Behavior in v2 DB but no test. |
|
||||
| **trigger gating: requiresTrigger=false disables check** (test:312–314) | Not tested in v2 | ❌ Not tested | v1 allows opting out of trigger requirement per group. |
|
||||
|
||||
---
|
||||
|
||||
## Missing from v2
|
||||
|
||||
1. **Timezone-aware time formatting**
|
||||
- v1: `formatLocalTime(utcIso, timezone)` in `src/v1/timezone.ts:26–37` converts UTC ISO timestamp to user's local timezone with full locale formatting (date, month, am/pm).
|
||||
- v2: `formatTime()` in `container/agent-runner/src/formatter.ts:216–223` only extracts `HH:MM` in UTC, no localization.
|
||||
- **Impact**: v2 loses per-agent timezone context. Timestamps appear in UTC only, potentially confusing users in different timezones.
|
||||
|
||||
2. **Context header with timezone attribute**
|
||||
- v1: Every message batch includes `<context timezone="..."/>` header.
|
||||
- v2: No context header; timestamp is a message attribute only.
|
||||
- **Impact**: Agent sees no explicit timezone declaration; must infer from message times or system prompt.
|
||||
|
||||
3. **Reply context with message ID attribute**
|
||||
- v1: `reply_to="<message_id>"` attribute on message; separate `<quoted_message from="...">content</quoted_message>` child.
|
||||
- v2: Consolidated into `<reply-to sender="...">preview</reply-to>` without message ID; preview truncated to 100 chars.
|
||||
- **Impact**: v2 loses structured reply tracking; agent can't reference specific message IDs in follow-ups.
|
||||
|
||||
4. **Message ID sequence in XML**
|
||||
- v1: No `id` attribute on messages (WhatsApp-era design).
|
||||
- v2: Each message has `id="seq"` (database sequence number).
|
||||
- **Impact**: Allows agent to reference messages by ID, but v1 tests do not verify this.
|
||||
|
||||
5. **Trigger pattern unit tests**
|
||||
- v1: Comprehensive tests for `getTriggerPattern()`, `TRIGGER_PATTERN`, case-insensitivity, word boundaries, regex escaping.
|
||||
- v2: No unit tests; trigger logic moved to DB and router. Untested.
|
||||
- **Impact**: Trigger matching behavior not verified by tests; regression risk if router changes.
|
||||
|
||||
6. **Internal tag stripping tests**
|
||||
- v1: `stripInternalTags()` and `formatOutbound()` tested for single-line, multi-line, multiple blocks, edge cases.
|
||||
- v2: No isolated tests for outbound tag stripping.
|
||||
- **Impact**: No verification that internal tags are reliably removed before delivery.
|
||||
|
||||
7. **Trigger gating (requiresTrigger flag) tests**
|
||||
- v1: Detailed tests of main-group bypass, per-group override, default behavior, flag combinations.
|
||||
- v2: No tests; logic moved to DB schema and router evaluation.
|
||||
- **Impact**: Trigger enforcement behavior not verified.
|
||||
|
||||
8. **Empty message batch handling**
|
||||
- v1: Explicitly returns `<context>\n<messages>\n\n</messages>` for empty array.
|
||||
- v2: Returns empty string.
|
||||
- **Impact**: No clear protocol for "no messages to process" signals.
|
||||
|
||||
---
|
||||
|
||||
## Behavioral discrepancies
|
||||
|
||||
### 1. Message XML structure (formatMessages)
|
||||
- **v1**: `<context timezone="..."/>\n<messages>\n<message sender="..." time="...">content</message>\n</messages>`
|
||||
- **v2**: `<message id="seq" from="dest-name" sender="..." time="HH:MM">content</message>` (no wrapper for single message)
|
||||
- **v1 line**: `src/v1/router.ts:9–23`
|
||||
- **v2 line**: `container/agent-runner/src/formatter.ts:124–158`
|
||||
|
||||
### 2. Time formatting
|
||||
- **v1**: Full locale string (e.g., "Jan 1, 2024, 1:30 PM") using `Intl.DateTimeFormat` with timezone localization (`src/v1/timezone.ts:26–37`)
|
||||
- **v2**: 24-hour UTC only (e.g., "13:30") without timezone info (`container/agent-runner/src/formatter.ts:216–223`)
|
||||
- **Impact**: v2 loses timezone awareness; agent cannot distinguish between user's local time and server time.
|
||||
|
||||
### 3. Reply context structure
|
||||
- **v1**: Two-part — `reply_to="<id>"` attribute + `<quoted_message from="...">text</quoted_message>` child element
|
||||
- **v2**: Single element — `<reply-to sender="...">100-char preview</reply-to>` (no ID, preview truncated)
|
||||
- **v1 line**: `src/v1/router.ts:12–16`
|
||||
- **v2 line**: `container/agent-runner/src/formatter.ts:143, 183–188`
|
||||
- **Impact**: v2 cannot support message-ID-based threading; loses structured reply metadata.
|
||||
|
||||
### 4. Trigger pattern matching
|
||||
- **v1**: Implemented as regex returned by `getTriggerPattern()` with word-boundary enforcement (`config.ts:40–49`)
|
||||
- **v2**: Stored in DB as JSON in `messaging_group_agents.trigger_rules`; evaluated by router at delivery time
|
||||
- **v1 line**: `src/v1/config.ts:40–49`
|
||||
- **v2 line**: `/src/router.ts` (router logic, not inspected in detail here)
|
||||
- **Impact**: v1 enforces word boundaries via regex (`\b`); v2 implementation unknown (DB-driven).
|
||||
|
||||
### 5. Empty message handling
|
||||
- **v1**: Returns `<context>\n<messages>\n\n</messages>` — preserves structure
|
||||
- **v2**: Returns empty string
|
||||
- **v1 line**: `src/v1/router.ts:22`
|
||||
- **v2 line**: `container/agent-runner/src/formatter.ts:98`
|
||||
|
||||
### 6. Internal tag stripping
|
||||
- **v1**: Regex-based, `.trim()` called after removal
|
||||
- **v2**: Not isolated; no dedicated function or test in v2 formatter
|
||||
- **v1 line**: `src/v1/router.ts:25–26`
|
||||
- **v2 line**: No equivalent
|
||||
|
||||
---
|
||||
|
||||
## Worth preserving?
|
||||
|
||||
**Partially.** The v1 formatting test suite is **essential for documenting lost functionality**, not for v2 regression. Key behaviors that should be preserved in v2 but are currently missing:
|
||||
|
||||
1. **Timezone-aware message timestamps** — v2 should restore `formatLocalTime()` from `src/v1/timezone.ts` and include timezone context in the XML header. Without this, agents cannot reason about when messages arrived relative to the user's clock.
|
||||
|
||||
2. **Reply context with message IDs** — v2's truncated reply preview is lossy. Consider restoring the `reply_to="<id>"` attribute so agents can reference prior messages by sequence number for structured threading.
|
||||
|
||||
3. **Trigger pattern unit tests** — v2 moved trigger logic to the DB but lost test coverage. The DB schema and router must enforce the same invariants (word boundaries, case-insensitivity, custom per-group overrides) that v1 tested. Recommend adding integration tests to `src/router.ts` or `src/channels/adapter.ts` to verify trigger matching.
|
||||
|
||||
4. **Internal tag stripping tests** — v2 agent-runner should include unit tests for `stripInternalTags()` (if the skill applies) to prevent regression when Claude adds `<internal>` thinking tags.
|
||||
|
||||
The v1 test file serves as a **specification document** for channel formatting and trigger gating that v2 partially refactored away. Keeping it in the repo (even unpowered) documents the intended semantics.
|
||||
|
||||
@@ -1,38 +0,0 @@
|
||||
# group-folder: v1 vs v2
|
||||
|
||||
## Scope
|
||||
- v1: `src/v1/group-folder.ts` (45 LOC), `group-folder.test.ts` (35 LOC) — validation + path resolution only
|
||||
- v2 counterparts:
|
||||
- `src/group-folder.ts` (45 LOC) — byte-identical to v1
|
||||
- `src/group-init.ts` (128 LOC) — **new** filesystem bootstrap
|
||||
- `src/container-config.ts` (115 LOC) — **new** per-group container.json management
|
||||
- `src/group-folder.test.ts` (35 LOC) — identical to v1
|
||||
|
||||
## Capability map
|
||||
|
||||
| v1 behavior | v2 location | Status | Notes |
|
||||
|---|---|---|---|
|
||||
| `GROUP_FOLDER_PATTERN` (alphanumeric + `-` + `_`, 1-64) | `group-folder.ts:5-6` | identical | |
|
||||
| Reserved folder `global` | `group-folder.ts:6` | identical | `RESERVED_FOLDERS` set |
|
||||
| `isValidGroupFolder()` (reject empty, whitespace, traversal, absolute) | `group-folder.ts:8-16` | identical | |
|
||||
| `assertValidGroupFolder()` | `group-folder.ts:18-22` | identical | |
|
||||
| `resolveGroupFolderPath()` + `ensureWithinBase()` | `group-folder.ts:31-36` | identical | |
|
||||
| `resolveGroupIpcPath()` (resolves `data/ipc/<folder>`) | `group-folder.ts:38-44` | kept | IPC dir is legacy — no longer used since v2 moved to session DBs |
|
||||
| Filesystem scaffold (CLAUDE.md, skills, overlays) | — | **new in v2** | `group-init.ts:48-127` |
|
||||
| Global memory symlink (`.claude-global.md` → `/workspace/global/CLAUDE.md`) | `group-init.ts:55-70` | **new** | Uses `lstat` to detect dangling symlinks |
|
||||
| Per-group `container.json` init | `group-init.ts:83-85` + `container-config.ts:109-114` | **new** | Graceful fallback on corruption |
|
||||
| `.claude-shared` session dir | `group-init.ts:87-92` | **new** | Under `data/v2-sessions/<id>/` |
|
||||
| `settings.json` with `CLAUDE_CODE_*` flags | `group-init.ts:94-98` | **new** | |
|
||||
| Recursive skill copy from `container/skills/` | `group-init.ts:100-107` | **new** | |
|
||||
| Per-group agent-runner src overlay copy | `group-init.ts:109-117` | **new** | |
|
||||
| Idempotent init (every step gates on `fs.existsSync()`) | `group-init.ts:44-127` | **new** | Safe to re-run |
|
||||
| Step logging via `log.info()` | `group-init.ts:119-126` | **new** | |
|
||||
|
||||
## Missing from v2
|
||||
None. All v1 validation/resolution behavior is preserved byte-for-byte.
|
||||
|
||||
## Behavioral discrepancies
|
||||
None on the validation layer. v2 adds the filesystem-scaffold layer as a separate module (`group-init.ts`) so validation stays pure.
|
||||
|
||||
## Worth preserving?
|
||||
Clean split — keep as-is. `group-folder.ts` = names + paths; `group-init.ts` = file creation. Both modules are small and single-purpose.
|
||||
@@ -1,48 +0,0 @@
|
||||
# group-queue: v1 vs v2
|
||||
|
||||
## Scope
|
||||
- v1: `src/v1/group-queue.ts` (325 LOC), `group-queue.test.ts` (457 LOC) — in-memory per-group state machine, IPC-file dispatch
|
||||
- v2: **no equivalent class**. Serialization is now DB-based and distributed across `src/session-manager.ts`, `src/host-sweep.ts`, `src/container-runner.ts`, `src/delivery.ts`
|
||||
|
||||
## Capability map
|
||||
|
||||
| v1 behavior | v2 location | Status | Notes |
|
||||
|---|---|---|---|
|
||||
| Per-group message queue | `inbound.db.messages_in` + `status='pending'` | replaced | Atomic status transitions serialize work per-session |
|
||||
| Per-group task queue | `inbound.db.messages_in` with `kind='task'` | replaced | Same table; `kind` discriminates |
|
||||
| `MAX_CONCURRENT_CONTAINERS` global cap | `container-runner.ts:42-52` `activeContainers` Map + `wakeContainer` dedup | kept | Enforced at spawn |
|
||||
| One container per group invariant | One container per **session** | redefined | Session is identity unit now |
|
||||
| Task-before-message priority (`drainGroup`) | `host-sweep.ts` recurrence + `delivery.ts` active poll | **partially lost** | No priority; polled by `process_after` timestamp ordering |
|
||||
| Exponential retry backoff | `host-sweep.ts:145-147` `BACKOFF_BASE_MS * 2^tries` | kept | Max 5 tries, same shape |
|
||||
| Idle preemption (`notifyIdle`/`closeStdin`) | heartbeat file mtime | **removed** | No interrupt signal — container polls continuously |
|
||||
| Message dispatch to active container (`sendMessage`) | Write to `messages_in` table | replaced | Host writes; container polls |
|
||||
| Cascading drain on task arrival | `delivery.ts` (~1s) + `host-sweep.ts` (~60s) polls | **async-ized** | Work discovery on next tick, not synchronous |
|
||||
| Shutdown without kill | containers continue under `--rm` | similar | Host shutdown does not stop containers |
|
||||
| Task dedup (`pendingTasks.some(t => t.id === id)`) | PK on `messages_in.id` | partial | Unique ID prevents DB duplicates; does not prevent two distinct rows with same series_id |
|
||||
| `drainWaiting` (waiting-group fairness) | Implicit: any session can wake if slot free | async | No explicit fairness |
|
||||
|
||||
## Serialization model diff
|
||||
**v1 (push-based):** `GroupState` in memory per group: `active`, `pendingMessages`, `pendingTasks`, `idleWaiting`, `runningTaskId`. `drainGroup()` synchronously dispatches. IPC file write signals container readiness. State lost on restart.
|
||||
|
||||
**v2 (pull-based via DB):** `messages_in.status` is the queue (`pending` → `processing` → `completed`/`failed`). Host writes rows + calls `wakeContainer()`; container polls + atomic UPDATE to take work. One writer per DB file (host→inbound, container→outbound) eliminates cross-mount contention. Heartbeat file mtime replaces IPC for liveness. State persisted; survives crashes.
|
||||
|
||||
## Missing from v2
|
||||
1. **Idle-state preemption** — v1 could interrupt an idle container on task arrival via `closeStdin`. v2 has no interrupt; container finishes current work and polls again
|
||||
2. **Synchronous drain cascade** — v1's `drainGroup` immediately ran the next item; v2 discovers it on the next poll tick (~1s active, ~60s sweep)
|
||||
3. **In-memory task dedup** — v1 checked pending-task list before enqueue. v2 can have two task rows with the same series_id coexisting (both pending) — relies on atomic `status` update for single-execution, best-effort
|
||||
4. **Priority ordering** — v1 tasks preempted messages; v2 is timestamp-ordered only
|
||||
|
||||
## Behavioral discrepancies
|
||||
| Aspect | v1 | v2 |
|
||||
|---|----|----|
|
||||
| Wake trigger | on enqueue (sync) | on `wakeContainer()` call, or poll finding due message |
|
||||
| Idle timeout | implicit via IPC | explicit heartbeat mtime (10 min) |
|
||||
| Task ordering | FIFO within group, tasks preempt messages | `process_after` timestamp; ties by insert seq |
|
||||
| Retry | host `scheduleRetry()` | host sweep detects stale, increments `tries`, sets backoff |
|
||||
| Concurrency cap | same | same (enforced in `spawnContainer` dedup) |
|
||||
|
||||
## Worth preserving?
|
||||
1. **Explicit task dedup** — add `(kind, series_id, session_id)` unique index on `messages_in`, or dedup in `host-sweep.ts` before inserting retry rows. Currently best-effort via atomic status update
|
||||
2. **Priority ordering** — add a `priority` column or document the ~1s task-wake latency as the SLA
|
||||
3. **Idle preemption** — not critical; 1s polling is acceptable for most workflows
|
||||
4. **Fairness** — v1's `drainWaiting` ensured no group starved. v2 is fair by timestamp but untested under concurrent load. Monitor in production
|
||||
@@ -1,70 +0,0 @@
|
||||
# host index: v1 vs v2
|
||||
|
||||
## Scope
|
||||
- v1: `src/v1/index.ts` (647 LOC) — monolithic entry: config, DB, state, channels, queues, scheduler, IPC watcher, message loop
|
||||
- v2: `src/index.ts` (345 LOC) — lean entry: DB+migrations, channels, delivery/sweep polls, OneCLI handler
|
||||
|
||||
## Startup sequence diff
|
||||
|
||||
| # | v1 step | v2 step | Status |
|
||||
|---|---------|---------|--------|
|
||||
| 1 | `ensureContainerRuntimeRunning()` + `cleanupOrphans()` | same | kept |
|
||||
| 2 | `initDatabase()` | `initDb()` + `runMigrations()` | enhanced (explicit migrations) |
|
||||
| 3 | `loadState()` — cursor, groups, agent timestamps | — | removed (no global state) |
|
||||
| 4 | OneCLI `ensureAgent` per group | — | removed (now per-wake in `container-runner.ts`) |
|
||||
| 5 | `restoreRemoteControl()` | — | removed |
|
||||
| 6 | SIGTERM/SIGINT handlers | same | kept |
|
||||
| 7 | `handleRemoteControl` bind | — | removed |
|
||||
| 8 | Channel options + callbacks | `initChannelAdapters()` | rewritten (adapter API) |
|
||||
| 9 | Channel discovery + connection | absorbed into adapters | — |
|
||||
| 10 | `startSchedulerLoop()` | — | removed (folded into `startHostSweep`) |
|
||||
| 11 | `startIpcWatcher()` | — | removed (no IPC in v2) |
|
||||
| 12 | `startSessionCleanup()` | — | removed (folded into `startHostSweep`) |
|
||||
| 13 | `queue.setProcessMessagesFn()` | — | removed (GroupQueue gone) |
|
||||
| 14 | `recoverPendingMessages()` | — | **removed** (implicit in sweep) |
|
||||
| 15 | `startMessageLoop()` (polling) | `startActiveDeliveryPoll()` + `startSweepDeliveryPoll()` | **fundamentally changed** (event-driven) |
|
||||
| 16 | — | `startHostSweep()` | **new** |
|
||||
| 17 | — | `startOneCLIApprovalHandler()` | **new** |
|
||||
|
||||
## Capability map
|
||||
|
||||
| v1 behavior | v2 location | Status | Notes |
|
||||
|---|---|---|---|
|
||||
| Arg/env parsing | `src/config.ts` (shared) | kept | |
|
||||
| Central DB init | `src/index.ts:47-50` | kept | + `runMigrations()` |
|
||||
| Container runtime bring-up | `src/index.ts:52-54` | kept | identical |
|
||||
| Global cursor + timestamps state | — | **removed** | v2 session-scoped state in outbound.db |
|
||||
| Periodic message polling loop | — | **removed** | Replaced by event-driven delivery + 60s sweep |
|
||||
| OneCLI group-wide sync at startup | — | **removed** | Per-wake in `container-runner.ts:303` |
|
||||
| Remote control subsystem | — | **removed** | No equivalent — feature deferred |
|
||||
| Group message queue (`GroupQueue`) | — | **removed** | DB-based serialization |
|
||||
| Channel adapter array + callbacks | `src/channels/channel-registry.ts` | refactored | `ChannelAdapter` interface |
|
||||
| Pending message recovery on startup | — | **removed** | Sweep detects stale containers + resets messages |
|
||||
| IPC watcher (dynamic group add) | — | **removed** | Static topology at startup; restart to add groups |
|
||||
| Signal handlers | `src/index.ts:339-340` | kept | Simplified teardown |
|
||||
| Top-level error handling | `src/index.ts:342-345` | kept | Same fatal exit |
|
||||
|
||||
## Missing from v2
|
||||
1. **Polling message loop** (v1:370-459) — replaced by event-driven + sweep (net improvement)
|
||||
2. **GroupQueue state machine** — now DB-based
|
||||
3. **Cross-restart cursor state** — no `lastAgentTimestamp` persisted; recovery implicit via DB scan
|
||||
4. **Remote control** — gone
|
||||
5. **Explicit `recoverPendingMessages()`** — implicit in sweep; worth verifying via post-crash test
|
||||
6. **IPC watcher** (`startIpcWatcher`) — cannot add groups dynamically; restart required
|
||||
7. **Scheduler loop** — merged into sweep's due-message wake
|
||||
|
||||
## Behavioral discrepancies
|
||||
| Aspect | v1 | v2 |
|
||||
|---|----|----|
|
||||
| Startup time | ~500ms (long loop init) | ~200ms |
|
||||
| Message fetch | polling every POLL_INTERVAL | event-driven callbacks + 1s delivery poll |
|
||||
| Container spawn | on-demand via GroupQueue | per-message wake via router/sweep |
|
||||
| Group topology | dynamic (IPC watcher) | static at startup |
|
||||
| Error recovery | per-message cursor rollback | implicit via stale detection |
|
||||
| Shutdown | GroupQueue 10s grace then disconnect | stop handlers/polls/sweep/adapters in order |
|
||||
|
||||
## Worth preserving?
|
||||
1. **Polling loop**: No — event-driven is superior. Verify delivery poll latency regression vs old POLL_INTERVAL under load
|
||||
2. **Pending-message recovery**: Worth explicit restoration — kill a container mid-message, restart host, verify re-delivery within ≤5s. If sweep doesn't cover this, add startup-phase scan
|
||||
3. **Remote control**: Unknown — either restore as opt-in skill or document removal
|
||||
4. **Dynamic group add (IPC watcher)**: Probably not worth — modern flow is "admin skill adds group to DB, restart". But document that restart is required
|
||||
@@ -1,240 +0,0 @@
|
||||
# IPC: v1 vs v2
|
||||
|
||||
## Scope
|
||||
|
||||
### v1
|
||||
- **Host side:** `/Users/gavriel/nanoclaw4/src/v1/ipc.ts` (127 lines) — file-system watcher, task authorization, message routing
|
||||
- **Auth/handshake tests:** `/Users/gavriel/nanoclaw4/src/v1/ipc-auth.test.ts` (614 lines) — authorization gates, schedule types, cron validation
|
||||
- **Container side:** `/Users/gavriel/nanoclaw4/container/agent-runner/src/v1/ipc-mcp-stdio.ts` (509 lines) — MCP server over stdio, file-based message writes
|
||||
- **Total v1 codebase:** ~1,250 lines (v1/ subtree)
|
||||
|
||||
### v2 counterparts
|
||||
This is not a file-for-file mapping. The entire IPC abstraction layer has been replaced with SQLite DBs:
|
||||
|
||||
- **Host delivery/routing:** `/Users/gavriel/nanoclaw4/src/delivery.ts` (912 lines) — polls outbound.db, delivers, handles system actions
|
||||
- **Host sweep/recurrence:** `/Users/gavriel/nanoclaw4/src/host-sweep.ts` (174 lines) — 60s maintenance, stale detection via heartbeat, processing_ack sync
|
||||
- **Session setup/DB:** `/Users/gavriel/nanoclaw4/src/session-manager.ts` (361 lines) — DB paths, folder init, destinations + routing writes
|
||||
- **Container poll loop:** `/Users/gavriel/nanoclaw4/container/agent-runner/src/poll-loop.ts` (200+ lines) — fetches messages_in, marks status in processing_ack
|
||||
- **Container destinations:** `/Users/gavriel/nanoclaw4/container/agent-runner/src/destinations.ts` (118 lines) — reads inbound.db's destinations table live
|
||||
- **DB layer (host):** `src/db/session-db.ts` — insertMessage, getDueOutboundMessages, markDelivered, syncProcessingAcks, etc.
|
||||
- **DB layer (container):** `container/agent-runner/src/db/{messages-in,messages-out,session-state,connection}.ts`
|
||||
- **Schema:** `/Users/gavriel/nanoclaw4/docs/db-session.md` (184 lines) — definitive per-session DB layout
|
||||
|
||||
---
|
||||
|
||||
## Paradigm shift
|
||||
|
||||
**v1: IPC as explicit message files + stdio tunnel + MCP server**
|
||||
|
||||
In v1, the host spawned an MCP server inside each container's stdio. The container's `ipc-mcp-stdio.ts` exposed tools (`send_message`, `schedule_task`, `register_group`, etc.) by writing JSON files to the host's `data/ipc/{groupFolder}/{messages|tasks}/` directory. The host's `ipc.ts` file-watcher scanned these directories every `IPC_POLL_INTERVAL` (~1s), parsed the JSON, applied authorization gates (isMain? folder-match?), executed side effects (DB writes, group registration), and deleted the files. Ordering, atomicity, and backpressure were implicit in the filesystem.
|
||||
|
||||
**v2: Everything is a message in two persistent DBs**
|
||||
|
||||
The IPC abstraction has been *entirely removed*. All host↔container communication now flows through two SQLite files per session:
|
||||
- **inbound.db** (host writes, container reads): `messages_in` for inbound chat/tasks, `destinations` for the routing map, `session_routing` for default reply channel
|
||||
- **outbound.db** (container writes, host reads): `messages_out` for agent responses, `processing_ack` for status acks, `session_state` for continuation storage
|
||||
|
||||
There is no MCP server inside the container that exposes system tools. Instead:
|
||||
- **Container side** calls `writeMessageOut()` directly, writing a JSON `content` blob with `action="schedule_task"` (or similar) into the `messages_out` table.
|
||||
- **Host side** polls `getDueOutboundMessages()` from outbound.db, deserializes the `content`, and in `handleSystemAction()` interprets the action, validates it, and applies it directly to inbound.db (no IPC file write).
|
||||
|
||||
The single-writer-per-file invariant (host writes inbound.db, container writes outbound.db) replaces the file-system locking and atomic rename semantics.
|
||||
|
||||
**Key ownership shift:**
|
||||
- v1: Container owned the "request to do something" (file write). Host decided whether to act (authorization on read).
|
||||
- v2: Host owns the "task is pending" state (messages_in row). Container marks its progress (processing_ack). Host syncs status, detects stale containers, and triggers recurrence.
|
||||
|
||||
---
|
||||
|
||||
## Capability map
|
||||
|
||||
| v1 IPC Behavior | v2 Equivalent | Status | Notes |
|
||||
|---|---|---|---|
|
||||
| **Handshake / auth** | Database schema + envelope ID | ✓ Functional but different | v1: read `isMain` env var at startup, gate each IPC op. v2: host resolves session once, container reads `destinations` table on every query. No per-message auth envelope. |
|
||||
| **Message framing** | JSON in files (atomic rename) | ✓ Replaced with DB schema | v1: `writeIpcFile()` temp-then-rename. v2: `better-sqlite3` with `journal_mode=DELETE` + open-close-per-op for cross-mount visibility. |
|
||||
| **Transport (pipes/sockets)** | SQLite on FUSE mount | ✓ Completely different | v1: filesystem watching (no network). v2: cross-mount DB access (requires `journal_mode=DELETE` pragma, see session-manager.ts:9–11). |
|
||||
| **Message types** | `kind` field in messages_in/out | ✓ Expanded | v1: message/task files. v2: `kind=chat|task|system|...` in DB rows, content shape in [api-details.md](../api-details.md). |
|
||||
| **Auth / authorization gates** | Host-side permission checks in delivery.ts | ◐ Simplified but different | v1: checked per file (isMain flag, folder-match). v2: admin perms checked at container startup (adminUserIds set in poll-loop.ts:22–33), destination ACL in agent_destinations table, delivery.ts enforces on send. No per-message envelope. |
|
||||
| **Handshake semantics** | None (session exists at startup) | ✗ Removed | v1: env vars set identity at container boot. v2: session_id/agent_group_id is stable DB fixture; container learns routing from `session_routing` table. No negotiation. |
|
||||
| **Backpressure / flow control** | Implicit (filesystem poll interval) | ◐ Different model | v1: host polls files at 1s intervals; if processing is slow, files pile up. v2: messages_in rows sit with `status='pending'` until container marks `processing_ack='processing'`, then host polls and syncs status. Host can enforce delivery retry budget (MAX_DELIVERY_ATTEMPTS=3 in delivery.ts:58). |
|
||||
| **Keepalives / timeouts** | No explicit mechanism | ✓ Heartbeat file replaces | v1: IPC files served as implicit liveness. v2: container touches `.heartbeat` file (mtime tracked by host). Host uses heartbeat staleness (10min threshold in host-sweep.ts:32) to detect crash and reset stuck messages. |
|
||||
| **Ordering / seq parity** | Implicit filename order (timestamp+random) | ✓ Enforced | v1: files had timestamps but no formal ordering. v2: `seq` is monotonic per session, even←host / odd←container (see db-session.md §3). Parity disambiguates edit/reaction targeting. |
|
||||
| **Reconnect semantics** | Container restart picks up where it left off (env vars) | ✓ Improved | v1: continuation not persisted across restarts. v2: provider continuation (Claude JSON transcript, etc.) stored in `session_state.session_id` on every SDK result. Survives crash. |
|
||||
| **Error handling / retries** | File left in `errors/` dir on parse failure | ✓ Better visibility | v1: failed IPC files moved to `data/ipc/errors/` for manual inspection. v2: `status='failed'` in messages_in; delivery.ts retries with exponential backoff (3 attempts), marks failed on max. Persisted in DB for audit. |
|
||||
| **Task scheduling (schedule_task)** | IPC file write → host parses → DB insert | ✓ Same end result, different path | v1: container writes task JSON, host reads/validates cron. v2: container writes `system` message with `action="schedule_task"` to messages_out, host reads + inserts into messages_in as new `kind='task'` row. Validation still in host (cron parsing at delivery time). |
|
||||
| **Admin commands (/clear, /setup)** | Not in v1 IPC | ✓ Implemented | v2 has `/clear` command in poll-loop.ts, checked against adminUserIds set. Clears `session_state.session_id`. No MCP server expose. |
|
||||
| **Tool-call plumbing** | MCP server in container exposes send_message, schedule_task, etc. | ✗ Removed entirely | v1 tools are now plain SDK result processors. send_message writes messages_out. schedule_task writes messages_out with action="schedule_task". |
|
||||
| **Message delivery tracking** | None (fire-and-forget) | ✓ Added | v1: host sends message, doesn't track if it reached the user. v2: `delivered` table in inbound.db (platform_message_id + status). delivery.ts marks as delivered/failed. Enables message edits, reactions, and retry logic. |
|
||||
| **Stale container detection** | None | ✓ Added | v1: no heartbeat. v2: host-sweep.ts checks `.heartbeat` mtime. If >10min old and processing_ack has 'processing' entries, resets with backoff. |
|
||||
| **Recurrence / cron re-firing** | Not in v1 | ✓ Added | v1: tasks were one-shot. v2: `recurrence` field in messages_in + `series_id`. host-sweep.ts fires next occurrence when completed message has recurrence. CronExpressionParser used at sync time. |
|
||||
|
||||
---
|
||||
|
||||
## Missing from v2
|
||||
|
||||
### 1. **Auth handshake envelope**
|
||||
v1 had explicit authorization gates for *every* IPC operation:
|
||||
- Read `isMain` and `groupFolder` from env vars at startup (ipc-mcp-stdio.ts:19–21)
|
||||
- For `schedule_task`: gate the `targetJid` — non-main groups can only schedule for `chatJid` (line 187–188)
|
||||
- For `register_group`: only isMain=true can call (line 471–481)
|
||||
- For `send_message`: isMain || (target group's folder == sender's folder) (ipc.ts:78)
|
||||
|
||||
**v2 equivalent:** Authorization is now **split**:
|
||||
- Container time: adminUserIds set passed at boot (poll-loop.ts:22–33), used to gate `/clear` command only
|
||||
- Delivery time: host checks destination ACL via agent_destinations table, permission to send to a messaging group (delivery.ts:535–561)
|
||||
- No per-message auth envelope; the session fixture itself represents authorization
|
||||
|
||||
**What's lost:** Per-request explicit authorization metadata. The agent can't *prove* it's "main" anymore; instead the host verifies at delivery time using the central DB. This is arguably *better* security (no token in container to leak), but if the agent needs to know *why* a request failed, it no longer gets an explicit auth reject response.
|
||||
|
||||
### 2. **Backpressure / request queuing**
|
||||
v1 file-based IPC was **implicitly backpressured**:
|
||||
- Container calls `send_message()` MCP tool, which calls `writeIpcFile()` and returns immediately (fire-and-forget)
|
||||
- If the host is slow or overloaded, files pile up in `data/ipc/messages/`
|
||||
- Container is blocked only if the filesystem fills
|
||||
|
||||
**v2 equivalent:** No queueing or explicit backpressure:
|
||||
- Container calls `writeMessageOut()`, which executes a synchronous SQLite INSERT into outbound.db
|
||||
- Host polls outbound.db at 1s (active) or 60s (sweep)
|
||||
- If delivery fails, messages sit in outbound.db with `status='pending'` until 3 retries exhausted
|
||||
|
||||
**What's lost:** Queue depth visibility. In v1, you could see `ls data/ipc/messages/ | wc -l` to get backlog. In v2, you have to query the outbound DB. The container has no way to ask "how many pending messages are waiting for me?" — it just writes and hopes the host picks them up.
|
||||
|
||||
### 3. **Explicit keepalive / ping**
|
||||
v1 had implicit keepalives via file timestamps:
|
||||
- Each IPC file wrote a `timestamp` field (ipc-mcp-stdio.ts:61, 202)
|
||||
- Host could reason about "last IPC activity"
|
||||
|
||||
**v2 equivalent:** Heartbeat file mtime:
|
||||
- Container touches `.heartbeat` file (connection.ts `touchHeartbeat()`)
|
||||
- Host checks mtime every 60s in host-sweep.ts
|
||||
- Detects stale if >10min old and processing_ack has 'processing' entries
|
||||
|
||||
**What's lost:** Sub-heartbeat timeouts. If the container is hung but the heartbeat is fresh (just stuck in a long computation), the host won't detect it. v1 had no explicit timeout either, so this is not a regression, but there's no keepalive *mechanism* (no ping/pong protocol).
|
||||
|
||||
### 4. **Payload size limits / chunking**
|
||||
v1 wrote task files with a single JSON blob:
|
||||
- ipc-mcp-stdio.ts:31: `fs.writeFileSync(tempPath, JSON.stringify(data, null, 2))`
|
||||
- Filesystem might have limits on inode size, but generally no explicit cap
|
||||
|
||||
**v2:** No explicit chunking or size limits in the DB layer:
|
||||
- messages_in.content and messages_out.content are TEXT
|
||||
- SQLite TEXT default is ~1GB per cell
|
||||
- No mention in the codebase of max payload size
|
||||
|
||||
**What's lost:** Explicit awareness. In v1, if a task prompt was 10MB, it would be a 10MB JSON file. In v2, it's a 10MB DB cell. The system doesn't actively prevent this, and there's no mention of a sanitizer.
|
||||
|
||||
---
|
||||
|
||||
## Behavioral discrepancies
|
||||
|
||||
### 1. **Task scheduling authorization**
|
||||
**v1** (ipc-auth.test.ts:71–127):
|
||||
```typescript
|
||||
// Main group can schedule for another group
|
||||
await processTaskIpc({ type: 'schedule_task', targetJid: 'other@g.us' }, 'whatsapp_main', true, deps);
|
||||
// Non-main group can ONLY schedule for itself
|
||||
await processTaskIpc({ type: 'schedule_task', targetJid: 'main@g.us' }, 'other-group', false, deps);
|
||||
// ↑ blocked by authorization gate (ipc.ts:170)
|
||||
```
|
||||
|
||||
**v2** (delivery.ts:645–712):
|
||||
The container writes a `system` message with `action="schedule_task"` directly into messages_out. The host reads it and calls `insertTask(inDb, {...})` **with no authorization gate**. The `targetJid` is derived from the system message `platformId` and `channelType`, not from an explicitly routed `targetJid` parameter.
|
||||
|
||||
**Discrepancy:** v1 prevented non-main groups from scheduling cross-group tasks at the *request* stage. v2 has no equivalent gate — the container can write any task to any group (in theory) because it's the host that does the actual DB insert. In practice, the container only has one session and only sees messages for that session, so it can't *reach* another group's messages_in. But the authorization model is implicitly structural, not explicit.
|
||||
|
||||
### 2. **Message send authorization**
|
||||
**v1** (ipc-auth.test.ts:339–373):
|
||||
```typescript
|
||||
// Main can send to any chat
|
||||
expect(isMessageAuthorized('whatsapp_main', true, 'other@g.us', groups)).toBe(true);
|
||||
// Non-main can send to its own chat
|
||||
expect(isMessageAuthorized('other-group', false, 'other@g.us', groups)).toBe(true);
|
||||
// Non-main cannot send to another group's chat
|
||||
expect(isMessageAuthorized('other-group', false, 'main@g.us', groups)).toBe(false);
|
||||
```
|
||||
|
||||
**v2** (delivery.ts:550–561):
|
||||
```typescript
|
||||
const isOriginChat = session.messaging_group_id === mg.id;
|
||||
if (!isOriginChat && !hasDestination(session.agent_group_id, 'channel', mg.id)) {
|
||||
throw new Error(`unauthorized channel destination: ...`);
|
||||
}
|
||||
```
|
||||
|
||||
The container's session has a fixed `messaging_group_id` + `thread_id`. The agent can only reply to that origin or to a destination in the `agent_destinations` table. There is no isMain flag.
|
||||
|
||||
**Discrepancy:** v1 was group-centric (folder-based identity). v2 is session-centric (agent is wired to one or more messaging groups via central DB, projected into inbound.db). If an agent is wired to multiple chats with `session_mode='agent-shared'`, it has one session and can see all of them as destinations. This is more flexible than v1's binary main/non-main gate.
|
||||
|
||||
### 3. **Task update semantics**
|
||||
**v1** (ipc-auth.test.ts:264–309): Container passes `type='update_task'`, host reads the task, re-computes `next_run` if schedule changed, updates DB.
|
||||
|
||||
**v2** (delivery.ts:695–712): Container writes `system` message with `action="update_task"`, host applies the update directly. The host **does not** recompute `next_run` if the schedule changes — it only updates the fields the container specified. Recurrence is re-fired by the *host* in host-sweep.ts (line 160–165), not at update time.
|
||||
|
||||
**Discrepancy:** v1 eagerly recomputed next_run on update. v2 lazily computes it during the 60s sweep. If an agent updates a task's cron expression, it won't take effect until the next sweep cycle. This is a ~60s latency increase.
|
||||
|
||||
### 4. **Error handling**
|
||||
**v1** (ipc.ts:85–91): Files that fail to parse are moved to `data/ipc/errors/` for manual inspection.
|
||||
|
||||
**v2** (delivery.ts:422–459): Messages that fail delivery get up to 3 retries with exponential backoff. If they still fail, they're marked `status='failed'` in the DB. There's no "errors" directory; the audit trail is in the DB + logs.
|
||||
|
||||
**Discrepancy:** v1's error handling was "fire-and-forget" (parse, move on). v2's is "retry + persistent state." This is better observability, but v1's "move to errors/" was a gentler way to pause processing without losing the file.
|
||||
|
||||
### 5. **Reconnect / session resumption**
|
||||
**v1:** No persistence. If the container crashed, the next restart had no knowledge of prior messages or state.
|
||||
|
||||
**v2** (poll-loop.ts:51–55): Reads `session_state.session_id` at startup and passes it to the provider as `continuation`. The provider (Claude) deserializes a `.jsonl` transcript and resumes. Survives container crash.
|
||||
|
||||
**Discrepancy:** v2 has explicit continuation support. v1 did not. This is a strict improvement.
|
||||
|
||||
---
|
||||
|
||||
## Worth preserving?
|
||||
|
||||
### 1. **Per-request authorization envelope**
|
||||
**Recommendation:** No, v2's structural approach is better. In v1, a malicious container could spoof an isMain flag to bypass gates (though env vars are hard to spoof). v2's model — the host resolves identity once and checks permissions against the central DB — is more robust and easier to audit.
|
||||
|
||||
### 2. **Message send ACL at request time**
|
||||
**Recommendation:** Partially — v2 should validate `agent_destinations` rows exist *before* the agent attempts a send, so it fails fast instead of silently dropping at delivery time. Currently, if an agent tries `<message to="nonexistent">...</message>`, it writes to messages_out and the host later rejects it. A pre-send validation in the container (via destinations.ts) would be better UX.
|
||||
|
||||
### 3. **Backpressure / delivery acknowledgment**
|
||||
**Recommendation:** Maybe. If an agent rapidly fires 100 `send_message()` calls, they all block on SQLite INSERT (fast) and return immediately. The host drains them at 1s per poll. If the channel adapter is slow, messages pile up in messages_out. There's no way for the agent to ask "is there backlog?" or "wait until sent." This is probably fine for most use cases (agents don't spam), but if latency-sensitive, a `send_message()` that returns `{delivered_at}` would help.
|
||||
|
||||
### 4. **Heartbeat / stale detection**
|
||||
**Recommendation:** Yes, and it's been preserved (`.heartbeat` file replaces file-based timestamps). But the 10min threshold is conservative. Consider shorter thresholds for interactive agents (containers should be responsive, stale is a sign of crash, not slow work).
|
||||
|
||||
---
|
||||
|
||||
## File references
|
||||
|
||||
### v1 (historical, in `src/v1/` and `container/agent-runner/src/v1/`)
|
||||
- **ipc.ts:30–127** — startIpcWatcher loop, per-group folder scan, message/task file dispatch
|
||||
- **ipc.ts:129–356** — processTaskIpc with authorization gates (lines 169, 228, 241, 254, 271, 313, 326)
|
||||
- **ipc-auth.test.ts:71–127** — schedule_task authorization tests
|
||||
- **ipc-auth.test.ts:339–373** — message send authorization tests
|
||||
- **ipc-mcp-stdio.ts:37–68** — send_message MCP tool, writeIpcFile
|
||||
- **ipc-mcp-stdio.ts:70–216** — schedule_task tool with validation, target_group_jid param
|
||||
- **ipc-mcp-stdio.ts:445–504** — register_group tool, isMain gate
|
||||
|
||||
### v2 (active, in `src/` and `container/agent-runner/src/`)
|
||||
- **db-session.md:1–50** — inbound.db schema (messages_in, delivered, destinations, session_routing)
|
||||
- **db-session.md:120–174** — outbound.db schema (messages_out, processing_ack, session_state)
|
||||
- **db-session.md:104–117** — seq parity invariant
|
||||
- **delivery.ts:383–394** — drainSession loop (active poll 1s, sweep 60s)
|
||||
- **delivery.ts:467–638** — deliverMessage, handles all message kinds, permission checks, delivery retry
|
||||
- **delivery.ts:645–906** — handleSystemAction, interprets action="schedule_task" etc.
|
||||
- **host-sweep.ts:48–109** — sweepSession, syncProcessingAcks, stale detection via heartbeat, recurrence handling
|
||||
- **session-manager.ts:1–12** — cross-mount invariant doc (journal_mode=DELETE, close-per-op)
|
||||
- **session-manager.ts:122–130** — initSessionFolder, schema creation
|
||||
- **session-manager.ts:152–222** — writeSessionRouting, writeDestinations (replaces static env vars with live table)
|
||||
- **session-manager.ts:231–267** — writeSessionMessage (host writes to messages_in)
|
||||
- **poll-loop.ts:22–33** — PollLoopConfig with adminUserIds set
|
||||
- **poll-loop.ts:46–77** — runPollLoop entry, getPendingMessages, markProcessing
|
||||
- **destinations.ts:44–52** — getAllDestinations, findByName (reads from inbound.db live)
|
||||
- **db/messages-in.ts** — getPendingMessages, markProcessing, markCompleted
|
||||
- **db/messages-out.ts** — writeMessageOut (container writes system actions here)
|
||||
- **db/session-state.ts** — getStoredSessionId, setStoredSessionId (continuation persistence)
|
||||
- **db/connection.ts** — touchHeartbeat, journal_mode=DELETE pragma, cross-mount setup
|
||||
|
||||
---
|
||||
|
||||
Generated from deep-dive analysis of v1 IPC → v2 DB paradigm shift.
|
||||
@@ -1,38 +0,0 @@
|
||||
# logger: v1 vs v2
|
||||
|
||||
## Scope
|
||||
- v1: `src/v1/logger.ts` (70 LOC) — export `logger`
|
||||
- v2 counterpart: `src/log.ts` (65 LOC) — export `log`
|
||||
|
||||
## Capability map
|
||||
|
||||
| v1 behavior | v2 location | Status | Notes |
|
||||
|---|---|---|---|
|
||||
| Levels (debug=20, info=30, warn=40, error=50, fatal=60) | `src/log.ts:1` | kept | Identical numeric map |
|
||||
| `debug/info/warn/error/fatal` methods | `src/log.ts:50-54` | renamed | `logger.X(...)` → `log.X(...)` |
|
||||
| Data-first signature `(data, msg)` | `src/log.ts:42-58` | **changed** | v2 requires message-first `(msg, data?)` — breaking for every callsite |
|
||||
| Color codes (per-level + KEY_COLOR=magenta, MSG_COLOR=cyan) | `src/log.ts:4-14` | kept | Identical |
|
||||
| LOG_LEVEL env threshold | `src/log.ts:16` | kept | `'info'` default |
|
||||
| Timestamp `HH:MM:SS.mmm` | `src/log.ts:33-40` | kept | Refactored, same output |
|
||||
| Error formatting | `src/log.ts:18-23` | **changed** | v1 pretty multi-line JSON; v2 single-line |
|
||||
| Data formatting | `src/log.ts:25-31` | **changed** | v1 per-line indented; v2 inline `key=value` |
|
||||
| Process ID in output | — | **removed** | v1 emitted `(${process.pid})`; v2 drops it |
|
||||
| info/debug → stdout, warn/error/fatal → stderr | `src/log.ts:45` | kept | Identical routing |
|
||||
| `uncaughtException` → fatal + exit(1) | `src/log.ts:57-60` | kept | Arg order swapped |
|
||||
| `unhandledRejection` → error | `src/log.ts:62-64` | kept | Arg order swapped |
|
||||
|
||||
## Missing from v2
|
||||
1. **Process ID in log output** — lost visibility into emitting process in multi-container scenarios
|
||||
2. **Data-first overload** — v1 `logger.warn({err, path}, 'msg')` is a breaking API change in v2
|
||||
3. **Multi-line error formatting** — condensed single-line form is harder to read for stack traces
|
||||
|
||||
## Behavioral discrepancies
|
||||
1. **Argument order**: `logger.error({err}, 'failed')` must become `log.error('failed', {err})` at every callsite
|
||||
2. **Error output**: v1 pretty-prints JSON over 3 lines; v2 collapses to one line
|
||||
3. **Data output**: v1 newline+indent per key; v2 space-separated inline
|
||||
|
||||
## Not in either
|
||||
File rotation, redaction rules, on-disk logging — both stream to stdout/stderr only.
|
||||
|
||||
## Worth preserving?
|
||||
Restoring PID to v2 output is cheap and helps multi-process debugging. Multi-line error format is worth a verbose-mode flag for `error`/`fatal`. Signature swap is stylistic; not worth reverting but every v1 `logger` → `log` migration must swap `(data, msg)` → `(msg, data)`.
|
||||
@@ -1,90 +0,0 @@
|
||||
# remote-control: v1 vs v2
|
||||
|
||||
## Scope
|
||||
|
||||
**v1:**
|
||||
- `/Users/gavriel/nanoclaw4/src/v1/remote-control.ts` (218 lines)
|
||||
- `/Users/gavriel/nanoclaw4/src/v1/remote-control.test.ts` (379 lines)
|
||||
- Integrated into v1 host via `restoreRemoteControl()` call at startup (v1/index.ts:42)
|
||||
|
||||
**v2 Counterparts:**
|
||||
- `/Users/gavriel/nanoclaw4/src/access.ts` (115 lines) — privilege/approval routing
|
||||
- `/Users/gavriel/nanoclaw4/src/onecli-approvals.ts` (269 lines) — OneCLI credential-gated action approval
|
||||
- `/Users/gavriel/nanoclaw4/src/webhook-server.ts` (134 lines) — HTTP webhook ingress for Chat SDK adapters
|
||||
- `/Users/gavriel/nanoclaw4/src/router.ts` (start of file) — inbound message routing with access gates
|
||||
|
||||
## Capability Map
|
||||
|
||||
| v1 Behavior | v2 Location | Status | Notes |
|
||||
|---|---|---|---|
|
||||
| Start `claude remote-control` child process, extract URL | **Removed** | ❌ Removed | v2 has no equivalent. The `claude remote-control` CLI was a v1-only mechanism tied to individual Telegram chats. |
|
||||
| Session state persistence (PID, URL, metadata) | **Removed** | ❌ Removed | v2 is stateless at the host level — all per-session state lives in `inbound.db` / `outbound.db`. |
|
||||
| Auto-accept "Enable Remote Control?" prompt via stdin | **Removed** | ❌ Removed | v1 quirk tied to Claude CLI's interactive mode; no equivalent in v2. |
|
||||
| Restore session from disk on startup | **Removed** | ❌ Removed | v2 has no startup recovery loop for stale processes. Sessions are created on-demand. |
|
||||
| Detect dead process by signal check | **Removed** | ❌ Removed | v2 uses per-session heartbeat file (`/workspace/.heartbeat`) and inactivity detection via 60s sweep. |
|
||||
| HTTP URL polling + timeout handling | **Webhook server** | ✅ Moved | v2's `webhook-server.ts` (line 16–124) runs a persistent HTTP server (default port 3000) for Chat SDK adapter webhooks. Routes via `/webhook/{adapterName}` (not URL-in-stdout polling). |
|
||||
| Single active session per host | **Per-agent-group sessions** | ✅ Evolved | v2 supports unlimited concurrent sessions. Each `(agent_group, messaging_group, thread)` tuple is a separate session with its own container. |
|
||||
| `getActiveSession()` getter | **Removed** | ❌ Removed | No global session concept. v2 queries sessions via `getSession(sessionId)` in `db/sessions.ts`. |
|
||||
| Credential access approval | **OneCLI approval handler** | ✅ Moved | v2's `onecli-approvals.ts` (line 92–215) handles credential-gated action approval. OneCLI gateway intercepts HTTP, delivers ask_question card to approver, persists `pending_approvals` row (line 173–196). |
|
||||
| Approver selection (admin → owner chain) | **access.ts** | ✅ Moved | `pickApprover()` (access.ts:55–72) returns ordered list: agent-group admins → global admins → owners. Same preference order as v1 logic. |
|
||||
| Approval delivery to DM (same channel kind preferred) | **access.ts + user-dm.ts** | ✅ Moved | `pickApprovalDelivery()` (access.ts:83–101) walks approver list, prefers same channel kind via `channelTypeOf()` (line 112–115), falls back to any reachable DM. Uses `ensureUserDm()` for cold-DM resolution (user-dm.ts). |
|
||||
| Ask_question card delivery | **onecli-approvals.ts** | ✅ Moved | v2 builds ask_question card (onecli-approvals.ts:148–167) with Approve/Reject buttons, routes via `deliveryAdapter.deliver()` with action_id for button callbacks. |
|
||||
| Button click → approval resolution | **onecli-approvals.ts** | ✅ Moved | `resolveOneCLIApproval()` (line 68–83) matches approval_id, resolves Promise, updates status to approved/rejected, deletes `pending_approvals` row. |
|
||||
| Approval expiry + cleanup | **onecli-approvals.ts** | ✅ Moved | Expiry timer fires just before gateway's TTL (line 200–211); `expireApproval()` (line 217–226) edits card to "Expired (reason)" and deletes row. Startup sweep cleans stale rows (line 247–255). |
|
||||
| Rate limiting | **Not implemented** | ❌ Missing | Neither v1 nor v2 has rate limiting on remote-control or approval requests. |
|
||||
| Audit logging | **Partial** | ⚠️ Partial | v1: `logger.info()` on session start/stop. v2: `log.info()` on approval resolved (onecli-approvals.ts:81), stale sweeps (line 250), expiry (line 225). Payload stored in `pending_approvals.payload` for audit (line 178–186). |
|
||||
| Error recovery (process death) | **Minimal** | ⚠️ Minimal | v1: restores from disk, kills stale PID. v2: no equivalent — dead container is detected by stale heartbeat, then respawned via `wakeContainer()`. |
|
||||
| Transport | HTTP via stdout polling | HTTP via standard webhook server | v1 is ephemeral per session; v2 is persistent, multi-tenant. |
|
||||
| Auth | None (CLI subprocess) | OneCLI gateway (credential-gated via HTTP) | v1 has no auth; v2 gates on agent identity + OneCLI decision. |
|
||||
|
||||
## Missing from v2
|
||||
|
||||
1. **CLI subprocess spawning** — v2 has no `claude remote-control` equivalent. Agents run in Docker containers, not standalone CLI processes. The OneCLI agent sandbox is managed by the agent-runner container, not the host.
|
||||
|
||||
2. **Process-level lifecycle management** — v1 tracks individual process PIDs and signal-kills them. v2 uses container IDs + heartbeat file, handled by host-sweep (host-sweep.ts) and container-runner.ts.
|
||||
|
||||
3. **Per-message URL polling with regex extraction** — v2's webhook server is push-based (HTTP handler), not pull-based polling of stdout files.
|
||||
|
||||
4. **Direct user-to-bot communication model** — v1's remote-control was tied to a single Telegram JID + chat. v2 decouples messaging groups from agent groups, allowing one agent to serve multiple channels with different isolation levels.
|
||||
|
||||
5. **State file on disk** (`remote-control.json`) — v2 stores all session state in SQLite central DB and per-session `inbound.db` / `outbound.db`.
|
||||
|
||||
## Behavioral Discrepancies
|
||||
|
||||
1. **Approval delivery model**:
|
||||
- v1: Remote control was tied to a single message sender; approvals implicitly went to the initiator's contact or a hardcoded owner.
|
||||
- v2: Approvals route to admins of the originating agent group, with tie-break by channel kind (pickApprovalDelivery line 87–94). Multiple approvers can be reached, decoupling approval from message sender.
|
||||
|
||||
2. **Session multiplicity**:
|
||||
- v1: One active `RemoteControlSession` per host at a time.
|
||||
- v2: Unlimited concurrent sessions, each with independent state (`inbound.db`, `outbound.db`, heartbeat).
|
||||
|
||||
3. **Timeout & cleanup**:
|
||||
- v1: Explicit timeout on URL polling (30s), then kill process. No ongoing monitoring.
|
||||
- v2: Heartbeat-based inactivity detection (60s sweep), graceful cleanup on stale. Approval expiry tied to OneCLI gateway TTL, not a fixed timeout.
|
||||
|
||||
4. **Error transparency**:
|
||||
- v1: Polling errors logged to stdout/stderr files; user doesn't see unless they debug.
|
||||
- v2: All approval errors logged centrally; card is edited to "Expired" on failure, so approver sees state change.
|
||||
|
||||
## Worth Preserving?
|
||||
|
||||
**No — v2 supersedes v1's remote-control model.**
|
||||
|
||||
v1's remote-control was a bridge between Telegram chats and a single Claude CLI session. v2 achieves equivalent (and superior) remote operation via:
|
||||
- **OneCLI credential approvals** (onecli-approvals.ts): Admins approve API/credential requests from agents, just as v1 surfaced sensitive actions.
|
||||
- **Approval routing** (access.ts): Automatically picks the right admin on the right channel, with fallback to any reachable DM.
|
||||
- **Multi-tenant agent groups**: Agents can serve multiple channels with different approval chains, not just one chat JID.
|
||||
|
||||
Users still get on-demand approval for sensitive actions; they just don't manage a CLI subprocess anymore. The host handles container lifecycle, and the container agent is managed by OneCLI.
|
||||
|
||||
---
|
||||
|
||||
### Citation Summary
|
||||
|
||||
- v1 remote-control: `/Users/gavriel/nanoclaw4/src/v1/remote-control.ts:1–218`
|
||||
- v1 tests: `/Users/gavriel/nanoclaw4/src/v1/remote-control.test.ts:1–379`
|
||||
- v2 access control: `/Users/gavriel/nanoclaw4/src/access.ts:29–115` (pickApprover, pickApprovalDelivery, canAccessAgentGroup)
|
||||
- v2 approval handler: `/Users/gavriel/nanoclaw4/src/onecli-approvals.ts:50–270` (handleRequest, resolveOneCLIApproval, sweepStaleApprovals)
|
||||
- v2 webhook server: `/Users/gavriel/nanoclaw4/src/webhook-server.ts:73–124` (registerWebhookAdapter, ensureServer)
|
||||
- v2 router: `/Users/gavriel/nanoclaw4/src/router.ts:19–50` (inbound access gate, unknown_sender_policy)
|
||||
@@ -1,67 +0,0 @@
|
||||
# router: v1 vs v2
|
||||
|
||||
## Scope
|
||||
- v1 (distributed across): `src/v1/index.ts` (startMessageLoop, trigger check), `group-queue.ts` (concurrency, retry), `router.ts` (outbound formatting, 44 LOC), `sender-allowlist.ts` (drop/allow)
|
||||
- v2: `src/router.ts` (317 LOC), `src/session-manager.ts` (346 LOC), `src/container-runner.ts`, `src/access.ts`, `src/db/messaging-groups.ts` (trigger_rules schema)
|
||||
|
||||
## Routing-flow diff
|
||||
|
||||
### v1 (polling, per-group)
|
||||
1. Channel receives message → `onMessage` → store in DB
|
||||
2. Sender allowlist drop-mode filter → discard denied
|
||||
3. `startMessageLoop` polls every POLL_INTERVAL
|
||||
4. For each group: lookup channel (`findChannel` O(n)), check trigger requirement, load allowlist, scan for pattern, skip if no trigger
|
||||
5. Pull messages since `lastAgentTimestamp`, XML-format with tz context
|
||||
6. If active container: write JSON to IPC file; else `enqueueMessageCheck(groupJid)` → GroupQueue
|
||||
7. Retry on failure (up to 5, exp. backoff); rollback cursor on agent error
|
||||
|
||||
### v2 (event-driven, entity model)
|
||||
1. Channel adapter → `routeInbound(platformId, threadId, message)`
|
||||
2. Apply thread policy (`supportsThreads` → collapse to null)
|
||||
3. Resolve `messaging_group` (lookup or auto-create)
|
||||
4. Extract sender → upsert `users` row → `userId` (namespaced `channel_type:handle`)
|
||||
5. Lookup wired agent groups via `messaging_group_agents`; drop if none
|
||||
6. `pickAgent` (highest priority; **trigger_rules matching is TODO**)
|
||||
7. `enforceAccess`: owner/admin/member gate; `unknown_sender_policy: strict | request_approval | public`
|
||||
8. `resolveSession` by `session_mode` (`agent-shared`/`shared`/`per-thread`)
|
||||
9. `insertMessage` to session `inbound.db`, write session_routing + destinations
|
||||
10. `startTypingRefresh`; `wakeContainer(session)` (dedup by `activeContainers` + `wakePromises`)
|
||||
11. Container polls inbound.db, writes outbound.db; host `delivery.ts` polls and sends via adapter; `stopTypingRefresh` on container exit
|
||||
|
||||
## Capability map
|
||||
|
||||
| v1 behavior | v2 location | Status | Notes |
|
||||
|---|---|---|---|
|
||||
| Sender allowlist drop/allow modes | — | **removed** | Replaced by access gate + `unknown_sender_policy` |
|
||||
| Group registration auto-creating folder on first message | `router.ts` auto-creates messaging_group; group folder via `group-init.ts` on wake | moved | Admin skill path for agent groups |
|
||||
| Trigger pattern matching (`requiresTrigger`, `DEFAULT_TRIGGER`) | `messaging_group_agents.trigger_rules` JSON | **deferred** | Schema ready; `pickAgent` has TODO comment |
|
||||
| `lastAgentTimestamp` cursor tracking | — | **removed** | All messages written immediately to inbound.db |
|
||||
| IPC file polling (`inputDir`, `_close` sentinel) | — | **removed** | DB polling replaces |
|
||||
| GroupQueue concurrency + waiting-groups | `container-runner.ts:42-82` `activeContainers` + `wakePromises` | reimplemented | Per-session not per-group |
|
||||
| Task scheduler → enqueue to GroupQueue | host-sweep due-wake + delivery system-actions | preserved | |
|
||||
| Session reuse rules (session mode) | `session-manager.ts` (agent-shared/shared/per-thread) | **enhanced** | Explicit per-wiring |
|
||||
| Remote control command interception | — | **removed** | |
|
||||
| Idle timeout + stdin close | `container-runner.ts:135-140` `resetIdle` | kept | Heartbeat instead of stdin |
|
||||
| Host-level retry on agent error | — | **removed** | Container is authority; host sweep retries stale only |
|
||||
| Typing indicator | `delivery.ts:startTypingRefresh` | kept | Gated on heartbeat |
|
||||
|
||||
## Missing from v2
|
||||
1. **Trigger-rule matching** — `router.ts:198` TODO. Currently every wired agent fires on every message (only priority breaks ties). **Without this, multi-agent wirings don't work as intended.**
|
||||
2. **Sender drop mode** — v1's silent-drop for noisy users is gone. v2 only has binary allow/deny.
|
||||
3. **Cursor / state recovery** — v2 writes immediately to DB. If container crashes mid-output, no host-level dedup guarantees (beyond `messages_in.id` PK)
|
||||
4. **Remote control** — v1 intercepted `/remote-control` commands pre-storage; no v2 equivalent
|
||||
5. **Host-level retry with backoff on agent error** — v1 had MAX_RETRIES=5 + exp. backoff on `processGroupMessages`; v2 only retries on stale heartbeat detection
|
||||
|
||||
## Behavioral discrepancies
|
||||
1. **Trigger evaluation**: v1 eager (skip group until trigger arrives, accumulate context); v2 TODO — once implemented, likely drops non-trigger messages at ingest (semantic change)
|
||||
2. **Session reuse**: v1 single session per group; v2 multiple (one per thread on threaded platforms)
|
||||
3. **Access control timing**: v1 pre-storage (cheap drop); v2 post-sender-resolution (requires `users` upsert)
|
||||
4. **Unknown channels**: v1 silently ignored; v2 auto-creates `messaging_groups` row — no data loss but orphaned rows possible
|
||||
5. **Formatting**: v1 host formats with tz + cursor-based message subset; v2 pushes raw JSON to inbound.db, container formats from full session history
|
||||
|
||||
## Worth preserving?
|
||||
1. **Trigger rule matching (HIGH priority)** — schema is ready; 10-line implementation in `pickAgent`. Currently broken-by-default for multi-agent wirings
|
||||
2. **Sender drop mode (MEDIUM)** — add `(agent_group_id, sender_pattern)` drop table; orthogonal to privilege
|
||||
3. **State recovery (LOW)** — add unique constraint on `messages_in.id` if not already; v2's model is simpler + more robust
|
||||
4. **Host-level retry on agent error (MEDIUM)** — currently only stale containers retry. Explicit container-exit-error retry could be valuable
|
||||
5. **Remote control** — decide: restore as opt-in skill or document deletion
|
||||
@@ -1,46 +0,0 @@
|
||||
# sender-allowlist: v1 vs v2
|
||||
|
||||
## Scope
|
||||
- v1: `src/v1/sender-allowlist.ts` (97 LOC), `sender-allowlist.test.ts` (217 LOC) — flat JSON config at `~/.config/nanoclaw/sender-allowlist.json`
|
||||
- v2 counterparts: `src/access.ts` (116 LOC), `src/router.ts` (317 LOC), `src/db/schema.ts` (user_roles, agent_group_members, messaging_groups.unknown_sender_policy), `src/container-runner.ts:291-295` (admin injection), `src/types.ts` (MessagingGroupAgent.response_scope)
|
||||
|
||||
## Capability map
|
||||
|
||||
| v1 behavior | v2 location | Status | Notes |
|
||||
|---|---|---|---|
|
||||
| Per-chat entry (`chats[chatJid]`) | `messaging_groups.unknown_sender_policy` | replaced | Policy per channel, not allowlist entries |
|
||||
| Default entry | Default `unknown_sender_policy = 'strict'` | **reversed** | v1 default-allow → v2 default-deny |
|
||||
| `allow: '*'` wildcard | Not present | removed | |
|
||||
| `allow: string[]` (exact-match list) | `agent_group_members` rows + `user_roles` | replaced | Role-based / membership-based |
|
||||
| `mode: 'trigger'` (allow for processing) | Implicit (access granted → routed) | kept | |
|
||||
| `mode: 'drop'` (silent drop) | `recordDroppedMessage()` (logs only) | **partially lost** | No silent-drop mode; denied = logged |
|
||||
| Admin override | owner / global_admin / scoped_admin | **new in v2** | Richer privilege hierarchy |
|
||||
| Static JSON file | Central DB (`users`, `user_roles`, `agent_group_members`) | changed | Runtime-mutable, queryable |
|
||||
| Exact-string sender | Namespaced `channel_type:handle` user IDs | enhanced | Explicit channel scoping |
|
||||
| `logDenied` flag | implicit (log at decision point) | kept | |
|
||||
|
||||
## Access-model diff
|
||||
**v1**: flat allowlist per chat → default-allow → binary allowed/denied.
|
||||
**v2**: entity model (`users` + roles + memberships) + per-messaging-group policy (`strict | request_approval | public`) → default-deny for unknowns.
|
||||
|
||||
**Strictly more expressive:** role hierarchy, per-agent-group scope, three-way unknown handling, user metadata (display_name/kind), runtime reconfig.
|
||||
**Lost:** per-message `drop` mode, default-allow posture, simple JSON editing.
|
||||
|
||||
## Missing from v2
|
||||
1. **`request_approval` flow** — marked TODO in `router.ts:295`. Approval-on-first-contact for unknown senders is scaffolded but not wired
|
||||
2. **`response_scope` enforcement** — field exists (`'all' | 'triggered' | 'allowlisted'`) but is not checked in `router.ts` or `delivery.ts`
|
||||
3. **Trigger-rule matching on `messaging_group_agents`** — `router.ts:198` TODO ("Future: trigger rule matching"); currently only priority-based agent selection
|
||||
4. **Silent-drop option for known-noisy senders** — v1's `mode: 'drop'` allowed "I see you but I ignore you"; v2 can only log and drop
|
||||
|
||||
## Behavioral discrepancies
|
||||
1. **Default posture flipped**: v1 open-by-default vs v2 closed-by-default — **breaking for migrations that relied on default-allow**
|
||||
2. **Drop semantics**: v1 silent drop; v2 `recordDroppedMessage()` always logs
|
||||
3. **Admin bypass**: v1 had no implicit bypass; v2 grants owners/admins access regardless of membership — more permissive for privileged users
|
||||
4. **Scope resolution**: v1 per-chat; v2 per-agent-group via `user_roles.agent_group_id` — misalignment if one chat routes to multiple agent groups with different admins
|
||||
|
||||
## Worth preserving?
|
||||
The v2 role-based model is architecturally superior. The gaps worth closing:
|
||||
- **Finish `request_approval`** flow — half-implemented scaffolding
|
||||
- **Finish `response_scope` enforcement** — exists in schema but unused
|
||||
- **Finish trigger-rule matching** in `pickAgent` — without it, every wired agent fires on every message
|
||||
- **Consider silent-drop via a dedicated table** (`(agent_group_id, sender_pattern)` with action=drop) — orthogonal to privilege
|
||||
@@ -1,44 +0,0 @@
|
||||
# session-cleanup: v1 vs v2
|
||||
|
||||
## Scope
|
||||
- v1: `src/v1/session-cleanup.ts` (26 LOC) + `scripts/cleanup-sessions.sh` (151 LOC) — cadence 24h
|
||||
- v2: `src/host-sweep.ts` (174 LOC) primary, plus `src/container-runtime.ts:60-80` (orphan cleanup), `src/session-manager.ts` (heartbeat path)
|
||||
|
||||
## Capability map
|
||||
|
||||
| v1 behavior | v2 location | Status | Notes |
|
||||
|---|---|---|---|
|
||||
| Cleanup cadence 24h | `host-sweep.ts:31` 60s sweep | **changed** | Continuous monitoring |
|
||||
| Stale session detection via JSONL mtime | `host-sweep.ts:116-151` heartbeat file mtime | simplified | Heartbeat replaces JSONL |
|
||||
| Heartbeat threshold | `STALE_THRESHOLD_MS = 10 * 60 * 1000` (`host-sweep.ts:32`) | **new** | 10 min |
|
||||
| Stuck-processing detection | `getStuckProcessingIds()` via outbound.db (`host-sweep.ts:134`) | **new** | |
|
||||
| Retry with exponential backoff | `BACKOFF_BASE_MS * 2^tries` (`host-sweep.ts:145`) | **new** | |
|
||||
| Max retries | `MAX_TRIES = 5` (`host-sweep.ts:33`) | **new** | Messages → failed after 5 |
|
||||
| Explicit container kill on stale | — | **not done** | Stale detection resets messages, doesn't stop container |
|
||||
| JSONL + tool-results cleanup | — | **removed** | No artifact cleanup (SQLite persists in DB) |
|
||||
| Artifact cleanup (debug logs, todos, telemetry) | — | **removed** | Per-type retention windows gone |
|
||||
| Orphan container cleanup | `container-runtime.ts:60-80` `cleanupOrphans()` | **new** | At startup only |
|
||||
| Active session detection via `store/messages.db` | `getActiveSessions()` from `v2.db` (`host-sweep.ts:52`) | changed | DB schema different |
|
||||
| Sync `processing_ack` (outbound.db → inbound.db) | `syncProcessingAcks()` (`host-sweep.ts:87`) | **new** | |
|
||||
| Wake container for due messages | `countDueMessages()` + `wakeContainer()` (`host-sweep.ts:91-96`) | **new** | Replaces scheduler's role |
|
||||
| Recurrence firing | `handleRecurrence()` (`host-sweep.ts:154-173`) | **new** | Cron-parsed next-run insertion |
|
||||
|
||||
## Missing from v2
|
||||
1. **Artifact cleanup** — v1 pruned JSONLs (7d), debug logs (3d), todos (3d), telemetry (7d), group logs (7d). v2 has none; if v1 leftovers exist on disk, they'll accumulate
|
||||
2. **Explicit container termination** on stale detection — v2 marks messages as retry-eligible but leaves the stale container running; orphan cleanup only runs at next startup
|
||||
3. **Configurable retention windows** — v1 had per-artifact-type retention; v2 constants are hardcoded
|
||||
|
||||
## Behavioral discrepancies
|
||||
| Aspect | v1 | v2 |
|
||||
|---|---|---|
|
||||
| Cadence | daily batch | 60s continuous |
|
||||
| Stale trigger | 24h-old JSONL | 10-min heartbeat |
|
||||
| Retry | none (session removed) | 5 tries, exp. backoff |
|
||||
| Container wake | via message loop | via `countDueMessages()` in sweep |
|
||||
| Transactions | implicit (offline script) | explicit per-session try/finally |
|
||||
|
||||
## Worth preserving?
|
||||
1. **Stop running containers on stale detection** — currently only startup `cleanupOrphans()` removes them. If a container truly dies while the host runs, the host will retry messages but won't kill the shell. Low-cost fix: `stopContainer(name)` when heartbeat is stale AND processing_ack is stuck
|
||||
2. **Artifact cleanup migration** — if v1 data exists on disk post-migration, one-time prune is worth scripting. Not a v2 runtime concern
|
||||
3. **Configurable thresholds** — `STALE_THRESHOLD_MS` / `MAX_TRIES` could live in `config.ts` for operational tuning; minor improvement
|
||||
4. **Continuous sweep + recurrence + orphan cleanup** are all **significant improvements**; keep as-is
|
||||
@@ -1,100 +0,0 @@
|
||||
# task-scheduler: v1 vs v2
|
||||
|
||||
## Scope
|
||||
|
||||
**v1 task scheduler:**
|
||||
- Files: `src/v1/task-scheduler.ts` (241 lines), `src/v1/task-scheduler.test.ts` (122 lines)
|
||||
- Self-contained scheduler loop with DB persistence and container execution
|
||||
- Stores tasks in central DB table `scheduled_tasks`
|
||||
- Runs a polling loop at `SCHEDULER_POLL_INTERVAL` (configurable, typically 5–60s)
|
||||
|
||||
**v2 task distribution:**
|
||||
- No central task-scheduler file; tasks spread across host sweep and session DBs
|
||||
- Core files: `src/host-sweep.ts` (174 lines), `src/delivery.ts` (task handlers ~line 654–713), `src/db/session-db.ts` (task mutation logic)
|
||||
- Optional: `container/agent-runner/src/task-script.ts` (pre-task script execution)
|
||||
- Task rows live in per-session `inbound.db` table `messages_in` (polymorphic message kind)
|
||||
- Recurrence computed in `host-sweep.ts` (host-sweep.ts:159–173)
|
||||
|
||||
---
|
||||
|
||||
## Capability map
|
||||
|
||||
| v1 Behavior | v2 Location | Status | Notes |
|
||||
|---|---|---|---|
|
||||
| **One-shot tasks** (schedule_type='once') | `insertTask()` in `src/db/session-db.ts:103–122`; processAfter field set, recurrence=NULL | ✅ Supported | Task inserted into messages_in with process_after timestamp, processed once, no recurrence |
|
||||
| **Recurring via cron** (schedule_type='cron') | `insertTask()` with recurrence field; `host-sweep.ts:159–173` parses cron | ✅ Supported | Cron expression stored in messages_in.recurrence, next occurrence computed on completion via CronExpressionParser |
|
||||
| **Recurring via fixed interval** (schedule_type='interval') | Not directly supported; v2 uses cron for all recurring | ⚠️ Removed | v2 requires cron syntax for recurrence. No interval-based scheduling (e.g., "every 5 minutes") without converting to cron |
|
||||
| **Timezone handling** | `host-sweep.ts:159–161` uses CronExpressionParser with no explicit TZ param; cron-parser respects system TZ | ⚠️ Degraded | v1's explicit TIMEZONE config (via timezone.ts helpers) is absent in v2. Cron evaluation uses system/Node.js default TZ, not agent/session-level configuration |
|
||||
| **Persistence** | Per-session `inbound.db` `messages_in` table + `series_id` grouping | ✅ Supported | Tasks persisted as DB rows with status (pending/completed/paused). Series_id backfilled for recurring task groups |
|
||||
| **Restart recovery** | `host-sweep.ts:85–96` syncs processing_ack on startup to detect stale containers; tasks marked paused if container crashes | ✅ Supported | Stale container detection via heartbeat file mtime (host-sweep.ts:122–131); stuck messages retried with exponential backoff |
|
||||
| **Due-message wake** | `host-sweep.ts:91–96` queries countDueMessages, wakes container if due tasks exist | ✅ Supported | 60s sweep checks for pending tasks with process_after in the past and wakes container if found |
|
||||
| **Missed-run catch-up** (interval-based) | `computeNextRun()` skips past missed intervals to prevent cumulative drift; tests verify no infinite loop | ⚠️ Degraded | v2 doesn't handle missed intervals — if a recurring cron task gets skipped, next occurrence is computed from completion time only. No "make up" for missed runs |
|
||||
| **Cancellation** | `updateTask(id, {status: 'paused'})` prevents retry churn | ✅ Supported | `cancelTask()` in `src/db/session-db.ts:128–132` sets status='completed' and clears recurrence; matches by id OR series_id |
|
||||
| **Pause/resume** | `updateTask(id, {status: 'paused'})` / resume | ✅ Supported | `pauseTask()` (line 134–138) and `resumeTask()` (line 140–144); both match id or series_id |
|
||||
| **Retry-on-failure** | `updateTaskAfterRun()` on error; no explicit retry logic in scheduler loop | ⚠️ Degraded | v2 uses `retryWithBackoff()` only when container goes stale (host-sweep.ts:147). No automatic retry for task execution errors |
|
||||
| **Concurrent-run prevention** | Task status 'active' gate (task-scheduler.ts:221); no concurrent-run logic | ⚠️ Degraded | v2 allows multiple pending tasks to wake the container in the same sweep; container processes serially but no host-level concurrency control |
|
||||
| **Idempotency** | Task ID is primary key; `insertTask()` will fail if re-run with same ID | ✅ Supported | messages_in.id is PRIMARY KEY; insertTask() fails on duplicate (caller must handle or use ON CONFLICT) |
|
||||
| **Max-age drop** | No explicit max-age field; tasks can remain pending indefinitely | ⚠️ Missing | No max-age or TTL in v2 messages_in schema. A stuck task can remain pending forever unless manually cancelled |
|
||||
| **Task context mode** (group vs isolated session) | v1: context_mode field drives session reuse (task-scheduler.ts:122) | ⚠️ Removed | v2 doesn't track context_mode; all tasks are processed in the container's default session context; no isolation toggle |
|
||||
| **Task result logging** | `logTaskRun()` writes to task_runs table; stores error + result summary | ⚠️ Degraded | v2 has no equivalent task_runs table. Task output is written as system messages back to the agent; no persistent audit trail |
|
||||
| **Task script execution** | v1: prompt + optional script field, passed to container | ✅ Supported | v2: `applyPreTaskScripts()` in `container/agent-runner/src/task-script.ts:79–121` runs scripts pre-prompt, enriches prompt with scriptOutput |
|
||||
|
||||
---
|
||||
|
||||
## Missing from v2
|
||||
|
||||
1. **Interval-based recurrence** — v1 `schedule_type='interval'` (e.g., "every 5000ms") is gone. v2 only supports cron expressions. Workaround: convert to equivalent cron (e.g., `*/5 * * * * *` for every 5 min).
|
||||
|
||||
2. **Timezone awareness** — v1 passed `TIMEZONE` config to cron parser and had explicit `formatLocalTime()` helpers. v2 has no way to specify a session/agent timezone for cron evaluation; it uses the system/Node.js TZ.
|
||||
|
||||
3. **Task context modes** — v1's `context_mode: 'group' | 'isolated'` is removed. No way to force a task into a dedicated session vs. the agent group's shared session.
|
||||
|
||||
4. **Task result audit trail** — v1 logged every run to `task_runs(task_id, run_at, duration_ms, status, result, error)`. v2 has no persistent task execution history; output is a system message only.
|
||||
|
||||
5. **Max-age / task TTL** — v1 tasks could be implicitly aged out (not directly visible in the code, but conceivable via cleanup logic). v2 has no TTL; a paused/completed task lingers in messages_in forever.
|
||||
|
||||
6. **Task-level concurrency control** — v1 prevented concurrent runs of the same task (single status check per loop iteration). v2 can queue multiple pending tasks in one sweep, though the container processes them serially.
|
||||
|
||||
---
|
||||
|
||||
## Behavioral discrepancies
|
||||
|
||||
1. **Missed-interval catch-up** (v1 `computeNextRun()` lines 32–46 vs. v2 absence):
|
||||
- **v1:** If a task is due at 10:00, 10:05, 10:10 but the scheduler is down during 10:00–10:15, it computes `next_run = 10:20` (skips missed intervals, stays on the grid).
|
||||
- **v2:** If the same recurring cron task is skipped, the next occurrence is computed from the *completion* time (host-sweep.ts:160–161), not from the original grid. A task that should run at :00 and :05 every 10 minutes might drift if completions are delayed.
|
||||
|
||||
2. **Stale-container recovery** (v1 none vs. v2 heartbeat-based):
|
||||
- **v1:** Tasks remain due if the container crashes; the scheduler will retry on the next poll.
|
||||
- **v2:** If the heartbeat goes stale (container unresponsive for 10 min), stuck processing messages are retried with exponential backoff. Tasks stuck in 'processing' state are reset.
|
||||
|
||||
3. **Task script pre-processing** (v1 prompt + script → container vs. v2 script → output enrichment):
|
||||
- **v1:** Passes script alongside prompt to container; container execution model unclear from scheduler.ts (likely runs in group-queue).
|
||||
- **v2:** Host runs script *before* waking container; script output (`scriptOutput`) is merged into prompt JSON via `applyPreTaskScripts()` (task-script.ts:115–117). If script fails or returns `wakeAgent=false`, the task is skipped entirely.
|
||||
|
||||
4. **Retry semantics**:
|
||||
- **v1:** On execution error (runTask throws), `updateTaskAfterRun()` is called with `error`. Next retry relies on scheduler polling the same task again (no backoff).
|
||||
- **v2:** Execution errors are not retried; container processes the task once. If the container crashes mid-task, the message is retried with exponential backoff only up to `MAX_TRIES=5` (host-sweep.ts:145–150).
|
||||
|
||||
---
|
||||
|
||||
## Worth preserving?
|
||||
|
||||
**Interval-based recurrence** (v1 `schedule_type='interval'`) is a practical feature that v2 trades away. Cron syntax is powerful but less intuitive for simple "every X milliseconds" patterns. If users want "run every 30 seconds," they must learn cron (`*/30 * * * * *` for seconds doesn't exist in standard cron; workaround is job-level looping in the prompt). Consider a thin adapter layer in agent-facing APIs to accept `{interval: 5000}` and convert to cron, or extend the v2 schema to support an optional `interval_ms` alongside `recurrence`.
|
||||
|
||||
**Task context modes** (`group` vs. `isolated`) were a way to isolate task execution context. v2's removal simplifies the model but loses the ability to run a task in a fresh container state. If a task needs a clean slate (no session history), that's now impossible; workaround is a manual system-action to clear session state before running the task.
|
||||
|
||||
**Task result audit trail** is a gap for operational visibility. v2's system messages are ephemeral; there's no way to query "how many times did task X run and what were the outcomes?" Adding a lightweight `task_execution_log` table (optional, populated on task completion) would help without burdening the common case.
|
||||
|
||||
---
|
||||
|
||||
## References by line
|
||||
|
||||
- v1 task-scheduler: `src/v1/task-scheduler.ts:20–49` (computeNextRun), `:203–235` (startSchedulerLoop)
|
||||
- v1 test coverage: `src/v1/task-scheduler.test.ts:49–121` (drift, missed-interval, once-task tests)
|
||||
- v1 timezone: `src/v1/timezone.ts:26–37` (formatLocalTime with explicit TZ)
|
||||
- v1 types: `src/v1/types.ts:60–74` (ScheduledTask interface with context_mode)
|
||||
- v2 sweep: `src/host-sweep.ts:154–173` (handleRecurrence, insertRecurrence)
|
||||
- v2 delivery system actions: `src/delivery.ts:645–713` (handleSystemAction switch on schedule_task/cancel_task/pause_task/resume_task/update_task)
|
||||
- v2 session-db: `src/db/session-db.ts:103–198` (insertTask, cancelTask, pauseTask, resumeTask, updateTask, all with series_id matching)
|
||||
- v2 task-script: `container/agent-runner/src/task-script.ts:79–121` (applyPreTaskScripts, wakeAgent logic)
|
||||
- v2 DB schema: `docs/db-session.md:31–56` (messages_in table with process_after, recurrence, series_id)
|
||||
@@ -1,570 +0,0 @@
|
||||
# v1 Timezone + Formatting — Recreation Spec
|
||||
|
||||
## Source commits
|
||||
|
||||
**Parent of deletion**: `86becf8^ = 27c52205f9fdeac0483600b2663f1c4d80aba45d`
|
||||
|
||||
**Deletion commit**: `86becf8` (chore: delete v1 reference code)
|
||||
|
||||
### Relevant v1 files at commit 27c5220 (v1^):
|
||||
- `src/v1/router.ts` — message formatting logic (escapeXml, formatMessages, stripInternalTags, formatOutbound)
|
||||
- `src/v1/timezone.ts` — timezone utility functions (isValidTimezone, resolveTimezone, formatLocalTime)
|
||||
- `src/v1/config.ts` — configuration and trigger patterns (buildTriggerPattern, getTriggerPattern, TIMEZONE resolution)
|
||||
- `src/v1/task-scheduler.ts` — scheduled task timezone handling (computeNextRun with cron-parser)
|
||||
- `src/v1/types.ts` — data structures (NewMessage interface)
|
||||
- `src/v1/formatting.test.ts` — comprehensive test suite for all formatting behavior
|
||||
- `src/v1/timezone.test.ts` — timezone utility tests
|
||||
- `src/v1/task-scheduler.test.ts` — scheduler tests
|
||||
|
||||
---
|
||||
|
||||
## 1. Timestamp formatting on inbound messages
|
||||
|
||||
### v1 behavior (exact)
|
||||
|
||||
**Function**: `formatLocalTime()` in `src/v1/timezone.ts:26-36`
|
||||
|
||||
```typescript
|
||||
export function formatLocalTime(utcIso: string, timezone: string): string {
|
||||
const date = new Date(utcIso);
|
||||
return date.toLocaleString('en-US', {
|
||||
timeZone: resolveTimezone(timezone),
|
||||
year: 'numeric',
|
||||
month: 'short',
|
||||
day: 'numeric',
|
||||
hour: 'numeric',
|
||||
minute: '2-digit',
|
||||
hour12: true,
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
**Input**: UTC ISO 8601 timestamp (e.g., `'2024-01-01T00:00:00.000Z'`) + timezone name (e.g., `'America/New_York'`)
|
||||
|
||||
**Output format example**:
|
||||
- Input: `'2024-01-01T18:30:00.000Z'` with timezone `'America/New_York'` (EST, UTC-5)
|
||||
- Output: `'1:30 PM'` (with additional date components: month short name, day, year, hour, 2-digit minute, 12-hour format)
|
||||
- Full example output: `"Jan 1, 2024, 1:30 PM"` (exact format depends on browser/Node locale)
|
||||
|
||||
**Critical Details**:
|
||||
- Uses JavaScript's `Intl.DateTimeFormat` API with `en-US` locale
|
||||
- Format options: `{ year: 'numeric', month: 'short', day: 'numeric', hour: 'numeric', minute: '2-digit', hour12: true }`
|
||||
- Handles invalid timezone gracefully by calling `resolveTimezone(timezone)` which falls back to UTC
|
||||
- No external dependencies (no moment.js, date-fns, or day.js)
|
||||
|
||||
**Where it's called**:
|
||||
- `src/v1/router.ts:11` in `formatMessages()` function to convert each message's `m.timestamp` to display time
|
||||
- The display time is then placed in the `time="..."` attribute of the XML message element
|
||||
|
||||
### Test coverage
|
||||
|
||||
From `src/v1/formatting.test.ts:51-84`:
|
||||
|
||||
1. **Basic formatting with context header**
|
||||
- Input: Single message with timestamp `'2024-01-01T00:00:00.000Z'`, timezone `'UTC'`
|
||||
- Asserts: `result.toContain('Jan 1, 2024')` and `'<context timezone="UTC" />'`
|
||||
- File:line: `src/v1/formatting.test.ts:51-56`
|
||||
|
||||
2. **Timezone conversion to local time**
|
||||
- Input: Timestamp `'2024-01-01T18:30:00.000Z'` with timezone `'America/New_York'` (EST)
|
||||
- Asserts: Result contains `'1:30'` and `'PM'` (correct EST conversion, UTC-5)
|
||||
- File:line: `src/v1/formatting.test.ts:74-78`
|
||||
|
||||
From `src/v1/timezone.test.ts:10-30`:
|
||||
|
||||
3. **formatLocalTime with timezone conversion**
|
||||
- Input: `'2026-02-04T18:30:00.000Z'` with `'America/New_York'`
|
||||
- Asserts: Contains `'1:30'`, `'PM'`, `'Feb'`, `'2026'`
|
||||
- File:line: `src/v1/timezone.test.ts:10-16`
|
||||
|
||||
4. **Multiple timezones comparison**
|
||||
- Input: Same UTC time with different timezones (`'America/New_York'`, `'Asia/Tokyo'`)
|
||||
- Asserts: NY shows `'8:00'` (EDT, UTC-4 in summer), Tokyo shows `'9:00'` (UTC+9)
|
||||
- File:line: `src/v1/timezone.test.ts:18-26`
|
||||
|
||||
5. **Invalid timezone fallback**
|
||||
- Input: Invalid timezone `'IST-2'`
|
||||
- Asserts: Does not throw, formats as UTC (falls back)
|
||||
- File:line: `src/v1/timezone.test.ts:28-33`
|
||||
|
||||
---
|
||||
|
||||
## 2. Context timezone header
|
||||
|
||||
### v1 behavior (exact)
|
||||
|
||||
**Location**: Prepended at the START of the formatted message block in `src/v1/router.ts:20-22`
|
||||
|
||||
**Format**:
|
||||
```xml
|
||||
<context timezone="<TIMEZONE_NAME>" />
|
||||
```
|
||||
|
||||
**Code**:
|
||||
```typescript
|
||||
const header = `<context timezone="${escapeXml(timezone)}" />\n`;
|
||||
return `${header}<messages>\n${lines.join('\n')}\n</messages>`;
|
||||
```
|
||||
|
||||
**What it includes**:
|
||||
- Only the timezone name (IANA identifier, e.g., `'UTC'`, `'America/New_York'`)
|
||||
- **NOT** the current time (that's in each individual message's `time="..."` attribute)
|
||||
- XML-escaped to prevent injection (via `escapeXml()`)
|
||||
|
||||
**Per-message vs per-turn**:
|
||||
- The header appears **once per call to `formatMessages()`**, which formats a batch of messages
|
||||
- The entire batch (header + all messages) is passed to the agent as a single unit
|
||||
- The `timezone` parameter is passed in from the caller (`src/v1/router.ts:9` line signature)
|
||||
|
||||
**Where it's wired**:
|
||||
- `src/v1/router.ts:9` — `formatMessages(messages: NewMessage[], timezone: string)` accepts timezone as a parameter
|
||||
- This function is called from the channel message processing loop (inbound message handler)
|
||||
- The caller supplies the `TIMEZONE` constant from `src/v1/config.ts:62`
|
||||
|
||||
### Test coverage
|
||||
|
||||
From `src/v1/formatting.test.ts:51-56`:
|
||||
|
||||
1. **Context header is included in output**
|
||||
- Input: Any message list with timezone `'UTC'`
|
||||
- Asserts: `result.toContain('<context timezone="UTC" />')`
|
||||
- File:line: `src/v1/formatting.test.ts:51-56`
|
||||
|
||||
2. **Context header with non-UTC timezone**
|
||||
- Input: Timezone `'America/New_York'`
|
||||
- Asserts: `result.toContain('<context timezone="America/New_York" />')`
|
||||
- File:line: `src/v1/formatting.test.ts:74-78`
|
||||
|
||||
3. **Context header with empty message list**
|
||||
- Input: Empty array with timezone `'UTC'`
|
||||
- Asserts: `result.toContain('<context timezone="UTC" />')` even when no messages
|
||||
- File:line: `src/v1/formatting.test.ts:80-83`
|
||||
|
||||
---
|
||||
|
||||
## 3. Reply-to handling with message IDs
|
||||
|
||||
### v1 behavior (exact)
|
||||
|
||||
**Location**: In the message formatting loop in `src/v1/router.ts:10-18`
|
||||
|
||||
**Code**:
|
||||
```typescript
|
||||
const replyAttr = m.reply_to_message_id ? ` reply_to="${escapeXml(m.reply_to_message_id)}"` : '';
|
||||
const replySnippet =
|
||||
m.reply_to_message_content && m.reply_to_sender_name
|
||||
? `\n <quoted_message from="${escapeXml(m.reply_to_sender_name)}">${escapeXml(m.reply_to_message_content)}</quoted_message>`
|
||||
: '';
|
||||
return `<message sender="${escapeXml(m.sender_name)}" time="${escapeXml(displayTime)}"${replyAttr}>${replySnippet}${escapeXml(m.content)}</message>`;
|
||||
```
|
||||
|
||||
**Format of reply-to**:
|
||||
- Attribute: `reply_to="<MESSAGE_ID>"` on the `<message>` tag (if `m.reply_to_message_id` is present)
|
||||
- The ID is XML-escaped via `escapeXml()`
|
||||
- Nested element: `<quoted_message from="<SENDER_NAME>"><MESSAGE_CONTENT></quoted_message>` (if both sender and content are present)
|
||||
- Both sender name and content are XML-escaped
|
||||
|
||||
**What it contains**:
|
||||
- `reply_to="<id>"` attribute with the exact message ID from `m.reply_to_message_id`
|
||||
- Sender name from `m.reply_to_sender_name`
|
||||
- Original message content from `m.reply_to_message_content`
|
||||
- **No timestamp** of the referenced message
|
||||
|
||||
**Conditional rendering**:
|
||||
1. If `m.reply_to_message_id` is present: include `reply_to="<id>"` attribute
|
||||
2. If `m.reply_to_message_id` is present but content/sender missing: include attribute only, no `<quoted_message>` element
|
||||
3. If only content and sender (no ID): only `<quoted_message>` element, no attribute
|
||||
|
||||
**Example output**:
|
||||
```xml
|
||||
<message sender="Alice" time="Jan 1, 2024, 12:00 PM" reply_to="42">
|
||||
<quoted_message from="Bob">Are you coming tonight?</quoted_message>
|
||||
Yes, on my way!</message>
|
||||
```
|
||||
|
||||
### Test coverage
|
||||
|
||||
From `src/v1/formatting.test.ts:96-139`:
|
||||
|
||||
1. **Reply with both ID and quoted content**
|
||||
- Input: Message with `reply_to_message_id: '42'`, `reply_to_sender_name: 'Bob'`, `reply_to_message_content: 'Are you coming tonight?'`, content: `'Yes, on my way!'`
|
||||
- Asserts:
|
||||
- `result.toContain('reply_to="42"')`
|
||||
- `result.toContain('<quoted_message from="Bob">Are you coming tonight?</quoted_message>')`
|
||||
- `result.toContain('Yes, on my way!</message>')`
|
||||
- File:line: `src/v1/formatting.test.ts:96-112`
|
||||
|
||||
2. **No reply context when missing**
|
||||
- Input: Message without reply fields
|
||||
- Asserts:
|
||||
- `result.not.toContain('reply_to')`
|
||||
- `result.not.toContain('quoted_message')`
|
||||
- File:line: `src/v1/formatting.test.ts:114-119`
|
||||
|
||||
3. **ID present but content missing**
|
||||
- Input: `reply_to_message_id: '42'`, `reply_to_sender_name: 'Bob'`, but NO `reply_to_message_content`
|
||||
- Asserts:
|
||||
- `result.toContain('reply_to="42"')`
|
||||
- `result.not.toContain('quoted_message')`
|
||||
- File:line: `src/v1/formatting.test.ts:121-130`
|
||||
|
||||
4. **XML escape in reply context**
|
||||
- Input: `reply_to_message_id: '1'`, `reply_to_sender_name: 'A & B'`, `reply_to_message_content: '<script>alert("xss")</script>'`
|
||||
- Asserts:
|
||||
- `result.toContain('from="A & B"')`
|
||||
- `result.toContain('<script>alert("xss")</script>')`
|
||||
- File:line: `src/v1/formatting.test.ts:131-139`
|
||||
|
||||
---
|
||||
|
||||
## 4. Internal tag stripping
|
||||
|
||||
### v1 behavior (exact)
|
||||
|
||||
**Function name**: `stripInternalTags()` in `src/v1/router.ts:25-27`
|
||||
|
||||
**Implementation**:
|
||||
```typescript
|
||||
export function stripInternalTags(text: string): string {
|
||||
return text.replace(/<internal>[\s\S]*?<\/internal>/g, '').trim();
|
||||
}
|
||||
```
|
||||
|
||||
**Regex pattern**: `/<internal>[\s\S]*?<\/internal>/g`
|
||||
- `<internal>` — literal opening tag
|
||||
- `[\s\S]*?` — match any character (whitespace or non-whitespace) non-greedily
|
||||
- `<\/internal>` — literal closing tag
|
||||
- `g` flag — global (all matches)
|
||||
|
||||
**Post-processing**: `.trim()` removes leading/trailing whitespace after all tags are stripped
|
||||
|
||||
**Where it's called**:
|
||||
- `src/v1/router.ts:30` in `formatOutbound()` function
|
||||
- Called AFTER the tag removal to clean the output before returning
|
||||
|
||||
**Used for**: Stripping internal thinking/reasoning from outbound messages before sending to channel
|
||||
|
||||
**Input/Output examples**:
|
||||
|
||||
1. Single-line internal tag:
|
||||
- Input: `'hello <internal>secret</internal> world'`
|
||||
- Output: `'hello world'` (then `.trim()` would be `'hello world'`)
|
||||
|
||||
2. Multi-line internal tags:
|
||||
- Input: `'hello <internal>\nsecret\nstuff\n</internal> world'`
|
||||
- Output: `'hello world'`
|
||||
|
||||
3. Multiple blocks:
|
||||
- Input: `'<internal>a</internal>hello<internal>b</internal>'`
|
||||
- Output: `'hello'`
|
||||
|
||||
4. Only internal content:
|
||||
- Input: `'<internal>only this</internal>'`
|
||||
- Output: `''` (empty after trim)
|
||||
|
||||
### Test coverage
|
||||
|
||||
From `src/v1/formatting.test.ts:163-181`:
|
||||
|
||||
1. **Single-line tag stripping**
|
||||
- Input: `'hello <internal>secret</internal> world'`
|
||||
- Asserts: Result is `'hello world'` (two spaces, then `.trim()` removes outer whitespace)
|
||||
- Expected (with trim): `'hello world'`
|
||||
- File:line: `src/v1/formatting.test.ts:163-165`
|
||||
|
||||
2. **Multi-line tag stripping**
|
||||
- Input: `'hello <internal>\nsecret\nstuff\n</internal> world'`
|
||||
- Asserts: Result is `'hello world'` (after trim)
|
||||
- File:line: `src/v1/formatting.test.ts:167-169`
|
||||
|
||||
3. **Multiple internal blocks**
|
||||
- Input: `'<internal>a</internal>hello<internal>b</internal>'`
|
||||
- Asserts: Result is `'hello'`
|
||||
- File:line: `src/v1/formatting.test.ts:171-173`
|
||||
|
||||
4. **Only internal content**
|
||||
- Input: `'<internal>only this</internal>'`
|
||||
- Asserts: Result is `''` (empty string)
|
||||
- File:line: `src/v1/formatting.test.ts:175-177`
|
||||
|
||||
From `src/v1/formatting.test.ts:183-194`:
|
||||
|
||||
5. **formatOutbound with no internal tags**
|
||||
- Input: `'hello world'`
|
||||
- Asserts: Result is `'hello world'`
|
||||
- File:line: `src/v1/formatting.test.ts:183-185`
|
||||
|
||||
6. **formatOutbound with all internal content**
|
||||
- Input: `'<internal>hidden</internal>'`
|
||||
- Asserts: Result is `''` (returns early after strip)
|
||||
- File:line: `src/v1/formatting.test.ts:187-189`
|
||||
|
||||
7. **formatOutbound strips and returns remaining**
|
||||
- Input: `'<internal>thinking</internal>The answer is 42'`
|
||||
- Asserts: Result is `'The answer is 42'`
|
||||
- File:line: `src/v1/formatting.test.ts:191-194`
|
||||
|
||||
---
|
||||
|
||||
## 5. Timezone handling for scheduled tasks
|
||||
|
||||
### v1 behavior (exact)
|
||||
|
||||
**Location**: `src/v1/task-scheduler.ts:20-49`
|
||||
|
||||
**Key function**: `computeNextRun(task: ScheduledTask): string | null`
|
||||
|
||||
**Cron timezone handling**:
|
||||
```typescript
|
||||
if (task.schedule_type === 'cron') {
|
||||
const interval = CronExpressionParser.parse(task.schedule_value, {
|
||||
tz: TIMEZONE,
|
||||
});
|
||||
return interval.next().toISOString();
|
||||
}
|
||||
```
|
||||
|
||||
**Critical details**:
|
||||
- Uses `cron-parser` library's `CronExpressionParser.parse()` method
|
||||
- Passes timezone option as `{ tz: TIMEZONE }` (e.g., `{ tz: 'America/New_York' }`)
|
||||
- `TIMEZONE` is imported from `src/v1/config.ts:62` and resolved via `resolveConfigTimezone()`
|
||||
- The cron expression is interpreted in the **user's timezone**, not UTC
|
||||
- Example: cron `'0 9 * * *'` with `tz: 'America/New_York'` means 9 AM ET every day
|
||||
|
||||
**Interval task handling**:
|
||||
```typescript
|
||||
if (task.schedule_type === 'interval') {
|
||||
const ms = parseInt(task.schedule_value, 10);
|
||||
if (!ms || ms <= 0) {
|
||||
logger.warn({ taskId: task.id, value: task.schedule_value }, 'Invalid interval value');
|
||||
return new Date(now + 60_000).toISOString();
|
||||
}
|
||||
let next = new Date(task.next_run!).getTime() + ms;
|
||||
while (next <= now) {
|
||||
next += ms;
|
||||
}
|
||||
return new Date(next).toISOString();
|
||||
}
|
||||
```
|
||||
|
||||
**Interval specifics**:
|
||||
- Intervals are timezone-agnostic (pure millisecond-based)
|
||||
- Anchored to the task's `next_run` time to prevent cumulative drift
|
||||
- If intervals have been missed, the loop skips forward to land in the future while maintaining the original schedule grid
|
||||
|
||||
**Once-only tasks**:
|
||||
```typescript
|
||||
if (task.schedule_type === 'once') return null;
|
||||
```
|
||||
|
||||
**MCP tool description**:
|
||||
- v1 did not expose cron task scheduling directly to the agent (it was a server-side feature)
|
||||
- The scheduling was configured in group config files, not via agent tool calls
|
||||
|
||||
### Test coverage
|
||||
|
||||
From `src/v1/task-scheduler.test.ts:33-60`:
|
||||
|
||||
1. **computeNextRun returns null for once-tasks**
|
||||
- Input: Task with `schedule_type: 'once'`
|
||||
- Asserts: `computeNextRun(task)` returns `null`
|
||||
- File:line: `src/v1/task-scheduler.test.ts:40-49`
|
||||
|
||||
2. **Interval task anchoring to prevent drift**
|
||||
- Input: Task scheduled 2s ago with interval `60000` (1 minute)
|
||||
- Asserts: Next run = `scheduledTime + 60s`, not `now + 60s`
|
||||
- Expected: Exact alignment to the scheduled time grid
|
||||
- File:line: `src/v1/task-scheduler.test.ts:33-39`
|
||||
|
||||
3. **Interval task catches up without infinite loop**
|
||||
- Input: Task with 10 missed intervals (missed by 10 * 60000ms)
|
||||
- Asserts: Next run is in the future and aligned to original schedule grid
|
||||
- File:line: `src/v1/task-scheduler.test.ts:51-60`
|
||||
|
||||
---
|
||||
|
||||
## 6. Complete test inventory (formatting.test.ts)
|
||||
|
||||
### All test cases from src/v1/formatting.test.ts (lines 1-254):
|
||||
|
||||
#### Block 1: escapeXml tests (lines 22-46)
|
||||
|
||||
| Test name | Input | Expected output |
|
||||
|-----------|-------|-----------------|
|
||||
| escapes ampersands | `'a & b'` | `'a & b'` |
|
||||
| escapes less-than | `'a < b'` | `'a < b'` |
|
||||
| escapes greater-than | `'a > b'` | `'a > b'` |
|
||||
| escapes double quotes | `'"hello"'` | `'"hello"'` |
|
||||
| handles multiple special characters together | `'a & b < c > d "e"'` | `'a & b < c > d "e"'` |
|
||||
| passes through strings with no special chars | `'hello world'` | `'hello world'` |
|
||||
| handles empty string | `''` | `''` |
|
||||
|
||||
#### Block 2: formatMessages tests (lines 48-159)
|
||||
|
||||
| Test name | Input | Key asserts |
|
||||
|-----------|-------|------------|
|
||||
| formats a single message as XML with context header (line 51) | Single message with timestamp `'2024-01-01T00:00:00.000Z'`, TZ `'UTC'` | Contains `'<context timezone="UTC" />'`, `'<message sender="Alice"'`, `'>hello</message>'`, `'Jan 1, 2024'` |
|
||||
| formats multiple messages (line 59) | 2 messages: Alice at 00:00, Bob at 01:00 | Contains both sender names and contents |
|
||||
| escapes special characters in sender names (line 72) | Sender `'A & B <Co>'` | Contains `'sender="A & B <Co>"'` |
|
||||
| escapes special characters in content (line 79) | Content `'<script>alert("xss")</script>'` | Contains escaped script tags `'<script>...'` |
|
||||
| handles empty array (line 85) | Empty message list, TZ `'UTC'` | Contains header and `'<messages>\n\n</messages>'` |
|
||||
| renders reply context as quoted_message element (line 96) | Message with `reply_to_message_id: '42'`, `reply_to_sender_name: 'Bob'`, `reply_to_message_content: 'Are you coming tonight?'` | Contains `'reply_to="42"'`, `'<quoted_message from="Bob">Are you coming tonight?</quoted_message>'` |
|
||||
| omits reply attributes when no reply context (line 114) | Message without reply fields | Does NOT contain `'reply_to'` or `'quoted_message'` |
|
||||
| omits quoted_message when content is missing but id is present (line 121) | Message with `reply_to_message_id: '42'` but no `reply_to_message_content` | Contains `'reply_to="42"'` but NOT `'<quoted_message'` |
|
||||
| escapes special characters in reply context (line 131) | Sender `'A & B'`, content `'<script>alert("xss")</script>'` | Contains `'from="A & B"'` and escaped script |
|
||||
| converts timestamps to local time for given timezone (line 140) | Timestamp `'2024-01-01T18:30:00.000Z'` with TZ `'America/New_York'` (EST, UTC-5) | Contains `'1:30'`, `'PM'`, header has `'America/New_York'` |
|
||||
|
||||
#### Block 3: TRIGGER_PATTERN tests (lines 146-169)
|
||||
|
||||
| Test name | Input | Expected result |
|
||||
|-----------|-------|-----------------|
|
||||
| matches @name at start of message (line 152) | `'@Andy hello'` (assuming ASSISTANT_NAME='Andy') | `true` |
|
||||
| matches case-insensitively (line 156) | `'@andy hello'` or `'@ANDY hello'` | `true` |
|
||||
| does not match when not at start of message (line 160) | `'hello @Andy'` | `false` |
|
||||
| does not match partial name like @NameExtra (word boundary) (line 164) | `'@Andyextra hello'` | `false` |
|
||||
| matches with word boundary before apostrophe (line 168) | `'@Andy\'s thing'` | `true` |
|
||||
| matches @name alone (end of string is a word boundary) (line 172) | `'@Andy'` | `true` |
|
||||
| matches with leading whitespace after trim (line 175) | `' @Andy hey'` (after `.trim()`) | `true` |
|
||||
|
||||
#### Block 4: getTriggerPattern tests (lines 177-196)
|
||||
|
||||
| Test name | Input | Expected behavior |
|
||||
|-----------|-------|-------------------|
|
||||
| uses the configured per-group trigger when provided (line 180) | `getTriggerPattern('@Claw')` | Matches `'@Claw hello'`, does NOT match `'@Andy hello'` |
|
||||
| falls back to the default trigger when group trigger is missing (line 186) | `getTriggerPattern(undefined)` | Matches default trigger `'@Andy hello'` |
|
||||
| treats regex characters in custom triggers literally (line 192) | `getTriggerPattern('@C.L.A.U.D.E')` | Matches literal dots, NOT wildcard (does NOT match `'@CXLXAUXDXE'`) |
|
||||
|
||||
#### Block 5: stripInternalTags tests (lines 198-210)
|
||||
|
||||
| Test name | Input | Expected output |
|
||||
|-----------|-------|-----------------|
|
||||
| strips single-line internal tags (line 199) | `'hello <internal>secret</internal> world'` | `'hello world'` (then `.trim()` makes it `'hello world'`) |
|
||||
| strips multi-line internal tags (line 203) | `'hello <internal>\nsecret\nstuff\n</internal> world'` | `'hello world'` |
|
||||
| strips multiple internal tag blocks (line 207) | `'<internal>a</internal>hello<internal>b</internal>'` | `'hello'` |
|
||||
| returns empty string when text is only internal tags (line 211) | `'<internal>only this</internal>'` | `''` |
|
||||
|
||||
#### Block 6: formatOutbound tests (lines 213-226)
|
||||
|
||||
| Test name | Input | Expected output |
|
||||
|-----------|-------|-----------------|
|
||||
| returns text with internal tags stripped (line 214) | `'hello world'` | `'hello world'` |
|
||||
| returns empty string when all text is internal (line 218) | `'<internal>hidden</internal>'` | `''` |
|
||||
| strips internal tags from remaining text (line 222) | `'<internal>thinking</internal>The answer is 42'` | `'The answer is 42'` |
|
||||
|
||||
#### Block 7: trigger gating (requiresTrigger interaction) tests (lines 228-254)
|
||||
|
||||
| Test name | Input | Expected result |
|
||||
|-----------|-------|-----------------|
|
||||
| main group always processes (no trigger needed) (line 239) | `isMainGroup: true`, message without trigger | `true` |
|
||||
| main group processes even with requiresTrigger=true (line 244) | `isMainGroup: true`, `requiresTrigger: true`, no trigger | `true` |
|
||||
| non-main group with requiresTrigger=undefined requires trigger (line 249) | `isMainGroup: false`, `requiresTrigger: undefined`, no trigger | `false` |
|
||||
| non-main group with requiresTrigger=true requires trigger (line 254) | `isMainGroup: false`, `requiresTrigger: true`, no trigger | `false` |
|
||||
| non-main group with requiresTrigger=true processes when trigger present (line 259) | `isMainGroup: false`, trigger in message | `true` |
|
||||
| non-main group uses per-group trigger instead of default (line 264) | `isMainGroup: false`, `trigger: '@Claw'`, message `'@Claw do something'` | `true` |
|
||||
| non-main group does not process when only default trigger is present for custom-trigger group (line 269) | `isMainGroup: false`, `trigger: '@Claw'`, message `'@Andy do something'` | `false` |
|
||||
| non-main group with requiresTrigger=false always processes (line 274) | `isMainGroup: false`, `requiresTrigger: false`, no trigger | `true` |
|
||||
|
||||
---
|
||||
|
||||
## v2 porting plan
|
||||
|
||||
### For each of sections 1–5: the specific change to make in v2
|
||||
|
||||
#### 1. Timestamp formatting
|
||||
|
||||
**v2 file to modify**: (Unknown — search for where v2 formats inbound messages to the agent)
|
||||
|
||||
**Change needed**:
|
||||
1. Find where v2 currently formats message timestamps for the agent
|
||||
2. Replace any custom date formatting with the v1 pattern:
|
||||
- Call `new Date(timestamp).toLocaleString('en-US', { timeZone, year: 'numeric', month: 'short', day: 'numeric', hour: 'numeric', minute: '2-digit', hour12: true })`
|
||||
3. Ensure the timezone parameter is sourced from `config.TIMEZONE` (or equivalent in v2)
|
||||
|
||||
**Test to port**: `src/v1/formatting.test.ts:51-56` (basic formatting) and `src/v1/formatting.test.ts:74-78` (timezone conversion)
|
||||
|
||||
#### 2. Context timezone header
|
||||
|
||||
**v2 file to modify**: (Unknown — search for where v2 constructs the XML/prompt for inbound messages)
|
||||
|
||||
**Change needed**:
|
||||
1. Prepend `<context timezone="<TIMEZONE_NAME>" />\n` to the formatted message block
|
||||
2. The timezone should be the resolved IANA identifier (e.g., `'UTC'`, `'America/New_York'`)
|
||||
3. Ensure it's placed BEFORE the `<messages>` element
|
||||
|
||||
**Test to port**: `src/v1/formatting.test.ts:51-56` and `src/v1/formatting.test.ts:80-83` (empty array still has header)
|
||||
|
||||
#### 3. Reply-to with message ID
|
||||
|
||||
**v2 file to modify**: (Unknown — search for where v2 formats message metadata)
|
||||
|
||||
**Change needed**:
|
||||
1. If `message.reply_to_message_id` is present, add ` reply_to="<ID>"` attribute to the `<message>` element
|
||||
2. If BOTH `message.reply_to_message_content` AND `message.reply_to_sender_name` are present, include a nested `<quoted_message from="<SENDER>"><CONTENT></quoted_message>` element
|
||||
3. XML-escape all three values (ID, sender name, content)
|
||||
|
||||
**Test to port**:
|
||||
- `src/v1/formatting.test.ts:96-112` (full reply context)
|
||||
- `src/v1/formatting.test.ts:121-130` (ID only, no content)
|
||||
- `src/v1/formatting.test.ts:131-139` (XML escaping in reply)
|
||||
|
||||
#### 4. Internal tag stripping
|
||||
|
||||
**v2 file to modify**: (Unknown — search for where v2 processes outbound messages before sending)
|
||||
|
||||
**Change needed**:
|
||||
1. Apply the regex `/<internal>[\s\S]*?<\/internal>/g` to strip all internal thinking/reasoning blocks
|
||||
2. Call `.trim()` on the result after stripping
|
||||
3. Return empty string if result is empty after stripping
|
||||
|
||||
**Test to port**:
|
||||
- `src/v1/formatting.test.ts:163-177` (stripInternalTags)
|
||||
- `src/v1/formatting.test.ts:183-194` (formatOutbound)
|
||||
|
||||
#### 5. Scheduled task timezone handling
|
||||
|
||||
**v2 file to modify**: (Unknown — search for where v2 handles cron task scheduling)
|
||||
|
||||
**Change needed**:
|
||||
1. When parsing cron expressions, pass the timezone option to cron-parser:
|
||||
```typescript
|
||||
const interval = CronExpressionParser.parse(cronExpression, { tz: TIMEZONE });
|
||||
```
|
||||
2. For interval-based tasks, anchor to the original `next_run` time, not `Date.now()`, to prevent drift
|
||||
3. Ensure the TIMEZONE constant is resolved at startup via a function like:
|
||||
```typescript
|
||||
function resolveConfigTimezone(): string {
|
||||
const candidates = [process.env.TZ, envConfig.TZ, Intl.DateTimeFormat().resolvedOptions().timeZone];
|
||||
for (const tz of candidates) {
|
||||
if (tz && isValidTimezone(tz)) return tz;
|
||||
}
|
||||
return 'UTC';
|
||||
}
|
||||
```
|
||||
|
||||
**Test to port**:
|
||||
- `src/v1/task-scheduler.test.ts:33-39` (interval anchoring)
|
||||
- `src/v1/task-scheduler.test.ts:40-49` (once-task returns null)
|
||||
- `src/v1/task-scheduler.test.ts:51-60` (interval catch-up)
|
||||
|
||||
---
|
||||
|
||||
## Git references for verification
|
||||
|
||||
All code snippets above can be verified with:
|
||||
|
||||
```bash
|
||||
git show 27c5220:src/v1/router.ts
|
||||
git show 27c5220:src/v1/timezone.ts
|
||||
git show 27c5220:src/v1/config.ts
|
||||
git show 27c5220:src/v1/task-scheduler.ts
|
||||
git show 27c5220:src/v1/types.ts
|
||||
git show 27c5220:src/v1/formatting.test.ts
|
||||
git show 27c5220:src/v1/timezone.test.ts
|
||||
git show 27c5220:src/v1/task-scheduler.test.ts
|
||||
```
|
||||
|
||||
Or from the deletion parent commit:
|
||||
|
||||
```bash
|
||||
git show 86becf8^:src/v1/<filename>
|
||||
```
|
||||
@@ -1,27 +0,0 @@
|
||||
# timezone: v1 vs v2
|
||||
|
||||
## Scope
|
||||
- v1: `src/v1/timezone.ts` (37 LOC), `src/v1/timezone.test.ts` (64 LOC)
|
||||
- v2 counterparts: `src/timezone.ts` (37 LOC), `src/timezone.test.ts` (64 LOC)
|
||||
|
||||
## Capability map
|
||||
|
||||
| v1 behavior | v2 location | Status | Notes |
|
||||
|---|---|---|---|
|
||||
| `isValidTimezone(tz)` | `src/timezone.ts:5-12` | kept | Byte-identical |
|
||||
| `resolveTimezone(tz)` | `src/timezone.ts:17-19` | kept | Byte-identical |
|
||||
| `formatLocalTime(utcIso, timezone)` | `src/timezone.ts:26-37` | kept | Byte-identical |
|
||||
|
||||
## Tests (byte-identical)
|
||||
- `formatLocalTime`: UTC→local display with offset; DST awareness (EDT vs EST); fall back to UTC on invalid tz without throwing
|
||||
- `isValidTimezone`: accepts `America/New_York`, `UTC`, `Asia/Tokyo`, `Asia/Jerusalem`; rejects `IST-2`, `XYZ+3`, empty/garbage
|
||||
- `resolveTimezone`: returns tz if valid; falls back to UTC on invalid or empty
|
||||
|
||||
## Missing from v2
|
||||
None — v1 and v2 files are byte-for-byte identical.
|
||||
|
||||
## Behavioral discrepancies
|
||||
None.
|
||||
|
||||
## Worth preserving?
|
||||
No action needed — v2 already mirrors v1 exactly. Minimal, correct, no external deps. No cron-time conversions in either version (that logic lived in `task-scheduler.ts`).
|
||||
@@ -1,58 +0,0 @@
|
||||
# types: v1 vs v2
|
||||
|
||||
## Scope
|
||||
- v1: `src/v1/types.ts` (112 LOC) — 10 exported types/interfaces covering AdditionalMount, MountAllowlist, AllowedRoot, ContainerConfig, RegisteredGroup, NewMessage, ScheduledTask, TaskRunLog, Channel, OnInboundMessage/OnChatMetadata
|
||||
- v2 counterparts (distributed):
|
||||
- `src/types.ts` — central DB entities (`AgentGroup`, `MessagingGroup`, `MessageIn`, `User`, `MessagingGroupAgent` etc.)
|
||||
- `src/container-config.ts` — file-based per-group container config
|
||||
- `src/mount-security.ts` — mount types
|
||||
- `src/channels/adapter.ts` — v2 channel interface
|
||||
- `container/agent-runner/src/db/messages-in.ts`, `destinations.ts` — session-level types
|
||||
- `src/db/schema.ts` — schema reference
|
||||
|
||||
## Capability map
|
||||
|
||||
| v1 type / field | v2 location | Status | Notes |
|
||||
|---|---|---|---|
|
||||
| `AdditionalMount` | `src/mount-security.ts:16-18` | kept | Same fields |
|
||||
| `MountAllowlist` / `AllowedRoot` | `src/mount-security.ts:21-29` | kept | `nonMainReadOnly` field removed (see container-runtime doc) |
|
||||
| `ContainerConfig` | split: `src/container-config.ts:36` (file-based) + `src/mount-security.ts` | refactored | `timeout` dropped; added `mcpServers`, `packages`, `imageTag` |
|
||||
| `RegisteredGroup` | `agent_groups` + `messaging_group_agents` + `container.json` | refactored | One entity split across two DB tables + filesystem |
|
||||
| `RegisteredGroup.trigger` | `messaging_group_agents.trigger_rules` JSON | moved | Per-wiring, not per-group |
|
||||
| `RegisteredGroup.containerConfig` | `groups/<folder>/container.json` | moved | DB → disk |
|
||||
| `RegisteredGroup.isMain` | convention (`agent_group_id = 'main'`) | removed | No explicit flag |
|
||||
| `NewMessage` | split: `MessageIn` (`src/types.ts:98-111`) + `InboundMessage` (`src/channels/adapter.ts:33-38`) + `MessageInRow` (`container/.../db/messages-in.ts`) | refactored | Platform fields separated |
|
||||
| `NewMessage.chat_jid` | `channel_type` + `platform_id` | refactored | Explicit split, no more JID parsing |
|
||||
| `NewMessage.sender` / `sender_name` | inside JSON `content` blob | moved | Less type safety, more flexibility |
|
||||
| `NewMessage.is_from_me` / `is_bot_message` | — | removed | Inferred from identity or `messages_out` |
|
||||
| `NewMessage.reply_to_*` | inside `content` blob | moved | |
|
||||
| `ScheduledTask` (entire type) | `MessageIn` with `kind='task'` + `recurrence` | removed | No separate task entity; no task UI/API |
|
||||
| `TaskRunLog` | — | removed | No audit trail in v2 |
|
||||
| `Channel` (connect/disconnect/sendMessage/ownsJid/syncGroups/setTyping) | `ChannelAdapter` (`src/channels/adapter.ts:60-105`) | refactored | Stateless request/response, async, no callback loop |
|
||||
| `Channel.ownsJid` | — | removed | Routing keyed on `channel_type + platform_id` |
|
||||
| `OnInboundMessage(chatJid, message)` | `onInbound(platformId, threadId, message)` | refactored | Routing fields explicit |
|
||||
| `OnChatMetadata` | `onMetadata(platformId, name?, isGroup?)` | refactored | Drops timestamp/channel params |
|
||||
|
||||
## Schema diff (v1 `RegisteredGroup` → v2 split)
|
||||
- **Identity** (`name`, `folder`, `created_at`) → `agent_groups` table
|
||||
- **Wiring** (`trigger`, `requiresTrigger`) → `messaging_group_agents` table (`trigger_rules`, `response_scope`, `session_mode`)
|
||||
- **Container config** (`containerConfig`) → `groups/<folder>/container.json`
|
||||
- Normalization gain: an agent group can have N wirings with different triggers
|
||||
|
||||
## Missing from v2
|
||||
1. `ScheduledTask` + `TaskRunLog` — no first-class task entity or execution log
|
||||
2. `ContainerConfig.timeout` — per-group timeout override gone; single hardcoded `IDLE_TIMEOUT`
|
||||
3. `NewMessage.is_from_me` / `is_bot_message` — flat flags gone
|
||||
4. `Channel.ownsJid` — JID ownership concept gone
|
||||
5. `Channel.connect()`/`disconnect()`/`isConnected()` lifecycle — replaced by stateless `setup`/`teardown`
|
||||
|
||||
## Behavioral discrepancies
|
||||
- **JID → channel_type + platform_id**: routing fields are now structured, not bundled strings
|
||||
- **Pull vs push channels**: v1 channels pushed events via callbacks; v2 adapters are stateless with DB-mediated flow
|
||||
- **Container config storage**: v1 in DB, v2 on disk (survives container restarts without DB query)
|
||||
|
||||
## Worth preserving?
|
||||
- **ScheduledTask / TaskRunLog**: v2's removal leaves a visibility gap; if scheduled-task introspection matters, reintroduce a log table keyed on `messages_in.id` to capture run metadata
|
||||
- **Per-group timeout**: meaningful loss — some agent groups are slow, others fast; hardcoded timeout = false positives
|
||||
- **is_from_me / is_bot_message**: trivial to reconstruct; not worth restoring
|
||||
- **Channel lifecycle callbacks**: obsolete; v2 model is cleaner
|
||||
Reference in New Issue
Block a user