mirror of
https://github.com/qwibitai/nanoclaw.git
synced 2026-06-04 10:14:47 +08:00
feat(v2): builder-agent self-modification WIP + container-config as per-group file
Checkpoints the builder-agent dev-agent/worktree/swap flow (create_dev_agent, request_swap, classifier, deadman, promote) before pivoting to a unified draft-activate approach with OS-level RO enforcement. Lifts container_config out of the agent_groups row into groups/<folder>/container.json so install_packages, add_mcp_server, and rebuild flows can eventually route through the same draft path as source edits. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -35,4 +35,7 @@ groups/global/*
|
||||
# Skills system (local per-installation state)
|
||||
.nanoclaw/
|
||||
|
||||
# Builder-agent worktrees (ephemeral, per-request)
|
||||
.worktrees/
|
||||
|
||||
agents-sdk-docs
|
||||
|
||||
@@ -1,79 +1,142 @@
|
||||
# NanoClaw
|
||||
|
||||
Personal Claude assistant. See [README.md](README.md) for philosophy and setup. See [docs/REQUIREMENTS.md](docs/REQUIREMENTS.md) for architecture decisions.
|
||||
Personal Claude assistant. See [README.md](README.md) for philosophy and setup. Architecture lives in `docs/v2-*.md`.
|
||||
|
||||
## Quick Context
|
||||
## Quick Context (v2)
|
||||
|
||||
Single Node.js process with skill-based channel system. Channels (WhatsApp, Telegram, Slack, Discord, Gmail) are skills that self-register at startup. Messages route to Claude Agent SDK running in containers (Linux VMs). Each group has isolated filesystem and memory.
|
||||
v2 is the current branch and codebase. v1 still exists under `src/v1/` and `container/agent-runner/src/v1/` for reference but is no longer the runtime. If a file mentions v1 in its comments, it is probably stale.
|
||||
|
||||
The host is a single Node process that orchestrates per-session agent containers. Platform messages land via channel adapters, route through an entity model (users → messaging groups → agent groups → sessions), get written into the session's inbound DB, and wake a container. The agent-runner inside the container polls the DB, calls Claude, and writes back to the outbound DB. The host polls the outbound DB and delivers through the same adapter.
|
||||
|
||||
**Everything is a message.** There is no IPC, no file watcher, no stdin piping between host and container. The two session DBs are the sole IO surface.
|
||||
|
||||
## Entity Model
|
||||
|
||||
```
|
||||
users (id "<channel>:<handle>", kind, display_name)
|
||||
user_roles (user_id, role, agent_group_id) — owner | admin (global or scoped)
|
||||
agent_group_members (user_id, agent_group_id) — unprivileged access gate
|
||||
user_dms (user_id, channel_type, messaging_group_id) — cold-DM cache
|
||||
|
||||
agent_groups (workspace, memory, CLAUDE.md, personality, container config)
|
||||
↕ many-to-many via messaging_group_agents (session_mode, trigger_rules, priority)
|
||||
messaging_groups (one chat/channel on one platform; unknown_sender_policy)
|
||||
|
||||
sessions (agent_group_id + messaging_group_id + thread_id → per-session container)
|
||||
```
|
||||
|
||||
Privilege is user-level (owner/admin), not agent-group-level. See [docs/v2-isolation-model.md](docs/v2-isolation-model.md) for the three isolation levels (`agent-shared`, `shared`, separate agents).
|
||||
|
||||
## Two-DB Session Split
|
||||
|
||||
Each session has **two** SQLite files under `data/v2-sessions/<session_id>/`:
|
||||
|
||||
- `inbound.db` — host writes, container reads. `messages_in`, routing, destinations, pending_questions, processing_ack.
|
||||
- `outbound.db` — container writes, host reads. `messages_out`, session_state.
|
||||
|
||||
Exactly one writer per file — no cross-mount lock contention. Heartbeat is a file touch at `/workspace/.heartbeat`, not a DB update. Host uses even `seq` numbers, container uses odd.
|
||||
|
||||
## Central DB
|
||||
|
||||
`data/v2.db` holds everything that isn't per-session: users, user_roles, agent_groups, messaging_groups, wiring, pending_approvals, pending_credentials, pending_swaps, user_dms, chat_sdk_* (for the Chat SDK bridge), schema_version. Migrations live at `src/db/migrations/`.
|
||||
|
||||
## Key Files
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `src/index.ts` | Orchestrator: state, message loop, agent invocation |
|
||||
| `src/channels/registry.ts` | Channel registry (self-registration at startup) |
|
||||
| `src/ipc.ts` | IPC watcher and task processing |
|
||||
| `src/router.ts` | Message formatting and outbound routing |
|
||||
| `src/config.ts` | Trigger pattern, paths, intervals |
|
||||
| `src/container-runner.ts` | Spawns agent containers with mounts |
|
||||
| `src/task-scheduler.ts` | Runs scheduled tasks |
|
||||
| `src/db.ts` | SQLite operations |
|
||||
| `groups/{name}/CLAUDE.md` | Per-group memory (isolated) |
|
||||
| `container/skills/` | Skills loaded inside agent containers (browser, status, formatting) |
|
||||
| `src/index.ts` | Entry point: init DB, migrations, channel adapters, delivery polls, sweep, shutdown |
|
||||
| `src/router.ts` | Inbound routing: messaging group → agent group → session → `inbound.db` → wake |
|
||||
| `src/delivery.ts` | Polls `outbound.db`, delivers via adapter, handles system actions (schedule, approvals, etc.) |
|
||||
| `src/host-sweep.ts` | 60s sweep: `processing_ack` sync, stale detection, due-message wake, recurrence |
|
||||
| `src/session-manager.ts` | Resolves sessions; opens `inbound.db` / `outbound.db`; manages heartbeat path |
|
||||
| `src/container-runner.ts` | Spawns per-agent-group Docker containers with session DB + outbox mounts, OneCLI `ensureAgent` |
|
||||
| `src/container-runtime.ts` | Runtime selection (Docker vs Apple containers), orphan cleanup |
|
||||
| `src/access.ts` | `pickApprover`, `pickApprovalDelivery`, admin resolution for `NANOCLAW_ADMIN_USER_IDS` |
|
||||
| `src/onecli-approvals.ts` | OneCLI credentialed-action approval bridge |
|
||||
| `src/credentials.ts` | `trigger_credential_collection` host side — modal, OneCLI write-back |
|
||||
| `src/user-dm.ts` | Cold-DM resolution + `user_dms` cache |
|
||||
| `src/group-init.ts` | Per-agent-group filesystem scaffold (CLAUDE.md, skills, agent-runner-src overlay) |
|
||||
| `src/builder-agent/` | Self-modification feature: dev-agent spawn, worktree, classifier, swap, deadman, promote. See `docs/v2-builder-agent-plan.md` |
|
||||
| `src/db/` | DB layer — agent_groups, messaging_groups, sessions, user_roles, user_dms, pending_*, migrations |
|
||||
| `src/channels/` | Channel adapters + Chat SDK bridge |
|
||||
| `container/agent-runner/src/` | Agent-runner: poll loop, formatter, provider abstraction, MCP tools, destinations |
|
||||
| `container/skills/` | Container skills mounted into every agent session |
|
||||
| `groups/<folder>/` | Per-agent-group filesystem (CLAUDE.md, skills, `agent-runner-src/` overlay for builder-agent) |
|
||||
| `scripts/init-first-agent.ts` | Bootstrap the first DM-wired agent (used by `/init-first-agent` skill) |
|
||||
|
||||
## Secrets / Credentials / Proxy (OneCLI)
|
||||
## Self-Modification
|
||||
|
||||
API keys, secret keys, OAuth tokens, and auth credentials are managed by the OneCLI gateway — which handles secret injection into containers at request time, so no keys or tokens are ever passed to containers directly. Run `onecli --help`.
|
||||
Three tiers of agent self-modification, lightest first:
|
||||
|
||||
1. **`install_packages` / `add_mcp_server` / `request_rebuild`** — changes to the per-agent-group container config only (apt/npm deps, wire an existing MCP server). Admin approval, rebuild, container restart. `container/agent-runner/src/mcp-tools/self-mod.ts`.
|
||||
2. **`trigger_credential_collection`** — user provides an API key via a secure modal; value goes straight into OneCLI and never enters agent context. `src/credentials.ts`.
|
||||
3. **`create_dev_agent` + `request_swap`** — heaviest path. Agent spawns a dev-agent clone in a git worktree overlaid with the group's private `agent-runner-src/`, the dev agent edits source, the host classifies the diff, routes for approval, applies a per-path swap, and runs a deadman-restart dance. Every swap commits to `main` for audit. Full design in [docs/v2-builder-agent-plan.md](docs/v2-builder-agent-plan.md).
|
||||
|
||||
## Secrets / Credentials / OneCLI
|
||||
|
||||
API keys, OAuth tokens, and auth credentials are managed by the OneCLI gateway. Secrets are injected into per-agent containers at request time — none are passed in env vars or through chat context. `src/onecli-secrets.ts`, `src/onecli-approvals.ts`, `ensureAgent()` in `container-runner.ts`. Run `onecli --help`.
|
||||
|
||||
## Skills
|
||||
|
||||
Four types of skills exist in NanoClaw. See [CONTRIBUTING.md](CONTRIBUTING.md) for the full taxonomy and guidelines.
|
||||
Four types of skills. See [CONTRIBUTING.md](CONTRIBUTING.md) for the full taxonomy.
|
||||
|
||||
- **Feature skills** — merge a `skill/*` branch to add capabilities (e.g. `/add-telegram`, `/add-slack`)
|
||||
- **Utility skills** — ship code files alongside SKILL.md (e.g. `/claw`)
|
||||
- **Operational skills** — instruction-only workflows, always on `main` (e.g. `/setup`, `/debug`)
|
||||
- **Container skills** — loaded inside agent containers at runtime (`container/skills/`)
|
||||
- **Feature skills** — `skill/*` branches merged via `scripts/apply-skill.ts` (e.g. `/add-discord-v2`, `/add-slack-v2`, `/add-whatsapp-v2`)
|
||||
- **Utility skills** — ship code files alongside `SKILL.md` (e.g. `/claw`)
|
||||
- **Operational skills** — instruction-only workflows (`/setup`, `/debug`, `/customize`, `/init-first-agent`, `/manage-channels`, `/init-onecli`, `/update-nanoclaw`)
|
||||
- **Container skills** — loaded inside agent containers at runtime (`container/skills/`: `welcome`, `self-customize`, `agent-browser`, `slack-formatting`)
|
||||
|
||||
| Skill | When to Use |
|
||||
|-------|-------------|
|
||||
| `/setup` | First-time installation, authentication, service configuration |
|
||||
| `/customize` | Adding channels, integrations, changing behavior |
|
||||
| `/setup` | First-time install, auth, service config |
|
||||
| `/init-first-agent` | Bootstrap the first DM-wired agent (channel pick → identity → wire → welcome DM) |
|
||||
| `/manage-channels` | Wire channels to agent groups with isolation level decisions |
|
||||
| `/customize` | Adding channels, integrations, behavior changes |
|
||||
| `/debug` | Container issues, logs, troubleshooting |
|
||||
| `/update-nanoclaw` | Bring upstream NanoClaw updates into a customized install |
|
||||
| `/init-onecli` | Install OneCLI Agent Vault and migrate `.env` credentials to it |
|
||||
| `/qodo-pr-resolver` | Fetch and fix Qodo PR review issues interactively or in batch |
|
||||
| `/get-qodo-rules` | Load org- and repo-level coding rules from Qodo before code tasks |
|
||||
| `/update-nanoclaw` | Bring upstream updates into a customized install |
|
||||
| `/init-onecli` | Install OneCLI Agent Vault and migrate `.env` credentials |
|
||||
|
||||
## Contributing
|
||||
|
||||
Before creating a PR, adding a skill, or preparing any contribution, you MUST read [CONTRIBUTING.md](CONTRIBUTING.md). It covers accepted change types, the four skill types and their guidelines, SKILL.md format rules, PR requirements, and the pre-submission checklist (searching for existing PRs/issues, testing, description format).
|
||||
Before creating a PR, adding a skill, or preparing any contribution, you MUST read [CONTRIBUTING.md](CONTRIBUTING.md). It covers accepted change types, the four skill types and their guidelines, `SKILL.md` format rules, and the pre-submission checklist.
|
||||
|
||||
## Development
|
||||
|
||||
Run commands directly—don't tell the user to run them.
|
||||
Run commands directly — don't tell the user to run them.
|
||||
|
||||
```bash
|
||||
npm run dev # Run with hot reload
|
||||
npm run build # Compile TypeScript
|
||||
./container/build.sh # Rebuild agent container
|
||||
npm run dev # Host with hot reload
|
||||
npm run build # Compile host TypeScript (src/)
|
||||
./container/build.sh # Rebuild agent container image (nanoclaw-agent:latest)
|
||||
npm test # Host tests
|
||||
```
|
||||
|
||||
Container typecheck is a separate tsconfig — if you edit `container/agent-runner/src/`, run `npx tsc -p container/agent-runner/tsconfig.json --noEmit` to check it.
|
||||
|
||||
Service management:
|
||||
```bash
|
||||
# macOS (launchd)
|
||||
launchctl load ~/Library/LaunchAgents/com.nanoclaw.plist
|
||||
launchctl load ~/Library/LaunchAgents/com.nanoclaw.plist
|
||||
launchctl unload ~/Library/LaunchAgents/com.nanoclaw.plist
|
||||
launchctl kickstart -k gui/$(id -u)/com.nanoclaw # restart
|
||||
|
||||
# Linux (systemd)
|
||||
systemctl --user start nanoclaw
|
||||
systemctl --user stop nanoclaw
|
||||
systemctl --user restart nanoclaw
|
||||
systemctl --user start|stop|restart nanoclaw
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
Host logs: `logs/nanoclaw.log` (normal) and `logs/nanoclaw.error.log` (errors only — some delivery/approval failures only show up here).
|
||||
|
||||
**WhatsApp not connecting after upgrade:** WhatsApp is now a separate skill, not bundled in core. Run `/add-whatsapp` (or `npx tsx scripts/apply-skill.ts .claude/skills/add-whatsapp && npm run build`) to install it. Existing auth credentials and groups are preserved.
|
||||
## v2 Docs Index
|
||||
|
||||
| Doc | Purpose |
|
||||
|-----|---------|
|
||||
| [docs/v2-architecture-draft.md](docs/v2-architecture-draft.md) | Full architecture writeup |
|
||||
| [docs/v2-api-details.md](docs/v2-api-details.md) | Host API + DB schema details |
|
||||
| [docs/v2-agent-runner-details.md](docs/v2-agent-runner-details.md) | Agent-runner internals + MCP tool interface |
|
||||
| [docs/v2-isolation-model.md](docs/v2-isolation-model.md) | Three-level channel isolation model |
|
||||
| [docs/v2-setup-wiring.md](docs/v2-setup-wiring.md) | What's wired, what's open in the setup flow |
|
||||
| [docs/v2-builder-agent-plan.md](docs/v2-builder-agent-plan.md) | Self-modification via dev-agent delegation |
|
||||
| [docs/v2-checklist.md](docs/v2-checklist.md) | Rolling status checklist across all subsystems |
|
||||
| [docs/v2-architecture-diagram.md](docs/v2-architecture-diagram.md) | Diagram version of the architecture |
|
||||
|
||||
## Container Build Cache
|
||||
|
||||
|
||||
@@ -23,11 +23,18 @@ export interface CommandInfo {
|
||||
/**
|
||||
* Categorize a message as a command or not.
|
||||
* Only applies to chat/chat-sdk messages.
|
||||
*
|
||||
* The extracted `senderId` is compared against `NANOCLAW_ADMIN_USER_IDS`
|
||||
* which stores ids in the namespaced form `<channel_type>:<raw>` (see
|
||||
* src/db/users.ts). chat-sdk-bridge serializes `author.userId` as a raw
|
||||
* platform id with no prefix, so we prefix it here. If the id already
|
||||
* contains a `:` we assume it's pre-namespaced (non-chat-sdk adapters
|
||||
* that populate `senderId` directly) and leave it alone.
|
||||
*/
|
||||
export function categorizeMessage(msg: MessageInRow): CommandInfo {
|
||||
const content = parseContent(msg.content);
|
||||
const text = (content.text || '').trim();
|
||||
const senderId = content.senderId || content.author?.userId || null;
|
||||
const senderId = extractSenderId(msg, content);
|
||||
|
||||
if (!text.startsWith('/')) {
|
||||
return { category: 'none', command: '', text, senderId };
|
||||
@@ -47,6 +54,17 @@ export function categorizeMessage(msg: MessageInRow): CommandInfo {
|
||||
return { category: 'passthrough', command, text, senderId };
|
||||
}
|
||||
|
||||
// eslint-disable-next-line @typescript-eslint/no-explicit-any
|
||||
function extractSenderId(msg: MessageInRow, content: any): string | null {
|
||||
const raw: string | null = content?.senderId || content?.author?.userId || null;
|
||||
if (!raw) return null;
|
||||
// Already namespaced (e.g. "telegram:123") — use as-is.
|
||||
if (raw.includes(':')) return raw;
|
||||
// Raw platform id from chat-sdk serialization — prefix with channel type.
|
||||
if (!msg.channel_type) return raw;
|
||||
return `${msg.channel_type}:${raw}`;
|
||||
}
|
||||
|
||||
/**
|
||||
* Routing context extracted from messages_in rows.
|
||||
* Copied to messages_out by default so responses go back to the sender.
|
||||
|
||||
@@ -31,7 +31,7 @@ export const createAgent: McpToolDefinition = {
|
||||
tool: {
|
||||
name: 'create_agent',
|
||||
description:
|
||||
'Create a new child agent with a given name. The name you choose becomes the destination name you use to message this agent. Admin-only. Fire-and-forget — you will receive a notification when the agent is created.',
|
||||
"Create a long-lived companion sub-agent (research assistant, task manager, specialist) — the name becomes your destination for it. NOT for source-code changes — use `create_dev_agent` for those. Admin-only. Fire-and-forget.",
|
||||
inputSchema: {
|
||||
type: 'object' as const,
|
||||
properties: {
|
||||
|
||||
@@ -0,0 +1,116 @@
|
||||
/**
|
||||
* Builder-agent MCP tools: request_dev_changes (for originating agents) and
|
||||
* request_swap (for dev agents).
|
||||
*
|
||||
* Both are fire-and-forget: the tool writes a system action row to
|
||||
* messages_out and returns immediately. The host processes the request and
|
||||
* notifies the agent via a chat message when complete.
|
||||
*
|
||||
* See `src/builder-agent/handlers.ts` on the host for the receive side.
|
||||
*/
|
||||
import { writeMessageOut } from '../db/messages-out.js';
|
||||
import type { McpToolDefinition } from './types.js';
|
||||
|
||||
function log(msg: string): void {
|
||||
console.error(`[mcp-tools] ${msg}`);
|
||||
}
|
||||
|
||||
function generateId(): string {
|
||||
return `msg-${Date.now()}-${Math.random().toString(36).slice(2, 8)}`;
|
||||
}
|
||||
|
||||
function ok(text: string) {
|
||||
return { content: [{ type: 'text' as const, text }] };
|
||||
}
|
||||
|
||||
function err(text: string) {
|
||||
return { content: [{ type: 'text' as const, text: `Error: ${text}` }], isError: true };
|
||||
}
|
||||
|
||||
export const createDevAgent: McpToolDefinition = {
|
||||
tool: {
|
||||
name: 'create_dev_agent',
|
||||
description:
|
||||
"Spawn a dev agent to edit NanoClaw's own source code — new built-in MCP tools, runner/host bug fixes, new skill files, Dockerfile/package.json/migration changes, writing a new MCP server from scratch. Heaviest self-mod path: new container, git worktree, admin approval, swap-and-restart.\n\nPrefer lighter tools when they fit: `install_packages` (new apt/npm dep in your container), `add_mcp_server` (wire an EXISTING third-party server you can invoke by command+args), `trigger_credential_collection` (API key/token), `create_agent` (long-lived companion sub-agent), `request_rebuild` (rebuild after approved config change).\n\nTwo-step flow: (1) call with just a name — does NOT start work, (2) after the 'ready' notification, send task details via `<message to=\"<name>\">`. Do not include task details in this call.",
|
||||
inputSchema: {
|
||||
type: 'object' as const,
|
||||
properties: {
|
||||
name: {
|
||||
type: 'string',
|
||||
description:
|
||||
'Short descriptive destination name for the dev agent (e.g. "dev-welcome-message", "dev-fix-typo"). Becomes the local destination you address it by. Tearing down a previous dev agent for this group is automatic on create.',
|
||||
},
|
||||
},
|
||||
required: ['name'],
|
||||
},
|
||||
},
|
||||
async handler(args) {
|
||||
const name = (args.name as string)?.trim();
|
||||
if (!name) return err('name is required');
|
||||
|
||||
const requestId = generateId();
|
||||
writeMessageOut({
|
||||
id: requestId,
|
||||
kind: 'system',
|
||||
content: JSON.stringify({
|
||||
action: 'create_dev_agent',
|
||||
requestId,
|
||||
name,
|
||||
}),
|
||||
});
|
||||
|
||||
log(`create_dev_agent: ${requestId} → "${name}"`);
|
||||
return ok(
|
||||
`Dev agent creation submitted. You will be notified when it is ready. When you see that notification, send it a message with <message to="${name}">...task details here...</message> to kick off the work. The dev agent does NOT start working until you message it.`,
|
||||
);
|
||||
},
|
||||
};
|
||||
|
||||
export const requestSwap: McpToolDefinition = {
|
||||
tool: {
|
||||
name: 'request_swap',
|
||||
description:
|
||||
'From a dev agent: submit your committed worktree changes for admin approval. The summaries become the human-readable portion of the approval card. Fire-and-forget.',
|
||||
inputSchema: {
|
||||
type: 'object' as const,
|
||||
properties: {
|
||||
overall_summary: {
|
||||
type: 'string',
|
||||
description:
|
||||
'Overall summary of the code change: what it does, why, and any risk. This is what the admin/owner reads first, so be concrete.',
|
||||
},
|
||||
per_file_summaries: {
|
||||
type: 'object',
|
||||
description:
|
||||
'Map of relative worktree path → one-sentence explanation of what changed in that file. Every changed file should have an entry.',
|
||||
additionalProperties: { type: 'string' },
|
||||
},
|
||||
},
|
||||
required: ['overall_summary', 'per_file_summaries'],
|
||||
},
|
||||
},
|
||||
async handler(args) {
|
||||
const overall = (args.overall_summary as string)?.trim();
|
||||
const perFile = args.per_file_summaries as Record<string, string> | undefined;
|
||||
if (!overall) return err('overall_summary is required');
|
||||
if (!perFile || Object.keys(perFile).length === 0) return err('per_file_summaries is required and must be non-empty');
|
||||
|
||||
const requestId = generateId();
|
||||
writeMessageOut({
|
||||
id: requestId,
|
||||
kind: 'system',
|
||||
content: JSON.stringify({
|
||||
action: 'request_swap',
|
||||
overallSummary: overall,
|
||||
perFileSummaries: perFile,
|
||||
}),
|
||||
});
|
||||
|
||||
log(`request_swap: ${requestId} → ${Object.keys(perFile).length} file(s)`);
|
||||
return ok(
|
||||
`Code change submitted. The host will classify the diff and route it for admin/owner approval. You will be notified once classification completes.`,
|
||||
);
|
||||
},
|
||||
};
|
||||
|
||||
export const builderAgentTools: McpToolDefinition[] = [createDevAgent, requestSwap];
|
||||
@@ -35,7 +35,7 @@ export const triggerCredentialCollection: McpToolDefinition = {
|
||||
tool: {
|
||||
name: 'trigger_credential_collection',
|
||||
description:
|
||||
'Collect a credential (API key, token, etc.) from the user for a third-party service. Research the service first so you can pass the correct host pattern, header name, and value format. A card is sent to the user with a button that opens a secure input modal — the value is inserted directly into OneCLI and never enters your context. Blocks until the user saves, rejects, or the request fails.',
|
||||
'Collect an API key / OAuth token / secret from the user for a third-party service. Research the service first so you pass the correct host pattern, header name, and value format. The value is injected straight into OneCLI and never enters your context. Blocks until saved/rejected/failed.',
|
||||
inputSchema: {
|
||||
type: 'object' as const,
|
||||
properties: {
|
||||
|
||||
@@ -16,6 +16,7 @@ import { interactiveTools } from './interactive.js';
|
||||
import { agentTools } from './agents.js';
|
||||
import { selfModTools } from './self-mod.js';
|
||||
import { credentialTools } from './credentials.js';
|
||||
import { builderAgentTools } from './builder-agent.js';
|
||||
|
||||
function log(msg: string): void {
|
||||
console.error(`[mcp-tools] ${msg}`);
|
||||
@@ -28,6 +29,7 @@ const allTools: McpToolDefinition[] = [
|
||||
...agentTools,
|
||||
...selfModTools,
|
||||
...credentialTools,
|
||||
...builderAgentTools,
|
||||
];
|
||||
|
||||
const toolMap = new Map<string, McpToolDefinition>();
|
||||
|
||||
@@ -35,7 +35,7 @@ export const installPackages: McpToolDefinition = {
|
||||
tool: {
|
||||
name: 'install_packages',
|
||||
description:
|
||||
'Request installation of apt or npm packages. Requires admin approval. Fire-and-forget: you will receive a notification when the request is approved or rejected. After approval, call request_rebuild to apply the changes.',
|
||||
'Install apt and/or npm packages into YOUR per-agent container image. Prefer this over `create_dev_agent` when the request is just to make a package available. Requires admin approval; fire-and-forget. After approval, call `request_rebuild` to apply.',
|
||||
inputSchema: {
|
||||
type: 'object' as const,
|
||||
properties: {
|
||||
@@ -77,7 +77,7 @@ export const addMcpServer: McpToolDefinition = {
|
||||
tool: {
|
||||
name: 'add_mcp_server',
|
||||
description:
|
||||
"Request adding an MCP server to this agent's configuration. Requires admin approval. Fire-and-forget: you will be notified when approved/rejected. On approval, your container restarts with the new server.",
|
||||
"Wire an EXISTING third-party MCP server into YOUR per-agent runtime config — you must already know the exact `command` + `args` to invoke it (e.g. `npx @modelcontextprotocol/server-github`). NOT for writing a new tool or server from scratch — use `create_dev_agent` for that. Requires admin approval; fire-and-forget.",
|
||||
inputSchema: {
|
||||
type: 'object' as const,
|
||||
properties: {
|
||||
@@ -116,7 +116,7 @@ export const requestRebuild: McpToolDefinition = {
|
||||
tool: {
|
||||
name: 'request_rebuild',
|
||||
description:
|
||||
'Request a container rebuild to apply pending package installations. Requires admin approval. Fire-and-forget: you will be notified when approved/rejected. On approval, your container restarts with the new image on the next message.',
|
||||
'Rebuild YOUR container image to pick up approved `install_packages` / `add_mcp_server` changes. Requires admin approval; fire-and-forget.',
|
||||
inputSchema: {
|
||||
type: 'object' as const,
|
||||
properties: {
|
||||
|
||||
@@ -0,0 +1,277 @@
|
||||
# Builder Agent: Self-Modification via Delegated Dev Agent
|
||||
|
||||
Plan for the self-modification flow tracked under "Self-modification via builder-agent delegation" in `v2-checklist.md`. Lets a user request code changes from chat, have a dev agent produce them in an isolated worktree copy, and land them through a host-gated approval + deadman-restart dance. Goal is to replace terminal-based customization entirely for the common case.
|
||||
|
||||
## Goal
|
||||
|
||||
Enable full customization of a NanoClaw install from chat, without returning to the terminal, while guaranteeing:
|
||||
|
||||
1. **No cross-group data leakage** — dev agent cannot see another group's session DB, memory, or credentials.
|
||||
2. **Owner/admin approval on every live change** — nothing runs without an explicit human gate.
|
||||
3. **Automatic rollback** — if the new version doesn't handshake back within its window, it reverts without user action.
|
||||
4. **No self-modification footgun** — dev agent edits a copy, never the code it's currently running on.
|
||||
5. **Per-group isolation preserved** — one group's customizations stay local to that group unless the user explicitly promotes them.
|
||||
|
||||
## Per-Group Copy Architecture (existing)
|
||||
|
||||
This plan builds on a mechanism that already exists in v2. It's important context for the classification model below.
|
||||
|
||||
On first spawn of an agent group, `src/group-init.ts::initGroupFilesystem()` creates **private per-group copies** of:
|
||||
|
||||
| Repo path (template) | Private per-group path | Container mount |
|
||||
|----------------------|------------------------|-----------------|
|
||||
| `container/agent-runner/src/` | `data/v2-sessions/<group-id>/agent-runner-src/` | `/app/src` (rw) |
|
||||
| `container/skills/` | `data/v2-sessions/<group-id>/.claude-shared/skills/` | `/home/node/.claude/skills` (rw) |
|
||||
| `groups/<folder>/CLAUDE.md` | (same path, owned by group) | `/workspace/agent` (rw) |
|
||||
|
||||
After init, the host **never overwrites** the private copies on upstream updates — the group owns them. Changes to the repo's `container/agent-runner/src/` or `container/skills/` only affect **new** groups created after the change. Existing groups keep running their private copies forever unless explicitly refreshed.
|
||||
|
||||
This means edits to runner or skills code can land safely in one group without touching any other group. The per-group copy mechanism is the foundation for the whole "dev agent edits runner code" story — without it, runner edits would be host-level and globally disruptive.
|
||||
|
||||
**Gitignore adjustment (new):** today `data/` is gitignored wholesale. For this feature, we carve out exceptions so each group's private code is tracked in the main repo:
|
||||
|
||||
```gitignore
|
||||
data/**
|
||||
!data/v2-sessions/
|
||||
!data/v2-sessions/*/
|
||||
!data/v2-sessions/*/agent-runner-src/
|
||||
!data/v2-sessions/*/agent-runner-src/**
|
||||
!data/v2-sessions/*/.claude-shared/
|
||||
!data/v2-sessions/*/.claude-shared/skills/
|
||||
!data/v2-sessions/*/.claude-shared/skills/**
|
||||
```
|
||||
|
||||
(`groups/<folder>/CLAUDE.md` is already in the repo.) Every swap commits the updated per-group files to main alongside the swap metadata, giving us full per-group code history in one repo — `git log data/v2-sessions/<id>/agent-runner-src/` shows everything a group has ever run. **Rollback uses git:** `git checkout <pre-swap-sha> -- <affected-paths>` plus a forward-only revert commit for auditability. One mechanism for file state; no separate blob-storage table needed.
|
||||
|
||||
## Mental Model
|
||||
|
||||
The dev agent is a **clone of the originating agent**: same container image, same mounts, same OneCLI agent identity (so same credential scope, same LLM routing, same privilege). It is spawned by the originating agent's group via the standard `create_agent` path and inherits privilege automatically because it runs under the same OneCLI agent.
|
||||
|
||||
Beyond those shared mounts, the dev agent gets one extra mount: a **git worktree copy of the whole repo**, writable, containing everything except `data/`, `store/`, and real `.env`. The worktree is constructed by:
|
||||
|
||||
1. `git worktree add .worktrees/dev-<id> HEAD` — gets all repo-tracked files.
|
||||
2. Overlay: copy `data/v2-sessions/<originating-id>/agent-runner-src/*` → `<worktree>/container/agent-runner/src/` (overwriting the template with the originating group's current customized runner).
|
||||
3. Overlay: copy `data/v2-sessions/<originating-id>/.claude-shared/skills/*` → `<worktree>/container/skills/` (same reason).
|
||||
4. Shadow-mount a dummy `.env`; exclude `data/` and `store/` entirely.
|
||||
|
||||
Now the dev agent's worktree reflects exactly what the originating group is currently running — not a pristine template. Edits go to this copy. On swap, the host maps worktree paths back to the right destinations (per-group private dir for runner/skills, repo paths for host code).
|
||||
|
||||
Because the dev agent's own runtime is the live code, not the worktree, **self-modification is structurally impossible**: the dev agent cannot change the code it's currently running on.
|
||||
|
||||
## Actors
|
||||
|
||||
| Actor | Role |
|
||||
|-------|------|
|
||||
| **Originating agent** (agent-A) | User-facing agent the user is chatting with. Decides a change is needed, spawns the dev agent, brokers the pre-swap handshake with the user. |
|
||||
| **Dev agent** | Clone of agent-A created by agent-A's group. Inherits agent-A's OneCLI scope and privilege. No web access. Works in a dedicated worktree overlaid with the originating group's current state. |
|
||||
| **Host** | Creates the worktree (with overlays), mounts it, classifies the diff, routes approval, runs the swap dance, runs the deadman timer, handles rollback and promote-to-template. |
|
||||
| **Approver** | Group admin (group-level diffs) or owner (host-level diffs, typed confirmation). |
|
||||
|
||||
## Flow
|
||||
|
||||
### 1. User requests a change
|
||||
|
||||
User → agent-A in chat: "add feature X" / "fix this bug" / "rename my welcome message."
|
||||
|
||||
Agent-A determines the request needs code edits, calls a new MCP tool `request_dev_changes(summary)`.
|
||||
|
||||
### 2. Host spawns dev agent + worktree
|
||||
|
||||
- **If a previous dev agent exists for this originating group, tear it down now.** The originating agent may keep talking to a prior dev agent between requests (e.g. "hey, can you also tweak X" follow-up chat), but the moment a **new** `request_dev_changes` call comes in, the prior dev agent group is wound down and its worktree cleaned up. One live dev agent at a time per originating group.
|
||||
- Host creates a **fresh** dev agent group per request. Originating agent can supply a name in `request_dev_changes(summary, dev_agent_name?)` so the dev agent has a stable identity for conversation (e.g. "dev-refactor-welcome"). If no name given, auto-generated.
|
||||
- Dev agent created through the existing `create_agent` path, under agent-A's OneCLI agent identity, so it inherits credential scope and privilege. **Upstream dependency:** OneCLI parent-child privilege inheritance (`onecli agents create --inherit-from <parent>`) would make this first-class; today we fake it by reusing the same `agentId` or replicating secret assignments in `src/onecli-secrets.ts`.
|
||||
- Host creates a **fresh** worktree at `.worktrees/dev-<request-id>` on branch `dev/<request-id>`, then applies the runner and skills overlays from the originating group's private dirs. Shadows `.env`, excludes `data/` and `store/` (except the carve-outs declared in the Per-Group Copy Architecture section).
|
||||
- Host mounts the worktree into the dev agent's container at `/worktree` (additional writable volume). The dev agent's standard runtime mounts are unchanged — it's running the live version, editing the worktree copy.
|
||||
- Host delivers the user's request summary to the dev agent as a system message via the normal session DB path.
|
||||
|
||||
### 3. Dev agent works
|
||||
|
||||
- Reads context from the whole worktree.
|
||||
- Edits anywhere in the worktree except excluded paths. No filesystem-level scope restriction — dev agent can touch `src/`, `container/agent-runner/src/`, `container/skills/`, `groups/<originating>/`, `package.json`, `Dockerfile`, migrations. Classification (and the consequent approval path) happens later, not now.
|
||||
- No web access. Tool allowlist is a reduced variant (file edit, git, local build/test).
|
||||
- Can message agent-A for clarification via the standard destinations flow.
|
||||
- Commits to the dev branch in the worktree when ready.
|
||||
|
||||
### 4. Pre-swap handshake with originating agent
|
||||
|
||||
Before submitting the swap request to the host, the dev agent clears it with agent-A:
|
||||
|
||||
1. Dev agent → agent-A: "Ready to propose these changes: {overall summary}. OK to submit for approval?"
|
||||
2. Agent-A (likely confirming with the user): "yes, submit."
|
||||
3. Dev agent → host: `request_swap(per_file_summaries, overall_summary, commit_sha)` MCP tool.
|
||||
|
||||
Per-file summaries and overall summary are required — host rejects the swap request if either is missing or empty. These summaries become the human-readable portion of the approval card.
|
||||
|
||||
### 5. Host classifies the diff
|
||||
|
||||
Host runs `git diff --name-only` in the worktree against `main`. Every changed path is classified:
|
||||
|
||||
**Group-level** (group admin approves):
|
||||
- `groups/<originating>/**` — the originating group's own memory/CLAUDE.md/skills
|
||||
- `container/agent-runner/src/**` — targets the originating group's **private runner copy**, not the repo template
|
||||
- `container/skills/**` — targets the originating group's **private skills copy**, not the repo template
|
||||
|
||||
**Host-level** (owner only + typed confirmation):
|
||||
- `src/**` — host code, affects everyone
|
||||
- `scripts/**`, `setup/**`
|
||||
- Root `package.json`, `package-lock.json`, `Dockerfile`, `tsconfig.json`
|
||||
- `container/Dockerfile` (shared container image)
|
||||
- `src/db/migrations/**` (flagged as non-rollback-safe)
|
||||
- `docs/**`, `README.md`
|
||||
- `groups/<other>/**` — touching another group's folder is never allowed through this flow; classify as host-level so it requires owner consent (but in practice the approval message should call it out as "touching another group's data" specifically)
|
||||
|
||||
**Combined (host + group runner/skills) — still host-level, but with a cross-group safety warning:** if the diff contains both host-level paths AND `container/agent-runner/src/**` or `container/skills/**`, the approval card includes an extra prominent warning (see §6).
|
||||
|
||||
**Never writable** (filesystem-excluded from worktree): `data/**`, `store/**`, `.env`, `.env.*`.
|
||||
|
||||
Classification is purely about **approval routing and swap targets**, not about what the dev agent was allowed to write. The dev agent always has full worktree write access.
|
||||
|
||||
### 6. Approval card
|
||||
|
||||
Host sends an approval card via `pending_approvals` to the appropriate approver.
|
||||
|
||||
**Group-level diff:**
|
||||
- Approver: group admin of the originating group (via `pickApprover` → `pickApprovalDelivery`).
|
||||
- Card: overall summary + per-file summaries + raw diff + in-flight work notice + deadman reminder + Approve / Reject.
|
||||
|
||||
**Host-level diff:**
|
||||
- Approver: **owner only** (not escalated through scoped or global admin).
|
||||
- Card: same as group-level, plus:
|
||||
- Prominent warning banner: "Host-level changes run with full credential scope and can access data across all groups. Review carefully."
|
||||
- **Typed confirmation required** — approver must type a confirmation word (e.g. `CONFIRM HOST`) before the Approve button activates. Guards against fat-finger approvals on mobile.
|
||||
- Schema-migration warning if `src/db/migrations/**` is in the diff — rollback may be lossy.
|
||||
- **Cross-group safety warning** if the diff also includes runner/skills changes:
|
||||
> "This change modifies host code **and** per-group runner/skills code. Host changes will apply to every group on next restart. Runner/skills changes will only apply to the originating group. Other existing groups will continue running their old runner against the new host — this can break them if the host change requires a matching runner update. If that happens, you can run another dev-agent cycle from each affected group to refresh them."
|
||||
|
||||
Rejection at any point throws the diff away and tears down the worktree.
|
||||
|
||||
### 7. Swap dance (on approval)
|
||||
|
||||
0. **Reject if another swap is in flight for this originating group.** Plan enforces one swap per originating group at a time via a row lock on `pending_swaps`. A second approval arriving while another is mid-dance is impossible by construction; a second `request_swap` from the dev agent while the previous swap is awaiting approval gets rejected with "previous swap still pending — resolve it first."
|
||||
1. **Capture pre-swap state** — record main's current HEAD SHA in `pending_swaps.pre_swap_sha` (used for rollback) and snapshot the central DB to `data/backups/swap-<request-id>.sqlite` (path stored in `pending_swaps.db_snapshot_path`).
|
||||
2. **Apply the diff to swap targets** — host walks the diff by path and maps each changed file to its destination:
|
||||
- `<worktree>/container/agent-runner/src/FILE` → `data/v2-sessions/<originating-id>/agent-runner-src/FILE`
|
||||
- `<worktree>/container/skills/FILE` → `data/v2-sessions/<originating-id>/.claude-shared/skills/FILE`
|
||||
- `<worktree>/groups/<originating>/FILE` → `groups/<originating>/FILE` (in the repo, same path)
|
||||
- `<worktree>/src/FILE` → `src/FILE` (host code, in the repo)
|
||||
- `<worktree>/container/Dockerfile` → `container/Dockerfile` (in the repo)
|
||||
- `<worktree>/package.json` → `package.json` (in the repo)
|
||||
- etc.
|
||||
3. **Commit the swap** — host uses `git commit --only <touched-paths>` to commit exactly the swap's files to main with message `swap <request-id>: <overall_summary>`, leaving any unrelated uncommitted state in main alone. If git is in a weird state (mid-merge, mid-rebase, detached HEAD), the swap is refused with a clear error surfaced back to the approver — no auto-resolution, no stashing. `pre_swap_sha` + git is the rollback mechanism.
|
||||
4. **Conditional rebuild** — if the diff touches the container Dockerfile, root `package.json`, or similar image-affecting files, rebuild the affected image(s). Detect via classifier output. Group-local Dockerfile edits (if we ever add that) would trigger a per-group image rebuild; root Dockerfile / host package.json edits trigger a host-wide rebuild.
|
||||
5. **Restart affected processes.**
|
||||
- Group-level diff → restart only the originating agent's container. It re-mounts `agent-runner-src/` and `.claude-shared/skills/`, picking up the updated per-group copies.
|
||||
- Host-level diff → restart the host process. All channels reconnect; active sessions resume on next message.
|
||||
6. **Start the deadman timer** — 2 minutes initially, extendable (see §8).
|
||||
7. **Post-restart handshake begins** — agent-A (now running the new code) sends the user a confirmation message.
|
||||
|
||||
### 8. Deadman dance
|
||||
|
||||
Deadman runs for **both** group-level and host-level swaps. Two-message handshake verifying both inbound and outbound paths work under the new code:
|
||||
|
||||
1. Agent-A → user: "I'm back with the new version. Reply `confirm` to keep it, or `rollback` to revert."
|
||||
2. User → agent-A: `confirm`.
|
||||
|
||||
On step 2, host cancels the timer and the swap is finalized. Two messages is enough: step 1 proves outbound delivery works under the new code, and step 2 arriving and being processed proves inbound routing + agent handling work.
|
||||
|
||||
**Timer state is persisted** in `pending_swaps` (`deadman_started_at`, `deadman_expires_at`, `handshake_state`). The in-memory `setTimeout` is just the trigger — the source of truth is the DB row. This is what makes host-level swaps work across the expected host restart, and what makes group-level swaps survive an unexpected host crash.
|
||||
|
||||
**Timer extension on progress:** if step 1 is successfully delivered to the user, update `deadman_expires_at` to +2 minutes from now and reset the in-memory timer. Slow channel reconnects (WhatsApp Baileys: 30–120s) should not trigger false rollback once we know outbound is flowing. Hard cap: 10 minutes absolute maximum.
|
||||
|
||||
**Explicit rollback:** user can reply `rollback` at step 2 (instead of `confirm`) to trigger immediate rollback without waiting for the timer.
|
||||
|
||||
**On timer expiry without step 2:**
|
||||
1. Host runs `git checkout <pending_swaps.pre_swap_sha> -- <affected-paths>` to restore every file modified by the swap, then records a forward-only revert commit on main: `rollback <request-id>: deadman timeout`. One mechanism — git — handles both restore and audit trail.
|
||||
2. Restores the central DB from the snapshot at `pending_swaps.db_snapshot_path`.
|
||||
3. Restarts the originating agent's container (group-level) or the host (host-level).
|
||||
4. Notifies the user via any working channel: "Rolled back to previous version — confirmation timed out."
|
||||
|
||||
**Resume on host startup:** the host startup sequence (see `builder-agent/startup.ts`) scans `pending_swaps` for any row in `awaiting_confirmation` status:
|
||||
- If `handshake_state = 'pending_restart'` (host-level swap finished the restart; now running the new code): send handshake message 1 to the user, update state to `message1_sent`, start in-memory timer for the remaining time in `deadman_expires_at`.
|
||||
- If `handshake_state = 'message1_sent'` (host or container crashed while waiting for user reply): don't resend, just reschedule the timer for the remaining time.
|
||||
- If `deadman_expires_at <= now`: expired, execute rollback immediately.
|
||||
|
||||
This one code path covers both the expected host-level restart and any unexpected host crash mid-dance. ~50 LOC total in `startup.ts` including the orphan-worktree cleanup.
|
||||
|
||||
### 9. Promote to template (post-finalize)
|
||||
|
||||
If the finalized diff touched `container/agent-runner/src/**` or `container/skills/**` — regardless of whether it was classified group-level or host-level-combined — host sends a follow-up card to the same approver:
|
||||
|
||||
> "The runner/skills changes are currently applied only to the {originating} group. Would you like to also apply them to the template so new groups created in the future inherit these changes? (Existing groups will not be affected.)"
|
||||
|
||||
Options: `Apply to template` / `Keep local to {originating}`. No defer — the prompt is decide-now-or-never to avoid a lifecycle management burden.
|
||||
|
||||
On `Apply to template`: host copies the same files from `<worktree>` to `container/agent-runner/src/` and/or `container/skills/` **in the main repo**, commits (`promote <request-id>: <paths> → template`), and done. New groups initialized after this point get the updated template as their starting copy. Existing groups (including the originating one, which already has its private copy updated) are unaffected.
|
||||
|
||||
On `Keep local`: nothing further happens. Changes stay in the originating group's private copy.
|
||||
|
||||
**Not in v1:** bulk refresh of other existing groups when a combined host + runner diff lands. The cross-group safety warning on the host-level approval card (§6) sets expectations. If a user hits real breakage, they run another dev-agent cycle from each affected group to refresh its private copy. Revisit if this becomes a real pain point.
|
||||
|
||||
## Code Affected
|
||||
|
||||
### New modules
|
||||
|
||||
- `src/builder-agent/worktree.ts` — worktree creation with overlay from per-group private dirs, shadow `.env`, exclude `data/`/`store/`, dev branch lifecycle. Crash cleanup is a simple startup sweep (see below), not runtime bookkeeping.
|
||||
- `src/builder-agent/classifier.ts` — diff classification by path, following the rules in §5. Exports structured output (list of changes with their classification + swap target).
|
||||
- `src/builder-agent/swap.ts` — captures `pre_swap_sha` + DB snapshot, applies diff to swap targets, commits via `git commit --only <paths>` to main, refuses if git is in a weird state, conditional rebuild, restart orchestration.
|
||||
- `src/builder-agent/deadman.ts` — in-memory timer backed by `pending_swaps` row, extension logic, handshake state tracking, rollback via `git checkout <pre_swap_sha>` + revert commit + DB snapshot restore. Runs for both group-level and host-level swaps.
|
||||
- `src/builder-agent/promote.ts` — post-finalization prompt for promoting runner/skills changes to the template.
|
||||
- `src/builder-agent/approval.ts` — approval-card rendering for swap requests and the typed-confirmation gate for host-level approvals. Built on `pending_approvals` directly (swap approvals are not credential operations; they have nothing to do with `onecli-approvals.ts`).
|
||||
- `src/builder-agent/startup.ts` — runs on host startup: (a) resume any `pending_swaps` row in `awaiting_confirmation` status (see §8 "Resume on host startup" — handles both host-level expected restarts and unexpected group-level host crashes with one code path); (b) delete any `.worktrees/dev-*` dir whose corresponding row is in a terminal state or has no row. ~50 LOC total including resume + orphan cleanup.
|
||||
- `src/db/migrations/NNN_builder_agent.sql` — one new table:
|
||||
- `pending_swaps` — `request_id`, `dev_agent_id`, `originating_group_id`, `dev_branch`, `commit_sha`, `classification` (group|host|combined), `status`, `summary_json`, `pre_swap_sha`, `db_snapshot_path`, `deadman_started_at`, `deadman_expires_at`, `handshake_state`. Everything swap-lifecycle fits on one row.
|
||||
|
||||
### New MCP tools (container)
|
||||
|
||||
- `request_dev_changes(summary, dev_agent_name?)` — on originating agent; host spawns dev agent + worktree.
|
||||
- `request_swap(per_file_summaries, overall_summary, commit_sha)` — on dev agent; host classifies + routes for approval.
|
||||
|
||||
### Modified
|
||||
|
||||
- `src/container-runner.ts` — support an extra writable worktree mount for dev agents; same per-group mounts otherwise. The dev agent runs the **standard** agent-runner image — no dev-specific variant. Tool restrictions (no web, etc.) are enforced via the agent-runner's existing tool allowlist mechanism, configured per session.
|
||||
- `src/group-init.ts` — no changes required, but verify that promote-to-template copies land in the right place for future groups to pick up.
|
||||
- `src/access.ts` / `src/db/users.ts` — `pickApprover` variant that skips escalation and targets owner only for host-level diffs.
|
||||
- `src/delivery.ts` — no new logic; existing ACL already handles dev-agent destinations.
|
||||
|
||||
### Not touched
|
||||
|
||||
- `container/agent-runner/**` — **no dev-agent variant.** The standard agent-runner is reused as-is; the dev agent is just a clone with a different tool allowlist and an extra mount.
|
||||
- `src/onecli-approvals.ts` — **unrelated.** Swap approvals use `pending_approvals` directly via the new `builder-agent/approval.ts`. OneCLI approvals are for credential-gating operations only.
|
||||
|
||||
### Tests (v1)
|
||||
|
||||
- Unit tests for `classifier`, `swap` target mapping, `deadman` state machine, `startup-sweep`. ~400–600 LOC.
|
||||
- **No end-to-end integration test** in v1 — exercising real container/host restarts in CI is expensive and can be added later. Manual testing during development covers the full flow.
|
||||
|
||||
## OneCLI Dependencies (Upstream)
|
||||
|
||||
- **Parent-child agent privilege inheritance** — `onecli agents create --inherit-from <parent-agent-id>`. Today we fake it (same `agentId` or replicated secret assignments). Not blocking for v1 of this feature but makes the wiring cleaner.
|
||||
- **Agent-scoped tool allowlists** — nice to have to ensure the dev agent variant cannot invoke web tools even if present in its image. Not blocking; we enforce at the container-runner tool-allowlist level.
|
||||
|
||||
## Decisions
|
||||
|
||||
1. **Dev agent lifecycle** — **fresh dev agent group per request**, with an optional name supplied by the originating agent so it has a stable conversational identity. Previous dev agent is kept alive between requests (originating agent can chat with it indefinitely after the prior request finalized) and is torn down the moment a new `request_dev_changes` arrives. One live dev agent at a time per originating group.
|
||||
2. **Worktree reuse across requests** — **fresh per request.** New worktree, new overlays, new dev branch every time.
|
||||
3. **Live CLAUDE.md race** — **accept the race.** No locking. Dev cycles are short and the race window is small; swap overwrites whatever the originating agent wrote in the meantime. Revisit if it becomes a real problem in practice.
|
||||
4. **Schema migrations in host-level diffs** — **allowed with warning.** Classifier flags the approval card with "rollback may be lossy if migration is non-reversible." Owner decides.
|
||||
5. **Parallel dev-agent flows** — **serialized.** One in-flight swap per originating group at a time, enforced by a `pending_swaps` row lock. Second `request_swap` while the previous is pending approval gets rejected.
|
||||
6. **Bulk refresh on combined host + runner diffs** — **not in v1.** The cross-group safety warning on the approval card sets expectations. If a user hits real breakage, they run another dev-agent cycle from each affected group to refresh its private copy. Revisit if it becomes a real pain point.
|
||||
7. **Tracking per-group src code history** — **un-gitignore the per-group carve-outs**, track them in main. Every swap is a commit touching the per-group paths (and host paths if applicable); rollback is a forward-only revert commit. One git history covers host code, template, and every group's private state.
|
||||
|
||||
## Deliberate Simplifications
|
||||
|
||||
To keep the implementation surface small, v1 explicitly does not handle:
|
||||
|
||||
- **Git in a weird state** (mid-merge, mid-rebase, detached HEAD): swap is refused with a clear error surfaced to the approver. No stashing, no auto-resolution.
|
||||
- **Runtime worktree bookkeeping for crashes:** crash recovery is a single startup sweep that resumes pending deadmans and deletes orphan worktrees. No in-flight crash tracking, no leases.
|
||||
- **End-to-end integration tests:** unit tests only for v1. Full container/host-restart integration test is a follow-up.
|
||||
- **No separate dev-agent runner image:** dev agent reuses the standard agent-runner with a different tool allowlist and an extra mount. Zero delta in `container/agent-runner/**`.
|
||||
- **Bulk refresh of other groups** on combined host + runner diffs (see §9): warning on approval card sets expectations; user runs another dev-agent cycle per affected group if they hit real breakage.
|
||||
|
||||
## Remaining Open Questions
|
||||
|
||||
None blocking — plan is ready to implement.
|
||||
|
||||
## Replaces
|
||||
|
||||
This plan replaces the `Self-modification via builder-agent delegation` sub-block in `docs/v2-checklist.md`. Once agreed, update the checklist to collapse those subtasks into a single line pointing here.
|
||||
+6
-30
@@ -7,15 +7,13 @@ Status: [x] done, [~] partial, [ ] not started
|
||||
## Core Architecture
|
||||
|
||||
- [x] Session DB replaces IPC (messages_in / messages_out as sole IO)
|
||||
- [x] Two-DB split: inbound.db (host-owned) + outbound.db (container-owned) — zero cross-process write contention
|
||||
- **Cross-mount invariants (empirically validated, see `scripts/sanity-live-poll.ts`):** (1) `journal_mode=DELETE` on every session DB — WAL's `-shm` is memory-mapped and VirtioFS does not propagate mmap coherency host→guest, so WAL leaves the container's poll loop frozen on an early snapshot with no error; (2) host opens-writes-closes per operation — the close is what invalidates the container's VirtioFS page cache; (3) one writer per file — DELETE-mode with two writers corrupts because journal-unlink doesn't propagate atomically. Each invariant was individually confirmed by flipping it and observing silent message loss or corruption. Do not "simplify" by unifying the DBs, switching to WAL, or keeping a long-lived host connection.
|
||||
- **Seq parity is load-bearing, not cleanup:** host writes even seqs, container writes odd seqs. The seq is the agent-facing message ID returned by `send_message` and consumed by `edit_message` / `add_reaction`, and `getMessageIdBySeq()` looks up by seq across both tables. Removing parity would let a single ID resolve to the wrong row.
|
||||
- [x] Central DB (agent groups, messaging groups, sessions, routing)
|
||||
- [x] Host sweep (stale detection via heartbeat file, retry with backoff, recurrence scheduling)
|
||||
- [x] Active delivery polling (1s for running sessions)
|
||||
- [x] Sweep delivery polling (60s across all sessions)
|
||||
- [x] Container runner with session DB mounting
|
||||
- [x] Per-session container lifecycle and idle timeout
|
||||
- [ ] Replace hard Idle and Timeout with work aware prompts to user to kill stuck processes
|
||||
- [x] Session resume (sessionId + resumeAt across queries)
|
||||
- [x] Graceful shutdown (SIGTERM/SIGINT handlers)
|
||||
- [x] Orphan container cleanup on startup
|
||||
@@ -36,7 +34,7 @@ Status: [x] done, [~] partial, [ ] not started
|
||||
- [x] Mock provider (testing)
|
||||
- [x] Provider factory
|
||||
- [ ] Codex provider
|
||||
- [ ] OpenCode provider
|
||||
- [~] OpenCode provider
|
||||
|
||||
## Channel Adapters
|
||||
|
||||
@@ -52,7 +50,7 @@ Status: [x] done, [~] partial, [ ] not started
|
||||
- [~] Google Chat via Chat SDK (adapter + skill written, not tested)
|
||||
- [~] Linear via Chat SDK (adapter + skill written, not tested)
|
||||
- [~] GitHub via Chat SDK (adapter + skill written, not tested)
|
||||
- [~] WhatsApp Cloud API via Chat SDK (adapter + skill written, not tested)
|
||||
- [x] WhatsApp Cloud API via Chat SDK (adapter + skill written, not tested)
|
||||
- [~] Resend (email) via Chat SDK (adapter + skill written, not tested)
|
||||
- [~] Matrix via Chat SDK (adapter + skill written, not tested)
|
||||
- [~] Webex via Chat SDK (adapter + skill written, not tested)
|
||||
@@ -66,7 +64,6 @@ Status: [x] done, [~] partial, [ ] not started
|
||||
- [x] Cold-DM infrastructure — `ChannelAdapter.openDM?(handle)` optional method, resolved via Chat SDK `chat.openDM` for resolution-required channels (Discord, Slack, Teams, Webex, gChat) and fall-through to the handle directly for direct-addressable channels (Telegram, WhatsApp, iMessage, Matrix, Resend). `src/user-dm.ts::ensureUserDm` caches every resolution in `user_dms` so subsequent cold DMs are a DB read.
|
||||
- [x] Agent-shared session mode (cross-channel shared sessions, e.g. GitHub + Slack)
|
||||
- [x] Auto-onboarding on channel registration (/welcome skill triggered on first wiring)
|
||||
- [ ] Setup vs production channel separation
|
||||
|
||||
## Chat-First Setup Flow
|
||||
|
||||
@@ -112,7 +109,6 @@ Status: [x] done, [~] partial, [ ] not started
|
||||
- [x] Container waking on new message
|
||||
- [x] Typing indicator triggered on message route
|
||||
- [~] Trigger rule matching (router picks highest-priority agent, regex/mention matching TODO)
|
||||
- [x] Delivery ACL — `delivery.ts` throws on unauthorized channel / agent-to-agent targets (was `return` previously, which falsely marked the message as delivered because the outer loop treated undefined as success — real incident: silent drops during the welcome-DM test before the fix). Self-origin chat and self-to-self agent messages skip the destination check. `createMessagingGroupAgent` auto-creates the companion `agent_destinations` row with a normalized, collision-broken `local_name` so operators don't have to hand-insert destinations when wiring channels.
|
||||
|
||||
## Rich Messaging
|
||||
|
||||
@@ -126,6 +122,7 @@ Status: [x] done, [~] partial, [ ] not started
|
||||
- [~] Formatted /usage, /context, /cost output (commands pass through, no rich card formatting)
|
||||
- [ ] Context window visibility: show position in context, approaching compaction, when compaction happens, post-compaction state
|
||||
- [ ] Threading and replies support
|
||||
- [ ] Auto-compact on idle before cache expires
|
||||
|
||||
## MCP Tools (Container)
|
||||
|
||||
@@ -142,7 +139,6 @@ Status: [x] done, [~] partial, [ ] not started
|
||||
- [x] install_packages (apt/npm, owner/admin approval required via `pickApprover`, strict name validation)
|
||||
- [x] add_mcp_server (owner/admin approval required via `pickApprover`)
|
||||
- [x] request_rebuild (rebuilds per-agent-group Docker image)
|
||||
- ~~send_to_agent~~ — deleted; agents are just destinations in the unified `send_message`
|
||||
|
||||
## Scheduling
|
||||
|
||||
@@ -167,30 +163,10 @@ Status: [x] done, [~] partial, [ ] not started
|
||||
- [x] add_mcp_server (admin approval)
|
||||
- [x] request_rebuild (builds per-agent-group Docker image with approved packages)
|
||||
- [x] Fire-and-forget model (write request, return immediately; chat notification on approval; container killed so next wake picks up new config/image)
|
||||
- [ ] Role definitions beyond admin (custom roles, per-group permissions)
|
||||
- [ ] Configurable sensitive action list (hardcoded today)
|
||||
- [~] OneCLI integration for human-loop approvals on credentialed requests (agent touching a credentialed resource → OneCLI gates → approval card to admin → OneCLI releases credential) — SDK 0.3.1 `configureManualApproval` wired into host, routes to admin via existing `pending_approvals` infra
|
||||
- [~] Credential collection from chat — `trigger_credential_collection` MCP tool; agent researches API config, card → modal → `onecli secrets create` via internal facade (`src/onecli-secrets.ts`); credential value never enters agent context
|
||||
- [ ] Replace `src/onecli-secrets.ts` shell facade with SDK-native secret management when `@onecli-sh/sdk` adds it
|
||||
- [ ] Per-agent-group secret scoping via OneCLI `agentId` (facade passes it today; CLI ignores it until upstream supports)
|
||||
- [ ] **Attach newly created secrets to the calling agent** — `trigger_credential_collection` today runs `onecli secrets create` but leaves the secret unassigned, so the agent that requested the credential still gets zero injections. Fix options: (a) follow-up `onecli agents set-secrets` call in `src/onecli-secrets.ts` after create, (b) set the agent to `mode=all`, or (c) upstream ask — `onecli secrets create --assign-to-agent-ids <id,...>` so it's a one-shot and orphaned secrets are impossible. Prefer (c); use (a) as the interim.
|
||||
- [ ] **Chat SDK input support beyond Slack (upstream ask)** — today only Slack's Modal surface works for secure input. The platforms themselves support it, but Chat SDK doesn't expose it:
|
||||
- [ ] **Discord** — native modal (`InteractionResponseType.Modal` with `ActionRow([TextInput])`). Map `event.openModal(Modal(...))` to the Discord REST callback.
|
||||
- [ ] **Microsoft Teams** — Adaptive Card with `Input.Text`, delivered as a regular message (inline, no modal-trigger needed).
|
||||
- [ ] **Google Chat** — Cards v2 `textInput` widget, inline in the conversation.
|
||||
- [ ] **Webex** — Adaptive Card with `Input.Text`, inline.
|
||||
- [ ] **WhatsApp Cloud** — WhatsApp Flows with a text field (requires flow registration with Meta — heavier but doable).
|
||||
- When these land upstream, `trigger_credential_collection` gets secure input on all major channels for free; no NanoClaw-side code change beyond maybe declaring the capability per adapter.
|
||||
- [ ] Tunneled OneCLI dashboard fallback for platforms with no native form input (Telegram Mini Apps aside, iMessage without Apple Business Register, Matrix, email). Signed short-lived URL → browser form served by OneCLI at 10254 → tunnel via ngrok/cloudflared/tailscale-funnel. Value never touches the chat surface.
|
||||
- [ ] OneCLI built-in apps (`onecli apps list`) shadow generic secrets on the same host. `trigger_credential_collection` should check for a matching app first; if one exists, route the user through the app's connect URL instead of creating a secret. Upstream ask: `apps configure --api-key` for api_key apps.
|
||||
- [ ] Tunneled OneCLI dashboard for credential addition (Telegram Mini Apps aside, iMessage without Apple Business Register, Matrix, email). Signed short-lived URL → browser form served by OneCLI at 10254 → tunnel via cloudflare durable object. Value never touches the chat surface.
|
||||
- [ ] Sensitive data access flow (agent requests PII / secrets / private files → approval card → scoped, time-limited access)
|
||||
- [ ] Self-modification via builder-agent delegation:
|
||||
- [ ] Agent requests code changes by delegating to a builder agent
|
||||
- [ ] Builder agent has write access to the requesting agent's code and Dockerfile
|
||||
- [ ] Approval modes: approve per-edit as builder works, or approve full diff at the end
|
||||
- [ ] Diff review card sent to admin showing all proposed changes
|
||||
- [ ] On approval: apply edits, rebuild container image, restart agent
|
||||
- [ ] On rejection: discard changes, notify requesting agent
|
||||
- [ ] Self-modification via builder-agent delegation — full design in [v2-builder-agent-plan.md](v2-builder-agent-plan.md). Dev-agent clone of originating agent edits a worktree overlaid with the group's private runner/skills; host classifies diff, routes approval (group admin or owner+typed-confirm), applies per-path swap targets, runs deadman-restart dance, commits every swap to main for full per-group history.
|
||||
|
||||
## Named Destinations + ACL
|
||||
|
||||
|
||||
@@ -156,7 +156,6 @@ async function main(): Promise<void> {
|
||||
name: args.agentName,
|
||||
folder,
|
||||
agent_provider: null,
|
||||
container_config: null,
|
||||
created_at: now,
|
||||
});
|
||||
ag = getAgentGroupByFolder(folder)!;
|
||||
|
||||
@@ -29,7 +29,6 @@ if (!getAgentGroup(AGENT_GROUP_ID)) {
|
||||
name: 'Main',
|
||||
folder: 'main',
|
||||
agent_provider: 'claude',
|
||||
container_config: null,
|
||||
created_at: new Date().toISOString(),
|
||||
});
|
||||
console.log('Created agent group:', AGENT_GROUP_ID);
|
||||
|
||||
@@ -36,7 +36,6 @@ createAgentGroup({
|
||||
name: 'Channel E2E Agent',
|
||||
folder: 'test-channel-e2e',
|
||||
agent_provider: 'claude',
|
||||
container_config: null,
|
||||
created_at: new Date().toISOString(),
|
||||
});
|
||||
|
||||
|
||||
@@ -38,7 +38,6 @@ createAgentGroup({
|
||||
name: 'E2E Test Agent',
|
||||
folder: 'test-agent-e2e',
|
||||
agent_provider: 'claude',
|
||||
container_config: null,
|
||||
created_at: new Date().toISOString(),
|
||||
});
|
||||
|
||||
|
||||
@@ -136,7 +136,6 @@ export async function run(args: string[]): Promise<void> {
|
||||
name: parsed.assistantName,
|
||||
folder: parsed.folder,
|
||||
agent_provider: null,
|
||||
container_config: null,
|
||||
created_at: new Date().toISOString(),
|
||||
});
|
||||
agentGroup = getAgentGroupByFolder(parsed.folder)!;
|
||||
|
||||
@@ -81,7 +81,6 @@ function seedAgentGroup(id: string): void {
|
||||
name: id.toUpperCase(),
|
||||
folder: id,
|
||||
agent_provider: null,
|
||||
container_config: null,
|
||||
created_at: now(),
|
||||
});
|
||||
}
|
||||
|
||||
@@ -0,0 +1,327 @@
|
||||
/**
|
||||
* Approval card routing for builder-agent swap requests.
|
||||
*
|
||||
* Uses `pending_approvals` directly (not `onecli-approvals.ts` — swap
|
||||
* approvals are NOT credential operations). Two approval actions live
|
||||
* here:
|
||||
*
|
||||
* - `swap_request` — posted after the dev agent calls
|
||||
* `request_swap`. Routed to group admin for
|
||||
* group-level diffs, owner-only for host-level
|
||||
* or combined. Buttons: Approve / Reject.
|
||||
* - `swap_confirmation` — the deadman handshake card. Routed back to
|
||||
* the originating user's DM. Handled in
|
||||
* `deadman.ts`.
|
||||
*
|
||||
* Host-level approvals ideally require typed confirmation to prevent
|
||||
* fat-finger approvals on mobile, but the chat-SDK bridge currently only
|
||||
* exposes button-option UI. For v1 we use a three-option card
|
||||
* (Approve / Reject / Cancel) with a prominent DANGER warning in the body
|
||||
* so the approver has to pick the dangerous option among siblings.
|
||||
* Upgrading to a true typed-confirmation flow is a follow-up when the
|
||||
* chat-SDK bridge gains a free-text question primitive.
|
||||
*/
|
||||
import { execFileSync } from 'child_process';
|
||||
|
||||
import { pickApprovalDelivery, pickApprover } from '../access.js';
|
||||
import { getAgentGroup } from '../db/agent-groups.js';
|
||||
import { getOwners } from '../db/user-roles.js';
|
||||
import { createPendingApproval } from '../db/sessions.js';
|
||||
import { getPendingSwap, updatePendingSwapStatus } from '../db/pending-swaps.js';
|
||||
import { log } from '../log.js';
|
||||
import type { PendingSwap, Session } from '../types.js';
|
||||
import { parseSwapSummary } from './swap.js';
|
||||
import { worktreePathFor } from './worktree.js';
|
||||
|
||||
export interface SwapApprovalDelivery {
|
||||
deliver(
|
||||
channelType: string,
|
||||
platformId: string,
|
||||
threadId: string | null,
|
||||
kind: string,
|
||||
content: string,
|
||||
): Promise<string | undefined>;
|
||||
}
|
||||
|
||||
let deliveryRef: SwapApprovalDelivery | null = null;
|
||||
|
||||
export function setSwapApprovalDelivery(adapter: SwapApprovalDelivery): void {
|
||||
deliveryRef = adapter;
|
||||
}
|
||||
|
||||
/**
|
||||
* Post an approval card for a classified swap. Called at the end of
|
||||
* `handleRequestSwap` once the classifier has run.
|
||||
*/
|
||||
export async function sendSwapApprovalCard(
|
||||
swap: PendingSwap,
|
||||
originatingSession: Session,
|
||||
notifyDevAgent: (text: string) => void,
|
||||
): Promise<void> {
|
||||
if (!deliveryRef) {
|
||||
log.error('sendSwapApprovalCard: no delivery adapter set', { requestId: swap.request_id });
|
||||
notifyDevAgent('Swap approval card could not be delivered: host delivery adapter missing.');
|
||||
return;
|
||||
}
|
||||
|
||||
const isHostLevel = swap.classification === 'host' || swap.classification === 'combined';
|
||||
|
||||
// Host-level swaps target the owner only. Group-level uses the normal
|
||||
// approver ladder (scoped admin → global admin → owner).
|
||||
const approvers = isHostLevel
|
||||
? getOwners().map((r) => r.user_id)
|
||||
: pickApprover(swap.originating_group_id);
|
||||
|
||||
if (approvers.length === 0) {
|
||||
notifyDevAgent(
|
||||
isHostLevel
|
||||
? 'Code change rejected: no owner configured to approve host-level changes.'
|
||||
: 'Code change rejected: no approver configured for this agent group.',
|
||||
);
|
||||
updatePendingSwapStatus(swap.request_id, 'rejected');
|
||||
return;
|
||||
}
|
||||
|
||||
// Origin channel kind drives tie-break preference (same as existing
|
||||
// install_packages / request_rebuild approvals).
|
||||
const originChannelType = originatingSession.messaging_group_id
|
||||
? (await import('../db/messaging-groups.js')).getMessagingGroup(originatingSession.messaging_group_id)?.channel_type ?? ''
|
||||
: '';
|
||||
|
||||
const target = await pickApprovalDelivery(approvers, originChannelType);
|
||||
if (!target) {
|
||||
notifyDevAgent('Code change rejected: no DM channel found for any eligible approver.');
|
||||
updatePendingSwapStatus(swap.request_id, 'rejected');
|
||||
return;
|
||||
}
|
||||
|
||||
const approvalId = `swapreq-${swap.request_id}`;
|
||||
const originatingGroup = getAgentGroup(swap.originating_group_id);
|
||||
const originatingName = originatingGroup?.name ?? swap.originating_group_id;
|
||||
const summary = parseSwapSummary(swap);
|
||||
|
||||
// Unified multi-message review flow for BOTH group-level and host-level
|
||||
// swaps. Host-level just gets bigger warning emojis + cross-group
|
||||
// safety callouts in the intro. Group-level is the same structure
|
||||
// without the danger banner.
|
||||
await sendSwapReviewMessages(
|
||||
swap,
|
||||
target.messagingGroup.channel_type,
|
||||
target.messagingGroup.platform_id,
|
||||
originatingName,
|
||||
summary,
|
||||
isHostLevel,
|
||||
);
|
||||
|
||||
const bodyLines: string[] = [];
|
||||
if (isHostLevel) {
|
||||
bodyLines.push(
|
||||
'⚠️ ⚠️ ⚠️ **HOST-LEVEL CODE CHANGE.** Review the preceding messages carefully. Approving runs the new code with full credential scope across all agents in this install.',
|
||||
);
|
||||
if (summary.touchesMigrations) {
|
||||
bodyLines.push('⚠️ Diff includes schema migrations — rollback may be lossy.');
|
||||
}
|
||||
if (swap.classification === 'combined') {
|
||||
bodyLines.push(
|
||||
'⚠️ Diff also modifies per-agent runner/skills code. Those changes will apply only to the originating agent. Other existing agents will run the new host against their old runner and may break — you can request another code change from each affected agent to refresh them if needed.',
|
||||
);
|
||||
}
|
||||
} else {
|
||||
bodyLines.push('Review the preceding messages, then approve or reject.');
|
||||
}
|
||||
|
||||
const options = isHostLevel
|
||||
? [
|
||||
{ label: 'Approve (DANGEROUS)', selectedLabel: '✅ Approved', value: 'approve' },
|
||||
{ label: 'Cancel', selectedLabel: '❎ Cancelled', value: 'cancel' },
|
||||
{ label: 'Reject', selectedLabel: '❌ Rejected', value: 'reject' },
|
||||
]
|
||||
: [
|
||||
{ label: 'Approve', selectedLabel: '✅ Approved', value: 'approve' },
|
||||
{ label: 'Reject', selectedLabel: '❌ Rejected', value: 'reject' },
|
||||
];
|
||||
|
||||
createPendingApproval({
|
||||
approval_id: approvalId,
|
||||
session_id: originatingSession.id,
|
||||
request_id: swap.request_id,
|
||||
action: 'swap_request',
|
||||
payload: JSON.stringify({
|
||||
swapRequestId: swap.request_id,
|
||||
isHostLevel,
|
||||
}),
|
||||
created_at: new Date().toISOString(),
|
||||
title: isHostLevel ? 'Host-level code change' : 'Agent code change',
|
||||
options_json: JSON.stringify(options),
|
||||
});
|
||||
|
||||
try {
|
||||
await deliveryRef.deliver(
|
||||
target.messagingGroup.channel_type,
|
||||
target.messagingGroup.platform_id,
|
||||
null,
|
||||
'chat-sdk',
|
||||
JSON.stringify({
|
||||
type: 'ask_question',
|
||||
questionId: approvalId,
|
||||
title: isHostLevel ? 'Host-level code change' : 'Agent code change',
|
||||
question: bodyLines.join('\n'),
|
||||
options,
|
||||
}),
|
||||
);
|
||||
log.info('Swap approval card delivered', {
|
||||
requestId: swap.request_id,
|
||||
approvalId,
|
||||
approver: target.userId,
|
||||
classification: swap.classification,
|
||||
});
|
||||
} catch (err) {
|
||||
log.error('Swap approval card delivery failed', { requestId: swap.request_id, err });
|
||||
notifyDevAgent(
|
||||
`Code change approval card delivery failed: ${err instanceof Error ? err.message : String(err)}. The pending_swaps row stays in 'pending_approval' — an operator can retry or reject manually.`,
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
/** Hard caps for the multi-message review flow on host-level approvals. */
|
||||
const DIFF_CHUNK_CHARS = 1800; // safe across channels (Discord, WhatsApp, Telegram, Slack)
|
||||
const MAX_DIFF_CHUNKS = 5; // up to ~9 KB of diff across 5 messages
|
||||
|
||||
/**
|
||||
* Unified review flow for both group-level and host-level swaps: send an
|
||||
* intro, a per-file summary, and the raw `git diff` chunked into 1-to-N
|
||||
* code-block messages before the approval card. The approver reads the
|
||||
* actual diff in their DM and then clicks Approve/Reject (or Cancel for
|
||||
* host-level) on the card that follows.
|
||||
*
|
||||
* Host-level swaps get more aggressive warning emojis and a cross-group
|
||||
* safety callout in the intro; structure is otherwise identical.
|
||||
*
|
||||
* Delivery errors for any individual message are logged but don't abort
|
||||
* the approval — the card still goes out so the approver has at least
|
||||
* the summary-and-buttons minimum.
|
||||
*/
|
||||
async function sendSwapReviewMessages(
|
||||
swap: PendingSwap,
|
||||
channelType: string,
|
||||
platformId: string,
|
||||
originatingName: string,
|
||||
summary: ReturnType<typeof parseSwapSummary>,
|
||||
isHostLevel: boolean,
|
||||
): Promise<void> {
|
||||
if (!deliveryRef) return;
|
||||
|
||||
const send = async (text: string, idx: number): Promise<void> => {
|
||||
try {
|
||||
await deliveryRef!.deliver(
|
||||
channelType,
|
||||
platformId,
|
||||
null,
|
||||
'chat',
|
||||
JSON.stringify({ text, sender: 'system', senderId: 'builder-agent' }),
|
||||
);
|
||||
} catch (err) {
|
||||
log.warn('Swap review message delivery failed', {
|
||||
requestId: swap.request_id,
|
||||
idx,
|
||||
err,
|
||||
});
|
||||
}
|
||||
};
|
||||
|
||||
// 1. Intro message
|
||||
const headerPrefix = isHostLevel
|
||||
? '⚠️ ⚠️ ⚠️ **HOST-LEVEL CODE CHANGE PROPOSED**'
|
||||
: '🔧 **Code change proposed**';
|
||||
const intro =
|
||||
`${headerPrefix} by agent "${originatingName}".\n\n` +
|
||||
`**What it does:** ${summary.overallSummary || '(no summary)'}\n\n` +
|
||||
`${summary.classifiedFiles.length} file(s) will be edited. Full diff follows, then the approval card.`;
|
||||
await send(intro, 0);
|
||||
|
||||
// 2. Per-file breakdown
|
||||
const fileLines: string[] = ['**Files in this code change:**'];
|
||||
for (const f of summary.classifiedFiles) {
|
||||
const perFile = summary.perFileSummaries[f.path] ?? '';
|
||||
fileLines.push(`- \`${f.path}\` (${f.classification})${perFile ? ` — ${perFile}` : ''}`);
|
||||
}
|
||||
await send(fileLines.join('\n'), 1);
|
||||
|
||||
// 3. Chunked raw diff. Read the full unified diff from the reviewed
|
||||
// COMMIT — not the working tree — so no post-submission edits leak
|
||||
// into what the approver sees. Split into DIFF_CHUNK_CHARS-sized
|
||||
// messages wrapped in code fences. Truncate beyond MAX_DIFF_CHUNKS.
|
||||
const diffText = readRawDiff(swap.request_id, swap.commit_sha);
|
||||
if (!diffText) {
|
||||
await send('_(could not read diff from worktree — review the commit directly)_', 2);
|
||||
return;
|
||||
}
|
||||
const chunks = chunkDiff(diffText, DIFF_CHUNK_CHARS, MAX_DIFF_CHUNKS);
|
||||
for (let i = 0; i < chunks.length; i++) {
|
||||
const header = chunks.length > 1 ? `**Diff (${i + 1}/${chunks.length})**\n` : '**Diff**\n';
|
||||
await send(`${header}\`\`\`diff\n${chunks[i]}\n\`\`\``, 2 + i);
|
||||
}
|
||||
if (diffText.length > DIFF_CHUNK_CHARS * MAX_DIFF_CHUNKS) {
|
||||
await send(
|
||||
'_(diff truncated — remainder not shown. Review the dev branch in a terminal before approving.)_',
|
||||
2 + chunks.length,
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Read the unified diff of a specific commit against main from a dev
|
||||
* worktree. Uses the range syntax `main..<sha>` so only committed content
|
||||
* is included — not anything in the working tree that the (possibly still-
|
||||
* running) dev agent may have touched between submission and approval.
|
||||
*/
|
||||
function readRawDiff(requestId: string, commitSha: string): string | null {
|
||||
if (!commitSha) return null;
|
||||
try {
|
||||
const out = execFileSync('git', ['diff', `main..${commitSha}`], {
|
||||
cwd: worktreePathFor(requestId),
|
||||
encoding: 'utf8',
|
||||
maxBuffer: 20 * 1024 * 1024,
|
||||
});
|
||||
return out.trim();
|
||||
} catch (err) {
|
||||
log.warn('readRawDiff failed', { requestId, err });
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Chunk a diff into up to maxChunks pieces of ~chunkSize characters each.
|
||||
* Splits on newline boundaries when possible so diffs stay readable.
|
||||
*/
|
||||
function chunkDiff(diff: string, chunkSize: number, maxChunks: number): string[] {
|
||||
if (diff.length <= chunkSize) return [diff];
|
||||
|
||||
const chunks: string[] = [];
|
||||
let i = 0;
|
||||
while (i < diff.length && chunks.length < maxChunks) {
|
||||
let end = Math.min(i + chunkSize, diff.length);
|
||||
// Prefer cutting at a newline boundary within the last 15% of the chunk.
|
||||
if (end < diff.length) {
|
||||
const lastNl = diff.lastIndexOf('\n', end);
|
||||
if (lastNl > i + chunkSize * 0.85) end = lastNl;
|
||||
}
|
||||
chunks.push(diff.slice(i, end));
|
||||
i = end;
|
||||
}
|
||||
return chunks;
|
||||
}
|
||||
|
||||
/**
|
||||
* Look up a swap by a `swap_request` approval's payload. Used by
|
||||
* index.ts::handleApprovalResponse to dispatch to `executeSwapOnApproval`.
|
||||
*/
|
||||
export function getSwapFromApprovalPayload(payloadJson: string): PendingSwap | undefined {
|
||||
try {
|
||||
const p = JSON.parse(payloadJson) as { swapRequestId?: string };
|
||||
if (!p.swapRequestId) return undefined;
|
||||
return getPendingSwap(p.swapRequestId);
|
||||
} catch {
|
||||
return undefined;
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,202 @@
|
||||
import path from 'path';
|
||||
|
||||
import { describe, expect, it } from 'vitest';
|
||||
|
||||
import {
|
||||
classifyDiff,
|
||||
classifyPath,
|
||||
isMigrationPath,
|
||||
isRunnerOrSkillsPath,
|
||||
type ClassifyOptions,
|
||||
} from './classifier.js';
|
||||
|
||||
const OPTS: ClassifyOptions = {
|
||||
projectRoot: '/repo',
|
||||
dataDir: '/repo/data',
|
||||
originatingGroupId: 'grp-abc',
|
||||
originatingGroupFolder: 'main',
|
||||
};
|
||||
|
||||
describe('classifyPath', () => {
|
||||
it('routes runner edits to the per-group private runner dir', () => {
|
||||
const r = classifyPath('container/agent-runner/src/index.ts', OPTS);
|
||||
expect(r).not.toBeNull();
|
||||
expect(r!.classification).toBe('group');
|
||||
expect(r!.target).toBe(
|
||||
path.join('/repo/data/v2-sessions/grp-abc/agent-runner-src/index.ts'),
|
||||
);
|
||||
});
|
||||
|
||||
it('routes nested runner edits correctly', () => {
|
||||
const r = classifyPath('container/agent-runner/src/mcp-tools/agents.ts', OPTS);
|
||||
expect(r!.classification).toBe('group');
|
||||
expect(r!.target).toBe(
|
||||
path.join(
|
||||
'/repo/data/v2-sessions/grp-abc/agent-runner-src/mcp-tools/agents.ts',
|
||||
),
|
||||
);
|
||||
});
|
||||
|
||||
it('routes skills edits to the per-group private skills dir', () => {
|
||||
const r = classifyPath('container/skills/browser/SKILL.md', OPTS);
|
||||
expect(r!.classification).toBe('group');
|
||||
expect(r!.target).toBe(
|
||||
path.join(
|
||||
'/repo/data/v2-sessions/grp-abc/.claude-shared/skills/browser/SKILL.md',
|
||||
),
|
||||
);
|
||||
});
|
||||
|
||||
it('routes originating group folder edits to their repo path', () => {
|
||||
const r = classifyPath('groups/main/CLAUDE.md', OPTS);
|
||||
expect(r!.classification).toBe('group');
|
||||
expect(r!.target).toBe('/repo/groups/main/CLAUDE.md');
|
||||
});
|
||||
|
||||
it('treats other groups as host-level', () => {
|
||||
const r = classifyPath('groups/other-group/CLAUDE.md', OPTS);
|
||||
expect(r!.classification).toBe('host');
|
||||
expect(r!.target).toBe('/repo/groups/other-group/CLAUDE.md');
|
||||
});
|
||||
|
||||
it('treats src/ as host-level', () => {
|
||||
const r = classifyPath('src/delivery.ts', OPTS);
|
||||
expect(r!.classification).toBe('host');
|
||||
expect(r!.target).toBe('/repo/src/delivery.ts');
|
||||
});
|
||||
|
||||
it('treats root package.json as host-level', () => {
|
||||
const r = classifyPath('package.json', OPTS);
|
||||
expect(r!.classification).toBe('host');
|
||||
});
|
||||
|
||||
it('treats root Dockerfile as host-level', () => {
|
||||
const r = classifyPath('Dockerfile', OPTS);
|
||||
expect(r!.classification).toBe('host');
|
||||
});
|
||||
|
||||
it('treats container/Dockerfile as host-level', () => {
|
||||
const r = classifyPath('container/Dockerfile', OPTS);
|
||||
expect(r!.classification).toBe('host');
|
||||
});
|
||||
|
||||
it('treats docs/ as host-level', () => {
|
||||
const r = classifyPath('docs/v2-checklist.md', OPTS);
|
||||
expect(r!.classification).toBe('host');
|
||||
});
|
||||
|
||||
it('rejects .env and its variants', () => {
|
||||
expect(classifyPath('.env', OPTS)).toBeNull();
|
||||
expect(classifyPath('.env.local', OPTS)).toBeNull();
|
||||
expect(classifyPath('.env.production', OPTS)).toBeNull();
|
||||
});
|
||||
|
||||
it('rejects data/ and store/ writes', () => {
|
||||
expect(classifyPath('data/something', OPTS)).toBeNull();
|
||||
expect(classifyPath('data/v2-sessions/foo/bar', OPTS)).toBeNull();
|
||||
expect(classifyPath('store/anything', OPTS)).toBeNull();
|
||||
});
|
||||
|
||||
it('rejects absolute and traversal paths', () => {
|
||||
expect(classifyPath('/etc/passwd', OPTS)).toBeNull();
|
||||
expect(classifyPath('../outside', OPTS)).toBeNull();
|
||||
expect(classifyPath('', OPTS)).toBeNull();
|
||||
});
|
||||
});
|
||||
|
||||
describe('isRunnerOrSkillsPath', () => {
|
||||
it('detects runner paths', () => {
|
||||
expect(isRunnerOrSkillsPath('container/agent-runner/src/index.ts')).toBe(true);
|
||||
});
|
||||
it('detects skills paths', () => {
|
||||
expect(isRunnerOrSkillsPath('container/skills/browser/SKILL.md')).toBe(true);
|
||||
});
|
||||
it('does not match unrelated container paths', () => {
|
||||
expect(isRunnerOrSkillsPath('container/Dockerfile')).toBe(false);
|
||||
expect(isRunnerOrSkillsPath('container/build.sh')).toBe(false);
|
||||
});
|
||||
it('does not match groups/ paths', () => {
|
||||
expect(isRunnerOrSkillsPath('groups/main/skills/foo.md')).toBe(false);
|
||||
});
|
||||
});
|
||||
|
||||
describe('isMigrationPath', () => {
|
||||
it('detects migrations', () => {
|
||||
expect(isMigrationPath('src/db/migrations/007-new.ts')).toBe(true);
|
||||
});
|
||||
it('rejects other src paths', () => {
|
||||
expect(isMigrationPath('src/db/users.ts')).toBe(false);
|
||||
});
|
||||
});
|
||||
|
||||
describe('classifyDiff — overall classification', () => {
|
||||
it('is "group" when all changes land in originating group targets', () => {
|
||||
const d = classifyDiff(
|
||||
['groups/main/CLAUDE.md', 'container/agent-runner/src/index.ts'],
|
||||
OPTS,
|
||||
);
|
||||
expect(d.overall).toBe('group');
|
||||
expect(d.hostPaths).toHaveLength(0);
|
||||
expect(d.runnerOrSkillsPaths).toHaveLength(1);
|
||||
});
|
||||
|
||||
it('is "host" when only host paths change and none are runner/skills', () => {
|
||||
const d = classifyDiff(['src/delivery.ts', 'package.json'], OPTS);
|
||||
expect(d.overall).toBe('host');
|
||||
expect(d.hostPaths).toHaveLength(2);
|
||||
expect(d.runnerOrSkillsPaths).toHaveLength(0);
|
||||
});
|
||||
|
||||
it('is "combined" when host AND runner/skills are both changed', () => {
|
||||
const d = classifyDiff(
|
||||
['src/delivery.ts', 'container/agent-runner/src/poll-loop.ts'],
|
||||
OPTS,
|
||||
);
|
||||
expect(d.overall).toBe('combined');
|
||||
expect(d.hostPaths).toHaveLength(1);
|
||||
expect(d.runnerOrSkillsPaths).toHaveLength(1);
|
||||
});
|
||||
|
||||
it('is "combined" for host + skills change', () => {
|
||||
const d = classifyDiff(
|
||||
['Dockerfile', 'container/skills/browser/SKILL.md'],
|
||||
OPTS,
|
||||
);
|
||||
expect(d.overall).toBe('combined');
|
||||
});
|
||||
|
||||
it('flags migrations regardless of other paths', () => {
|
||||
const d = classifyDiff(
|
||||
['src/db/migrations/007-new.ts', 'src/delivery.ts'],
|
||||
OPTS,
|
||||
);
|
||||
expect(d.touchesMigrations).toBe(true);
|
||||
expect(d.overall).toBe('host');
|
||||
});
|
||||
|
||||
it('does not flag migrations when none touched', () => {
|
||||
const d = classifyDiff(['groups/main/CLAUDE.md'], OPTS);
|
||||
expect(d.touchesMigrations).toBe(false);
|
||||
});
|
||||
|
||||
it('throws on excluded paths in the diff', () => {
|
||||
expect(() => classifyDiff(['.env'], OPTS)).toThrow(
|
||||
/unreachable or excluded path/,
|
||||
);
|
||||
});
|
||||
|
||||
it('throws on data/ paths in the diff', () => {
|
||||
expect(() => classifyDiff(['data/something'], OPTS)).toThrow();
|
||||
});
|
||||
|
||||
it('preserves original paths in output files', () => {
|
||||
const d = classifyDiff(
|
||||
['groups/main/CLAUDE.md', 'src/delivery.ts'],
|
||||
OPTS,
|
||||
);
|
||||
expect(d.files.map((f) => f.path)).toEqual([
|
||||
'groups/main/CLAUDE.md',
|
||||
'src/delivery.ts',
|
||||
]);
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,196 @@
|
||||
/**
|
||||
* Diff classification for builder-agent swaps.
|
||||
*
|
||||
* Every changed file is classified as:
|
||||
* - group: lands in the originating group's private per-group dir or own folder
|
||||
* - host: lands in host code / repo template / other groups (requires owner + typed confirmation)
|
||||
*
|
||||
* The overall swap classification:
|
||||
* - 'group': all changes are group-level
|
||||
* - 'host': all changes are host-level, none touch runner/skills
|
||||
* - 'combined': host-level AND touches container/agent-runner or container/skills
|
||||
* (triggers the cross-group safety warning on the approval card)
|
||||
*
|
||||
* Classification is purely about APPROVAL ROUTING and SWAP TARGETS, not about
|
||||
* what the dev agent was allowed to write. The dev agent has full worktree
|
||||
* write access; classification happens at `request_swap` time.
|
||||
*
|
||||
* Swap-target mapping: given a changed path in the worktree, where does it
|
||||
* land on disk when the swap is applied? Group-level files go to the
|
||||
* originating group's private dir; host-level files go to the repo paths.
|
||||
*/
|
||||
|
||||
import path from 'path';
|
||||
|
||||
import type { SwapClassification } from '../types.js';
|
||||
|
||||
export type FileClassification = 'group' | 'host';
|
||||
|
||||
export interface ClassifiedFile {
|
||||
/** Path relative to the worktree root (same form `git diff --name-only` returns). */
|
||||
path: string;
|
||||
classification: FileClassification;
|
||||
/**
|
||||
* Absolute on-disk destination where this file lands when the swap is
|
||||
* applied. Computed relative to `projectRoot` (the main repo) and, for
|
||||
* group-level paths under `container/agent-runner/src/**` or
|
||||
* `container/skills/**`, redirected into the originating group's private
|
||||
* per-group dirs under `data/v2-sessions/<id>/`.
|
||||
*/
|
||||
targetAbsPath: string;
|
||||
}
|
||||
|
||||
export interface ClassifiedDiff {
|
||||
files: ClassifiedFile[];
|
||||
overall: SwapClassification;
|
||||
/** Subset of `files` with classification === 'host'. */
|
||||
hostPaths: ClassifiedFile[];
|
||||
/** Subset of `files` that touch runner or skills code (regardless of classification). */
|
||||
runnerOrSkillsPaths: ClassifiedFile[];
|
||||
/**
|
||||
* True iff any file under `src/db/migrations/**` is in the diff — drives
|
||||
* the rollback-may-be-lossy warning on the approval card.
|
||||
*/
|
||||
touchesMigrations: boolean;
|
||||
}
|
||||
|
||||
export interface ClassifyOptions {
|
||||
/** Absolute path to the main repo root (used for host-target mapping). */
|
||||
projectRoot: string;
|
||||
/** Absolute path to the data dir (typically `<projectRoot>/data`). */
|
||||
dataDir: string;
|
||||
/** Agent-group ID whose private dirs are the targets for group-level swaps. */
|
||||
originatingGroupId: string;
|
||||
/**
|
||||
* Folder name (not ID) for the originating group, used to identify the
|
||||
* one allowed `groups/<folder>/**` path. Other groups are host-level.
|
||||
*/
|
||||
originatingGroupFolder: string;
|
||||
}
|
||||
|
||||
/**
|
||||
* Classify a single path. Used by `classifyDiff`; exported for unit tests.
|
||||
* Returns null for paths that must never be written to (excluded mount paths),
|
||||
* e.g. `.env`, `data/` (outside the carve-outs), `store/`. Callers should treat
|
||||
* null as a reject-with-error signal.
|
||||
*/
|
||||
export function classifyPath(
|
||||
relPath: string,
|
||||
opts: ClassifyOptions,
|
||||
): { classification: FileClassification; target: string } | null {
|
||||
const norm = relPath.replace(/\\/g, '/');
|
||||
|
||||
if (norm === '' || norm.startsWith('..') || path.isAbsolute(norm)) return null;
|
||||
if (norm === '.env' || norm.startsWith('.env.')) return null;
|
||||
if (norm === 'store' || norm.startsWith('store/')) return null;
|
||||
|
||||
// data/ is host-unreachable EXCEPT for the per-group carve-outs which are
|
||||
// tracked in git by design. The builder-agent flow never writes directly
|
||||
// to those paths via the worktree (the worktree reflects the overlaid
|
||||
// template path under container/agent-runner/src/ etc.), so any diff entry
|
||||
// under data/** is a reject.
|
||||
if (norm === 'data' || norm.startsWith('data/')) return null;
|
||||
|
||||
// ── group-level ────────────────────────────────────────────────
|
||||
// container/agent-runner/src/** → data/v2-sessions/<id>/agent-runner-src/**
|
||||
const runnerPrefix = 'container/agent-runner/src/';
|
||||
if (norm.startsWith(runnerPrefix)) {
|
||||
const rel = norm.slice(runnerPrefix.length);
|
||||
return {
|
||||
classification: 'group',
|
||||
target: path.join(
|
||||
opts.dataDir,
|
||||
'v2-sessions',
|
||||
opts.originatingGroupId,
|
||||
'agent-runner-src',
|
||||
rel,
|
||||
),
|
||||
};
|
||||
}
|
||||
|
||||
// container/skills/** → data/v2-sessions/<id>/.claude-shared/skills/**
|
||||
const skillsPrefix = 'container/skills/';
|
||||
if (norm.startsWith(skillsPrefix)) {
|
||||
const rel = norm.slice(skillsPrefix.length);
|
||||
return {
|
||||
classification: 'group',
|
||||
target: path.join(
|
||||
opts.dataDir,
|
||||
'v2-sessions',
|
||||
opts.originatingGroupId,
|
||||
'.claude-shared',
|
||||
'skills',
|
||||
rel,
|
||||
),
|
||||
};
|
||||
}
|
||||
|
||||
// groups/<originating-folder>/** → groups/<originating-folder>/** (same path)
|
||||
const originatingPrefix = `groups/${opts.originatingGroupFolder}/`;
|
||||
if (norm.startsWith(originatingPrefix)) {
|
||||
return {
|
||||
classification: 'group',
|
||||
target: path.join(opts.projectRoot, norm),
|
||||
};
|
||||
}
|
||||
|
||||
// ── host-level ─────────────────────────────────────────────────
|
||||
// Everything else lands at its repo path. groups/<other>/** is host-level
|
||||
// because touching another group's data requires owner consent.
|
||||
return {
|
||||
classification: 'host',
|
||||
target: path.join(opts.projectRoot, norm),
|
||||
};
|
||||
}
|
||||
|
||||
/** True iff a classified file's worktree path is under runner or skills template. */
|
||||
export function isRunnerOrSkillsPath(relPath: string): boolean {
|
||||
const norm = relPath.replace(/\\/g, '/');
|
||||
return (
|
||||
norm.startsWith('container/agent-runner/src/') ||
|
||||
norm.startsWith('container/skills/')
|
||||
);
|
||||
}
|
||||
|
||||
/** True iff a changed path is a schema migration. */
|
||||
export function isMigrationPath(relPath: string): boolean {
|
||||
const norm = relPath.replace(/\\/g, '/');
|
||||
return norm.startsWith('src/db/migrations/');
|
||||
}
|
||||
|
||||
/**
|
||||
* Classify every changed path. Throws if any path is unreachable
|
||||
* (excluded mount paths) — the dev agent should not be able to produce such
|
||||
* a diff because the worktree filesystem excludes those paths.
|
||||
*/
|
||||
export function classifyDiff(changedPaths: string[], opts: ClassifyOptions): ClassifiedDiff {
|
||||
const files: ClassifiedFile[] = [];
|
||||
for (const p of changedPaths) {
|
||||
const result = classifyPath(p, opts);
|
||||
if (!result) {
|
||||
throw new Error(
|
||||
`builder-agent: diff contains unreachable or excluded path: ${p}`,
|
||||
);
|
||||
}
|
||||
files.push({
|
||||
path: p,
|
||||
classification: result.classification,
|
||||
targetAbsPath: result.target,
|
||||
});
|
||||
}
|
||||
|
||||
const hostPaths = files.filter((f) => f.classification === 'host');
|
||||
const runnerOrSkillsPaths = files.filter((f) => isRunnerOrSkillsPath(f.path));
|
||||
const touchesMigrations = files.some((f) => isMigrationPath(f.path));
|
||||
|
||||
let overall: SwapClassification;
|
||||
if (hostPaths.length === 0) {
|
||||
overall = 'group';
|
||||
} else if (runnerOrSkillsPaths.length > 0) {
|
||||
overall = 'combined';
|
||||
} else {
|
||||
overall = 'host';
|
||||
}
|
||||
|
||||
return { files, overall, hostPaths, runnerOrSkillsPaths, touchesMigrations };
|
||||
}
|
||||
@@ -0,0 +1,307 @@
|
||||
/**
|
||||
* Builder-agent deadman dance.
|
||||
*
|
||||
* After a swap is applied and the originating container (or host) restarts,
|
||||
* we give the user a short window to confirm the new version is working.
|
||||
* Mechanism: send a two-button card (Confirm/Rollback) and start an
|
||||
* in-memory timer backed by `pending_swaps.deadman_expires_at`. The DB row
|
||||
* is source of truth so the timer survives host restart via the startup
|
||||
* sweep in `startup.ts`.
|
||||
*
|
||||
* Two-message handshake:
|
||||
* 1. Host → user: card "I'm back with the new version. Reply confirm to keep it."
|
||||
* 2. User → agent: click "Confirm" (or "Rollback") → card clicks route through
|
||||
* `handleQuestionResponse` in index.ts, which delegates to
|
||||
* `handleSwapConfirmationResponse` here.
|
||||
*
|
||||
* Timer extension: when we successfully deliver step 1, we bump
|
||||
* `deadman_expires_at` to +2 minutes from now (so slow channel reconnects
|
||||
* don't trigger false rollback once we know outbound works). Hard cap:
|
||||
* 10 minutes absolute maximum from initial start.
|
||||
*/
|
||||
import { createPendingApproval, deletePendingApproval } from '../db/sessions.js';
|
||||
import { findSessionByAgentGroup } from '../db/sessions.js';
|
||||
import { getMessagingGroup } from '../db/messaging-groups.js';
|
||||
import {
|
||||
extendSwapDeadman,
|
||||
getAwaitingConfirmationSwaps,
|
||||
getPendingSwap,
|
||||
setSwapHandshakeState,
|
||||
startSwapDeadman,
|
||||
updatePendingSwapStatus,
|
||||
} from '../db/pending-swaps.js';
|
||||
import { log } from '../log.js';
|
||||
import type { PendingSwap } from '../types.js';
|
||||
import { maybeSendPromotePrompt } from './promote.js';
|
||||
import { removeDevWorktree } from './worktree.js';
|
||||
import {
|
||||
isHostLevelSwap,
|
||||
parseSwapSummary,
|
||||
restoreDbFromSnapshot,
|
||||
rollbackSwapFiles,
|
||||
} from './swap.js';
|
||||
|
||||
const DEADMAN_INITIAL_MS = 2 * 60 * 1000;
|
||||
const DEADMAN_HARD_CAP_MS = 10 * 60 * 1000;
|
||||
|
||||
/** In-memory timers keyed by request_id. Rehydrated by the startup sweep. */
|
||||
const activeTimers = new Map<string, NodeJS.Timeout>();
|
||||
|
||||
/** Abstract channel-delivery surface so deadman can run without importing delivery.ts. */
|
||||
export interface DeadmanDelivery {
|
||||
deliver(
|
||||
channelType: string,
|
||||
platformId: string,
|
||||
threadId: string | null,
|
||||
kind: string,
|
||||
content: string,
|
||||
): Promise<string | undefined>;
|
||||
}
|
||||
|
||||
let deliveryRef: DeadmanDelivery | null = null;
|
||||
|
||||
export function setDeadmanDelivery(adapter: DeadmanDelivery): void {
|
||||
deliveryRef = adapter;
|
||||
}
|
||||
|
||||
/**
|
||||
* Start the deadman for a freshly-applied swap. Called either directly
|
||||
* after a group-level swap (host stays up) or by the startup sweep for a
|
||||
* host-level swap (host just restarted).
|
||||
*/
|
||||
export async function startDeadman(requestId: string): Promise<void> {
|
||||
const swap = getPendingSwap(requestId);
|
||||
if (!swap) {
|
||||
log.warn('startDeadman: swap not found', { requestId });
|
||||
return;
|
||||
}
|
||||
|
||||
const now = Date.now();
|
||||
const hardCap = swap.deadman_started_at
|
||||
? new Date(swap.deadman_started_at).getTime() + DEADMAN_HARD_CAP_MS
|
||||
: now + DEADMAN_HARD_CAP_MS;
|
||||
const expiresAtMs = Math.min(now + DEADMAN_INITIAL_MS, hardCap);
|
||||
const startedAtIso = swap.deadman_started_at ?? new Date(now).toISOString();
|
||||
const expiresAtIso = new Date(expiresAtMs).toISOString();
|
||||
|
||||
startSwapDeadman(requestId, startedAtIso, expiresAtIso, 'pending_restart');
|
||||
|
||||
const delivered = await sendHandshakeCard(swap);
|
||||
if (delivered) {
|
||||
setSwapHandshakeState(requestId, 'message1_sent');
|
||||
// Extend timer by a fresh +2 min from NOW, capped by hard cap.
|
||||
const extended = Math.min(Date.now() + DEADMAN_INITIAL_MS, hardCap);
|
||||
extendSwapDeadman(requestId, new Date(extended).toISOString());
|
||||
}
|
||||
scheduleTimer(requestId, Math.max(100, (delivered ? Date.now() + DEADMAN_INITIAL_MS : expiresAtMs) - Date.now()));
|
||||
}
|
||||
|
||||
/** Resume a deadman from persisted state after a host restart. */
|
||||
export async function resumeDeadman(swap: PendingSwap): Promise<void> {
|
||||
if (!swap.deadman_expires_at) {
|
||||
log.warn('resumeDeadman: no deadman_expires_at, rolling back', { requestId: swap.request_id });
|
||||
await executeRollback(swap.request_id, 'startup: corrupt deadman state');
|
||||
return;
|
||||
}
|
||||
const remainingMs = new Date(swap.deadman_expires_at).getTime() - Date.now();
|
||||
if (remainingMs <= 0) {
|
||||
log.info('Deadman already expired at startup; rolling back', { requestId: swap.request_id });
|
||||
await executeRollback(swap.request_id, 'startup: deadman expired');
|
||||
return;
|
||||
}
|
||||
|
||||
log.info('Resuming deadman after host restart', {
|
||||
requestId: swap.request_id,
|
||||
remainingMs,
|
||||
handshakeState: swap.handshake_state,
|
||||
});
|
||||
|
||||
// If we're here after a host-level swap restart, handshake_state is still
|
||||
// 'pending_restart' — we haven't sent message 1 yet because the host was
|
||||
// in the middle of restarting. Send it now.
|
||||
if (swap.handshake_state === 'pending_restart') {
|
||||
const delivered = await sendHandshakeCard(swap);
|
||||
if (delivered) setSwapHandshakeState(swap.request_id, 'message1_sent');
|
||||
}
|
||||
|
||||
scheduleTimer(swap.request_id, remainingMs);
|
||||
}
|
||||
|
||||
function scheduleTimer(requestId: string, ms: number): void {
|
||||
const existing = activeTimers.get(requestId);
|
||||
if (existing) clearTimeout(existing);
|
||||
const handle = setTimeout(() => {
|
||||
activeTimers.delete(requestId);
|
||||
void executeRollback(requestId, 'deadman timeout');
|
||||
}, ms);
|
||||
activeTimers.set(requestId, handle);
|
||||
}
|
||||
|
||||
/**
|
||||
* Send a Confirm/Rollback card to the user on the originating session's
|
||||
* messaging group. Returns true on successful delivery; false means the
|
||||
* channel isn't reachable and the deadman will fall through to timeout
|
||||
* (safe default: rollback if we can't even talk to the user).
|
||||
*/
|
||||
async function sendHandshakeCard(swap: PendingSwap): Promise<boolean> {
|
||||
if (!deliveryRef) {
|
||||
log.warn('sendHandshakeCard: no delivery adapter set', { requestId: swap.request_id });
|
||||
return false;
|
||||
}
|
||||
|
||||
// Find the originating agent's most recent active session so we know
|
||||
// which messaging group to send the card to.
|
||||
const session = findSessionByAgentGroup(swap.originating_group_id);
|
||||
if (!session || !session.messaging_group_id) {
|
||||
log.warn('sendHandshakeCard: no originating session with messaging group', {
|
||||
requestId: swap.request_id,
|
||||
});
|
||||
return false;
|
||||
}
|
||||
const mg = getMessagingGroup(session.messaging_group_id);
|
||||
if (!mg) {
|
||||
log.warn('sendHandshakeCard: messaging group not found', { requestId: swap.request_id });
|
||||
return false;
|
||||
}
|
||||
|
||||
// Create a pending_approval row so the button click routes back to
|
||||
// handleSwapConfirmationResponse via the existing handleApprovalResponse
|
||||
// dispatch in index.ts.
|
||||
const approvalId = `swapconf-${swap.request_id}`;
|
||||
createPendingApproval({
|
||||
approval_id: approvalId,
|
||||
session_id: session.id,
|
||||
request_id: swap.request_id,
|
||||
action: 'swap_confirmation',
|
||||
payload: JSON.stringify({ swapRequestId: swap.request_id }),
|
||||
created_at: new Date().toISOString(),
|
||||
title: 'Confirm code change',
|
||||
options_json: JSON.stringify([
|
||||
{ label: 'Confirm', selectedLabel: '✅ Confirmed', value: 'confirm' },
|
||||
{ label: 'Rollback', selectedLabel: '↩️ Rolled back', value: 'rollback' },
|
||||
]),
|
||||
});
|
||||
|
||||
const summary = parseSwapSummary(swap);
|
||||
const body =
|
||||
`I'm back with the new version of my code.\n\n` +
|
||||
`**What changed:** ${summary.overallSummary || '(no summary)'}\n\n` +
|
||||
`Reply **Confirm** within 2 minutes to keep the new version, or **Rollback** to revert.`;
|
||||
|
||||
try {
|
||||
await deliveryRef.deliver(
|
||||
mg.channel_type,
|
||||
mg.platform_id,
|
||||
session.thread_id,
|
||||
'chat-sdk',
|
||||
JSON.stringify({
|
||||
type: 'ask_question',
|
||||
questionId: approvalId,
|
||||
title: 'Confirm code change',
|
||||
question: body,
|
||||
options: [
|
||||
{ label: 'Confirm', selectedLabel: '✅ Confirmed', value: 'confirm' },
|
||||
{ label: 'Rollback', selectedLabel: '↩️ Rolled back', value: 'rollback' },
|
||||
],
|
||||
}),
|
||||
);
|
||||
log.info('Deadman handshake card delivered', { requestId: swap.request_id, approvalId });
|
||||
return true;
|
||||
} catch (err) {
|
||||
log.error('Deadman handshake card delivery failed', { requestId: swap.request_id, err });
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Called by `handleApprovalResponse` in index.ts when the user clicks a
|
||||
* button on the deadman card. `confirm` finalizes; anything else rolls back.
|
||||
*/
|
||||
export async function handleSwapConfirmationResponse(
|
||||
approvalId: string,
|
||||
swapRequestId: string,
|
||||
selectedOption: string,
|
||||
): Promise<void> {
|
||||
const swap = getPendingSwap(swapRequestId);
|
||||
if (!swap) {
|
||||
log.warn('handleSwapConfirmationResponse: swap not found', { swapRequestId });
|
||||
deletePendingApproval(approvalId);
|
||||
return;
|
||||
}
|
||||
|
||||
const timer = activeTimers.get(swapRequestId);
|
||||
if (timer) {
|
||||
clearTimeout(timer);
|
||||
activeTimers.delete(swapRequestId);
|
||||
}
|
||||
|
||||
if (selectedOption === 'confirm') {
|
||||
await finalizeSwap(swap);
|
||||
} else {
|
||||
await executeRollback(swapRequestId, 'user clicked rollback');
|
||||
}
|
||||
|
||||
deletePendingApproval(approvalId);
|
||||
}
|
||||
|
||||
async function finalizeSwap(swap: PendingSwap): Promise<void> {
|
||||
updatePendingSwapStatus(swap.request_id, 'finalized');
|
||||
try {
|
||||
removeDevWorktree(swap.request_id);
|
||||
} catch (err) {
|
||||
log.warn('Failed to remove worktree during finalize', { requestId: swap.request_id, err });
|
||||
}
|
||||
log.info('Swap finalized', { requestId: swap.request_id });
|
||||
|
||||
// Fire the promote-to-template prompt if the diff touched runner/skills
|
||||
// paths. No-op if it didn't, and failures are swallowed so finalize
|
||||
// always reports success to the user.
|
||||
try {
|
||||
await maybeSendPromotePrompt(swap);
|
||||
} catch (err) {
|
||||
log.error('maybeSendPromotePrompt threw', { requestId: swap.request_id, err });
|
||||
}
|
||||
}
|
||||
|
||||
async function executeRollback(requestId: string, reason: string): Promise<void> {
|
||||
const swap = getPendingSwap(requestId);
|
||||
if (!swap) return;
|
||||
|
||||
log.info('Executing swap rollback', { requestId, reason });
|
||||
|
||||
try {
|
||||
rollbackSwapFiles(swap);
|
||||
} catch (err) {
|
||||
log.error('rollbackSwapFiles threw', { requestId, err });
|
||||
}
|
||||
|
||||
// For host-level swaps, the central DB may have been mutated by the new
|
||||
// code since the swap. Restore from snapshot and then exit so the
|
||||
// supervisor respawns the host on the old code. For group-level swaps,
|
||||
// just restart the originating agent's container.
|
||||
if (isHostLevelSwap(swap)) {
|
||||
try {
|
||||
restoreDbFromSnapshot(swap);
|
||||
} catch (err) {
|
||||
log.error('restoreDbFromSnapshot failed during rollback', { requestId, err });
|
||||
}
|
||||
}
|
||||
|
||||
updatePendingSwapStatus(requestId, 'rolled_back');
|
||||
|
||||
try {
|
||||
removeDevWorktree(requestId);
|
||||
} catch {
|
||||
/* best-effort */
|
||||
}
|
||||
|
||||
if (isHostLevelSwap(swap)) {
|
||||
log.warn('Host-level rollback triggering process exit for supervisor respawn', {
|
||||
requestId,
|
||||
});
|
||||
// Give log sinks a moment to flush.
|
||||
setTimeout(() => process.exit(0), 250);
|
||||
}
|
||||
// For group-level, the next message to the originating agent will spawn
|
||||
// a fresh container that picks up the rolled-back files.
|
||||
}
|
||||
@@ -0,0 +1,449 @@
|
||||
/**
|
||||
* Host-side handlers for builder-agent system actions dispatched from
|
||||
* `src/delivery.ts::handleSystemAction`. Two actions live here:
|
||||
*
|
||||
* - `create_dev_agent` — originating agent asks to spawn a fresh dev
|
||||
* agent. Two-step model: this handler only CREATES the dev agent
|
||||
* group, its worktree, destinations, and the pending_swaps row. It
|
||||
* does NOT start any work. The originating agent is expected to then
|
||||
* send a message to the dev agent via its destination to describe
|
||||
* the task. This keeps the MCP tool call cheap and makes the work
|
||||
* instructions first-class inbound chat that the user/originating
|
||||
* agent can review or edit.
|
||||
*
|
||||
* - `request_swap` — dev agent has finished editing and wants to submit
|
||||
* for approval. We look up the pending_swaps row by dev_agent_id, run
|
||||
* `git diff` in the worktree, classify by path, persist, and route the
|
||||
* approval card.
|
||||
*
|
||||
* Both handlers are fire-and-forget at the MCP-tool layer: the container
|
||||
* tool writes a message_out and returns immediately; any failure is
|
||||
* surfaced back to the caller via `notifyAgent`.
|
||||
*/
|
||||
import path from 'path';
|
||||
|
||||
import { GROUPS_DIR } from '../config.js';
|
||||
import { killContainer } from '../container-runner.js';
|
||||
import { createAgentGroup, deleteAgentGroup, getAgentGroup, getAgentGroupByFolder } from '../db/agent-groups.js';
|
||||
import { findSessionByAgentGroup } from '../db/sessions.js';
|
||||
import {
|
||||
createDestination,
|
||||
deleteAllDestinationsTouching,
|
||||
getDestinationByName,
|
||||
normalizeName,
|
||||
} from '../db/agent-destinations.js';
|
||||
import { getDb } from '../db/connection.js';
|
||||
import {
|
||||
createPendingSwap,
|
||||
getInFlightSwapForGroup,
|
||||
getSwapForDevAgent,
|
||||
updatePendingSwapStatus,
|
||||
} from '../db/pending-swaps.js';
|
||||
import { initGroupFilesystem } from '../group-init.js';
|
||||
import { log } from '../log.js';
|
||||
import { writeDestinations } from '../session-manager.js';
|
||||
import type { AgentGroup, Session, SwapClassification } from '../types.js';
|
||||
import { sendSwapApprovalCard } from './approval.js';
|
||||
import { classifyDiff } from './classifier.js';
|
||||
import { createDevWorktree, diffChangedPathsAtCommit, removeDevWorktree, worktreeHeadSha } from './worktree.js';
|
||||
|
||||
type NotifyFn = (session: Session, text: string) => void;
|
||||
|
||||
export interface CreateDevAgentContent {
|
||||
requestId: string;
|
||||
name: string;
|
||||
}
|
||||
|
||||
export interface RequestSwapContent {
|
||||
perFileSummaries: Record<string, string>;
|
||||
overallSummary: string;
|
||||
}
|
||||
|
||||
const DEV_AGENT_INSTRUCTIONS = `# Dev Agent
|
||||
|
||||
You are a dev agent spawned by the builder-agent self-modification flow. Your job is to make code changes that the originating agent (your \`parent\`) will describe to you in an inbound message, then propose the diff for admin approval. You work in an isolated git worktree mounted at \`/worktree\`.
|
||||
|
||||
## Bootstrapping: wait for your first task
|
||||
|
||||
When you spawn, there is nothing to do yet. Sit idle until your first inbound message from \`parent\` arrives — that message contains the task description. Do not start exploring the worktree before then.
|
||||
|
||||
## Your environment
|
||||
|
||||
- \`/worktree\` — a full copy of the NanoClaw repo, writable. Edit anything here.
|
||||
- \`data/\`, \`store/\`, \`.env\` inside the worktree are excluded/shadowed — you cannot read real credentials from them.
|
||||
- You run the same code and tools as your parent, but with NO web access.
|
||||
- You have \`git\` available inside \`/worktree\`. Commit your changes on the dev branch when ready.
|
||||
|
||||
## The flow
|
||||
|
||||
1. Wait for the parent's task in your first inbound message.
|
||||
2. Explore the worktree at \`/worktree\` to understand the code.
|
||||
3. Message your \`parent\` destination whenever you need clarification.
|
||||
4. Make the edits and \`git commit\` them in the worktree.
|
||||
5. When ready, message your parent: "Ready to propose these changes: {summary}. OK to submit for approval?"
|
||||
6. After the parent confirms, call the \`request_swap\` MCP tool with a per-file summary and an overall summary. The host takes it from there (classification, approval routing, swap dance, deadman).
|
||||
|
||||
You do not execute the swap yourself — the host does, after an admin approves. Your job ends at \`request_swap\`.
|
||||
|
||||
**Do not edit your own agent-group folder.** Your edits target \`/worktree\`, not your runtime. Trying to modify your own CLAUDE.md is both pointless (you run on the live version, not the copy) and confusing.
|
||||
`;
|
||||
|
||||
/**
|
||||
* Tear down any previous in-flight dev agent for this originating group.
|
||||
* Called at the start of `handleRequestDevChanges`. Per decision #1 in the
|
||||
* plan: the originating agent may chat with a prior dev agent after its
|
||||
* request finalized, but the moment a NEW request comes in, the old dev
|
||||
* agent is wound down.
|
||||
*/
|
||||
function teardownPreviousDevAgent(originatingGroupId: string, originatingSession: Session): void {
|
||||
const prior = getInFlightSwapForGroup(originatingGroupId);
|
||||
if (!prior) return;
|
||||
|
||||
log.info('Tearing down previous dev agent before new request', {
|
||||
priorRequestId: prior.request_id,
|
||||
priorDevAgentId: prior.dev_agent_id,
|
||||
originatingGroupId,
|
||||
});
|
||||
|
||||
updatePendingSwapStatus(prior.request_id, 'rolled_back');
|
||||
try {
|
||||
removeDevWorktree(prior.request_id);
|
||||
} catch (err) {
|
||||
log.warn('Failed to remove prior worktree', { priorRequestId: prior.request_id, err });
|
||||
}
|
||||
try {
|
||||
deleteAllDestinationsTouching(prior.dev_agent_id);
|
||||
// REQUIRED: refresh the parent's destination projection after dropping
|
||||
// the prior dev-agent's rows, so its `dev-<name>` destination
|
||||
// disappears from the parent's running inbound.db. See the top-of-file
|
||||
// invariant in src/db/agent-destinations.ts.
|
||||
writeDestinations(originatingGroupId, originatingSession.id);
|
||||
} catch (err) {
|
||||
log.warn('Failed to drop prior dev-agent destinations', { priorDevAgentId: prior.dev_agent_id, err });
|
||||
}
|
||||
try {
|
||||
deleteAgentGroup(prior.dev_agent_id);
|
||||
} catch (err) {
|
||||
log.warn('Failed to delete prior dev agent group', { priorDevAgentId: prior.dev_agent_id, err });
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Handle a `create_dev_agent` system action from an originating agent.
|
||||
* Creates the dev agent group, worktree, destinations, and pending_swaps
|
||||
* row. Does NOT start any work — the originating agent is expected to
|
||||
* message the dev agent via its destination with the task details next.
|
||||
*/
|
||||
export async function handleCreateDevAgent(
|
||||
content: CreateDevAgentContent,
|
||||
session: Session,
|
||||
notifyAgent: NotifyFn,
|
||||
): Promise<void> {
|
||||
const requestId = content.requestId;
|
||||
const rawName = (content.name || '').trim();
|
||||
if (!rawName) {
|
||||
notifyAgent(session, 'create_dev_agent failed: name is required.');
|
||||
return;
|
||||
}
|
||||
|
||||
const originatingGroup = getAgentGroup(session.agent_group_id);
|
||||
if (!originatingGroup) {
|
||||
notifyAgent(session, 'create_dev_agent failed: originating agent group not found.');
|
||||
log.warn('create_dev_agent: missing originating group', {
|
||||
sessionAgentGroup: session.agent_group_id,
|
||||
});
|
||||
return;
|
||||
}
|
||||
|
||||
// Tear down any prior in-flight dev agent for this originating group.
|
||||
teardownPreviousDevAgent(originatingGroup.id, session);
|
||||
|
||||
// Sanitize + dedupe the destination name.
|
||||
const localName = normalizeName(rawName);
|
||||
if (getDestinationByName(originatingGroup.id, localName)) {
|
||||
notifyAgent(
|
||||
session,
|
||||
`create_dev_agent failed: you already have a destination named "${localName}". Pick a different name.`,
|
||||
);
|
||||
return;
|
||||
}
|
||||
|
||||
// Derive a safe folder name, deduplicated globally across agent_groups.folder.
|
||||
let folder = localName;
|
||||
let suffix = 2;
|
||||
while (getAgentGroupByFolder(folder)) {
|
||||
folder = `${localName}-${suffix}`;
|
||||
suffix++;
|
||||
}
|
||||
const groupPath = path.join(GROUPS_DIR, folder);
|
||||
const resolvedPath = path.resolve(groupPath);
|
||||
const resolvedGroupsDir = path.resolve(GROUPS_DIR);
|
||||
if (!resolvedPath.startsWith(resolvedGroupsDir + path.sep)) {
|
||||
notifyAgent(session, 'create_dev_agent failed: invalid folder path.');
|
||||
log.error('create_dev_agent path traversal attempt', { folder, resolvedPath });
|
||||
return;
|
||||
}
|
||||
|
||||
const devAgentGroupId = `ag-${Date.now()}-${Math.random().toString(36).slice(2, 8)}`;
|
||||
const now = new Date().toISOString();
|
||||
|
||||
const devGroup: AgentGroup = {
|
||||
id: devAgentGroupId,
|
||||
name: localName,
|
||||
folder,
|
||||
agent_provider: originatingGroup.agent_provider,
|
||||
created_at: now,
|
||||
};
|
||||
|
||||
try {
|
||||
createAgentGroup(devGroup);
|
||||
initGroupFilesystem(devGroup, { instructions: DEV_AGENT_INSTRUCTIONS });
|
||||
|
||||
// Bidirectional destinations: parent calls child by localName, child
|
||||
// calls parent as "parent" (or parent-N on collision).
|
||||
createDestination({
|
||||
agent_group_id: originatingGroup.id,
|
||||
local_name: localName,
|
||||
target_type: 'agent',
|
||||
target_id: devAgentGroupId,
|
||||
created_at: now,
|
||||
});
|
||||
let parentName = 'parent';
|
||||
let parentSuffix = 2;
|
||||
while (getDestinationByName(devAgentGroupId, parentName)) {
|
||||
parentName = `parent-${parentSuffix}`;
|
||||
parentSuffix++;
|
||||
}
|
||||
createDestination({
|
||||
agent_group_id: devAgentGroupId,
|
||||
local_name: parentName,
|
||||
target_type: 'agent',
|
||||
target_id: originatingGroup.id,
|
||||
created_at: now,
|
||||
});
|
||||
|
||||
// Fresh worktree per request (decision #2 in plan).
|
||||
createDevWorktree(requestId, originatingGroup.id);
|
||||
|
||||
// REQUIRED: project the new `dev-<name>` destination into the
|
||||
// originating agent's session inbound.db so the running container
|
||||
// sees it on its next send_message lookup. See the top-of-file
|
||||
// invariant in src/db/agent-destinations.ts.
|
||||
writeDestinations(originatingGroup.id, session.id);
|
||||
|
||||
// Persist the pending_swaps row. commit_sha / pre_swap_sha / db_snapshot
|
||||
// / deadman fields start null — populated at request_swap time and/or
|
||||
// approval time. summary_json starts empty; handleRequestSwap fills it
|
||||
// when the dev agent submits.
|
||||
createPendingSwap({
|
||||
request_id: requestId,
|
||||
dev_agent_id: devAgentGroupId,
|
||||
originating_group_id: originatingGroup.id,
|
||||
dev_branch: `dev/${requestId}`,
|
||||
commit_sha: '',
|
||||
classification: 'group',
|
||||
status: 'pending_approval',
|
||||
summary_json: JSON.stringify({}),
|
||||
pre_swap_sha: null,
|
||||
db_snapshot_path: null,
|
||||
deadman_started_at: null,
|
||||
deadman_expires_at: null,
|
||||
handshake_state: null,
|
||||
created_at: now,
|
||||
});
|
||||
} catch (err) {
|
||||
log.error('create_dev_agent failed mid-setup', { err, requestId, devAgentGroupId });
|
||||
try {
|
||||
removeDevWorktree(requestId);
|
||||
} catch {
|
||||
/* best effort */
|
||||
}
|
||||
try {
|
||||
deleteAllDestinationsTouching(devAgentGroupId);
|
||||
// REQUIRED: refresh the parent's destination projection after
|
||||
// dropping the partially-created dev-agent's rows. See the
|
||||
// top-of-file invariant in src/db/agent-destinations.ts.
|
||||
writeDestinations(originatingGroup.id, session.id);
|
||||
} catch {
|
||||
/* best effort */
|
||||
}
|
||||
try {
|
||||
deleteAgentGroup(devAgentGroupId);
|
||||
} catch {
|
||||
/* best effort */
|
||||
}
|
||||
notifyAgent(
|
||||
session,
|
||||
`create_dev_agent failed: ${err instanceof Error ? err.message : String(err)}`,
|
||||
);
|
||||
return;
|
||||
}
|
||||
|
||||
notifyAgent(
|
||||
session,
|
||||
`Dev agent "${localName}" created and is waiting for your first message. Send it the task details now with <message to="${localName}">...describe the change you want...</message>. It will NOT start until you message it.`,
|
||||
);
|
||||
log.info('Dev agent + worktree created', {
|
||||
requestId,
|
||||
devAgentGroupId,
|
||||
originatingGroupId: originatingGroup.id,
|
||||
localName,
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Handle a `request_swap` system action from a dev agent.
|
||||
*
|
||||
* Slice 2 scope: look up the pending_swaps row by dev_agent_id, run
|
||||
* `git diff` in the worktree, classify, persist. Approval routing and
|
||||
* the swap execution live in Slice 3.
|
||||
*/
|
||||
export async function handleRequestSwap(
|
||||
content: RequestSwapContent,
|
||||
session: Session,
|
||||
notifyAgent: NotifyFn,
|
||||
): Promise<void> {
|
||||
const devGroup = getAgentGroup(session.agent_group_id);
|
||||
if (!devGroup) {
|
||||
notifyAgent(session, 'Code change submission failed: dev agent group not found.');
|
||||
return;
|
||||
}
|
||||
|
||||
const pending = getSwapForDevAgent(devGroup.id);
|
||||
if (!pending) {
|
||||
notifyAgent(session, 'Code change submission failed: no in-flight code change for this dev agent.');
|
||||
return;
|
||||
}
|
||||
|
||||
const overall = (content.overallSummary || '').trim();
|
||||
const perFile = content.perFileSummaries || {};
|
||||
if (!overall || Object.keys(perFile).length === 0) {
|
||||
notifyAgent(session, 'Code change submission failed: overallSummary and perFileSummaries are both required.');
|
||||
return;
|
||||
}
|
||||
|
||||
// Capture HEAD first, THEN read the commit-based diff. The agent is
|
||||
// still running at this point, so any working-tree noise must be
|
||||
// excluded — we only consider what's in the committed tree at this sha.
|
||||
let headSha: string;
|
||||
let changedPaths: string[];
|
||||
try {
|
||||
headSha = worktreeHeadSha(pending.request_id);
|
||||
changedPaths = diffChangedPathsAtCommit(pending.request_id, headSha);
|
||||
} catch (err) {
|
||||
notifyAgent(
|
||||
session,
|
||||
`Code change submission failed: could not read worktree diff (${err instanceof Error ? err.message : String(err)}).`,
|
||||
);
|
||||
return;
|
||||
}
|
||||
|
||||
if (changedPaths.length === 0) {
|
||||
notifyAgent(
|
||||
session,
|
||||
"Code change submission failed: no committed changes in the worktree. Did you forget to `git commit`? Uncommitted working-tree edits don't count — only the committed tree is reviewed.",
|
||||
);
|
||||
return;
|
||||
}
|
||||
|
||||
let classified;
|
||||
try {
|
||||
classified = classifyDiff(changedPaths, {
|
||||
projectRoot: process.cwd(),
|
||||
dataDir: path.resolve(process.cwd(), 'data'),
|
||||
originatingGroupId: pending.originating_group_id,
|
||||
originatingGroupFolder: getAgentGroup(pending.originating_group_id)?.folder ?? '',
|
||||
});
|
||||
} catch (err) {
|
||||
notifyAgent(
|
||||
session,
|
||||
`Code change submission failed: ${err instanceof Error ? err.message : String(err)}`,
|
||||
);
|
||||
return;
|
||||
}
|
||||
|
||||
updatePendingSwapRow(pending.request_id, {
|
||||
commit_sha: headSha,
|
||||
classification: classified.overall,
|
||||
summary_json: JSON.stringify({
|
||||
overallSummary: overall,
|
||||
perFileSummaries: perFile,
|
||||
classifiedFiles: classified.files.map((f) => ({ path: f.path, classification: f.classification })),
|
||||
touchesMigrations: classified.touchesMigrations,
|
||||
}),
|
||||
});
|
||||
|
||||
notifyAgent(
|
||||
session,
|
||||
`Code change registered for ${classified.files.length} file(s). Classification: ${classified.overall}. Sending for admin approval…`,
|
||||
);
|
||||
log.info('request_swap classified', {
|
||||
requestId: pending.request_id,
|
||||
devAgentId: devGroup.id,
|
||||
classification: classified.overall,
|
||||
fileCount: classified.files.length,
|
||||
touchesMigrations: classified.touchesMigrations,
|
||||
});
|
||||
|
||||
// Freeze: kill the dev agent's container now that commit_sha is set.
|
||||
// The spawn gate in container-runner.ts will refuse to bring it back
|
||||
// while pending_swaps.commit_sha is non-empty and status is non-terminal.
|
||||
// This prevents the dev agent from editing /worktree between submission
|
||||
// and approval/rollback, which would otherwise let un-reviewed content
|
||||
// land on main because applySwapFiles reads from commit_sha (below).
|
||||
killContainer(session.id, 'frozen for code-change approval');
|
||||
|
||||
// Route the approval card to the originating agent's session context so
|
||||
// the approver ladder picks the right person (group admin vs owner).
|
||||
const originatingSession = findSessionByAgentGroup(pending.originating_group_id);
|
||||
if (!originatingSession) {
|
||||
notifyAgent(
|
||||
session,
|
||||
'Code change approval could not be routed: the originating agent has no active session. An operator will need to resolve the pending_swaps row manually.',
|
||||
);
|
||||
return;
|
||||
}
|
||||
|
||||
const updatedSwap = {
|
||||
...pending,
|
||||
commit_sha: headSha,
|
||||
classification: classified.overall,
|
||||
summary_json: JSON.stringify({
|
||||
overallSummary: overall,
|
||||
perFileSummaries: perFile,
|
||||
classifiedFiles: classified.files.map((f) => ({ path: f.path, classification: f.classification })),
|
||||
touchesMigrations: classified.touchesMigrations,
|
||||
}),
|
||||
};
|
||||
await sendSwapApprovalCard(updatedSwap, originatingSession, (text) => notifyAgent(session, text));
|
||||
}
|
||||
|
||||
/**
|
||||
* Targeted UPDATE helper — avoids adding a dedicated DB helper per field
|
||||
* combination. Prepared statement is built once per call from the patch
|
||||
* shape; parameter count always matches.
|
||||
*/
|
||||
function updatePendingSwapRow(
|
||||
requestId: string,
|
||||
patch: { commit_sha?: string; classification?: SwapClassification; summary_json?: string },
|
||||
): void {
|
||||
const sets: string[] = [];
|
||||
const values: unknown[] = [];
|
||||
if (patch.commit_sha !== undefined) {
|
||||
sets.push('commit_sha = ?');
|
||||
values.push(patch.commit_sha);
|
||||
}
|
||||
if (patch.classification !== undefined) {
|
||||
sets.push('classification = ?');
|
||||
values.push(patch.classification);
|
||||
}
|
||||
if (patch.summary_json !== undefined) {
|
||||
sets.push('summary_json = ?');
|
||||
values.push(patch.summary_json);
|
||||
}
|
||||
if (sets.length === 0) return;
|
||||
values.push(requestId);
|
||||
getDb()
|
||||
.prepare(`UPDATE pending_swaps SET ${sets.join(', ')} WHERE request_id = ?`)
|
||||
.run(...values);
|
||||
}
|
||||
@@ -0,0 +1,112 @@
|
||||
import path from 'path';
|
||||
|
||||
import { describe, expect, it } from 'vitest';
|
||||
|
||||
import { sourceForTemplate, swapTouchedRunnerOrSkills } from './promote.js';
|
||||
import { targetRepoRelPath } from './swap.js';
|
||||
import { DATA_DIR } from '../config.js';
|
||||
import type { PendingSwap } from '../types.js';
|
||||
|
||||
function makeSwap(files: Array<{ path: string; classification: 'group' | 'host' }>): PendingSwap {
|
||||
return {
|
||||
request_id: 'req-1',
|
||||
dev_agent_id: 'ag-dev',
|
||||
originating_group_id: 'ag-abc',
|
||||
dev_branch: 'dev/req-1',
|
||||
commit_sha: 'sha',
|
||||
classification: 'group',
|
||||
status: 'finalized',
|
||||
summary_json: JSON.stringify({
|
||||
overallSummary: 'test',
|
||||
perFileSummaries: {},
|
||||
classifiedFiles: files,
|
||||
touchesMigrations: false,
|
||||
}),
|
||||
pre_swap_sha: null,
|
||||
db_snapshot_path: null,
|
||||
deadman_started_at: null,
|
||||
deadman_expires_at: null,
|
||||
handshake_state: null,
|
||||
created_at: '2026-04-15T00:00:00Z',
|
||||
};
|
||||
}
|
||||
|
||||
describe('swapTouchedRunnerOrSkills', () => {
|
||||
it('is false when only groups/ files are touched', () => {
|
||||
const swap = makeSwap([{ path: 'groups/main/CLAUDE.md', classification: 'group' }]);
|
||||
expect(swapTouchedRunnerOrSkills(swap)).toBe(false);
|
||||
});
|
||||
|
||||
it('is true when container/agent-runner/src is touched', () => {
|
||||
const swap = makeSwap([
|
||||
{ path: 'container/agent-runner/src/poll-loop.ts', classification: 'group' },
|
||||
]);
|
||||
expect(swapTouchedRunnerOrSkills(swap)).toBe(true);
|
||||
});
|
||||
|
||||
it('is true when container/skills is touched', () => {
|
||||
const swap = makeSwap([{ path: 'container/skills/browser/SKILL.md', classification: 'group' }]);
|
||||
expect(swapTouchedRunnerOrSkills(swap)).toBe(true);
|
||||
});
|
||||
|
||||
it('is true when runner/skills AND host paths are mixed (combined diff)', () => {
|
||||
const swap = makeSwap([
|
||||
{ path: 'src/delivery.ts', classification: 'host' },
|
||||
{ path: 'container/agent-runner/src/poll-loop.ts', classification: 'group' },
|
||||
]);
|
||||
expect(swapTouchedRunnerOrSkills(swap)).toBe(true);
|
||||
});
|
||||
|
||||
it('is false when only host paths are touched', () => {
|
||||
const swap = makeSwap([{ path: 'src/delivery.ts', classification: 'host' }]);
|
||||
expect(swapTouchedRunnerOrSkills(swap)).toBe(false);
|
||||
});
|
||||
|
||||
it('is false for an empty diff', () => {
|
||||
const swap = makeSwap([]);
|
||||
expect(swapTouchedRunnerOrSkills(swap)).toBe(false);
|
||||
});
|
||||
});
|
||||
|
||||
describe('sourceForTemplate', () => {
|
||||
it('maps runner template paths to the per-group private dir (absolute)', () => {
|
||||
const src = sourceForTemplate('container/agent-runner/src/index.ts', 'ag-abc');
|
||||
expect(src).toBe(path.join(DATA_DIR, 'v2-sessions', 'ag-abc', 'agent-runner-src', 'index.ts'));
|
||||
});
|
||||
|
||||
it('maps nested runner paths correctly', () => {
|
||||
const src = sourceForTemplate('container/agent-runner/src/mcp-tools/agents.ts', 'ag-abc');
|
||||
expect(src).toBe(
|
||||
path.join(DATA_DIR, 'v2-sessions', 'ag-abc', 'agent-runner-src', 'mcp-tools', 'agents.ts'),
|
||||
);
|
||||
});
|
||||
|
||||
it('maps skills template paths to the per-group skills dir', () => {
|
||||
const src = sourceForTemplate('container/skills/browser/SKILL.md', 'ag-abc');
|
||||
expect(src).toBe(
|
||||
path.join(DATA_DIR, 'v2-sessions', 'ag-abc', '.claude-shared', 'skills', 'browser', 'SKILL.md'),
|
||||
);
|
||||
});
|
||||
});
|
||||
|
||||
describe('promote source mapping matches classifier target mapping', () => {
|
||||
// Invariant: for every runner/skills template path, the classifier's
|
||||
// target (for applying the swap) and promote's source (for reading back
|
||||
// from the committed per-group state) must be the same repo-relative
|
||||
// path. Both transforms should agree or rollback/promote will hit
|
||||
// different files.
|
||||
it('runner path round-trips through both transforms', () => {
|
||||
const templatePath = 'container/agent-runner/src/index.ts';
|
||||
const viaClassifier = targetRepoRelPath(templatePath, 'ag-abc');
|
||||
// sourceForTemplate returns absolute; strip project root to compare.
|
||||
const viaPromote = path.relative(process.cwd(), sourceForTemplate(templatePath, 'ag-abc'));
|
||||
expect(viaPromote).toBe(viaClassifier);
|
||||
});
|
||||
|
||||
it('skills path round-trips through both transforms', () => {
|
||||
const templatePath = 'container/skills/browser/SKILL.md';
|
||||
const viaClassifier = targetRepoRelPath(templatePath, 'ag-abc');
|
||||
const viaPromote = path.relative(process.cwd(), sourceForTemplate(templatePath, 'ag-abc'));
|
||||
expect(viaPromote).toBe(viaClassifier);
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,250 @@
|
||||
/**
|
||||
* Promote-to-template flow.
|
||||
*
|
||||
* Runs once a swap is finalized (user clicked Confirm in the deadman). If
|
||||
* the diff touched `container/agent-runner/src/**` or `container/skills/**`,
|
||||
* we offer the approver a follow-up card:
|
||||
*
|
||||
* "The runner/skills changes are currently applied only to the
|
||||
* {originating} group. Would you like to also apply them to the
|
||||
* template so new groups created in the future inherit these changes?"
|
||||
*
|
||||
* Options: `Apply to template` / `Keep local to {originating}`. Decide-now-
|
||||
* or-never — no "Ask me later" state, no lifecycle management burden.
|
||||
*
|
||||
* On apply: copy files from the originating group's committed private dir
|
||||
* (`data/v2-sessions/<id>/agent-runner-src/**`, etc.) to the repo template
|
||||
* paths (`container/agent-runner/src/**`, `container/skills/**`), commit.
|
||||
* New groups initialized after this point inherit the updated template.
|
||||
* Existing groups are untouched.
|
||||
*/
|
||||
import { execFileSync } from 'child_process';
|
||||
import fs from 'fs';
|
||||
import path from 'path';
|
||||
|
||||
import { pickApprovalDelivery, pickApprover } from '../access.js';
|
||||
import { DATA_DIR } from '../config.js';
|
||||
import { getAgentGroup } from '../db/agent-groups.js';
|
||||
import { getMessagingGroup } from '../db/messaging-groups.js';
|
||||
import { createPendingApproval, deletePendingApproval, findSessionByAgentGroup } from '../db/sessions.js';
|
||||
import { getOwners } from '../db/user-roles.js';
|
||||
import { log } from '../log.js';
|
||||
import type { PendingSwap } from '../types.js';
|
||||
import { parseSwapSummary } from './swap.js';
|
||||
|
||||
const PROJECT_ROOT = process.cwd();
|
||||
|
||||
export interface PromoteDelivery {
|
||||
deliver(
|
||||
channelType: string,
|
||||
platformId: string,
|
||||
threadId: string | null,
|
||||
kind: string,
|
||||
content: string,
|
||||
): Promise<string | undefined>;
|
||||
}
|
||||
|
||||
let deliveryRef: PromoteDelivery | null = null;
|
||||
|
||||
export function setPromoteDelivery(adapter: PromoteDelivery): void {
|
||||
deliveryRef = adapter;
|
||||
}
|
||||
|
||||
/**
|
||||
* True iff any path in the swap's diff maps to runner or skills template.
|
||||
* Used by the finalize path to decide whether to trigger the prompt.
|
||||
*/
|
||||
export function swapTouchedRunnerOrSkills(swap: PendingSwap): boolean {
|
||||
const summary = parseSwapSummary(swap);
|
||||
return summary.classifiedFiles.some(
|
||||
(f) =>
|
||||
f.path.startsWith('container/agent-runner/src/') ||
|
||||
f.path.startsWith('container/skills/'),
|
||||
);
|
||||
}
|
||||
|
||||
/**
|
||||
* Send the promote-to-template prompt to the approver of the original
|
||||
* swap. Routing is the same as the original approval card — group admin
|
||||
* for group-level, owner-only for host-level-combined. No-ops if the
|
||||
* swap didn't touch runner/skills.
|
||||
*/
|
||||
export async function maybeSendPromotePrompt(swap: PendingSwap): Promise<void> {
|
||||
if (!swapTouchedRunnerOrSkills(swap)) return;
|
||||
if (!deliveryRef) {
|
||||
log.warn('maybeSendPromotePrompt: no delivery adapter set', { requestId: swap.request_id });
|
||||
return;
|
||||
}
|
||||
|
||||
const isHostLevel = swap.classification === 'host' || swap.classification === 'combined';
|
||||
const approvers = isHostLevel
|
||||
? getOwners().map((r) => r.user_id)
|
||||
: pickApprover(swap.originating_group_id);
|
||||
|
||||
if (approvers.length === 0) {
|
||||
log.info('Skipping promote prompt: no approvers configured', { requestId: swap.request_id });
|
||||
return;
|
||||
}
|
||||
|
||||
const originatingSession = findSessionByAgentGroup(swap.originating_group_id);
|
||||
const originChannelType = originatingSession?.messaging_group_id
|
||||
? (getMessagingGroup(originatingSession.messaging_group_id)?.channel_type ?? '')
|
||||
: '';
|
||||
|
||||
const target = await pickApprovalDelivery(approvers, originChannelType);
|
||||
if (!target) {
|
||||
log.info('Skipping promote prompt: no reachable approver', { requestId: swap.request_id });
|
||||
return;
|
||||
}
|
||||
|
||||
const originatingGroup = getAgentGroup(swap.originating_group_id);
|
||||
const originatingName = originatingGroup?.name ?? swap.originating_group_id;
|
||||
|
||||
const approvalId = `promote-${swap.request_id}`;
|
||||
const options = [
|
||||
{ label: 'Apply to template', selectedLabel: '✅ Promoted', value: 'apply' },
|
||||
{ label: `Keep local to ${originatingName}`, selectedLabel: '↪️ Kept local', value: 'keep' },
|
||||
];
|
||||
|
||||
createPendingApproval({
|
||||
approval_id: approvalId,
|
||||
session_id: originatingSession?.id ?? null,
|
||||
request_id: swap.request_id,
|
||||
action: 'promote_template',
|
||||
payload: JSON.stringify({ swapRequestId: swap.request_id }),
|
||||
created_at: new Date().toISOString(),
|
||||
title: 'Promote to template?',
|
||||
options_json: JSON.stringify(options),
|
||||
});
|
||||
|
||||
const summary = parseSwapSummary(swap);
|
||||
const runnerOrSkills = summary.classifiedFiles
|
||||
.filter((f) => f.path.startsWith('container/agent-runner/src/') || f.path.startsWith('container/skills/'))
|
||||
.map((f) => `- \`${f.path}\``)
|
||||
.join('\n');
|
||||
|
||||
const body =
|
||||
`Code change confirmed. The runner/skills edits are currently applied only to the **${originatingName}** agent.\n\n` +
|
||||
`**Files that could also become the default for new agents:**\n${runnerOrSkills}\n\n` +
|
||||
`Apply to template so agents created in the future inherit these changes? ` +
|
||||
`(Existing agents are unaffected either way.)`;
|
||||
|
||||
try {
|
||||
await deliveryRef.deliver(
|
||||
target.messagingGroup.channel_type,
|
||||
target.messagingGroup.platform_id,
|
||||
null,
|
||||
'chat-sdk',
|
||||
JSON.stringify({
|
||||
type: 'ask_question',
|
||||
questionId: approvalId,
|
||||
title: 'Promote to template?',
|
||||
question: body,
|
||||
options,
|
||||
}),
|
||||
);
|
||||
log.info('Promote prompt delivered', {
|
||||
requestId: swap.request_id,
|
||||
approvalId,
|
||||
approver: target.userId,
|
||||
});
|
||||
} catch (err) {
|
||||
log.error('Promote prompt delivery failed', { requestId: swap.request_id, err });
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Called by `handleApprovalResponse` in index.ts when the approver clicks
|
||||
* a button on the promote prompt. `apply` copies the runner/skills files
|
||||
* from the originating group's private dir into the repo template and
|
||||
* commits; anything else is a no-op.
|
||||
*/
|
||||
export async function handlePromoteResponse(
|
||||
approvalId: string,
|
||||
swapRequestId: string,
|
||||
selectedOption: string,
|
||||
): Promise<void> {
|
||||
try {
|
||||
if (selectedOption === 'apply') {
|
||||
await applyToTemplate(swapRequestId);
|
||||
} else {
|
||||
log.info('Promote skipped by approver', { swapRequestId, selectedOption });
|
||||
}
|
||||
} finally {
|
||||
deletePendingApproval(approvalId);
|
||||
}
|
||||
}
|
||||
|
||||
async function applyToTemplate(swapRequestId: string): Promise<void> {
|
||||
// Re-read the row directly (we need fresh state in case anything touched
|
||||
// it since finalize).
|
||||
const { getPendingSwap } = await import('../db/pending-swaps.js');
|
||||
const swap = getPendingSwap(swapRequestId);
|
||||
if (!swap) {
|
||||
log.warn('applyToTemplate: swap not found', { swapRequestId });
|
||||
return;
|
||||
}
|
||||
|
||||
const summary = parseSwapSummary(swap);
|
||||
const runnerOrSkills = summary.classifiedFiles.filter(
|
||||
(f) =>
|
||||
f.path.startsWith('container/agent-runner/src/') ||
|
||||
f.path.startsWith('container/skills/'),
|
||||
);
|
||||
if (runnerOrSkills.length === 0) return;
|
||||
|
||||
const copiedRelPaths: string[] = [];
|
||||
for (const f of runnerOrSkills) {
|
||||
// The source is the originating group's committed private copy, which
|
||||
// lives under data/v2-sessions/<id>/... thanks to the gitignore carve-
|
||||
// out. The destination is the repo template path at `f.path`.
|
||||
const src = sourceForTemplate(f.path, swap.originating_group_id);
|
||||
const dst = path.join(PROJECT_ROOT, f.path);
|
||||
if (!fs.existsSync(src)) {
|
||||
// File was deleted — mirror into the template.
|
||||
if (fs.existsSync(dst)) fs.rmSync(dst);
|
||||
copiedRelPaths.push(f.path);
|
||||
continue;
|
||||
}
|
||||
const dir = path.dirname(dst);
|
||||
if (!fs.existsSync(dir)) fs.mkdirSync(dir, { recursive: true });
|
||||
fs.copyFileSync(src, dst);
|
||||
copiedRelPaths.push(f.path);
|
||||
}
|
||||
|
||||
if (copiedRelPaths.length === 0) return;
|
||||
|
||||
try {
|
||||
execFileSync('git', ['add', '--', ...copiedRelPaths], { cwd: PROJECT_ROOT, stdio: 'ignore' });
|
||||
execFileSync(
|
||||
'git',
|
||||
['commit', '-m', `promote ${swapRequestId}: ${copiedRelPaths.join(', ')} → template`, '--', ...copiedRelPaths],
|
||||
{ cwd: PROJECT_ROOT, stdio: 'ignore' },
|
||||
);
|
||||
log.info('Promote to template committed', { swapRequestId, paths: copiedRelPaths });
|
||||
} catch (err) {
|
||||
log.error('Promote to template git operations failed', { swapRequestId, err });
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Compute the on-disk source path that corresponds to a repo template
|
||||
* path for a given originating group. This is the reverse of the
|
||||
* classifier's group-level target mapping.
|
||||
*
|
||||
* Exported for tests so the mapping stays in sync with the classifier.
|
||||
*/
|
||||
export function sourceForTemplate(templatePath: string, originatingGroupId: string): string {
|
||||
const norm = templatePath.replace(/\\/g, '/');
|
||||
if (norm.startsWith('container/agent-runner/src/')) {
|
||||
const rel = norm.slice('container/agent-runner/src/'.length);
|
||||
return path.join(DATA_DIR, 'v2-sessions', originatingGroupId, 'agent-runner-src', rel);
|
||||
}
|
||||
if (norm.startsWith('container/skills/')) {
|
||||
const rel = norm.slice('container/skills/'.length);
|
||||
return path.join(DATA_DIR, 'v2-sessions', originatingGroupId, '.claude-shared', 'skills', rel);
|
||||
}
|
||||
// Non-runner/skills paths are already the repo path — should never be
|
||||
// passed here since we filter first.
|
||||
return path.join(PROJECT_ROOT, norm);
|
||||
}
|
||||
@@ -0,0 +1,76 @@
|
||||
/**
|
||||
* Builder-agent startup sweep.
|
||||
*
|
||||
* Runs once on host startup (from `src/index.ts::main()` after migrations).
|
||||
* Two jobs, one code path:
|
||||
*
|
||||
* 1. **Resume in-flight deadmans.** Any `pending_swaps` row in
|
||||
* `awaiting_confirmation` either (a) belongs to a host-level swap
|
||||
* whose host just restarted as the expected part of the dance, or
|
||||
* (b) belongs to a group-level swap whose host crashed mid-dance.
|
||||
* In either case we look at `deadman_expires_at`: if the deadline is
|
||||
* in the past, auto-rollback; if in the future, rehydrate the timer
|
||||
* and (for case a) send the handshake card now.
|
||||
*
|
||||
* 2. **Delete orphan worktrees.** Any `.worktrees/dev-*` directory whose
|
||||
* corresponding `pending_swaps` row is in a terminal state
|
||||
* (`finalized`, `rolled_back`, `rejected`) or missing altogether.
|
||||
*/
|
||||
import fs from 'fs';
|
||||
import path from 'path';
|
||||
|
||||
import {
|
||||
getAwaitingConfirmationSwaps,
|
||||
getPendingSwap,
|
||||
} from '../db/pending-swaps.js';
|
||||
import { log } from '../log.js';
|
||||
import { resumeDeadman } from './deadman.js';
|
||||
import { removeDevWorktree } from './worktree.js';
|
||||
|
||||
const WORKTREES_DIR = path.join(process.cwd(), '.worktrees');
|
||||
|
||||
export async function runBuilderAgentStartupSweep(): Promise<void> {
|
||||
await resumeInFlightSwaps();
|
||||
cleanupOrphanWorktrees();
|
||||
}
|
||||
|
||||
async function resumeInFlightSwaps(): Promise<void> {
|
||||
const pending = getAwaitingConfirmationSwaps();
|
||||
if (pending.length === 0) return;
|
||||
|
||||
log.info('Resuming in-flight builder-agent swaps', { count: pending.length });
|
||||
for (const swap of pending) {
|
||||
try {
|
||||
await resumeDeadman(swap);
|
||||
} catch (err) {
|
||||
log.error('resumeDeadman threw', { requestId: swap.request_id, err });
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
function cleanupOrphanWorktrees(): void {
|
||||
if (!fs.existsSync(WORKTREES_DIR)) return;
|
||||
|
||||
const entries = fs.readdirSync(WORKTREES_DIR, { withFileTypes: true });
|
||||
for (const entry of entries) {
|
||||
if (!entry.isDirectory() || !entry.name.startsWith('dev-')) continue;
|
||||
const requestId = entry.name.slice('dev-'.length);
|
||||
const swap = getPendingSwap(requestId);
|
||||
|
||||
// Orphaned if: no row, or row in a terminal state.
|
||||
const terminal =
|
||||
!swap ||
|
||||
swap.status === 'finalized' ||
|
||||
swap.status === 'rolled_back' ||
|
||||
swap.status === 'rejected';
|
||||
|
||||
if (terminal) {
|
||||
log.info('Cleaning up orphan worktree', { requestId, status: swap?.status ?? 'missing' });
|
||||
try {
|
||||
removeDevWorktree(requestId);
|
||||
} catch (err) {
|
||||
log.warn('Failed to remove orphan worktree', { requestId, err });
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,143 @@
|
||||
import path from 'path';
|
||||
|
||||
import { describe, expect, it } from 'vitest';
|
||||
|
||||
import {
|
||||
isHostLevelSwap,
|
||||
parseSwapSummary,
|
||||
requiresFullHostRebuild,
|
||||
targetRepoRelPath,
|
||||
} from './swap.js';
|
||||
import type { PendingSwap } from '../types.js';
|
||||
|
||||
function makeSwap(overrides: Partial<PendingSwap> = {}): PendingSwap {
|
||||
return {
|
||||
request_id: 'req-1',
|
||||
dev_agent_id: 'ag-dev-1',
|
||||
originating_group_id: 'ag-origin-1',
|
||||
dev_branch: 'dev/req-1',
|
||||
commit_sha: 'abc123',
|
||||
classification: 'group',
|
||||
status: 'pending_approval',
|
||||
summary_json: '{}',
|
||||
pre_swap_sha: null,
|
||||
db_snapshot_path: null,
|
||||
deadman_started_at: null,
|
||||
deadman_expires_at: null,
|
||||
handshake_state: null,
|
||||
created_at: '2026-04-15T00:00:00Z',
|
||||
...overrides,
|
||||
};
|
||||
}
|
||||
|
||||
describe('parseSwapSummary', () => {
|
||||
it('parses a well-formed summary_json', () => {
|
||||
const swap = makeSwap({
|
||||
summary_json: JSON.stringify({
|
||||
overallSummary: 'Fix the welcome message typo',
|
||||
perFileSummaries: { 'groups/main/CLAUDE.md': 'Correct typo' },
|
||||
classifiedFiles: [{ path: 'groups/main/CLAUDE.md', classification: 'group' }],
|
||||
touchesMigrations: false,
|
||||
}),
|
||||
});
|
||||
const s = parseSwapSummary(swap);
|
||||
expect(s.overallSummary).toBe('Fix the welcome message typo');
|
||||
expect(s.perFileSummaries['groups/main/CLAUDE.md']).toBe('Correct typo');
|
||||
expect(s.classifiedFiles).toHaveLength(1);
|
||||
expect(s.touchesMigrations).toBe(false);
|
||||
});
|
||||
|
||||
it('fills in defaults for a missing summary_json shape', () => {
|
||||
const swap = makeSwap({ summary_json: '{}' });
|
||||
const s = parseSwapSummary(swap);
|
||||
expect(s.overallSummary).toBe('');
|
||||
expect(s.perFileSummaries).toEqual({});
|
||||
expect(s.classifiedFiles).toEqual([]);
|
||||
expect(s.touchesMigrations).toBe(false);
|
||||
});
|
||||
|
||||
it('fills in defaults for a partially-populated summary_json', () => {
|
||||
const swap = makeSwap({
|
||||
summary_json: JSON.stringify({ overallSummary: 'partial', touchesMigrations: true }),
|
||||
});
|
||||
const s = parseSwapSummary(swap);
|
||||
expect(s.overallSummary).toBe('partial');
|
||||
expect(s.touchesMigrations).toBe(true);
|
||||
expect(s.classifiedFiles).toEqual([]);
|
||||
});
|
||||
});
|
||||
|
||||
describe('isHostLevelSwap', () => {
|
||||
it('is false for group classification', () => {
|
||||
expect(isHostLevelSwap(makeSwap({ classification: 'group' }))).toBe(false);
|
||||
});
|
||||
it('is true for host classification', () => {
|
||||
expect(isHostLevelSwap(makeSwap({ classification: 'host' }))).toBe(true);
|
||||
});
|
||||
it('is true for combined classification', () => {
|
||||
expect(isHostLevelSwap(makeSwap({ classification: 'combined' }))).toBe(true);
|
||||
});
|
||||
});
|
||||
|
||||
describe('requiresFullHostRebuild', () => {
|
||||
const root = process.cwd();
|
||||
const abs = (p: string): string => path.join(root, p);
|
||||
|
||||
it('flags src/ changes', () => {
|
||||
expect(requiresFullHostRebuild([abs('src/delivery.ts')])).toBe(true);
|
||||
});
|
||||
it('flags root package.json', () => {
|
||||
expect(requiresFullHostRebuild([abs('package.json')])).toBe(true);
|
||||
});
|
||||
it('flags root Dockerfile', () => {
|
||||
expect(requiresFullHostRebuild([abs('Dockerfile')])).toBe(true);
|
||||
});
|
||||
it('flags container/Dockerfile', () => {
|
||||
expect(requiresFullHostRebuild([abs('container/Dockerfile')])).toBe(true);
|
||||
});
|
||||
it('does not flag groups/ changes', () => {
|
||||
expect(requiresFullHostRebuild([abs('groups/main/CLAUDE.md')])).toBe(false);
|
||||
});
|
||||
it('does not flag per-group runner dir changes', () => {
|
||||
expect(
|
||||
requiresFullHostRebuild([abs('data/v2-sessions/ag-1/agent-runner-src/poll-loop.ts')]),
|
||||
).toBe(false);
|
||||
});
|
||||
it('returns true if any path requires rebuild even if others do not', () => {
|
||||
expect(
|
||||
requiresFullHostRebuild([abs('groups/main/CLAUDE.md'), abs('src/delivery.ts')]),
|
||||
).toBe(true);
|
||||
});
|
||||
});
|
||||
|
||||
describe('targetRepoRelPath', () => {
|
||||
it('maps runner paths to the per-group private dir', () => {
|
||||
expect(targetRepoRelPath('container/agent-runner/src/index.ts', 'ag-abc')).toBe(
|
||||
'data/v2-sessions/ag-abc/agent-runner-src/index.ts',
|
||||
);
|
||||
});
|
||||
|
||||
it('maps nested runner paths correctly', () => {
|
||||
expect(
|
||||
targetRepoRelPath('container/agent-runner/src/mcp-tools/agents.ts', 'ag-abc'),
|
||||
).toBe('data/v2-sessions/ag-abc/agent-runner-src/mcp-tools/agents.ts');
|
||||
});
|
||||
|
||||
it('maps skills paths to the per-group skills dir', () => {
|
||||
expect(targetRepoRelPath('container/skills/browser/SKILL.md', 'ag-abc')).toBe(
|
||||
'data/v2-sessions/ag-abc/.claude-shared/skills/browser/SKILL.md',
|
||||
);
|
||||
});
|
||||
|
||||
it('leaves host-level paths untouched', () => {
|
||||
expect(targetRepoRelPath('src/delivery.ts', 'ag-abc')).toBe('src/delivery.ts');
|
||||
expect(targetRepoRelPath('package.json', 'ag-abc')).toBe('package.json');
|
||||
expect(targetRepoRelPath('groups/main/CLAUDE.md', 'ag-abc')).toBe('groups/main/CLAUDE.md');
|
||||
});
|
||||
|
||||
it('handles Windows-style separators by normalizing', () => {
|
||||
expect(
|
||||
targetRepoRelPath('container\\agent-runner\\src\\index.ts', 'ag-abc'),
|
||||
).toBe('data/v2-sessions/ag-abc/agent-runner-src/index.ts');
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,393 @@
|
||||
/**
|
||||
* Builder-agent swap execution.
|
||||
*
|
||||
* Called from the approval handler on "approve" for a pending swap. The
|
||||
* flow: capture pre-swap state → apply worktree files to swap targets →
|
||||
* `git commit --only` those paths to main → conditional image rebuild →
|
||||
* restart affected processes (container for group-level, host for
|
||||
* host-level) → transition pending_swaps to `awaiting_confirmation`.
|
||||
*
|
||||
* Rollback is implemented in `deadman.ts` and uses `pre_swap_sha` +
|
||||
* `git checkout <sha> -- <paths>` as the one rollback mechanism (no
|
||||
* separate per-file blob table).
|
||||
*/
|
||||
import { execFileSync } from 'child_process';
|
||||
import fs from 'fs';
|
||||
import path from 'path';
|
||||
|
||||
import { DATA_DIR } from '../config.js';
|
||||
import { getAgentGroup } from '../db/agent-groups.js';
|
||||
import { getDb } from '../db/connection.js';
|
||||
import {
|
||||
getPendingSwap,
|
||||
resetSwapForRetry,
|
||||
setSwapPreSwapState,
|
||||
} from '../db/pending-swaps.js';
|
||||
import { log } from '../log.js';
|
||||
import type { PendingSwap } from '../types.js';
|
||||
import { classifyDiff, type ClassifiedFile } from './classifier.js';
|
||||
import { worktreePathFor } from './worktree.js';
|
||||
|
||||
const PROJECT_ROOT = process.cwd();
|
||||
|
||||
/** Run a git command in a given cwd; throw with stderr on failure. */
|
||||
function git(args: string[], cwd: string): string {
|
||||
try {
|
||||
return execFileSync('git', args, { cwd, encoding: 'utf8', stdio: ['ignore', 'pipe', 'pipe'] }).trim();
|
||||
} catch (err) {
|
||||
const e = err as { stderr?: Buffer | string; message?: string };
|
||||
const stderr = typeof e.stderr === 'string' ? e.stderr : (e.stderr?.toString() ?? '');
|
||||
throw new Error(`git ${args.join(' ')} failed: ${stderr || e.message || 'unknown error'}`);
|
||||
}
|
||||
}
|
||||
|
||||
export interface SwapSummary {
|
||||
overallSummary: string;
|
||||
perFileSummaries: Record<string, string>;
|
||||
classifiedFiles: Array<{ path: string; classification: 'group' | 'host' }>;
|
||||
touchesMigrations: boolean;
|
||||
}
|
||||
|
||||
/**
|
||||
* Decode the summary_json blob written by `handleRequestSwap`. Host-side
|
||||
* consumers (approval card rendering, swap execution) need the structured
|
||||
* form; we centralize the parse + validate here.
|
||||
*/
|
||||
export function parseSwapSummary(swap: PendingSwap): SwapSummary {
|
||||
const parsed = JSON.parse(swap.summary_json) as Partial<SwapSummary>;
|
||||
return {
|
||||
overallSummary: parsed.overallSummary ?? '',
|
||||
perFileSummaries: parsed.perFileSummaries ?? {},
|
||||
classifiedFiles: parsed.classifiedFiles ?? [],
|
||||
touchesMigrations: parsed.touchesMigrations ?? false,
|
||||
};
|
||||
}
|
||||
|
||||
/** Targets change-paths that require a host-wide rebuild/restart if present. */
|
||||
function isHostRebuildPath(relPath: string): boolean {
|
||||
const norm = relPath.replace(/\\/g, '/');
|
||||
return (
|
||||
norm === 'package.json' ||
|
||||
norm === 'package-lock.json' ||
|
||||
norm === 'Dockerfile' ||
|
||||
norm.startsWith('container/Dockerfile') ||
|
||||
norm.startsWith('src/')
|
||||
);
|
||||
}
|
||||
|
||||
/**
|
||||
* Capture pre-swap state so rollback has something to restore to:
|
||||
* - main HEAD SHA → pending_swaps.pre_swap_sha
|
||||
* - a full copy of the central DB → data/backups/swap-<id>.sqlite
|
||||
* SQLite is backed up via better-sqlite3's `db.backup()` which is
|
||||
* crash-safe and doesn't require stopping the app.
|
||||
*/
|
||||
export async function captureSwapPreState(requestId: string): Promise<void> {
|
||||
const preSwapSha = git(['rev-parse', 'HEAD'], PROJECT_ROOT);
|
||||
|
||||
const backupsDir = path.join(DATA_DIR, 'backups');
|
||||
if (!fs.existsSync(backupsDir)) {
|
||||
fs.mkdirSync(backupsDir, { recursive: true });
|
||||
}
|
||||
const snapshotPath = path.join(backupsDir, `swap-${requestId}.sqlite`);
|
||||
// better-sqlite3 backup returns a Promise; await it before persisting the
|
||||
// path so a rollback that reads db_snapshot_path always finds a valid file.
|
||||
await (getDb() as unknown as { backup: (dst: string) => Promise<unknown> }).backup(snapshotPath);
|
||||
|
||||
setSwapPreSwapState(requestId, preSwapSha, snapshotPath);
|
||||
log.info('Swap pre-state captured', { requestId, preSwapSha, snapshotPath });
|
||||
}
|
||||
|
||||
/**
|
||||
* Apply the approved commit's file contents to their swap targets.
|
||||
*
|
||||
* Critical correctness property: this reads from the committed tree at
|
||||
* `pending_swaps.commit_sha`, NOT from the worktree's working files. The
|
||||
* dev agent is frozen at request_swap time (see handleRequestSwap), but
|
||||
* even if the freeze didn't exist, reading from the commit ensures we
|
||||
* apply EXACTLY what the approver reviewed — no post-submission edits
|
||||
* can sneak in by editing the working tree.
|
||||
*
|
||||
* File changes are discovered via `git diff --name-status main..<sha>`
|
||||
* inside the worktree. For A/M files we `git show <sha>:<path>` to get
|
||||
* the committed content. For D files we delete the target.
|
||||
*/
|
||||
export function applySwapFiles(requestId: string): string[] {
|
||||
const swap = getPendingSwap(requestId);
|
||||
if (!swap) throw new Error(`applySwapFiles: no pending_swaps row for ${requestId}`);
|
||||
if (!swap.commit_sha) {
|
||||
throw new Error(`applySwapFiles: pending_swaps row ${requestId} has no commit_sha`);
|
||||
}
|
||||
|
||||
const worktreePath = worktreePathFor(requestId);
|
||||
if (!fs.existsSync(worktreePath)) {
|
||||
throw new Error(`applySwapFiles: worktree missing at ${worktreePath}`);
|
||||
}
|
||||
|
||||
const originating = getAgentGroup(swap.originating_group_id);
|
||||
if (!originating) {
|
||||
throw new Error(`applySwapFiles: originating group ${swap.originating_group_id} missing`);
|
||||
}
|
||||
|
||||
// Enumerate every path that changed in the reviewed commit relative
|
||||
// to main. Pairs each path with its A/M/D status. --no-renames keeps
|
||||
// the parsing simple (a rename shows up as D+A).
|
||||
const nameStatus = git(
|
||||
['diff', '--name-status', '--no-renames', `main..${swap.commit_sha}`],
|
||||
worktreePath,
|
||||
);
|
||||
|
||||
const changes: Array<{ status: 'A' | 'M' | 'D'; path: string }> = [];
|
||||
for (const line of nameStatus.split('\n')) {
|
||||
if (!line.trim()) continue;
|
||||
const [statusRaw, ...pathParts] = line.split('\t');
|
||||
const s = statusRaw.charAt(0) as 'A' | 'M' | 'D';
|
||||
const p = pathParts.join('\t');
|
||||
if (s === 'A' || s === 'M' || s === 'D') {
|
||||
changes.push({ status: s, path: p });
|
||||
}
|
||||
}
|
||||
|
||||
const classified = classifyDiff(
|
||||
changes.map((c) => c.path),
|
||||
{
|
||||
projectRoot: PROJECT_ROOT,
|
||||
dataDir: DATA_DIR,
|
||||
originatingGroupId: swap.originating_group_id,
|
||||
originatingGroupFolder: originating.folder,
|
||||
},
|
||||
);
|
||||
|
||||
const statusByPath = new Map<string, 'A' | 'M' | 'D'>(
|
||||
changes.map((c) => [c.path, c.status]),
|
||||
);
|
||||
|
||||
const touchedAbs: string[] = [];
|
||||
for (const file of classified.files) {
|
||||
const status = statusByPath.get(file.path) ?? 'M';
|
||||
const dst = file.targetAbsPath;
|
||||
|
||||
if (status === 'D') {
|
||||
// File was deleted in the reviewed commit — mirror by removing target.
|
||||
if (fs.existsSync(dst)) fs.rmSync(dst);
|
||||
} else {
|
||||
// A or M: read the file content at the reviewed commit via `git show`.
|
||||
// Use no encoding so we get a Buffer (safe for binary files too).
|
||||
let content: Buffer;
|
||||
try {
|
||||
content = execFileSync('git', ['show', `${swap.commit_sha}:${file.path}`], {
|
||||
cwd: worktreePath,
|
||||
stdio: ['ignore', 'pipe', 'pipe'],
|
||||
maxBuffer: 20 * 1024 * 1024,
|
||||
});
|
||||
} catch (err) {
|
||||
throw new Error(
|
||||
`git show ${swap.commit_sha}:${file.path} failed: ${
|
||||
err instanceof Error ? err.message : String(err)
|
||||
}`,
|
||||
);
|
||||
}
|
||||
const dir = path.dirname(dst);
|
||||
if (!fs.existsSync(dir)) fs.mkdirSync(dir, { recursive: true });
|
||||
fs.writeFileSync(dst, content);
|
||||
}
|
||||
touchedAbs.push(dst);
|
||||
}
|
||||
|
||||
log.info('Swap files applied from committed tree', {
|
||||
requestId,
|
||||
commitSha: swap.commit_sha,
|
||||
fileCount: classified.files.length,
|
||||
hostCount: classified.hostPaths.length,
|
||||
groupCount: classified.files.length - classified.hostPaths.length,
|
||||
});
|
||||
return touchedAbs;
|
||||
}
|
||||
|
||||
/**
|
||||
* Stage and commit exactly the swap's touched paths to main, using
|
||||
* `git add <paths>` + `git commit -- <paths>`. Leaves any unrelated
|
||||
* uncommitted state in main untouched. Returns the new commit SHA.
|
||||
*
|
||||
* Path arguments to git are repo-relative so the commit is clean regardless
|
||||
* of where process.cwd() happens to resolve the absolute paths.
|
||||
*/
|
||||
export function commitSwap(requestId: string, touchedAbs: string[], summary: string): string {
|
||||
if (touchedAbs.length === 0) return git(['rev-parse', 'HEAD'], PROJECT_ROOT);
|
||||
|
||||
const relPaths = touchedAbs.map((abs) => path.relative(PROJECT_ROOT, abs));
|
||||
|
||||
// Stage everything we touched. -- disambiguates path args from refs.
|
||||
git(['add', '--', ...relPaths], PROJECT_ROOT);
|
||||
|
||||
// Commit only the staged swap paths. If there are no changes (e.g. the
|
||||
// swap was a no-op because the worktree matched the current state), git
|
||||
// will exit non-zero; treat that as success and return current HEAD.
|
||||
const message = `swap ${requestId}: ${summary}`.slice(0, 500);
|
||||
try {
|
||||
git(['commit', '-m', message, '--', ...relPaths], PROJECT_ROOT);
|
||||
} catch (err) {
|
||||
const msg = err instanceof Error ? err.message : String(err);
|
||||
if (msg.includes('nothing to commit') || msg.includes('no changes added')) {
|
||||
log.info('Swap commit was a no-op', { requestId });
|
||||
} else {
|
||||
throw err;
|
||||
}
|
||||
}
|
||||
|
||||
const sha = git(['rev-parse', 'HEAD'], PROJECT_ROOT);
|
||||
log.info('Swap committed to main', { requestId, sha });
|
||||
return sha;
|
||||
}
|
||||
|
||||
/**
|
||||
* Restore the files a swap touched back to their pre-swap state, then
|
||||
* record a forward-only revert commit. Used on deadman timeout and on
|
||||
* explicit rollback.
|
||||
*/
|
||||
export function rollbackSwapFiles(swap: PendingSwap): void {
|
||||
if (!swap.pre_swap_sha) {
|
||||
log.warn('rollbackSwapFiles called with no pre_swap_sha', { requestId: swap.request_id });
|
||||
return;
|
||||
}
|
||||
const summary = parseSwapSummary(swap);
|
||||
const relPaths = summary.classifiedFiles.map((f) => {
|
||||
// Re-compute the on-disk target for rollback. The pre_swap_sha is on main,
|
||||
// so `git checkout <sha> -- <relative-path>` always refers to repo paths.
|
||||
// Group-level targets under data/v2-sessions/... ARE repo paths thanks to
|
||||
// the gitignore carve-out, so this works uniformly.
|
||||
return targetRepoRelPath(f.path, swap.originating_group_id);
|
||||
});
|
||||
|
||||
try {
|
||||
git(['checkout', swap.pre_swap_sha, '--', ...relPaths], PROJECT_ROOT);
|
||||
} catch (err) {
|
||||
log.error('git checkout during rollback failed', { requestId: swap.request_id, err });
|
||||
return;
|
||||
}
|
||||
|
||||
// Record a forward-only revert commit so main's history shows what reverted.
|
||||
try {
|
||||
git(['add', '--', ...relPaths], PROJECT_ROOT);
|
||||
git(
|
||||
['commit', '-m', `rollback ${swap.request_id}: deadman timeout`, '--', ...relPaths],
|
||||
PROJECT_ROOT,
|
||||
);
|
||||
} catch (err) {
|
||||
const msg = err instanceof Error ? err.message : String(err);
|
||||
if (!(msg.includes('nothing to commit') || msg.includes('no changes added'))) {
|
||||
log.error('Revert commit failed', { requestId: swap.request_id, err });
|
||||
}
|
||||
}
|
||||
log.info('Swap files rolled back', { requestId: swap.request_id, preSwapSha: swap.pre_swap_sha });
|
||||
}
|
||||
|
||||
/**
|
||||
* Compute the repo-relative path where a worktree path lands on disk. This
|
||||
* mirrors classifier.ts::classifyPath but using swap metadata — needed for
|
||||
* rollback because the classifier options weren't persisted, and by the
|
||||
* promote flow which copies from the committed per-group state into the
|
||||
* repo template.
|
||||
*
|
||||
* Exported so tests can lock the mapping against the classifier's rules.
|
||||
*/
|
||||
export function targetRepoRelPath(
|
||||
worktreeRelPath: string,
|
||||
originatingGroupId: string,
|
||||
): string {
|
||||
const norm = worktreeRelPath.replace(/\\/g, '/');
|
||||
if (norm.startsWith('container/agent-runner/src/')) {
|
||||
const rel = norm.slice('container/agent-runner/src/'.length);
|
||||
return path.posix.join('data', 'v2-sessions', originatingGroupId, 'agent-runner-src', rel);
|
||||
}
|
||||
if (norm.startsWith('container/skills/')) {
|
||||
const rel = norm.slice('container/skills/'.length);
|
||||
return path.posix.join(
|
||||
'data',
|
||||
'v2-sessions',
|
||||
originatingGroupId,
|
||||
'.claude-shared',
|
||||
'skills',
|
||||
rel,
|
||||
);
|
||||
}
|
||||
return norm;
|
||||
}
|
||||
|
||||
/**
|
||||
* Restore the central DB from a pre-swap snapshot. better-sqlite3 doesn't
|
||||
* support live restore, so we copy the snapshot file over data/v2.db. This
|
||||
* MUST be called during the host-level swap restart window where the DB
|
||||
* connection can be reopened; doing it while the running process has the
|
||||
* DB open would corrupt in-flight transactions.
|
||||
*/
|
||||
export function restoreDbFromSnapshot(swap: PendingSwap): void {
|
||||
if (!swap.db_snapshot_path || !fs.existsSync(swap.db_snapshot_path)) {
|
||||
log.warn('No DB snapshot to restore', { requestId: swap.request_id });
|
||||
return;
|
||||
}
|
||||
const dbPath = path.join(DATA_DIR, 'v2.db');
|
||||
fs.copyFileSync(swap.db_snapshot_path, dbPath);
|
||||
log.info('Central DB restored from snapshot', {
|
||||
requestId: swap.request_id,
|
||||
from: swap.db_snapshot_path,
|
||||
to: dbPath,
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Whether a swap's diff requires a host-level rebuild+restart vs just a
|
||||
* group-level container restart. The classifier's overall label is our
|
||||
* guide: `group` → group-level; `host`/`combined` → host-level.
|
||||
*/
|
||||
export function isHostLevelSwap(swap: PendingSwap): boolean {
|
||||
return swap.classification === 'host' || swap.classification === 'combined';
|
||||
}
|
||||
|
||||
/**
|
||||
* Bail out of a swap execution after a failure (apply / commit / build
|
||||
* error), leaving the dev agent and its worktree intact so the dev agent
|
||||
* can fix the issue and retry via another `request_swap` call.
|
||||
*
|
||||
* Behavior:
|
||||
* 1. If we got as far as captureSwapPreState (pre_swap_sha is set),
|
||||
* run rollbackSwapFiles to restore file contents and record a
|
||||
* forward-only revert commit on main.
|
||||
* 2. Reset the pending_swaps row to `pending_approval` with all
|
||||
* in-progress fields cleared — dev agent's next request_swap will
|
||||
* find the row via getSwapForDevAgent and re-populate it.
|
||||
* 3. Caller is responsible for notifying the dev agent with the actual
|
||||
* error message and deleting the stale pending_approval row.
|
||||
*
|
||||
* This is the RETRYABLE failure path. Explicit rejection by the approver
|
||||
* is a different flow (terminal teardown) and is handled in index.ts.
|
||||
*/
|
||||
export function bailSwapForRetry(requestId: string): void {
|
||||
const swap = getPendingSwap(requestId);
|
||||
if (!swap) {
|
||||
log.warn('bailSwapForRetry: swap not found', { requestId });
|
||||
return;
|
||||
}
|
||||
|
||||
// Rollback on-disk file contents if we got far enough to snapshot main.
|
||||
if (swap.pre_swap_sha) {
|
||||
try {
|
||||
rollbackSwapFiles(swap);
|
||||
} catch (err) {
|
||||
log.error('rollbackSwapFiles threw during bail', { requestId, err });
|
||||
}
|
||||
}
|
||||
|
||||
// Reset the row so the dev agent can retry.
|
||||
resetSwapForRetry(requestId);
|
||||
log.info('Swap bailed for retry — dev agent still alive', { requestId });
|
||||
}
|
||||
|
||||
/**
|
||||
* Whether any of the touched repo paths require a full host-wide rebuild
|
||||
* (as opposed to just restarting the originating container). Used by the
|
||||
* caller to decide: `npm run build` in the root, rebuild base image, etc.
|
||||
*/
|
||||
export function requiresFullHostRebuild(touchedAbs: string[]): boolean {
|
||||
return touchedAbs.some((abs) => isHostRebuildPath(path.relative(PROJECT_ROOT, abs)));
|
||||
}
|
||||
@@ -0,0 +1,219 @@
|
||||
/**
|
||||
* Builder-agent worktree management.
|
||||
*
|
||||
* Given an originating agent group, creates a git worktree containing a full
|
||||
* copy of the repo (via `git worktree add`), then overlays the originating
|
||||
* group's private per-group runner and skills copies over the repo template
|
||||
* so the dev agent sees the originating's actual current state, not a
|
||||
* pristine template.
|
||||
*
|
||||
* The worktree is mounted read-write into the dev agent's container at
|
||||
* /worktree, giving it write access to the whole repo *copy* (minus the
|
||||
* shadow-mounted .env and excluded data/store paths). The dev agent's own
|
||||
* runtime mounts are unchanged — it's running the live code, editing the
|
||||
* copy. Self-modification is structurally impossible.
|
||||
*/
|
||||
import { execFileSync } from 'child_process';
|
||||
import fs from 'fs';
|
||||
import path from 'path';
|
||||
|
||||
import { DATA_DIR } from '../config.js';
|
||||
import { log } from '../log.js';
|
||||
|
||||
const PROJECT_ROOT = process.cwd();
|
||||
const WORKTREES_DIR = path.join(PROJECT_ROOT, '.worktrees');
|
||||
|
||||
/**
|
||||
* Absolute path to a dev worktree for a given request id. Centralized so
|
||||
* every consumer (worktree.ts, swap.ts, container-runner.ts) agrees on the
|
||||
* layout.
|
||||
*/
|
||||
export function worktreePathFor(requestId: string): string {
|
||||
return path.join(WORKTREES_DIR, `dev-${requestId}`);
|
||||
}
|
||||
|
||||
/** Branch name convention for dev worktrees. */
|
||||
export function devBranchFor(requestId: string): string {
|
||||
return `dev/${requestId}`;
|
||||
}
|
||||
|
||||
/**
|
||||
* Run a git command synchronously in a given cwd. Returns trimmed stdout.
|
||||
* Throws on non-zero exit. Uses execFileSync to avoid shell interpolation.
|
||||
*/
|
||||
function git(args: string[], cwd: string): string {
|
||||
try {
|
||||
return execFileSync('git', args, { cwd, encoding: 'utf8', stdio: ['ignore', 'pipe', 'pipe'] }).trim();
|
||||
} catch (err) {
|
||||
const e = err as { stderr?: Buffer | string; message?: string };
|
||||
const stderr = typeof e.stderr === 'string' ? e.stderr : (e.stderr?.toString() ?? '');
|
||||
throw new Error(`git ${args.join(' ')} failed: ${stderr || e.message || 'unknown error'}`);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Refuse early if the main repo is in a state git can't safely swap against
|
||||
* (mid-merge, mid-rebase, cherry-pick, bisect). We do NOT try to auto-resolve.
|
||||
* Uncommitted working-tree changes are fine because we use `git commit --only`
|
||||
* at swap time, which commits only the swap's paths.
|
||||
*/
|
||||
export function assertGitCleanEnoughForSwap(): void {
|
||||
const gitDir = path.join(PROJECT_ROOT, '.git');
|
||||
const weirdFiles = ['MERGE_HEAD', 'REBASE_HEAD', 'CHERRY_PICK_HEAD', 'BISECT_LOG'];
|
||||
for (const f of weirdFiles) {
|
||||
if (fs.existsSync(path.join(gitDir, f))) {
|
||||
throw new Error(
|
||||
`cannot start swap: git repo is in an unresolved state (${f} exists). ` +
|
||||
`resolve merge/rebase/etc in the terminal before running the builder agent.`,
|
||||
);
|
||||
}
|
||||
}
|
||||
const rebaseDir = path.join(gitDir, 'rebase-merge');
|
||||
const rebaseApply = path.join(gitDir, 'rebase-apply');
|
||||
if (fs.existsSync(rebaseDir) || fs.existsSync(rebaseApply)) {
|
||||
throw new Error(
|
||||
'cannot start swap: git repo is mid-rebase. resolve it in the terminal first.',
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Create a fresh worktree for a dev-agent request and overlay the originating
|
||||
* group's private runner + skills copies over the repo template. Returns the
|
||||
* absolute worktree path.
|
||||
*
|
||||
* Idempotency: if the worktree path already exists (from a previous request
|
||||
* or crash), it is removed first via `git worktree remove --force` so the
|
||||
* creation is clean.
|
||||
*/
|
||||
export function createDevWorktree(
|
||||
requestId: string,
|
||||
originatingGroupId: string,
|
||||
): string {
|
||||
assertGitCleanEnoughForSwap();
|
||||
|
||||
if (!fs.existsSync(WORKTREES_DIR)) {
|
||||
fs.mkdirSync(WORKTREES_DIR, { recursive: true });
|
||||
}
|
||||
|
||||
const worktreePath = worktreePathFor(requestId);
|
||||
const branch = devBranchFor(requestId);
|
||||
|
||||
// If a prior worktree dir exists at this path, remove it first. `git
|
||||
// worktree remove` cleans up the worktree list; we then rm -rf as a
|
||||
// belt-and-suspenders in case the dir is orphaned but not tracked.
|
||||
if (fs.existsSync(worktreePath)) {
|
||||
try {
|
||||
git(['worktree', 'remove', '--force', worktreePath], PROJECT_ROOT);
|
||||
} catch {
|
||||
/* best-effort; dir might be orphaned */
|
||||
}
|
||||
if (fs.existsSync(worktreePath)) {
|
||||
fs.rmSync(worktreePath, { recursive: true, force: true });
|
||||
}
|
||||
}
|
||||
|
||||
// Clean up any stale branch with the same name (unlikely but possible
|
||||
// after a crash).
|
||||
try {
|
||||
git(['branch', '-D', branch], PROJECT_ROOT);
|
||||
} catch {
|
||||
/* branch didn't exist — fine */
|
||||
}
|
||||
|
||||
git(['worktree', 'add', '-b', branch, worktreePath, 'HEAD'], PROJECT_ROOT);
|
||||
|
||||
// Overlay: copy the originating group's private per-group dirs over the
|
||||
// worktree's repo-template paths. This makes the dev agent's view match
|
||||
// what the originating group is actually running, not the pristine
|
||||
// template.
|
||||
const sessDir = path.join(DATA_DIR, 'v2-sessions', originatingGroupId);
|
||||
overlayDir(
|
||||
path.join(sessDir, 'agent-runner-src'),
|
||||
path.join(worktreePath, 'container', 'agent-runner', 'src'),
|
||||
);
|
||||
overlayDir(
|
||||
path.join(sessDir, '.claude-shared', 'skills'),
|
||||
path.join(worktreePath, 'container', 'skills'),
|
||||
);
|
||||
|
||||
// Shadow the .env with an empty placeholder so the dev agent can't read
|
||||
// credentials from a committed-but-gitignored file if one snuck into the
|
||||
// working tree somehow.
|
||||
fs.writeFileSync(path.join(worktreePath, '.env'), '# shadowed by builder-agent\n');
|
||||
|
||||
log.info('Dev worktree created', {
|
||||
requestId,
|
||||
originatingGroupId,
|
||||
worktreePath,
|
||||
branch,
|
||||
});
|
||||
|
||||
return worktreePath;
|
||||
}
|
||||
|
||||
/**
|
||||
* Overlay the contents of `src` onto `dst`, overwriting any existing files.
|
||||
* Missing `src` is a silent no-op (some groups may not have customized their
|
||||
* runner/skills yet).
|
||||
*/
|
||||
function overlayDir(src: string, dst: string): void {
|
||||
if (!fs.existsSync(src)) return;
|
||||
if (!fs.existsSync(dst)) {
|
||||
fs.mkdirSync(dst, { recursive: true });
|
||||
}
|
||||
fs.cpSync(src, dst, { recursive: true, force: true });
|
||||
}
|
||||
|
||||
/**
|
||||
* Tear down a worktree: remove it via `git worktree remove --force`, delete
|
||||
* its branch, and rm -rf the directory as a final safety net. Idempotent.
|
||||
*/
|
||||
export function removeDevWorktree(requestId: string): void {
|
||||
const worktreePath = worktreePathFor(requestId);
|
||||
const branch = devBranchFor(requestId);
|
||||
|
||||
try {
|
||||
git(['worktree', 'remove', '--force', worktreePath], PROJECT_ROOT);
|
||||
} catch {
|
||||
/* worktree wasn't registered — fine */
|
||||
}
|
||||
if (fs.existsSync(worktreePath)) {
|
||||
fs.rmSync(worktreePath, { recursive: true, force: true });
|
||||
}
|
||||
try {
|
||||
git(['branch', '-D', branch], PROJECT_ROOT);
|
||||
} catch {
|
||||
/* branch didn't exist — fine */
|
||||
}
|
||||
|
||||
log.info('Dev worktree removed', { requestId, worktreePath });
|
||||
}
|
||||
|
||||
/**
|
||||
* Return the list of paths changed at a specific commit relative to main.
|
||||
* Always uses the range syntax `main..<sha>` so the result reflects what's
|
||||
* in the committed tree — NOT what's in the working-tree. This matters:
|
||||
* the dev agent may still be running when request_swap is processed, and
|
||||
* we must not pick up post-submission working-tree edits into the
|
||||
* approved diff.
|
||||
*/
|
||||
export function diffChangedPathsAtCommit(requestId: string, commitSha: string): string[] {
|
||||
const worktreePath = worktreePathFor(requestId);
|
||||
const out = git(['diff', '--name-only', `main..${commitSha}`], worktreePath);
|
||||
return out
|
||||
.split('\n')
|
||||
.map((s) => s.trim())
|
||||
.filter((s) => s.length > 0);
|
||||
}
|
||||
|
||||
/** Current HEAD SHA inside a dev worktree. */
|
||||
export function worktreeHeadSha(requestId: string): string {
|
||||
const worktreePath = worktreePathFor(requestId);
|
||||
return git(['rev-parse', 'HEAD'], worktreePath);
|
||||
}
|
||||
|
||||
/** Current HEAD SHA on main (captured as pre_swap_sha). */
|
||||
export function mainHeadSha(): string {
|
||||
return git(['rev-parse', 'HEAD'], PROJECT_ROOT);
|
||||
}
|
||||
@@ -134,7 +134,6 @@ describe('channel + router integration', () => {
|
||||
name: 'Test Agent',
|
||||
folder: 'test-agent',
|
||||
agent_provider: null,
|
||||
container_config: null,
|
||||
created_at: now(),
|
||||
});
|
||||
createMessagingGroup({
|
||||
|
||||
@@ -0,0 +1,117 @@
|
||||
/**
|
||||
* Per-group container config, stored as a plain JSON file at
|
||||
* `groups/<folder>/container.json`. Replaces the former
|
||||
* `agent_groups.container_config` DB column.
|
||||
*
|
||||
* Shape:
|
||||
* {
|
||||
* mcpServers: { [name]: { command, args, env } }
|
||||
* packages: { apt: string[], npm: string[] }
|
||||
* imageTag?: string // set by buildAgentGroupImage on rebuild
|
||||
* additionalMounts?: Array<{hostPath, containerPath, readonly}>
|
||||
* }
|
||||
*
|
||||
* All fields are optional — a missing file or a partial file both resolve
|
||||
* to sensible defaults. Writes are atomic-enough (write-then-rename is not
|
||||
* worth the ceremony here since there's only one writer in practice: the
|
||||
* host, from the delivery thread that processes approved system actions).
|
||||
*/
|
||||
import fs from 'fs';
|
||||
import path from 'path';
|
||||
|
||||
import { GROUPS_DIR } from './config.js';
|
||||
|
||||
export interface McpServerConfig {
|
||||
command: string;
|
||||
args?: string[];
|
||||
env?: Record<string, string>;
|
||||
}
|
||||
|
||||
export interface AdditionalMountConfig {
|
||||
hostPath: string;
|
||||
containerPath: string;
|
||||
readonly?: boolean;
|
||||
}
|
||||
|
||||
export interface ContainerConfig {
|
||||
mcpServers: Record<string, McpServerConfig>;
|
||||
packages: { apt: string[]; npm: string[] };
|
||||
imageTag?: string;
|
||||
additionalMounts: AdditionalMountConfig[];
|
||||
}
|
||||
|
||||
function emptyConfig(): ContainerConfig {
|
||||
return {
|
||||
mcpServers: {},
|
||||
packages: { apt: [], npm: [] },
|
||||
additionalMounts: [],
|
||||
};
|
||||
}
|
||||
|
||||
function configPath(folder: string): string {
|
||||
return path.join(GROUPS_DIR, folder, 'container.json');
|
||||
}
|
||||
|
||||
/**
|
||||
* Read the container config for a group, returning sensible defaults for
|
||||
* any missing fields (or an entirely empty config if the file is absent).
|
||||
* Never throws for missing / malformed files — corruption logs a warning
|
||||
* via console.error and falls back to empty.
|
||||
*/
|
||||
export function readContainerConfig(folder: string): ContainerConfig {
|
||||
const p = configPath(folder);
|
||||
if (!fs.existsSync(p)) return emptyConfig();
|
||||
try {
|
||||
const raw = JSON.parse(fs.readFileSync(p, 'utf8')) as Partial<ContainerConfig>;
|
||||
return {
|
||||
mcpServers: raw.mcpServers ?? {},
|
||||
packages: {
|
||||
apt: raw.packages?.apt ?? [],
|
||||
npm: raw.packages?.npm ?? [],
|
||||
},
|
||||
imageTag: raw.imageTag,
|
||||
additionalMounts: raw.additionalMounts ?? [],
|
||||
};
|
||||
} catch (err) {
|
||||
console.error(`[container-config] failed to parse ${p}: ${String(err)}`);
|
||||
return emptyConfig();
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Write the container config for a group, creating the groups/<folder>/
|
||||
* directory if necessary. Pretty-printed JSON so diffs in the activation
|
||||
* flow are reviewable.
|
||||
*/
|
||||
export function writeContainerConfig(folder: string, config: ContainerConfig): void {
|
||||
const p = configPath(folder);
|
||||
const dir = path.dirname(p);
|
||||
if (!fs.existsSync(dir)) fs.mkdirSync(dir, { recursive: true });
|
||||
fs.writeFileSync(p, JSON.stringify(config, null, 2) + '\n');
|
||||
}
|
||||
|
||||
/**
|
||||
* Apply a mutator function to a group's container config and persist the
|
||||
* result. Convenient for append-style changes like `install_packages` and
|
||||
* `add_mcp_server` handlers.
|
||||
*/
|
||||
export function updateContainerConfig(
|
||||
folder: string,
|
||||
mutate: (config: ContainerConfig) => void,
|
||||
): ContainerConfig {
|
||||
const config = readContainerConfig(folder);
|
||||
mutate(config);
|
||||
writeContainerConfig(folder, config);
|
||||
return config;
|
||||
}
|
||||
|
||||
/**
|
||||
* Initialize an empty container.json for a group if one doesn't already
|
||||
* exist. Idempotent — used from `group-init.ts`.
|
||||
*/
|
||||
export function initContainerConfig(folder: string): boolean {
|
||||
const p = configPath(folder);
|
||||
if (fs.existsSync(p)) return false;
|
||||
writeContainerConfig(folder, emptyConfig());
|
||||
return true;
|
||||
}
|
||||
+52
-12
@@ -9,11 +9,15 @@ import path from 'path';
|
||||
|
||||
import { OneCLI } from '@onecli-sh/sdk';
|
||||
|
||||
import { worktreePathFor } from './builder-agent/worktree.js';
|
||||
import { CONTAINER_IMAGE, DATA_DIR, GROUPS_DIR, IDLE_TIMEOUT, ONECLI_URL, TIMEZONE } from './config.js';
|
||||
import { readContainerConfig, writeContainerConfig } from './container-config.js';
|
||||
import { CONTAINER_RUNTIME_BIN, hostGatewayArgs, readonlyMountArgs, stopContainer } from './container-runtime.js';
|
||||
import { getAgentGroup } from './db/agent-groups.js';
|
||||
import { getSwapForDevAgent } from './db/pending-swaps.js';
|
||||
import { getAdminsOfAgentGroup, getGlobalAdmins, getOwners } from './db/user-roles.js';
|
||||
import { initGroupFilesystem } from './group-init.js';
|
||||
import { stopTypingRefresh } from './delivery.js';
|
||||
import { log } from './log.js';
|
||||
import { validateAdditionalMounts } from './mount-security.js';
|
||||
import {
|
||||
@@ -85,6 +89,24 @@ async function spawnContainer(session: Session): Promise<void> {
|
||||
return;
|
||||
}
|
||||
|
||||
// Freeze gate: if this agent group is the dev_agent of an in-flight
|
||||
// swap that has already been submitted for approval (commit_sha set),
|
||||
// refuse to spawn the container. The dev agent stays offline through
|
||||
// the approval/deadman window so it can't make additional edits that
|
||||
// weren't part of what the approver reviewed. `bailSwapForRetry`
|
||||
// clears commit_sha, which implicitly unfreezes and allows the next
|
||||
// wake to spawn again.
|
||||
const devSwap = getSwapForDevAgent(agentGroup.id);
|
||||
if (devSwap && devSwap.commit_sha) {
|
||||
log.info('Refusing to spawn dev agent — frozen during code-change approval', {
|
||||
sessionId: session.id,
|
||||
agentGroup: agentGroup.name,
|
||||
requestId: devSwap.request_id,
|
||||
status: devSwap.status,
|
||||
});
|
||||
return;
|
||||
}
|
||||
|
||||
// Refresh the destination map and default reply routing so any admin
|
||||
// changes take effect on wake.
|
||||
writeDestinations(agentGroup.id, session.id);
|
||||
@@ -132,6 +154,7 @@ async function spawnContainer(session: Session): Promise<void> {
|
||||
clearTimeout(idleTimer);
|
||||
activeContainers.delete(session.id);
|
||||
markContainerStopped(session.id);
|
||||
stopTypingRefresh(session.id);
|
||||
log.info('Container exited', { sessionId: session.id, code, containerName });
|
||||
});
|
||||
|
||||
@@ -139,6 +162,7 @@ async function spawnContainer(session: Session): Promise<void> {
|
||||
clearTimeout(idleTimer);
|
||||
activeContainers.delete(session.id);
|
||||
markContainerStopped(session.id);
|
||||
stopTypingRefresh(session.id);
|
||||
log.error('Container spawn error', { sessionId: session.id, err });
|
||||
});
|
||||
}
|
||||
@@ -197,9 +221,27 @@ function buildMounts(agentGroup: AgentGroup, session: Session): VolumeMount[] {
|
||||
const groupRunnerDir = path.join(DATA_DIR, 'v2-sessions', agentGroup.id, 'agent-runner-src');
|
||||
mounts.push({ hostPath: groupRunnerDir, containerPath: '/app/src', readonly: false });
|
||||
|
||||
// Additional mounts from container config
|
||||
const containerConfig = agentGroup.container_config ? JSON.parse(agentGroup.container_config) : {};
|
||||
if (containerConfig.additionalMounts) {
|
||||
// Builder-agent worktree at /worktree — only added when this agent group
|
||||
// is the dev_agent of an in-flight swap. The dev agent edits the worktree
|
||||
// (a git copy of the repo) through this mount. Its own runtime code at
|
||||
// /app/src is unchanged — self-modification is structurally impossible.
|
||||
const swap = getSwapForDevAgent(agentGroup.id);
|
||||
if (swap) {
|
||||
const worktreeDir = worktreePathFor(swap.request_id);
|
||||
if (fs.existsSync(worktreeDir)) {
|
||||
mounts.push({ hostPath: worktreeDir, containerPath: '/worktree', readonly: false });
|
||||
} else {
|
||||
log.warn('Dev agent has in-flight swap but worktree dir is missing', {
|
||||
agentGroupId: agentGroup.id,
|
||||
requestId: swap.request_id,
|
||||
worktreeDir,
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
// Additional mounts from container config (groups/<folder>/container.json)
|
||||
const containerConfig = readContainerConfig(agentGroup.folder);
|
||||
if (containerConfig.additionalMounts && containerConfig.additionalMounts.length > 0) {
|
||||
const validated = validateAdditionalMounts(containerConfig.additionalMounts, agentGroup.name);
|
||||
mounts.push(...validated);
|
||||
}
|
||||
@@ -279,8 +321,8 @@ async function buildContainerArgs(
|
||||
}
|
||||
}
|
||||
|
||||
// Pass additional MCP servers from container config
|
||||
const containerConfig = agentGroup.container_config ? JSON.parse(agentGroup.container_config) : {};
|
||||
// Pass additional MCP servers from container config (groups/<folder>/container.json)
|
||||
const containerConfig = readContainerConfig(agentGroup.folder);
|
||||
if (containerConfig.mcpServers && Object.keys(containerConfig.mcpServers).length > 0) {
|
||||
args.push('-e', `NANOCLAW_MCP_SERVERS=${JSON.stringify(containerConfig.mcpServers)}`);
|
||||
}
|
||||
@@ -305,10 +347,9 @@ export async function buildAgentGroupImage(agentGroupId: string): Promise<void>
|
||||
const agentGroup = getAgentGroup(agentGroupId);
|
||||
if (!agentGroup) throw new Error('Agent group not found');
|
||||
|
||||
const containerConfig = agentGroup.container_config ? JSON.parse(agentGroup.container_config) : {};
|
||||
const packages = containerConfig.packages || { apt: [], npm: [] };
|
||||
const aptPackages = (packages.apt || []) as string[];
|
||||
const npmPackages = (packages.npm || []) as string[];
|
||||
const containerConfig = readContainerConfig(agentGroup.folder);
|
||||
const aptPackages = containerConfig.packages.apt;
|
||||
const npmPackages = containerConfig.packages.npm;
|
||||
|
||||
if (aptPackages.length === 0 && npmPackages.length === 0) {
|
||||
throw new Error('No packages to install. Use install_packages first.');
|
||||
@@ -340,10 +381,9 @@ export async function buildAgentGroupImage(agentGroupId: string): Promise<void>
|
||||
fs.unlinkSync(tmpDockerfile);
|
||||
}
|
||||
|
||||
// Store the image tag in container_config
|
||||
// Store the image tag in groups/<folder>/container.json
|
||||
containerConfig.imageTag = imageTag;
|
||||
const { updateAgentGroup } = await import('./db/agent-groups.js');
|
||||
updateAgentGroup(agentGroupId, { container_config: JSON.stringify(containerConfig) });
|
||||
writeContainerConfig(agentGroup.folder, containerConfig);
|
||||
|
||||
log.info('Per-agent-group image built', { agentGroupId, imageTag });
|
||||
}
|
||||
|
||||
@@ -8,9 +8,40 @@
|
||||
* namespace. The host uses this table both for routing (resolve name → ID)
|
||||
* and for permission checks (row exists ⇒ authorized).
|
||||
*/
|
||||
/**
|
||||
* ⚠️ DESTINATION PROJECTION INVARIANT — READ BEFORE ADDING NEW CALL SITES.
|
||||
*
|
||||
* `agent_destinations` in the central DB is the source of truth, but the
|
||||
* agent-runner container reads its destinations from a per-session
|
||||
* projection in `inbound.db`. That projection is written by
|
||||
* `writeDestinations(agentGroupId, sessionId)` in session-manager.ts.
|
||||
*
|
||||
* `spawnContainer` calls `writeDestinations` on every container wake, so a
|
||||
* fresh container always sees the latest destinations. BUT: a container
|
||||
* that is ALREADY running when you mutate the central table will keep
|
||||
* serving the stale projection until its next wake — the central write
|
||||
* does not propagate automatically.
|
||||
*
|
||||
* **Therefore: every time you call `createDestination` / `deleteDestination` /
|
||||
* `deleteAllDestinationsTouching` from code that runs while an agent's
|
||||
* container may be alive, you MUST also call `writeDestinations(agentGroupId,
|
||||
* sessionId)` for each affected session.** Forgetting this manifests as
|
||||
* "dropped: unknown destination" errors at send_message time.
|
||||
*
|
||||
* Affected call sites today (keep this list honest if you add more):
|
||||
* - src/delivery.ts::handleSystemAction case 'create_agent'
|
||||
* - src/builder-agent/handlers.ts::handleCreateDevAgent
|
||||
* - src/db/messaging-groups.ts::createMessagingGroupAgent
|
||||
*/
|
||||
import type { AgentDestination } from '../types.js';
|
||||
import { getDb } from './connection.js';
|
||||
|
||||
/**
|
||||
* ⚠️ Caller responsibility: after this returns, call
|
||||
* `writeDestinations(row.agent_group_id, <sessionId>)` for each active
|
||||
* session of that agent group so the change propagates to the running
|
||||
* container's inbound.db. See the top-of-file invariant.
|
||||
*/
|
||||
export function createDestination(row: AgentDestination): void {
|
||||
getDb()
|
||||
.prepare(
|
||||
@@ -51,12 +82,51 @@ export function hasDestination(agentGroupId: string, targetType: 'channel' | 'ag
|
||||
return !!row;
|
||||
}
|
||||
|
||||
/**
|
||||
* ⚠️ Caller responsibility: after this returns, call
|
||||
* `writeDestinations(agentGroupId, <sessionId>)` for each active session
|
||||
* so the deletion propagates to the running container's inbound.db.
|
||||
*/
|
||||
export function deleteDestination(agentGroupId: string, localName: string): void {
|
||||
getDb()
|
||||
.prepare('DELETE FROM agent_destinations WHERE agent_group_id = ? AND local_name = ?')
|
||||
.run(agentGroupId, localName);
|
||||
}
|
||||
|
||||
/**
|
||||
* Delete every destination row where this agent group is either the owner
|
||||
* or the target. Used when tearing down a dev agent after a swap request
|
||||
* completes/rolls-back — drops the bidirectional destinations in one call.
|
||||
*
|
||||
* ⚠️ Caller responsibility: not only does `agentGroupId`'s own session
|
||||
* projection need a refresh, but ALSO every OTHER agent group that had
|
||||
* `agentGroupId` as a destination target. Use `getDestinationReferencers`
|
||||
* below to find them BEFORE calling this (the rows are gone afterwards).
|
||||
*/
|
||||
export function deleteAllDestinationsTouching(agentGroupId: string): void {
|
||||
getDb()
|
||||
.prepare(
|
||||
'DELETE FROM agent_destinations WHERE agent_group_id = ? OR (target_type = ? AND target_id = ?)',
|
||||
)
|
||||
.run(agentGroupId, 'agent', agentGroupId);
|
||||
}
|
||||
|
||||
/**
|
||||
* Return the list of agent_group_ids that currently have a destination
|
||||
* row pointing at `targetAgentGroupId`. Call this BEFORE
|
||||
* `deleteAllDestinationsTouching` if you need to know whose session
|
||||
* projections to refresh after the delete — the rows are gone once the
|
||||
* delete runs.
|
||||
*/
|
||||
export function getDestinationReferencers(targetAgentGroupId: string): string[] {
|
||||
const rows = getDb()
|
||||
.prepare(
|
||||
"SELECT DISTINCT agent_group_id FROM agent_destinations WHERE target_type = 'agent' AND target_id = ? AND agent_group_id != ?",
|
||||
)
|
||||
.all(targetAgentGroupId, targetAgentGroupId) as Array<{ agent_group_id: string }>;
|
||||
return rows.map((r) => r.agent_group_id);
|
||||
}
|
||||
|
||||
/** Normalize a human-readable name into a lowercase, dash-separated identifier. */
|
||||
export function normalizeName(name: string): string {
|
||||
return (
|
||||
|
||||
@@ -4,8 +4,8 @@ import { getDb } from './connection.js';
|
||||
export function createAgentGroup(group: AgentGroup): void {
|
||||
getDb()
|
||||
.prepare(
|
||||
`INSERT INTO agent_groups (id, name, folder, agent_provider, container_config, created_at)
|
||||
VALUES (@id, @name, @folder, @agent_provider, @container_config, @created_at)`,
|
||||
`INSERT INTO agent_groups (id, name, folder, agent_provider, created_at)
|
||||
VALUES (@id, @name, @folder, @agent_provider, @created_at)`,
|
||||
)
|
||||
.run(group);
|
||||
}
|
||||
@@ -24,7 +24,7 @@ export function getAllAgentGroups(): AgentGroup[] {
|
||||
|
||||
export function updateAgentGroup(
|
||||
id: string,
|
||||
updates: Partial<Pick<AgentGroup, 'name' | 'agent_provider' | 'container_config'>>,
|
||||
updates: Partial<Pick<AgentGroup, 'name' | 'agent_provider'>>,
|
||||
): void {
|
||||
const fields: string[] = [];
|
||||
const values: Record<string, unknown> = { id };
|
||||
|
||||
@@ -66,7 +66,6 @@ describe('agent groups', () => {
|
||||
name: 'Test Agent',
|
||||
folder: 'test-agent',
|
||||
agent_provider: null,
|
||||
container_config: null,
|
||||
created_at: now(),
|
||||
});
|
||||
|
||||
@@ -163,7 +162,6 @@ describe('messaging group agents', () => {
|
||||
name: 'Agent',
|
||||
folder: 'agent',
|
||||
agent_provider: null,
|
||||
container_config: null,
|
||||
created_at: now(),
|
||||
});
|
||||
createMessagingGroup({
|
||||
@@ -202,7 +200,6 @@ describe('messaging group agents', () => {
|
||||
name: 'Agent2',
|
||||
folder: 'agent2',
|
||||
agent_provider: null,
|
||||
container_config: null,
|
||||
created_at: now(),
|
||||
});
|
||||
createMessagingGroupAgent({ ...mga(), id: 'mga-2', agent_group_id: 'ag-2', priority: 10 });
|
||||
@@ -285,7 +282,6 @@ describe('sessions', () => {
|
||||
name: 'Agent',
|
||||
folder: 'agent',
|
||||
agent_provider: null,
|
||||
container_config: null,
|
||||
created_at: now(),
|
||||
});
|
||||
createMessagingGroup({
|
||||
@@ -380,7 +376,6 @@ describe('pending questions', () => {
|
||||
name: 'Agent',
|
||||
folder: 'agent',
|
||||
agent_provider: null,
|
||||
container_config: null,
|
||||
created_at: now(),
|
||||
});
|
||||
createSession({
|
||||
|
||||
@@ -43,6 +43,7 @@ export {
|
||||
createSession,
|
||||
getSession,
|
||||
findSession,
|
||||
findSessionByAgentGroup,
|
||||
getSessionsByAgentGroup,
|
||||
getActiveSessions,
|
||||
getRunningSessions,
|
||||
@@ -64,3 +65,18 @@ export {
|
||||
updatePendingCredentialMessageId,
|
||||
deletePendingCredential,
|
||||
} from './credentials.js';
|
||||
export {
|
||||
createPendingSwap,
|
||||
getPendingSwap,
|
||||
getInFlightSwapForGroup,
|
||||
getSwapForDevAgent,
|
||||
getAwaitingConfirmationSwaps,
|
||||
getTerminalSwaps,
|
||||
updatePendingSwapStatus,
|
||||
setSwapPreSwapState,
|
||||
startSwapDeadman,
|
||||
extendSwapDeadman,
|
||||
setSwapHandshakeState,
|
||||
resetSwapForRetry,
|
||||
deletePendingSwap,
|
||||
} from './pending-swaps.js';
|
||||
|
||||
@@ -85,6 +85,20 @@ export function createMessagingGroupAgent(mga: MessagingGroupAgent): void {
|
||||
|
||||
// Auto-create an agent_destinations row so delivery's ACL doesn't block
|
||||
// outbound messages that target this chat.
|
||||
//
|
||||
// ⚠️ DESTINATION PROJECTION NOTE: this function only writes the central
|
||||
// `agent_destinations` row. It does NOT project into any running
|
||||
// agent's session inbound.db (see top-of-file invariant in
|
||||
// src/db/agent-destinations.ts). In practice this is fine because the
|
||||
// only real callers are one-shot setup scripts (setup/register.ts,
|
||||
// scripts/init-first-agent.ts, /manage-channels skill) that run in a
|
||||
// separate process from the host. Any already-running container for
|
||||
// `mga.agent_group_id` will keep serving the stale projection until
|
||||
// its next wake (idle timeout or next inbound message) at which
|
||||
// point spawnContainer's writeDestinations call refreshes from central.
|
||||
// If you call this from code that runs INSIDE the host process and
|
||||
// need the refresh to happen immediately, explicitly call
|
||||
// `writeDestinations(mga.agent_group_id, <sessionId>)` afterwards.
|
||||
const existing = getDestinationByTarget(mga.agent_group_id, 'channel', mga.messaging_group_id);
|
||||
if (existing) return;
|
||||
|
||||
|
||||
@@ -12,7 +12,6 @@ export const migration001: Migration = {
|
||||
name TEXT NOT NULL,
|
||||
folder TEXT NOT NULL UNIQUE,
|
||||
agent_provider TEXT,
|
||||
container_config TEXT,
|
||||
created_at TEXT NOT NULL
|
||||
);
|
||||
|
||||
|
||||
@@ -0,0 +1,44 @@
|
||||
import type { Migration } from './index.js';
|
||||
|
||||
/**
|
||||
* `pending_swaps` — backs the builder-agent self-modification flow. One row
|
||||
* per in-flight swap request from a dev agent. Everything swap-lifecycle fits
|
||||
* on one row: approval state, classification, pre-swap git SHA for rollback,
|
||||
* DB snapshot path, deadman timer, handshake state.
|
||||
*
|
||||
* Status transitions: pending_approval → awaiting_confirmation →
|
||||
* (finalized | rolled_back | rejected).
|
||||
*
|
||||
* Handshake state (only meaningful while status = awaiting_confirmation):
|
||||
* pending_restart → message1_sent → confirmed | rolled_back.
|
||||
*/
|
||||
export const migration006: Migration = {
|
||||
version: 6,
|
||||
name: 'pending-swaps',
|
||||
up(db) {
|
||||
db.exec(`
|
||||
CREATE TABLE pending_swaps (
|
||||
request_id TEXT PRIMARY KEY,
|
||||
dev_agent_id TEXT NOT NULL REFERENCES agent_groups(id),
|
||||
originating_group_id TEXT NOT NULL REFERENCES agent_groups(id),
|
||||
dev_branch TEXT NOT NULL,
|
||||
commit_sha TEXT NOT NULL,
|
||||
classification TEXT NOT NULL,
|
||||
status TEXT NOT NULL DEFAULT 'pending_approval',
|
||||
summary_json TEXT NOT NULL,
|
||||
pre_swap_sha TEXT,
|
||||
db_snapshot_path TEXT,
|
||||
deadman_started_at TEXT,
|
||||
deadman_expires_at TEXT,
|
||||
handshake_state TEXT,
|
||||
created_at TEXT NOT NULL
|
||||
);
|
||||
|
||||
CREATE INDEX idx_pending_swaps_originating_status
|
||||
ON pending_swaps(originating_group_id, status);
|
||||
|
||||
CREATE INDEX idx_pending_swaps_status
|
||||
ON pending_swaps(status);
|
||||
`);
|
||||
},
|
||||
};
|
||||
@@ -0,0 +1,43 @@
|
||||
import type { Migration } from './index.js';
|
||||
|
||||
/**
|
||||
* Retroactive schema fix: earlier migration 003 was edited after it had
|
||||
* already been applied in the wild, adding `title` and `options_json`
|
||||
* columns to its CREATE TABLE statement. Installs that ran 003 before the
|
||||
* edit don't have those columns, and `createPendingApproval` (which
|
||||
* inserts into both) fails with "no such column" at runtime.
|
||||
*
|
||||
* This migration adds the missing columns via ALTER TABLE so old installs
|
||||
* catch up. On a fresh install that runs 003 at its current definition,
|
||||
* the ALTER statements will fail harmlessly (column already exists) and
|
||||
* we swallow the error per-column.
|
||||
*/
|
||||
export const migration007: Migration = {
|
||||
version: 7,
|
||||
name: 'pending-approvals-title-options',
|
||||
up(db) {
|
||||
const addIfMissing = (col: string, sql: string): void => {
|
||||
try {
|
||||
db.exec(sql);
|
||||
} catch (err) {
|
||||
const msg = err instanceof Error ? err.message : String(err);
|
||||
if (msg.includes('duplicate column') || msg.includes('already exists')) {
|
||||
// Fresh install — column already added by the current 003
|
||||
// definition. Nothing to do.
|
||||
return;
|
||||
}
|
||||
throw err;
|
||||
}
|
||||
void col;
|
||||
};
|
||||
|
||||
addIfMissing(
|
||||
'title',
|
||||
`ALTER TABLE pending_approvals ADD COLUMN title TEXT NOT NULL DEFAULT ''`,
|
||||
);
|
||||
addIfMissing(
|
||||
'options_json',
|
||||
`ALTER TABLE pending_approvals ADD COLUMN options_json TEXT NOT NULL DEFAULT '[]'`,
|
||||
);
|
||||
},
|
||||
};
|
||||
@@ -6,6 +6,8 @@ import { migration002 } from './002-chat-sdk-state.js';
|
||||
import { migration003 } from './003-pending-approvals.js';
|
||||
import { migration004 } from './004-agent-destinations.js';
|
||||
import { migration005 } from './005-pending-credentials.js';
|
||||
import { migration006 } from './006-pending-swaps.js';
|
||||
import { migration007 } from './007-pending-approvals-title-options.js';
|
||||
|
||||
export interface Migration {
|
||||
version: number;
|
||||
@@ -13,7 +15,15 @@ export interface Migration {
|
||||
up: (db: Database.Database) => void;
|
||||
}
|
||||
|
||||
const migrations: Migration[] = [migration001, migration002, migration003, migration004, migration005];
|
||||
const migrations: Migration[] = [
|
||||
migration001,
|
||||
migration002,
|
||||
migration003,
|
||||
migration004,
|
||||
migration005,
|
||||
migration006,
|
||||
migration007,
|
||||
];
|
||||
|
||||
export function runMigrations(db: Database.Database): void {
|
||||
db.exec(`
|
||||
|
||||
@@ -0,0 +1,195 @@
|
||||
import { afterEach, beforeEach, describe, expect, it } from 'vitest';
|
||||
|
||||
import { closeDb, initTestDb } from './connection.js';
|
||||
import { createAgentGroup } from './agent-groups.js';
|
||||
import { runMigrations } from './migrations/index.js';
|
||||
import {
|
||||
createPendingSwap,
|
||||
deletePendingSwap,
|
||||
extendSwapDeadman,
|
||||
getAwaitingConfirmationSwaps,
|
||||
getInFlightSwapForGroup,
|
||||
getPendingSwap,
|
||||
getSwapForDevAgent,
|
||||
getTerminalSwaps,
|
||||
setSwapHandshakeState,
|
||||
setSwapPreSwapState,
|
||||
startSwapDeadman,
|
||||
updatePendingSwapStatus,
|
||||
} from './pending-swaps.js';
|
||||
import type { AgentGroup, PendingSwap } from '../types.js';
|
||||
|
||||
function makeAgentGroup(id: string, folder: string): AgentGroup {
|
||||
return {
|
||||
id,
|
||||
name: folder,
|
||||
folder,
|
||||
agent_provider: null,
|
||||
created_at: '2026-04-15T00:00:00Z',
|
||||
};
|
||||
}
|
||||
|
||||
function makeSwap(overrides: Partial<PendingSwap> = {}): PendingSwap {
|
||||
return {
|
||||
request_id: 'req-1',
|
||||
dev_agent_id: 'ag-dev',
|
||||
originating_group_id: 'ag-origin',
|
||||
dev_branch: 'dev/req-1',
|
||||
commit_sha: '',
|
||||
classification: 'group',
|
||||
status: 'pending_approval',
|
||||
summary_json: JSON.stringify({ overallSummary: 'test', classifiedFiles: [] }),
|
||||
pre_swap_sha: null,
|
||||
db_snapshot_path: null,
|
||||
deadman_started_at: null,
|
||||
deadman_expires_at: null,
|
||||
handshake_state: null,
|
||||
created_at: '2026-04-15T00:00:00Z',
|
||||
...overrides,
|
||||
};
|
||||
}
|
||||
|
||||
beforeEach(() => {
|
||||
const db = initTestDb();
|
||||
runMigrations(db);
|
||||
// Both dev_agent_id and originating_group_id are FK to agent_groups.
|
||||
createAgentGroup(makeAgentGroup('ag-origin', 'origin-folder'));
|
||||
createAgentGroup(makeAgentGroup('ag-dev', 'dev-folder'));
|
||||
});
|
||||
|
||||
afterEach(() => {
|
||||
closeDb();
|
||||
});
|
||||
|
||||
describe('pending-swaps CRUD', () => {
|
||||
it('createPendingSwap then getPendingSwap round-trips all fields', () => {
|
||||
const swap = makeSwap({
|
||||
request_id: 'req-roundtrip',
|
||||
commit_sha: 'sha-xyz',
|
||||
summary_json: JSON.stringify({ overallSummary: 'round trip' }),
|
||||
});
|
||||
createPendingSwap(swap);
|
||||
|
||||
const got = getPendingSwap('req-roundtrip');
|
||||
expect(got).toBeDefined();
|
||||
expect(got!.request_id).toBe('req-roundtrip');
|
||||
expect(got!.commit_sha).toBe('sha-xyz');
|
||||
expect(got!.classification).toBe('group');
|
||||
expect(got!.status).toBe('pending_approval');
|
||||
// Default status comes from schema; parsed summary survives.
|
||||
expect(JSON.parse(got!.summary_json).overallSummary).toBe('round trip');
|
||||
});
|
||||
|
||||
it('getPendingSwap returns undefined for missing id', () => {
|
||||
expect(getPendingSwap('does-not-exist')).toBeUndefined();
|
||||
});
|
||||
|
||||
it('deletePendingSwap removes the row', () => {
|
||||
createPendingSwap(makeSwap({ request_id: 'req-del' }));
|
||||
deletePendingSwap('req-del');
|
||||
expect(getPendingSwap('req-del')).toBeUndefined();
|
||||
});
|
||||
});
|
||||
|
||||
describe('pending-swaps lookup by group / dev agent', () => {
|
||||
it('getInFlightSwapForGroup returns pending_approval rows', () => {
|
||||
createPendingSwap(makeSwap({ request_id: 'req-a', status: 'pending_approval' }));
|
||||
const got = getInFlightSwapForGroup('ag-origin');
|
||||
expect(got?.request_id).toBe('req-a');
|
||||
});
|
||||
|
||||
it('getInFlightSwapForGroup returns awaiting_confirmation rows', () => {
|
||||
createPendingSwap(makeSwap({ request_id: 'req-b', status: 'awaiting_confirmation' }));
|
||||
const got = getInFlightSwapForGroup('ag-origin');
|
||||
expect(got?.request_id).toBe('req-b');
|
||||
});
|
||||
|
||||
it('getInFlightSwapForGroup does NOT return terminal rows', () => {
|
||||
createPendingSwap(makeSwap({ request_id: 'req-c', status: 'finalized' }));
|
||||
expect(getInFlightSwapForGroup('ag-origin')).toBeUndefined();
|
||||
createPendingSwap(makeSwap({ request_id: 'req-d', status: 'rolled_back' }));
|
||||
expect(getInFlightSwapForGroup('ag-origin')).toBeUndefined();
|
||||
createPendingSwap(makeSwap({ request_id: 'req-e', status: 'rejected' }));
|
||||
expect(getInFlightSwapForGroup('ag-origin')).toBeUndefined();
|
||||
});
|
||||
|
||||
it('getSwapForDevAgent returns the row where dev_agent_id matches', () => {
|
||||
createPendingSwap(makeSwap({ request_id: 'req-f' }));
|
||||
const got = getSwapForDevAgent('ag-dev');
|
||||
expect(got?.request_id).toBe('req-f');
|
||||
});
|
||||
|
||||
it('getSwapForDevAgent returns undefined for unrelated dev agent', () => {
|
||||
createPendingSwap(makeSwap({ request_id: 'req-g' }));
|
||||
expect(getSwapForDevAgent('ag-unrelated')).toBeUndefined();
|
||||
});
|
||||
});
|
||||
|
||||
describe('pending-swaps status transitions', () => {
|
||||
it('updatePendingSwapStatus transitions through the lifecycle', () => {
|
||||
createPendingSwap(makeSwap({ request_id: 'req-life' }));
|
||||
|
||||
updatePendingSwapStatus('req-life', 'awaiting_confirmation');
|
||||
expect(getPendingSwap('req-life')!.status).toBe('awaiting_confirmation');
|
||||
|
||||
updatePendingSwapStatus('req-life', 'finalized');
|
||||
expect(getPendingSwap('req-life')!.status).toBe('finalized');
|
||||
});
|
||||
|
||||
it('setSwapPreSwapState populates pre_swap_sha + db_snapshot_path', () => {
|
||||
createPendingSwap(makeSwap({ request_id: 'req-pre' }));
|
||||
setSwapPreSwapState('req-pre', 'sha-pre', '/tmp/snap.sqlite');
|
||||
const got = getPendingSwap('req-pre')!;
|
||||
expect(got.pre_swap_sha).toBe('sha-pre');
|
||||
expect(got.db_snapshot_path).toBe('/tmp/snap.sqlite');
|
||||
});
|
||||
|
||||
it('startSwapDeadman transitions to awaiting_confirmation and sets deadman fields', () => {
|
||||
createPendingSwap(makeSwap({ request_id: 'req-dead' }));
|
||||
startSwapDeadman('req-dead', '2026-04-15T01:00:00Z', '2026-04-15T01:02:00Z', 'pending_restart');
|
||||
const got = getPendingSwap('req-dead')!;
|
||||
expect(got.status).toBe('awaiting_confirmation');
|
||||
expect(got.deadman_started_at).toBe('2026-04-15T01:00:00Z');
|
||||
expect(got.deadman_expires_at).toBe('2026-04-15T01:02:00Z');
|
||||
expect(got.handshake_state).toBe('pending_restart');
|
||||
});
|
||||
|
||||
it('extendSwapDeadman updates only deadman_expires_at', () => {
|
||||
createPendingSwap(makeSwap({ request_id: 'req-ext' }));
|
||||
startSwapDeadman('req-ext', '2026-04-15T01:00:00Z', '2026-04-15T01:02:00Z', 'pending_restart');
|
||||
extendSwapDeadman('req-ext', '2026-04-15T01:05:00Z');
|
||||
const got = getPendingSwap('req-ext')!;
|
||||
expect(got.deadman_expires_at).toBe('2026-04-15T01:05:00Z');
|
||||
expect(got.deadman_started_at).toBe('2026-04-15T01:00:00Z');
|
||||
expect(got.handshake_state).toBe('pending_restart');
|
||||
});
|
||||
|
||||
it('setSwapHandshakeState updates only the handshake state', () => {
|
||||
createPendingSwap(makeSwap({ request_id: 'req-hs' }));
|
||||
startSwapDeadman('req-hs', '2026-04-15T01:00:00Z', '2026-04-15T01:02:00Z', 'pending_restart');
|
||||
setSwapHandshakeState('req-hs', 'message1_sent');
|
||||
expect(getPendingSwap('req-hs')!.handshake_state).toBe('message1_sent');
|
||||
});
|
||||
});
|
||||
|
||||
describe('pending-swaps bulk lookups', () => {
|
||||
it('getAwaitingConfirmationSwaps returns only that status', () => {
|
||||
createPendingSwap(makeSwap({ request_id: 'req-pending', status: 'pending_approval' }));
|
||||
createPendingSwap(makeSwap({ request_id: 'req-await', status: 'awaiting_confirmation' }));
|
||||
createPendingSwap(makeSwap({ request_id: 'req-final', status: 'finalized' }));
|
||||
|
||||
const got = getAwaitingConfirmationSwaps();
|
||||
expect(got).toHaveLength(1);
|
||||
expect(got[0].request_id).toBe('req-await');
|
||||
});
|
||||
|
||||
it('getTerminalSwaps returns rows in terminal statuses', () => {
|
||||
createPendingSwap(makeSwap({ request_id: 'req-t1', status: 'finalized' }));
|
||||
createPendingSwap(makeSwap({ request_id: 'req-t2', status: 'rolled_back' }));
|
||||
createPendingSwap(makeSwap({ request_id: 'req-t3', status: 'rejected' }));
|
||||
createPendingSwap(makeSwap({ request_id: 'req-active', status: 'awaiting_confirmation' }));
|
||||
|
||||
const terminal = getTerminalSwaps().map((s) => s.request_id).sort();
|
||||
expect(terminal).toEqual(['req-t1', 'req-t2', 'req-t3']);
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,151 @@
|
||||
import type { PendingSwap, SwapHandshakeState, SwapStatus } from '../types.js';
|
||||
import { getDb } from './connection.js';
|
||||
|
||||
export function createPendingSwap(swap: PendingSwap): void {
|
||||
getDb()
|
||||
.prepare(
|
||||
`INSERT INTO pending_swaps (
|
||||
request_id, dev_agent_id, originating_group_id, dev_branch, commit_sha,
|
||||
classification, status, summary_json, pre_swap_sha, db_snapshot_path,
|
||||
deadman_started_at, deadman_expires_at, handshake_state, created_at
|
||||
) VALUES (
|
||||
@request_id, @dev_agent_id, @originating_group_id, @dev_branch, @commit_sha,
|
||||
@classification, @status, @summary_json, @pre_swap_sha, @db_snapshot_path,
|
||||
@deadman_started_at, @deadman_expires_at, @handshake_state, @created_at
|
||||
)`,
|
||||
)
|
||||
.run(swap);
|
||||
}
|
||||
|
||||
export function getPendingSwap(requestId: string): PendingSwap | undefined {
|
||||
return getDb().prepare('SELECT * FROM pending_swaps WHERE request_id = ?').get(requestId) as
|
||||
| PendingSwap
|
||||
| undefined;
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the in-flight swap for an originating group, if any. "In-flight"
|
||||
* means not in a terminal status (finalized / rolled_back / rejected).
|
||||
* Used to enforce one-swap-per-originating-group serialization.
|
||||
*/
|
||||
export function getInFlightSwapForGroup(originatingGroupId: string): PendingSwap | undefined {
|
||||
return getDb()
|
||||
.prepare(
|
||||
`SELECT * FROM pending_swaps
|
||||
WHERE originating_group_id = ?
|
||||
AND status IN ('pending_approval', 'awaiting_confirmation')
|
||||
LIMIT 1`,
|
||||
)
|
||||
.get(originatingGroupId) as PendingSwap | undefined;
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the in-flight swap for a dev-agent group. Used by the container
|
||||
* runner to decide whether to mount the worktree on the dev agent's container.
|
||||
*/
|
||||
export function getSwapForDevAgent(devAgentId: string): PendingSwap | undefined {
|
||||
return getDb()
|
||||
.prepare(
|
||||
`SELECT * FROM pending_swaps
|
||||
WHERE dev_agent_id = ?
|
||||
AND status IN ('pending_approval', 'awaiting_confirmation')
|
||||
LIMIT 1`,
|
||||
)
|
||||
.get(devAgentId) as PendingSwap | undefined;
|
||||
}
|
||||
|
||||
/**
|
||||
* All swaps currently in `awaiting_confirmation` — used by the startup sweep
|
||||
* to resume deadmans after a host restart (expected for host-level swaps,
|
||||
* unexpected for group-level crashes).
|
||||
*/
|
||||
export function getAwaitingConfirmationSwaps(): PendingSwap[] {
|
||||
return getDb()
|
||||
.prepare(`SELECT * FROM pending_swaps WHERE status = 'awaiting_confirmation'`)
|
||||
.all() as PendingSwap[];
|
||||
}
|
||||
|
||||
/** All terminal-status swaps — used by the startup worktree-orphan sweep. */
|
||||
export function getTerminalSwaps(): PendingSwap[] {
|
||||
return getDb()
|
||||
.prepare(`SELECT * FROM pending_swaps WHERE status IN ('finalized', 'rolled_back', 'rejected')`)
|
||||
.all() as PendingSwap[];
|
||||
}
|
||||
|
||||
export function updatePendingSwapStatus(requestId: string, status: SwapStatus): void {
|
||||
getDb().prepare('UPDATE pending_swaps SET status = ? WHERE request_id = ?').run(status, requestId);
|
||||
}
|
||||
|
||||
export function setSwapPreSwapState(
|
||||
requestId: string,
|
||||
preSwapSha: string,
|
||||
dbSnapshotPath: string,
|
||||
): void {
|
||||
getDb()
|
||||
.prepare(
|
||||
`UPDATE pending_swaps
|
||||
SET pre_swap_sha = ?, db_snapshot_path = ?
|
||||
WHERE request_id = ?`,
|
||||
)
|
||||
.run(preSwapSha, dbSnapshotPath, requestId);
|
||||
}
|
||||
|
||||
export function startSwapDeadman(
|
||||
requestId: string,
|
||||
startedAt: string,
|
||||
expiresAt: string,
|
||||
handshakeState: SwapHandshakeState,
|
||||
): void {
|
||||
getDb()
|
||||
.prepare(
|
||||
`UPDATE pending_swaps
|
||||
SET status = 'awaiting_confirmation',
|
||||
deadman_started_at = ?,
|
||||
deadman_expires_at = ?,
|
||||
handshake_state = ?
|
||||
WHERE request_id = ?`,
|
||||
)
|
||||
.run(startedAt, expiresAt, handshakeState, requestId);
|
||||
}
|
||||
|
||||
export function extendSwapDeadman(requestId: string, expiresAt: string): void {
|
||||
getDb().prepare('UPDATE pending_swaps SET deadman_expires_at = ? WHERE request_id = ?').run(
|
||||
expiresAt,
|
||||
requestId,
|
||||
);
|
||||
}
|
||||
|
||||
export function setSwapHandshakeState(requestId: string, state: SwapHandshakeState): void {
|
||||
getDb().prepare('UPDATE pending_swaps SET handshake_state = ? WHERE request_id = ?').run(
|
||||
state,
|
||||
requestId,
|
||||
);
|
||||
}
|
||||
|
||||
export function deletePendingSwap(requestId: string): void {
|
||||
getDb().prepare('DELETE FROM pending_swaps WHERE request_id = ?').run(requestId);
|
||||
}
|
||||
|
||||
/**
|
||||
* Reset a swap back to `pending_approval` after a post-approval failure
|
||||
* (apply / commit / build error). Clears the in-progress fields so a
|
||||
* subsequent `request_swap` call from the dev agent starts clean. Leaves
|
||||
* the dev_agent_id + originating_group_id + dev_branch intact so the dev
|
||||
* agent can fix the issue in its worktree and retry without having to
|
||||
* spin up a fresh dev agent.
|
||||
*/
|
||||
export function resetSwapForRetry(requestId: string): void {
|
||||
getDb()
|
||||
.prepare(
|
||||
`UPDATE pending_swaps
|
||||
SET status = 'pending_approval',
|
||||
commit_sha = '',
|
||||
pre_swap_sha = NULL,
|
||||
db_snapshot_path = NULL,
|
||||
deadman_started_at = NULL,
|
||||
deadman_expires_at = NULL,
|
||||
handshake_state = NULL
|
||||
WHERE request_id = ?`,
|
||||
)
|
||||
.run(requestId);
|
||||
}
|
||||
+3
-2
@@ -5,14 +5,15 @@
|
||||
*/
|
||||
|
||||
export const SCHEMA = `
|
||||
-- Agent workspaces: folder, skills, CLAUDE.md, container config.
|
||||
-- Agent workspaces: folder, skills, CLAUDE.md.
|
||||
-- All workspaces are equal; privilege lives on users, not groups.
|
||||
-- Container config (mcpServers, packages, imageTag, additionalMounts) lives
|
||||
-- in groups/<folder>/container.json on disk, not in the DB.
|
||||
CREATE TABLE agent_groups (
|
||||
id TEXT PRIMARY KEY,
|
||||
name TEXT NOT NULL,
|
||||
folder TEXT NOT NULL UNIQUE,
|
||||
agent_provider TEXT,
|
||||
container_config TEXT,
|
||||
created_at TEXT NOT NULL
|
||||
);
|
||||
|
||||
|
||||
+175
-3
@@ -37,6 +37,7 @@ import {
|
||||
import { log } from './log.js';
|
||||
import { normalizeOptions, type RawOption } from './channels/ask-question.js';
|
||||
import {
|
||||
heartbeatPath,
|
||||
openInboundDb,
|
||||
openOutboundDb,
|
||||
sessionDir,
|
||||
@@ -186,6 +187,139 @@ export async function triggerTyping(channelType: string, platformId: string, thr
|
||||
}
|
||||
}
|
||||
|
||||
// ── Typing refresh ──
|
||||
// Most platforms expire a typing indicator after 5–10s, so a one-shot call
|
||||
// on message arrival goes stale long before the agent finishes thinking.
|
||||
// We keep it alive by re-firing setTyping on a short interval — but only
|
||||
// while the agent is actually WORKING, not just while the container is
|
||||
// alive. The agent-runner touches `heartbeat` on every SDK event, so we
|
||||
// gate each tick on "is the heartbeat file fresh?". If it goes stale (agent
|
||||
// finished its turn and is idle-polling), the refresh stops on its own
|
||||
// without waiting for the container to exit.
|
||||
//
|
||||
// After delivering a user-facing message, the refresh is paused for
|
||||
// POST_DELIVERY_PAUSE_MS — long enough for the client-side typing
|
||||
// indicator to visually clear (Discord ~10s, Telegram ~5s). If the agent
|
||||
// keeps touching heartbeat past the pause window, typing resumes
|
||||
// naturally on the next refresh tick.
|
||||
//
|
||||
// `startTypingRefresh` is idempotent per session. `stopTypingRefresh` is
|
||||
// called from container-runner.ts on container exit as a fast-path cleanup
|
||||
// (the heartbeat-staleness path would catch it within one tick anyway).
|
||||
const TYPING_REFRESH_MS = 4000;
|
||||
// Grace window from startTypingRefresh: fire typing unconditionally for
|
||||
// this long regardless of heartbeat state. Covers container spawn/wake
|
||||
// latency, which can be 5–12s on a cold start before the first heartbeat
|
||||
// touch lands.
|
||||
const TYPING_GRACE_MS = 15000;
|
||||
// After the grace window, a heartbeat must be mtimed within this many
|
||||
// milliseconds of now to count as "agent is working." Heartbeats are
|
||||
// touched on every SDK event (tool calls, result chunks), so during
|
||||
// active work they land every few hundred ms. 6s is well above that
|
||||
// while still being small enough to stop typing quickly when the agent
|
||||
// goes idle.
|
||||
const HEARTBEAT_FRESH_MS = 6000;
|
||||
// After we deliver a user-facing message, pause typing for this long so
|
||||
// the client-side indicator has time to visually clear. Tuned for the
|
||||
// longest common client expiry (Discord ~10s). The interval stays
|
||||
// running; ticks inside the pause just skip the setTyping call.
|
||||
const POST_DELIVERY_PAUSE_MS = 10000;
|
||||
|
||||
interface TypingTarget {
|
||||
agentGroupId: string;
|
||||
channelType: string;
|
||||
platformId: string;
|
||||
threadId: string | null;
|
||||
interval: NodeJS.Timeout;
|
||||
startedAt: number;
|
||||
pausedUntil: number; // epoch ms; 0 = not paused
|
||||
}
|
||||
|
||||
const typingRefreshers = new Map<string, TypingTarget>();
|
||||
|
||||
function isHeartbeatFresh(agentGroupId: string, sessionId: string): boolean {
|
||||
const hbPath = heartbeatPath(agentGroupId, sessionId);
|
||||
try {
|
||||
const stat = fs.statSync(hbPath);
|
||||
return Date.now() - stat.mtimeMs < HEARTBEAT_FRESH_MS;
|
||||
} catch {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
export function startTypingRefresh(
|
||||
sessionId: string,
|
||||
agentGroupId: string,
|
||||
channelType: string,
|
||||
platformId: string,
|
||||
threadId: string | null,
|
||||
): void {
|
||||
const existing = typingRefreshers.get(sessionId);
|
||||
if (existing) {
|
||||
// Already refreshing. Fire an immediate tick for the new inbound
|
||||
// event and reset the grace window — the new message restarts the
|
||||
// container-wake latency budget. Also clear any lingering
|
||||
// post-delivery pause: a new inbound means the user expects typing
|
||||
// to show immediately.
|
||||
triggerTyping(channelType, platformId, threadId).catch(() => {});
|
||||
existing.startedAt = Date.now();
|
||||
existing.pausedUntil = 0;
|
||||
return;
|
||||
}
|
||||
|
||||
// Immediate tick + periodic refresh.
|
||||
triggerTyping(channelType, platformId, threadId).catch(() => {});
|
||||
const startedAt = Date.now();
|
||||
const interval = setInterval(() => {
|
||||
const entry = typingRefreshers.get(sessionId);
|
||||
if (!entry) return; // stopped externally since this tick was scheduled
|
||||
|
||||
// Inside a post-delivery pause: skip setTyping but keep the interval
|
||||
// running so we resume automatically once the pause expires.
|
||||
if (entry.pausedUntil > Date.now()) return;
|
||||
|
||||
const withinGrace = Date.now() - entry.startedAt < TYPING_GRACE_MS;
|
||||
if (withinGrace || isHeartbeatFresh(entry.agentGroupId, sessionId)) {
|
||||
triggerTyping(entry.channelType, entry.platformId, entry.threadId).catch(() => {});
|
||||
return;
|
||||
}
|
||||
|
||||
// Out of grace AND heartbeat stale — agent is idle, stop refreshing.
|
||||
clearInterval(entry.interval);
|
||||
typingRefreshers.delete(sessionId);
|
||||
}, TYPING_REFRESH_MS);
|
||||
// unref so a stale refresher can't hold the event loop alive.
|
||||
interval.unref();
|
||||
typingRefreshers.set(sessionId, {
|
||||
agentGroupId,
|
||||
channelType,
|
||||
platformId,
|
||||
threadId,
|
||||
interval,
|
||||
startedAt,
|
||||
pausedUntil: 0,
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Pause the typing refresh for POST_DELIVERY_PAUSE_MS. Called after a
|
||||
* user-facing message is delivered so the client-side indicator has a
|
||||
* chance to visually clear before the agent's next SDK event pushes it
|
||||
* back on. No-op if no refresh is active for this session.
|
||||
*/
|
||||
export function pauseTypingRefreshAfterDelivery(sessionId: string): void {
|
||||
const entry = typingRefreshers.get(sessionId);
|
||||
if (!entry) return;
|
||||
entry.pausedUntil = Date.now() + POST_DELIVERY_PAUSE_MS;
|
||||
}
|
||||
|
||||
export function stopTypingRefresh(sessionId: string): void {
|
||||
const entry = typingRefreshers.get(sessionId);
|
||||
if (!entry) return;
|
||||
clearInterval(entry.interval);
|
||||
typingRefreshers.delete(sessionId);
|
||||
}
|
||||
|
||||
/** Start the active container poll loop (~1s). */
|
||||
export function startActiveDeliveryPoll(): void {
|
||||
if (activePolling) return;
|
||||
@@ -262,6 +396,16 @@ async function deliverSessionMessages(session: Session): Promise<void> {
|
||||
markDelivered(inDb, msg.id, platformMsgId ?? null);
|
||||
deliveryAttempts.delete(msg.id);
|
||||
resetContainerIdleTimer(session.id);
|
||||
|
||||
// Pause the typing indicator after a real user-facing message
|
||||
// lands on the user's screen, so the client has time to visually
|
||||
// clear the indicator before the next heartbeat tick brings it
|
||||
// back. Skip the pause for internal traffic (system actions,
|
||||
// agent-to-agent routing) — the user doesn't see those and
|
||||
// shouldn't get a gap in their typing indicator for them.
|
||||
if (msg.kind !== 'system' && msg.channel_type !== 'agent') {
|
||||
pauseTypingRefreshAfterDelivery(session.id);
|
||||
}
|
||||
} catch (err) {
|
||||
const attempts = (deliveryAttempts.get(msg.id) ?? 0) + 1;
|
||||
deliveryAttempts.set(msg.id, attempts);
|
||||
@@ -557,7 +701,6 @@ async function handleSystemAction(
|
||||
name,
|
||||
folder,
|
||||
agent_provider: null,
|
||||
container_config: null,
|
||||
created_at: now,
|
||||
};
|
||||
createAgentGroup(newGroup);
|
||||
@@ -588,8 +731,11 @@ async function handleSystemAction(
|
||||
created_at: now,
|
||||
});
|
||||
|
||||
// Refresh the creator's destination map so the new child appears
|
||||
// immediately on the next query — no restart needed.
|
||||
// REQUIRED: project the new destination into the running
|
||||
// container's inbound.db. See the top-of-file invariant in
|
||||
// src/db/agent-destinations.ts — forgetting this causes
|
||||
// "dropped: unknown destination" when the parent tries to send
|
||||
// to the newly-created child.
|
||||
writeDestinations(session.agent_group_id, session.id);
|
||||
|
||||
// Fire-and-forget notification back to the creator
|
||||
@@ -705,6 +851,32 @@ async function handleSystemAction(
|
||||
break;
|
||||
}
|
||||
|
||||
case 'create_dev_agent': {
|
||||
const { handleCreateDevAgent } = await import('./builder-agent/handlers.js');
|
||||
await handleCreateDevAgent(
|
||||
{
|
||||
requestId: content.requestId as string,
|
||||
name: content.name as string,
|
||||
},
|
||||
session,
|
||||
notifyAgent,
|
||||
);
|
||||
break;
|
||||
}
|
||||
|
||||
case 'request_swap': {
|
||||
const { handleRequestSwap } = await import('./builder-agent/handlers.js');
|
||||
await handleRequestSwap(
|
||||
{
|
||||
perFileSummaries: (content.perFileSummaries as Record<string, string>) || {},
|
||||
overallSummary: (content.overallSummary as string) || '',
|
||||
},
|
||||
session,
|
||||
notifyAgent,
|
||||
);
|
||||
break;
|
||||
}
|
||||
|
||||
default:
|
||||
log.warn('Unknown system action', { action });
|
||||
}
|
||||
|
||||
@@ -2,6 +2,7 @@ import fs from 'fs';
|
||||
import path from 'path';
|
||||
|
||||
import { DATA_DIR, GROUPS_DIR } from './config.js';
|
||||
import { initContainerConfig } from './container-config.js';
|
||||
import { log } from './log.js';
|
||||
import type { AgentGroup } from './types.js';
|
||||
|
||||
@@ -76,6 +77,13 @@ export function initGroupFilesystem(group: AgentGroup, opts?: { instructions?: s
|
||||
initialized.push('CLAUDE.md');
|
||||
}
|
||||
|
||||
// groups/<folder>/container.json — empty container config, replaces the
|
||||
// former agent_groups.container_config DB column. Self-modification flows
|
||||
// read and write this file directly.
|
||||
if (initContainerConfig(group.folder)) {
|
||||
initialized.push('container.json');
|
||||
}
|
||||
|
||||
// 2. data/v2-sessions/<id>/.claude-shared/ — Claude state + per-group skills
|
||||
const claudeDir = path.join(DATA_DIR, 'v2-sessions', group.id, '.claude-shared');
|
||||
if (!fs.existsSync(claudeDir)) {
|
||||
|
||||
@@ -70,7 +70,6 @@ describe('session manager', () => {
|
||||
name: 'Test Agent',
|
||||
folder: 'test-agent',
|
||||
agent_provider: null,
|
||||
container_config: null,
|
||||
created_at: now(),
|
||||
});
|
||||
createMessagingGroup({
|
||||
@@ -185,7 +184,6 @@ describe('router', () => {
|
||||
name: 'Test Agent',
|
||||
folder: 'test-agent',
|
||||
agent_provider: null,
|
||||
container_config: null,
|
||||
created_at: now(),
|
||||
});
|
||||
// Use 'public' policy so the router tests exercise routing, not the
|
||||
@@ -308,7 +306,6 @@ describe('delivery', () => {
|
||||
name: 'Agent',
|
||||
folder: 'agent',
|
||||
agent_provider: null,
|
||||
container_config: null,
|
||||
created_at: now(),
|
||||
});
|
||||
createMessagingGroup({
|
||||
|
||||
+207
-14
@@ -4,11 +4,27 @@
|
||||
* Thin orchestrator: init DB, run migrations, start channel adapters,
|
||||
* start delivery polls, start sweep, handle shutdown.
|
||||
*/
|
||||
import { execFileSync } from 'child_process';
|
||||
import path from 'path';
|
||||
|
||||
import { setSwapApprovalDelivery } from './builder-agent/approval.js';
|
||||
import { handleSwapConfirmationResponse, setDeadmanDelivery, startDeadman } from './builder-agent/deadman.js';
|
||||
import { handlePromoteResponse, setPromoteDelivery } from './builder-agent/promote.js';
|
||||
import { runBuilderAgentStartupSweep } from './builder-agent/startup.js';
|
||||
import {
|
||||
applySwapFiles,
|
||||
bailSwapForRetry,
|
||||
captureSwapPreState,
|
||||
commitSwap,
|
||||
isHostLevelSwap,
|
||||
parseSwapSummary,
|
||||
requiresFullHostRebuild,
|
||||
} from './builder-agent/swap.js';
|
||||
import { removeDevWorktree } from './builder-agent/worktree.js';
|
||||
import { DATA_DIR } from './config.js';
|
||||
import { initDb } from './db/connection.js';
|
||||
import { runMigrations } from './db/migrations/index.js';
|
||||
import { getPendingSwap, updatePendingSwapStatus } from './db/pending-swaps.js';
|
||||
import { getMessagingGroupsByChannel, getMessagingGroupAgents } from './db/messaging-groups.js';
|
||||
import { ensureContainerRuntimeRunning, cleanupOrphans } from './container-runtime.js';
|
||||
import { startActiveDeliveryPoll, startSweepDeliveryPoll, setDeliveryAdapter, stopDeliveryPolls } from './delivery.js';
|
||||
@@ -34,7 +50,8 @@ import {
|
||||
deletePendingApproval,
|
||||
getSession,
|
||||
} from './db/sessions.js';
|
||||
import { getAgentGroup, updateAgentGroup } from './db/agent-groups.js';
|
||||
import { getAgentGroup } from './db/agent-groups.js';
|
||||
import { updateContainerConfig } from './container-config.js';
|
||||
import { writeSessionMessage } from './session-manager.js';
|
||||
import { wakeContainer, buildAgentGroupImage, killContainer } from './container-runner.js';
|
||||
import { log } from './log.js';
|
||||
@@ -55,6 +72,12 @@ async function main(): Promise<void> {
|
||||
runMigrations(db);
|
||||
log.info('Central DB ready', { path: dbPath });
|
||||
|
||||
// 1b. Builder-agent startup sweep — resumes any in-flight deadmans (from a
|
||||
// host-level swap restart or an unexpected host crash) and cleans up
|
||||
// orphan worktrees. Must run before channel adapters start so any
|
||||
// rollback path-exit happens cleanly without partial startup state.
|
||||
await runBuilderAgentStartupSweep();
|
||||
|
||||
// 2. Container runtime
|
||||
ensureContainerRuntimeRunning();
|
||||
cleanupOrphans();
|
||||
@@ -135,6 +158,9 @@ async function main(): Promise<void> {
|
||||
};
|
||||
setDeliveryAdapter(deliveryAdapter);
|
||||
setCredentialDeliveryAdapter(deliveryAdapter);
|
||||
setSwapApprovalDelivery(deliveryAdapter);
|
||||
setDeadmanDelivery(deliveryAdapter);
|
||||
setPromoteDelivery(deliveryAdapter);
|
||||
|
||||
// 5. Start delivery polls
|
||||
startActiveDeliveryPoll();
|
||||
@@ -240,6 +266,33 @@ async function handleApprovalResponse(
|
||||
selectedOption: string,
|
||||
userId: string,
|
||||
): Promise<void> {
|
||||
// Builder-agent actions are handled out-of-band from the install_packages
|
||||
// family: their session linkage is different and swap_confirmation doesn't
|
||||
// use `payload.session_id` at all (the session is derived from the swap's
|
||||
// originating_group_id). Dispatch them first.
|
||||
if (approval.action === 'swap_confirmation') {
|
||||
const payload = JSON.parse(approval.payload) as { swapRequestId?: string };
|
||||
if (payload.swapRequestId) {
|
||||
await handleSwapConfirmationResponse(approval.approval_id, payload.swapRequestId, selectedOption);
|
||||
} else {
|
||||
deletePendingApproval(approval.approval_id);
|
||||
}
|
||||
return;
|
||||
}
|
||||
if (approval.action === 'swap_request') {
|
||||
await handleSwapRequestApproval(approval, selectedOption, userId);
|
||||
return;
|
||||
}
|
||||
if (approval.action === 'promote_template') {
|
||||
const payload = JSON.parse(approval.payload) as { swapRequestId?: string };
|
||||
if (payload.swapRequestId) {
|
||||
await handlePromoteResponse(approval.approval_id, payload.swapRequestId, selectedOption);
|
||||
} else {
|
||||
deletePendingApproval(approval.approval_id);
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
if (!approval.session_id) {
|
||||
deletePendingApproval(approval.approval_id);
|
||||
return;
|
||||
@@ -274,11 +327,14 @@ async function handleApprovalResponse(
|
||||
|
||||
if (approval.action === 'install_packages') {
|
||||
const agentGroup = getAgentGroup(session.agent_group_id);
|
||||
const containerConfig = agentGroup?.container_config ? JSON.parse(agentGroup.container_config) : {};
|
||||
if (!containerConfig.packages) containerConfig.packages = { apt: [], npm: [] };
|
||||
if (payload.apt) containerConfig.packages.apt.push(...payload.apt);
|
||||
if (payload.npm) containerConfig.packages.npm.push(...payload.npm);
|
||||
updateAgentGroup(session.agent_group_id, { container_config: JSON.stringify(containerConfig) });
|
||||
if (!agentGroup) {
|
||||
notify('install_packages approved but agent group missing.');
|
||||
return;
|
||||
}
|
||||
updateContainerConfig(agentGroup.folder, (cfg) => {
|
||||
if (payload.apt) cfg.packages.apt.push(...(payload.apt as string[]));
|
||||
if (payload.npm) cfg.packages.npm.push(...(payload.npm as string[]));
|
||||
});
|
||||
|
||||
const pkgs = [...(payload.apt || []), ...(payload.npm || [])].join(', ');
|
||||
log.info('Package install approved', { approvalId: approval.approval_id, userId });
|
||||
@@ -324,14 +380,17 @@ async function handleApprovalResponse(
|
||||
}
|
||||
} else if (approval.action === 'add_mcp_server') {
|
||||
const agentGroup = getAgentGroup(session.agent_group_id);
|
||||
const containerConfig = agentGroup?.container_config ? JSON.parse(agentGroup.container_config) : {};
|
||||
if (!containerConfig.mcpServers) containerConfig.mcpServers = {};
|
||||
containerConfig.mcpServers[payload.name] = {
|
||||
command: payload.command,
|
||||
args: payload.args || [],
|
||||
env: payload.env || {},
|
||||
};
|
||||
updateAgentGroup(session.agent_group_id, { container_config: JSON.stringify(containerConfig) });
|
||||
if (!agentGroup) {
|
||||
notify('add_mcp_server approved but agent group missing.');
|
||||
return;
|
||||
}
|
||||
updateContainerConfig(agentGroup.folder, (cfg) => {
|
||||
cfg.mcpServers[payload.name as string] = {
|
||||
command: payload.command as string,
|
||||
args: (payload.args as string[]) || [],
|
||||
env: (payload.env as Record<string, string>) || {},
|
||||
};
|
||||
});
|
||||
|
||||
// Kill the container so next wake loads the new MCP server config
|
||||
killContainer(session.id, 'mcp server added');
|
||||
@@ -343,6 +402,140 @@ async function handleApprovalResponse(
|
||||
await wakeContainer(session);
|
||||
}
|
||||
|
||||
/**
|
||||
* Handle an approver's response to a builder-agent `swap_request` card.
|
||||
* Approve → capture pre-state, apply files, commit, rebuild if needed,
|
||||
* restart, start deadman. Reject → teardown worktree + dev agent, notify.
|
||||
*
|
||||
* Kept separate from the install_packages / request_rebuild flow because:
|
||||
* - Host-level swaps require `process.exit(0)` for supervisor respawn,
|
||||
* which the other flows never do.
|
||||
* - Swap state lives in `pending_swaps`, not `pending_approvals.payload`.
|
||||
*/
|
||||
async function handleSwapRequestApproval(
|
||||
approval: import('./types.js').PendingApproval,
|
||||
selectedOption: string,
|
||||
userId: string,
|
||||
): Promise<void> {
|
||||
const payload = JSON.parse(approval.payload) as { swapRequestId?: string };
|
||||
const swapRequestId = payload.swapRequestId;
|
||||
if (!swapRequestId) {
|
||||
deletePendingApproval(approval.approval_id);
|
||||
return;
|
||||
}
|
||||
const swap = getPendingSwap(swapRequestId);
|
||||
if (!swap) {
|
||||
deletePendingApproval(approval.approval_id);
|
||||
return;
|
||||
}
|
||||
|
||||
// Notify the dev agent's session about the outcome. Uses the existing
|
||||
// session for the dev agent group so the dev agent sees it as an inbound
|
||||
// chat message with sender=system.
|
||||
const { findSessionByAgentGroup } = await import('./db/sessions.js');
|
||||
const devSession = findSessionByAgentGroup(swap.dev_agent_id);
|
||||
const notifyDev = (text: string): void => {
|
||||
if (!devSession) return;
|
||||
writeSessionMessage(devSession.agent_group_id, devSession.id, {
|
||||
id: `appr-note-${Date.now()}-${Math.random().toString(36).slice(2, 8)}`,
|
||||
kind: 'chat',
|
||||
timestamp: new Date().toISOString(),
|
||||
platformId: devSession.agent_group_id,
|
||||
channelType: 'agent',
|
||||
threadId: null,
|
||||
content: JSON.stringify({ text, sender: 'system', senderId: 'system' }),
|
||||
});
|
||||
};
|
||||
|
||||
if (selectedOption !== 'approve') {
|
||||
notifyDev(`Your proposed code change was rejected by ${userId}.`);
|
||||
log.info('Swap request rejected', { requestId: swapRequestId, userId, selectedOption });
|
||||
updatePendingSwapStatus(swapRequestId, 'rejected');
|
||||
try {
|
||||
removeDevWorktree(swapRequestId);
|
||||
} catch (err) {
|
||||
log.warn('Failed to remove worktree after rejection', { swapRequestId, err });
|
||||
}
|
||||
deletePendingApproval(approval.approval_id);
|
||||
return;
|
||||
}
|
||||
|
||||
log.info('Swap request approved — executing swap dance', { requestId: swapRequestId, userId });
|
||||
|
||||
// Swap execution. Any failure inside the try (captureSwapPreState,
|
||||
// applySwapFiles, commitSwap, npm run build, startDeadman, restart
|
||||
// orchestration) triggers a unified retryable-bail: revert any on-disk
|
||||
// changes via git, reset the pending_swaps row back to pending_approval,
|
||||
// leave the dev agent + worktree ALIVE so the dev agent can fix the
|
||||
// issue and call request_swap again. Only explicit rejection tears
|
||||
// down the dev agent.
|
||||
try {
|
||||
// 1. Capture pre-state (pre_swap_sha + DB snapshot).
|
||||
await captureSwapPreState(swapRequestId);
|
||||
|
||||
// 2. Apply files from worktree to swap targets.
|
||||
const touchedAbs = applySwapFiles(swapRequestId);
|
||||
|
||||
// 3. Commit the swap to main.
|
||||
const summary = parseSwapSummary(swap);
|
||||
commitSwap(swapRequestId, touchedAbs, summary.overallSummary || 'no summary');
|
||||
|
||||
// 4. Host-level rebuild. If the diff touched host code that compiles
|
||||
// to dist/ (src/**, package.json, etc.), run `npm run build` now so
|
||||
// the respawned host process runs the new compiled output rather
|
||||
// than stale dist/. Group-level swaps need no rebuild — /app/src is
|
||||
// runtime-compiled inside each container on spawn, skills/CLAUDE.md
|
||||
// are mounted.
|
||||
if (requiresFullHostRebuild(touchedAbs)) {
|
||||
notifyDev('Code change applied and committed. Running `npm run build` before the host restart…');
|
||||
try {
|
||||
execFileSync('npm', ['run', 'build'], { cwd: process.cwd(), stdio: 'inherit' });
|
||||
log.info('npm run build succeeded for host-level swap', { requestId: swapRequestId });
|
||||
} catch (buildErr) {
|
||||
const msg = buildErr instanceof Error ? buildErr.message : String(buildErr);
|
||||
// Wrap with context and re-throw so the outer catch runs the
|
||||
// unified bail path.
|
||||
throw new Error(`npm run build failed: ${msg}`);
|
||||
}
|
||||
}
|
||||
|
||||
// 5. Start the deadman. This sets status=awaiting_confirmation, posts
|
||||
// the handshake card, and schedules the timer. For host-level swaps
|
||||
// we then exit so the supervisor respawns the host on the new code;
|
||||
// the startup sweep will resume this deadman after restart.
|
||||
await startDeadman(swapRequestId);
|
||||
|
||||
if (isHostLevelSwap(swap)) {
|
||||
notifyDev('Code change applied and committed. Triggering host restart so the new code takes effect. Awaiting user confirmation after restart.');
|
||||
log.warn('Host-level swap triggering process exit for supervisor respawn', {
|
||||
requestId: swapRequestId,
|
||||
});
|
||||
// Give log sinks and the deadman card delivery a moment to flush
|
||||
// before exiting.
|
||||
setTimeout(() => process.exit(0), 500);
|
||||
} else {
|
||||
// Group-level: kill the originating agent's active container so its
|
||||
// next wake respawns it with the new per-group runner/skills mounted.
|
||||
const originatingSession = findSessionByAgentGroup(swap.originating_group_id);
|
||||
if (originatingSession) {
|
||||
killContainer(originatingSession.id, 'swap applied');
|
||||
}
|
||||
notifyDev('Code change applied and committed. The originating agent will restart on its next message. Awaiting user confirmation.');
|
||||
}
|
||||
} catch (err) {
|
||||
const errMsg = err instanceof Error ? err.message : String(err);
|
||||
log.error('Swap execution failed — bailing for retry', { requestId: swapRequestId, err });
|
||||
bailSwapForRetry(swapRequestId);
|
||||
notifyDev(
|
||||
`❌ Code change failed: ${errMsg}\n\n` +
|
||||
`Your worktree and dev-agent group are still alive. Review the error above, ` +
|
||||
`fix the issue in /worktree, commit, and call \`request_swap\` again to retry.`,
|
||||
);
|
||||
}
|
||||
|
||||
deletePendingApproval(approval.approval_id);
|
||||
}
|
||||
|
||||
/** Graceful shutdown. */
|
||||
async function shutdown(signal: string): Promise<void> {
|
||||
log.info('Shutdown signal received', { signal });
|
||||
|
||||
+33
-9
@@ -21,7 +21,7 @@ import { getChannelAdapter } from './channels/channel-registry.js';
|
||||
import { isMember } from './db/agent-group-members.js';
|
||||
import { getMessagingGroupByPlatform, createMessagingGroup, getMessagingGroupAgents } from './db/messaging-groups.js';
|
||||
import { upsertUser, getUser } from './db/users.js';
|
||||
import { triggerTyping } from './delivery.js';
|
||||
import { startTypingRefresh } from './delivery.js';
|
||||
import { log } from './log.js';
|
||||
import { resolveSession, writeSessionMessage } from './session-manager.js';
|
||||
import { wakeContainer } from './container-runner.js';
|
||||
@@ -148,8 +148,20 @@ export async function routeInbound(event: InboundEvent): Promise<void> {
|
||||
created,
|
||||
});
|
||||
|
||||
// 7. Show typing indicator while agent processes
|
||||
triggerTyping(event.channelType, event.platformId, event.threadId);
|
||||
// 7. Show typing indicator while the agent processes. Refresh on a short
|
||||
// interval so platforms like Discord (which auto-expire typing after
|
||||
// ~10s) keep showing it for the full thinking window. Gated on the
|
||||
// heartbeat file's mtime after an initial grace period, so typing stops
|
||||
// as soon as the agent goes idle — not when the container eventually
|
||||
// exits. Container-runner also calls stopTypingRefresh on exit as a
|
||||
// fast-path cleanup.
|
||||
startTypingRefresh(
|
||||
session.id,
|
||||
session.agent_group_id,
|
||||
event.channelType,
|
||||
event.platformId,
|
||||
event.threadId,
|
||||
);
|
||||
|
||||
// 8. Wake container
|
||||
const freshSession = getSession(session.id);
|
||||
@@ -189,14 +201,26 @@ function extractAndUpsertUser(event: InboundEvent): string | null {
|
||||
return null;
|
||||
}
|
||||
|
||||
const senderId = typeof content.senderId === 'string' ? content.senderId : undefined;
|
||||
const sender = typeof content.sender === 'string' ? content.sender : undefined;
|
||||
const senderName = typeof content.senderName === 'string' ? content.senderName : undefined;
|
||||
// chat-sdk-bridge serializes author info as a nested `author.userId` and
|
||||
// does NOT populate top-level `senderId`. Older adapters (v1, native) put
|
||||
// `senderId` or `sender` directly at the top level. Check all three.
|
||||
const senderIdField = typeof content.senderId === 'string' ? content.senderId : undefined;
|
||||
const senderField = typeof content.sender === 'string' ? content.sender : undefined;
|
||||
const author = typeof content.author === 'object' && content.author !== null
|
||||
? (content.author as Record<string, unknown>)
|
||||
: undefined;
|
||||
const authorUserId = typeof author?.userId === 'string' ? (author.userId as string) : undefined;
|
||||
const senderName =
|
||||
(typeof content.senderName === 'string' ? content.senderName : undefined) ??
|
||||
(typeof author?.fullName === 'string' ? (author.fullName as string) : undefined) ??
|
||||
(typeof author?.userName === 'string' ? (author.userName as string) : undefined);
|
||||
|
||||
const handle = senderId ?? sender;
|
||||
if (!handle) return null;
|
||||
const rawHandle = senderIdField ?? senderField ?? authorUserId;
|
||||
if (!rawHandle) return null;
|
||||
|
||||
const userId = `${event.channelType}:${handle}`;
|
||||
// If the raw handle already contains ':' it's pre-namespaced (the older
|
||||
// adapters put it in that form). Otherwise prepend the channel type.
|
||||
const userId = rawHandle.includes(':') ? rawHandle : `${event.channelType}:${rawHandle}`;
|
||||
if (!getUser(userId)) {
|
||||
upsertUser({
|
||||
id: userId,
|
||||
|
||||
+41
-1
@@ -5,7 +5,6 @@ export interface AgentGroup {
|
||||
name: string;
|
||||
folder: string;
|
||||
agent_provider: string | null;
|
||||
container_config: string | null; // JSON: { additionalMounts, timeout }
|
||||
created_at: string;
|
||||
}
|
||||
|
||||
@@ -180,6 +179,47 @@ export interface PendingCredential {
|
||||
created_at: string;
|
||||
}
|
||||
|
||||
// ── Pending swaps (central DB, builder-agent feature) ──
|
||||
|
||||
/** Classification of a swap's diff — drives approval routing + warning UX. */
|
||||
export type SwapClassification = 'group' | 'host' | 'combined';
|
||||
|
||||
/**
|
||||
* Swap lifecycle status. Transitions:
|
||||
* pending_approval → awaiting_confirmation → (finalized | rolled_back | rejected)
|
||||
* `rejected` is also reachable directly from pending_approval.
|
||||
*/
|
||||
export type SwapStatus =
|
||||
| 'pending_approval'
|
||||
| 'awaiting_confirmation'
|
||||
| 'finalized'
|
||||
| 'rolled_back'
|
||||
| 'rejected';
|
||||
|
||||
/**
|
||||
* Deadman handshake state — only meaningful while status = awaiting_confirmation.
|
||||
* pending_restart — swap applied, container/host restarting, message 1 not yet sent.
|
||||
* message1_sent — handshake prompt delivered, waiting for user confirm/rollback.
|
||||
*/
|
||||
export type SwapHandshakeState = 'pending_restart' | 'message1_sent';
|
||||
|
||||
export interface PendingSwap {
|
||||
request_id: string;
|
||||
dev_agent_id: string;
|
||||
originating_group_id: string;
|
||||
dev_branch: string;
|
||||
commit_sha: string;
|
||||
classification: SwapClassification;
|
||||
status: SwapStatus;
|
||||
summary_json: string;
|
||||
pre_swap_sha: string | null;
|
||||
db_snapshot_path: string | null;
|
||||
deadman_started_at: string | null;
|
||||
deadman_expires_at: string | null;
|
||||
handshake_state: SwapHandshakeState | null;
|
||||
created_at: string;
|
||||
}
|
||||
|
||||
// ── Agent destinations (central DB) ──
|
||||
|
||||
export interface AgentDestination {
|
||||
|
||||
Reference in New Issue
Block a user