Commit Graph

139 Commits

Author SHA1 Message Date
gavrielc 9db39b291d fix(agent-runner): require explicit destination addressing, fix per-destination threading
The poll loop had a bare-text routing fallback in dispatchResultText: when
the agent produced text without <message to="..."> wrapping, it would auto-
route to the session's originating channel (via a frozen RoutingContext) or
to the single configured destination. This caused three problems:

1. Routing drift: RoutingContext was extracted once from the initial batch
   and never refreshed. When the initial batch was a null-routed cron task
   and a real chat arrived mid-query, replies were silently dropped to
   scratchpad because the frozen routing had all-null fields.

2. Cross-channel thread bleed: sendToDestination applied a single
   routing.threadId to every outbound message regardless of destination.
   In agent-shared sessions (multiple channels sharing one session), one
   channel's thread ID was stamped onto messages to a different channel.

3. Inconsistent formatting: task, webhook, and system messages had no
   origin metadata in their formatted output, so the agent couldn't tell
   which destination they came from — even when the underlying messages_in
   rows carried routing fields.

Changes:

- Remove the bare-text routing fallbacks in dispatchResultText (both the
  routing-based and single-destination shortcuts). All agent output must
  be wrapped in <message to="name">...</message>. Bare text is scratchpad.

- Update buildDestinationsSection() to require explicit wrapping for all
  groups, including single-destination. No more "no special wrapping
  needed" shortcut.

- Resolve thread_id per-destination via resolveDestinationThread(), which
  queries messages_in for the most recent message matching the target
  channel+platform. Falls back to null (top-level channel message) when
  no prior inbound exists for that destination.

- Extract originAttr() helper in formatter.ts and apply it to all message
  types. Tasks now render as <task from="dest" time="...">, webhooks as
  <webhook from="dest" source="..." event="...">, system responses as
  <system_response from="dest" ...>. The agent always sees where a
  message originated.

- Add a PreCompact shell hook (compact-instructions.ts) that outputs
  custom compaction instructions, telling the compactor to preserve
  recent message XML structure and routing metadata in the summary.
  Wired via settings.json in the .claude-shared scaffold, with a
  migration path (ensurePreCompactHook) for existing groups.

Relation to open PRs:

- #2277 (mergeRouting) becomes unnecessary — the routing fallback it
  patches no longer exists. Can be closed.
- #2327 (post-compaction destination reminder) is complementary — it
  handles the post-compaction push, this handles pre-compaction
  instructions. Both can merge independently.
- #2328 (default routing instruction) is complementary — it adds "reply
  to the from= destination" guidance to the multi-destination section.
  Compatible with the unified instruction format here.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-07 19:47:46 +03:00
gavrielc 144c65e32d Merge branch 'main' into fix/test-infra-openInboundDb 2026-05-05 16:03:16 +03:00
gavrielc 6d6584d120 fix(test-infra): openInboundDb honors in-memory test DB
openInboundDb() always opened /workspace/inbound.db which doesn't exist
in CI. In test mode, return a thin wrapper over the in-memory singleton
that delegates prepare/exec but no-ops close(), so callers' try/finally
cleanup doesn't destroy the shared DB mid-test.

One flag (_testMode), no monkey-patching, no saved-close bookkeeping.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-05 16:02:10 +03:00
Alex Mashkovtsev f68f6da406 fix(agent-runner): derive MCP allowedTools from registered mcpServers
Claude Code 2.1.116+ treats SDK `allowedTools` as a hard whitelist:
servers whose namespace isnt listed are filtered out before the agent
ever sees them, regardless of `permissionMode: bypassPermissions` or
any `permissions.allow` in settings. The static TOOL_ALLOWLIST only
contained `mcp__nanoclaw__*`, so any MCP wired via add_mcp_server (or
directly in container.json) was silently dropped.

Derive `mcp__<sanitized-name>__*` entries at the SDK call site from
the already-aggregated `this.mcpServers` map, mirroring the SDKs own
sanitization rule (chars outside [A-Za-z0-9_-] become _).

Prior diagnosis by @jsboige in #2028 (withdrawn, not upstreamed).
2026-05-04 16:49:53 +08:00
Mike Nolet ceb0b9cf5f fix(test-infra): openInboundDb honors in-memory test DB
initTestSessionDb() creates an in-memory inbound singleton, but
openInboundDb() always opened the hardcoded /workspace/inbound.db
path. Every test that exercised getPendingMessages — directly, or via
test fixtures that load data through it (e.g. poll-loop.test.ts:29
loads formatter test rows via getPendingMessages) — failed with
SQLITE_CANTOPEN under `bun test` outside a real container.

Baseline on main: 34 pass, 25 fail across 6 files. After this fix:
59 pass, 0 fail.

In test mode, openInboundDb returns the in-memory singleton. The
singleton's .close() is no-op'd in initTestSessionDb so caller
try/finally cleanup doesn't tear down the shared DB; closeSessionDb
invokes the saved original close to do the real teardown.

Production behavior is unchanged — _inboundIsTest only flips inside
initTestSessionDb, which is never called outside the test runner.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 08:45:23 +02:00
Mike Nolet 1ebb2dc8d2 fix(poll-loop): slash commands silently broken on warm containers
The follow-up poller filtered /clear out of every tick without acking
the row, and pushed every other slash command through plain
formatMessages() (XML wrapping). On a warm container the outer
while(true) loop never regains control, so:

  - /clear sat pending in messages_in forever (no response at all)
  - /compact, /cost, /context, /files, /remote-control arrived at the
    SDK as XML-wrapped user text and were never dispatched as commands

Both modes are invisible to host monitoring: rows are either left
pending without a processing_ack claim, or marked completed normally;
heartbeat keeps firing inside the SDK event loop.

When the follow-up poller observes any slash command (admin or
passthrough — categorizeMessage decides), end the active query so the
current turn winds down cleanly and the outer loop wakes, re-fetches
the same pending set, and runs them through the canonical path
(/clear handler + formatMessagesWithCommands raw dispatch). Leave the
rows untouched so the outer-loop fetch sees the same set the poller
saw.

Cost: each slash command on a warm container forces close+reopen of
the SDK stream — a few seconds of subprocess startup. The Anthropic
prompt cache is server-side with a 5-min TTL keyed on prefix hash, so
stream lifecycle does not affect cache lifetime; close+reopen within
5 min still gets cache hits.

Also corrects the warm-stream rationale comment on processQuery, which
implied keeping the stream open preserved cache warmth — it doesn't.

Testing evidence — cache stays warm across stream close+reopen:

  Turn 1 (warm session):
    Usage: in=6 out=245 cache_create=92 cache_read=22996
    Full cache hit (22996 tokens).

  Turn 2 — /clear arrives:
    Pending slash command — ending stream so outer loop can process
    Clearing session (resetting continuation)
    Usage: in=6 out=95 cache_create=9393 cache_read=13600
    System prompt + tool defs (~13600 tokens) still hit cache;
    conversation history is gone (continuation reset) so the new turn
    writes fresh context.

  Turn 3 — /cost arrives:
    Pending slash command — ending stream so outer loop can process
    Usage: in=0 out=0 cache_create=0 cache_read=0 wall=0.0s api=0.0s
    /cost is a CLI built-in: dispatched locally by the SDK, no API
    call. Pre-fix this would have arrived as XML-wrapped user text
    and never dispatched — confirms the broader fix works.

  Turn 4 (next chat after /cost):
    Usage: in=6 out=142 cache_create=328 cache_read=22993
    Full cache hit again (22993 tokens read, 328 written). Despite the
    /cost-induced stream close+reopen, the server-side prompt cache
    survived: the new sdkQuery() resumed the same continuation, the
    request prefix matched the cached entry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 08:35:14 +02:00
gavrielc 0d836220d9 Merge branch 'main' into pr/inbound-db-fresh-open 2026-05-01 16:29:46 +03:00
exe.dev user 28c38ae28b fix(container): pin vercel to 52.2.1 to dodge broken 53.0.1 publish
vercel@53.0.1 declares a dep on @vercel/static-build@2.9.22 which is not
published on npm (only 2.9.21 exists), breaking every fresh container
build that resolves vercel@latest.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 23:00:02 +00:00
gavrielc 5ab1a2733c review: catch follow-up poll errors + re-check done before push
Two fixes on top of the follow-up pre-task-script work:

1. The void async IIFE inside the interval handler had no catch, so a
   throw from the dynamic import or applyPreTaskScripts escaped as an
   unhandled rejection — terminating the container. The initial-batch
   path is wrapped by processQuery's outer try/catch; the follow-up
   path needs its own. Now logs the error and lets the next tick retry.

2. Re-check `done` immediately before query.push. The flag can flip
   true while applyPreTaskScripts is awaited (outer stream finishes
   during the script execution); without the re-check we'd push into a
   closed query. Claimed messages get released by the host's
   processing-claim sweep — same recovery posture as the rest of the
   poller.

Co-Authored-By: Michael Zazon <mzazon@gmail.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 00:55:46 +03:00
gavrielc 7d29888e59 Merge branch 'main' into fix/poll-loop-prescripts-on-followups 2026-05-01 00:34:45 +03:00
Claw ccfdf2dd75 fix(agent-runner): open inbound.db fresh per messages_in read
Cached singleton can return stale rows on virtiofs/NFS mounts,
causing follow-up messages to silently never be polled. Add
openInboundDb() with mmap_size=0 and switch the three messages_in
readers to it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 15:14:04 -04:00
Mike Nolet 8dd004ca75 fix(scheduling): include routing in schedule_task content JSON
The schedule_task MCP tool wrote routing fields (platform_id, channel_type,
thread_id) onto the outbound system message's row columns, but
handleSystemAction (src/delivery.ts) parses content JSON and forwards only
that to handlers. handleScheduleTask (src/modules/scheduling/actions.ts)
reads content.platformId/channelType/threadId — which the writer never
populated — so every kind='task' row landed in messages_in with all-null
routing.

When host-sweep wakes a scheduled task, dispatchResultText's fast path
requires routing on the message and bails when it's null, falling through
to the "Routing recovery" retry prompt. End-user delivery still works
because the agent can pick a destination from its destinations table on
retry — so the bug went undetected, silently costing one extra LLM turn
per scheduled-task wake. Sessions whose destinations table has no channel
row (e.g. agent-only destinations) fail outright with a recovery loop.

Fix: add the routing fields to the content JSON so the writer matches the
contract handleScheduleTask already expects. cancel/pause/resume/update_task
operate by id alone and don't need routing.
2026-04-30 08:13:59 +02:00
robbyczgw-cla 9889848932 fix(claude-provider): respect operator-set CLAUDE_CODE_AUTO_COMPACT_WINDOW
Closes #1820.

The container agent-runner sets CLAUDE_CODE_AUTO_COMPACT_WINDOW
unconditionally on the container process env, with no way to override
it per-deployment without editing source. Read process.env first and
fall back to the existing 165000 literal when unset.

Default behavior is unchanged for installs that do not set the env
var. Operators running 1M-context models or emergency-tuning a live
deployment can now raise or lower the threshold from the host env.
2026-04-29 15:07:26 +00:00
robbyczgw-cla ef8e3aa1b8 fix(poll-loop): apply pre-task scripts to follow-up injections too
Tasks arriving during an active query were pushed into the stream as
follow-ups without running their `script` gate — so a wakeAgent=false
pre-script that was supposed to suppress the tick silently leaked
through and woke the agent every time. Evidence: monitoring cron
firing every 10 min with [task-script] log lines never showing.

Run applyPreTaskScripts on the follow-up batch too: wakeAgent=false
tasks get marked completed and dropped; wakeAgent=true tasks have
scriptOutput enriched exactly like the initial-batch path. Added a
pollInFlight guard to serialize async runs and avoid overlapping
script executions when the interval fires while one is still going.

Wrapped in a MODULE-HOOK:scheduling-pre-task-followup marker block
to match the existing initial-batch hook convention.
2026-04-29 14:55:47 +00:00
Adam 81ef193e69 refactor(session-state): key continuations per provider to survive provider switches
Before, every provider stored its opaque continuation id under the
single outbound.db key `sdk_session_id`. Flipping a session's
agent_provider (e.g. Codex → Claude) meant the new provider read the
old provider's id at wake, handed it to its own SDK, and got a
"No conversation found" error that cost the user one sacrificed
message before the stale-session recovery path cleared the id.

This reshapes session_state so continuations are keyed
`continuation:<provider>` instead. Consequences:

- Per-provider continuations coexist. Flipping Claude → Codex → Claude
  resumes the Claude thread exactly where it left off, with the
  intervening Codex thread also still on file.
- No provider ever reads another provider's id. Switching costs no
  sacrificed message and emits no transient error.
- Legacy installs are migrated forward on first startup:
  migrateLegacyContinuation() adopts any pre-existing `sdk_session_id`
  row into the current provider's slot (best guess — it was whichever
  provider ran last), then deletes the legacy row unconditionally so
  it can't poison a future provider's read.

runPollLoop now takes providerName alongside the provider instance,
and threads it through processQuery to setContinuation on init.

Tests: 9 new tests covering set/get isolation across providers,
clear-specificity, legacy-adoption, legacy-always-deleted,
prefer-existing-slot-over-legacy, and idempotency of a second
migration call.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:34:28 +10:00
gavrielc dd5bc85b02 refactor(skill/atomic-chat-tool): ship MCP file in skill folder, revert src edits
The initial /add-atomic-chat-tool merge added src edits directly to main.
That conflicts with the utility-skill pattern used elsewhere (e.g. /claw):
the skill folder should ship the file and SKILL.md should instruct copy +
idempotent edits at install time, not a git merge that carries src diffs.

- Move container/agent-runner/src/atomic-chat-mcp-stdio.ts →
  .claude/skills/add-atomic-chat-tool/atomic-chat-mcp-stdio.ts
- Revert the atomic_chat mcpServers entry in agent-runner index.ts
- Revert mcp__atomic_chat__* from TOOL_ALLOWLIST in providers/claude.ts
- Revert ATOMIC_CHAT_* env forwarding and [ATOMIC] log elevation in
  src/container-runner.ts
- Empty .env.example back out
- Rewrite SKILL.md: copy the shipped file, then apply deterministic Edits
  (index.ts, providers/claude.ts, container-runner.ts, .env.example)
  with exact before/after snippets the installer agent can match.

Main is now back to its pre-PR state for the tool; /add-atomic-chat-tool
re-applies everything at install time.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 16:29:10 +03:00
Misha Skvortsov 3a9b98f1a4 feat: add Atomic Chat MCP tool skill
Exposes local Atomic Chat models (OpenAI-compatible API at
127.0.0.1:1337/v1) as tools to the container agent. Adds
atomic_chat_list_models and atomic_chat_generate alongside
the existing Ollama skill.

Rebased on current main:
- MCP server registered in agent-runner index.ts using bun (no tsc
  step in-image), sibling path to index.ts, env: {} with ATOMIC_CHAT_*
  forwarded when set.
- allowedTools entry moved to providers/claude.ts TOOL_ALLOWLIST.
- SKILL.md: drop obsolete per-group copy step (single RO mount
  supersedes it); use pnpm build.

Made-with: Cursor
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 16:18:34 +03:00
gavrielc 7a9401ddf2 feat(setup): per-checkout service name and docker image tag
Two NanoClaw installs on the same host used to fight over the shared `com.nanoclaw` launchd label / `nanoclaw.service` systemd unit and the `nanoclaw-agent:latest` docker tag — the second install silently rewrote the service pointer and rebuilt the image out from under the first. Introduces a deterministic per-checkout slug (sha1(projectRoot)[:8]) and namespaces everything off it:

- Service: `com.nanoclaw-v2-<slug>` / `nanoclaw-v2-<slug>.service`
- Image:   `nanoclaw-agent-v2-<slug>:latest` (base), `nanoclaw-agent-v2-<slug>:<agentGroupId>` (per-group)

New shared helpers: src/install-slug.ts (host) + setup/lib/install-slug.sh (bash). Both compute the same slug so verify/probe/add-*.sh/build.sh/container-runner all agree. Any v1 `com.nanoclaw` service left on the host stays untouched and can coexist.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 10:10:09 +03:00
gavrielc 6cd261a26d chore(container): loosen /home/node to 0777
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 01:05:03 +03:00
gavrielc 39ae04df98 docs(welcome-skill): expand onboarding tour with capability reveals and trust framing
Adds a structured drip-feed of capabilities (memory, agents, scheduling, research, code, UI, files, self-customization) and explicit sections on approvals, access control, and natural interaction.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 19:04:00 +03:00
gavrielc d2f53048f2 docs(module-fragments): add instructions for create_agent, interactive, and remaining core tools
Three MCP tool groups were orphaned from the ambient CLAUDE.md context
because they shipped no `*.instructions.md` alongside their source.
Backfill them so the composer picks them up as fragments on next spawn:

- core.instructions.md: add `send_file` (artifact delivery, path relative
  to /workspace/agent/) and `add_reaction` (by `#N` id with emoji
  shortcode name).
- interactive.instructions.md: `ask_user_question` (blocking
  multiple-choice with selectedLabel/value option objects, 300s default
  timeout) and `send_card` (non-blocking structured render with
  fallbackText). Opens with a one-line framing of the contrast between
  the two.
- agents.instructions.md: `create_agent` with how-it-works, when-to-use
  (companions vs collaborators — persistent memory vs independent
  parallel work), when-NOT-to-use (short tasks should use the SDK `Agent`
  tool instead), and guidance for writing the seed instructions string.

No composer changes — scan in `src/claude-md-compose.ts` already picks up
any file matching `*.instructions.md` in the mcp-tools directory.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 18:05:50 +03:00
gavrielc 52ebdce9c9 docs(claude-md): drop host-facing header comment from shared base
The HTML comment at the top was aimed at maintainers opening the file,
but it's loaded verbatim into every agent's system prompt via the
`.claude-shared.md` import. Agents don't need the meta-explanation of
where the file is mounted or how identity gets injected — it's just
context-budget drag. Move the maintainer guidance out of the agent's
view.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 17:47:30 +03:00
gavrielc e2b1df876b docs(claude-md): strengthen memory discipline in shared base
Replace the passing CLAUDE.local.md mention with an explicit Memory
section: anything substantive the user shares must be stored so it's
retrievable later, with per-topic files indexed from CLAUDE.local.md.
Frames this as a core part of the agent's job — the quality of its
memory systems is a main signal of how useful it is.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 17:46:59 +03:00
gavrielc 3b8240a91b refactor(self-mod): drop request_rebuild — approvals now bundle rebuild+restart
install_packages and add_mcp_server already did the right thing on approve
(install auto-rebuilt+killed, add_mcp_server just killed), so request_rebuild
was redundant plumbing agents sometimes called after an install — wasting an
admin approval round-trip. Delete it end-to-end:

- container/agent-runner/src/mcp-tools/self-mod.ts: remove requestRebuild
  tool + registration; update install_packages description.
- src/modules/self-mod/{request,apply,index}.ts: drop handleRequestRebuild
  + applyRequestRebuild + registrations; rewrite the rebuild-failed notify
  to point admins at retrying install_packages instead.
- src/modules/{approvals,self-mod}/{agent,project}.md and skill/self-
  customize/SKILL.md: scrub agent-facing references; clarify that
  add_mcp_server needs no rebuild (bun runs TS directly).
- docs/{module-contract,architecture-diagram,checklist,db-central,shared-
  source,v1-vs-v2/*}.md, CLAUDE.md, pending-approvals migration comment,
  approvals/index.ts docstring, REFACTOR.md: trailing references.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 17:28:36 +03:00
gavrielc e64bdb3016 refactor(claude-md): split shared base into module fragments, inject name at runtime
Move every agent-specific instruction out of the shared container/CLAUDE.md
so the base is genuinely universal. Persona/identity now comes from the
system-prompt addendum (buildSystemPromptAddendum now takes assistantName
and prepends "# You are {name}"). Per-module instructions live alongside
each MCP tool source:

  container/agent-runner/src/mcp-tools/core.instructions.md
  container/agent-runner/src/mcp-tools/scheduling.instructions.md
  container/agent-runner/src/mcp-tools/self-mod.instructions.md

composeGroupClaudeMd() scans that directory and emits `module-<name>.md`
fragments as symlinks to /app/src/mcp-tools/<name>.instructions.md (valid
via the existing RO source mount). Skill fragments renamed to
`skill-<name>.md` for naming consistency with `module-*` and `mcp-*`.

Mount tightening so composer-managed files can't be clobbered by agent
writes: nested RO mounts for /workspace/agent/CLAUDE.md and
/workspace/agent/.claude-fragments/. CLAUDE.local.md (per-group memory)
stays RW as the only writable CLAUDE.md-family file.

.gitignore: ignore CLAUDE.local.md, .claude-shared.md, .claude-fragments/
everywhere, and simplify groups/ rules to ignore the whole tree (per-
installation state, not tracked).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 17:14:51 +03:00
exe.dev user 7da24b166d fix(agent-runner): remove thread_id filter and fix processing ack on empty result
The concurrent poll in processQuery filtered out messages with
mismatched thread_ids, causing a deadlock when the initial batch
(e.g. a host-generated welcome trigger with null thread_id) completed
but follow-ups arrived with a different thread_id (e.g. a Discord DM).
The query stayed open waiting for matching-thread pushes that never
came, blocking the poll loop indefinitely.

Thread routing is the router's concern — per-thread sessions already
isolate threads into separate containers; shared sessions intentionally
merge everything. Removed the filter.

Also fixed processing_ack: a result event (with or without text) means
the turn is done, but markCompleted only ran when event.text was truthy.
When the agent responded via MCP send_message (empty result text), the
initial batch stayed in 'processing' for the query's lifetime, creating
false stuck signals in the host sweep. Now marks completed on any result
event.

Belt-and-suspenders: init-first-agent welcome trigger now sets threadId
to the DM platform_id instead of null.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-22 10:42:56 +00:00
gavrielc c8fc1da719 refactor(claude-md): compose per-group CLAUDE.md from shared base + fragments
Replace the per-group "written once at init, owned by the group" CLAUDE.md
with a host-regenerated entry point that imports:

  - a shared base (`container/CLAUDE.md` mounted RO at `/app/CLAUDE.md`)
  - optional per-skill fragments (skills that ship `instructions.md`)
  - optional per-MCP-server fragments (inline `instructions` field in
    `container.json`)
  - per-group agent memory (`CLAUDE.local.md`, auto-loaded by Claude Code)

Principle: RW = per-group memory, RO = shared content. Source/skills/base
are shared; personality, config, working files, and Claude state stay
per-group.

Key changes:

  - New `src/claude-md-compose.ts` — per-spawn composition +
    `migrateGroupsToClaudeLocal()` one-time cutover.
  - New `container/CLAUDE.md` — shared base, seeded verbatim from the
    former `groups/global/CLAUDE.md`.
  - `src/container-runner.ts` — swap `/workspace/global` mount for RO
    `/app/CLAUDE.md`; call `composeGroupClaudeMd()` after
    `initGroupFilesystem()`.
  - `src/group-init.ts` — drop `.claude-global.md` symlink + initial
    `CLAUDE.md` write; seed `CLAUDE.local.md` from `opts.instructions`.
  - `src/index.ts` — call `migrateGroupsToClaudeLocal()` at startup.
  - `src/container-config.ts` — add optional `instructions` field to
    `McpServerConfig` (inline per-MCP guidance fragment).
  - `container/Dockerfile` — drop dead `/workspace/global` mkdir.
  - Remove obsolete `scripts/migrate-group-claude-md.ts`.

Migration (runs once at host startup, idempotent):

  - Delete `.claude-global.md` symlinks in each group.
  - Rename each `groups/<folder>/CLAUDE.md` → `CLAUDE.local.md`
    (preserves existing per-group content as memory).
  - Delete `groups/global/` directory.

Design docs: `docs/claude-md-composition.md` and `docs/shared-source.md`
(the latter is the sibling design discussion this refactor builds on).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 12:58:43 +03:00
exe.dev user 8a12fa61ac refactor: shared source — replace per-group agent-runner copies with single RO mount
Replace the per-group agent-runner-src copy model with a single shared
read-only mount. Source and skills are now RO + shared; personality,
config, working files, and Claude state stay RW + per-group.

Key changes:
- Mount container/agent-runner/src/ RO at /app/src (all groups share one copy)
- Mount container/skills/ RO at /app/skills; per-group skill selection via
  symlinks in .claude-shared/skills/ based on container.json "skills" field
- Mount container.json as nested RO bind on top of RW group dir
- Move all NANOCLAW_* env vars to container.json (runner reads at startup)
- New runner config.ts module replaces process.env reads
- Move command gate (filtered/admin) from container to host router
- Dockerfile: remove source COPY, split CLI installs (claude-code last),
  move agent-runner deps above CLIs for better layer caching
- Add writeOutboundDirect for router denial responses
- Design doc at docs/shared-src.md

Not included (follow-up): DB migration to drop agent_provider columns,
cleanup of orphaned agent-runner-src directories.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-22 12:58:43 +03:00
gavrielc 1e0f7d631d Merge pull request #1884 from ssimeonov/fix/sdk-binary-resolution
fix(container): v2 -- point SDK to pnpm-installed Claude Code binary
2026-04-22 09:55:24 +03:00
gavrielc e49cbce429 Merge pull request #1883 from ssimeonov/fix/claude-code-version-bump
fix(container): v2 -- bump Claude Code to 2.1.116 and Agent SDK to ^0.2.116
2026-04-21 21:56:26 +03:00
Dave Kim 91c668e0cc fix: persist SDK session_id on init + split long messages before adapter truncation
Two related bugs that surfaced together when a Discord response exceeded
2000 chars:

1. **Session id lost on mid-turn container exit.** `runPollLoop` was
   calling `setStoredSessionId` only after `processQuery` returned. If
   the container died between the SDK's `init` event (where session_id
   arrives) and the stream completing, the id was never persisted. The
   next wake called `getStoredSessionId()` → undefined and started a
   fresh Claude session, dropping all prior context. Fix: persist
   immediately in the `init` branch inside `processQuery`. The existing
   post-query store becomes a harmless no-op.

2. **Silent truncation past adapter limits.** `chat-sdk-bridge.deliver`
   handed full text straight to `adapter.postMessage`. Discord's adapter
   hard-truncates at 2000 chars; Telegram's at 4096. Responses longer
   than that were cut off without any signal to the user or host. Fix:
   add `maxTextLength` to `ChatSdkBridgeConfig` and a `splitForLimit`
   helper that breaks on paragraph → line → hard-char boundaries, then
   posts chunks sequentially. Files ride on the first chunk; the
   returned id is the first chunk's so edits and reactions still target
   the reply head.

Channel adapter files (Discord, Telegram, …) live on the `channels`
branch — a companion PR wires `maxTextLength: 1900` for Discord and
`4000` for Telegram so the splitter actually engages in those installs.
Without wiring, behavior is unchanged.
2026-04-21 13:04:57 +00:00
gavrielc 01ffce6f74 Revert "fix(permissions): welcome new approved channels via /welcome, route to them"
This reverts commit 9776dd4f32.
2026-04-21 15:20:06 +03:00
Koshkoshinsk 9776dd4f32 fix(permissions): welcome new approved channels via /welcome, route to them
When the unknown-channel approval flow completes, seed a /welcome task
into the newly-wired session so the agent greets the new user on first
contact. The replayed /start (Telegram's default first-message) is filtered
by the agent-runner's command-command filter, so without an explicit
onboarding trigger the first interaction produced nothing.

Pin the destination by its local_name from agent_destinations to avoid the
agent picking the wrong named destination (previously it greeted the owner,
whose DM is in CLAUDE.md).

Also guard dispatchResultText against echoing trailing status lines when
the agent has already sent messages explicitly via send_message. Otherwise
a task-triggered flow that calls send_message then emits "welcome message
sent" produces a duplicate chat to the recipient.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 11:40:12 +00:00
Simeon Simeonov f0090ebbb9 fix(container): point SDK to pnpm-installed Claude Code binary
The Agent SDK's default binary resolution picks the musl-linked native
binary (claude-agent-sdk-linux-arm64-musl), which cannot execute on the
Debian-based container image (glibc). Explicitly set
pathToClaudeCodeExecutable to /pnpm/claude — the pnpm global symlink
that resolves to the correct glibc binary regardless of architecture.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-20 23:28:54 -04:00
Simeon Simeonov 63b8beb0fb fix(container): bump Claude Code to 2.1.116 and Agent SDK to ^0.2.116
The Agent SDK's IPC protocol must match the Claude Code version. Also
allowlist @anthropic-ai/claude-code in only-built-dependencies so its
postinstall script runs during Docker build — without this, the native
binary is never installed and the SDK fails at spawn time.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-20 23:28:35 -04:00
gavrielc 866b7915b5 fix(container): add /start to filtered commands
Telegram clients send /start when a user first DMs a bot (and when they
tap "Start" on a bot profile). It's a platform handshake, not a
user-intended prompt — forwarding it to the agent wastes a turn and
produces a confused response.

Matches the existing filter pattern for /help, /login, /logout,
/doctor, /config.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 18:25:32 +03:00
gavrielc 31f2da9585 fix(container): gate poll loop on trigger=1 to honor accumulate contract
A warm container picks up every pending messages_in row on each poll tick
and calls markProcessing → agent.query → markCompleted. Before this, that
included trigger=0 rows (ignored_message_policy='accumulate' context),
causing the agent to wake and potentially respond to messages the wiring
had explicitly opted out of engaging on — defeating accumulate's "store
as context, don't engage" contract.

Gate the main poll loop with `messages.some(m => m.trigger === 1)` —
mirrors host-side countDueMessages which is already gated. If the batch
has no wake-eligible row, sleep and leave them pending. They ride along
via the same getPendingMessages query the next time a real trigger=1
lands, which is the intended accumulate behavior.

The concurrent active-turn poll (line ~290) is unchanged on purpose —
once the agent has engaged, pushing in accumulate rows mid-turn as
additional context is desired.

initTestSessionDb was missing the trigger and series_id columns on
messages_in, out of sync with the live migration. Added both so the new
tests (and any future trigger-aware tests) can run.

Four tests cover the data contract: trigger=0 rows are returned by
getPendingMessages (so they ride along), the gate predicate correctly
identifies accumulate-only batches, mixed batches pass the gate, and the
schema default of 1 applies when the column is omitted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 11:23:47 +03:00
gavrielc 16b9499532 feat(routing): engage modes + sender scope + accumulate/drop + per-agent fan-out
Replaces the opaque trigger_rules JSON + response_scope enum on
messaging_group_agents with four explicit orthogonal columns:

    engage_mode            'pattern' | 'mention' | 'mention-sticky'
    engage_pattern         regex source; required when mode='pattern';
                           '.' is the "always" sentinel
    sender_scope           'all' | 'known'
    ignored_message_policy 'drop' | 'accumulate'

Inbound routing becomes a fan-out — every wired agent is evaluated
independently. A match gets its own session + container wake. A miss
with accumulate keeps the message as context-only (trigger=0) in that
agent's session, so when the agent does eventually engage it sees the
prior chatter.

## Schema

- Migration 010 (`engage-modes`): adds the 4 new columns, backfills
  from trigger_rules.pattern + requiresTrigger + response_scope, drops
  the legacy columns.
- messages_in gains `trigger INTEGER NOT NULL DEFAULT 1` (session DB
  schema + `migrateMessagesInTable` forward-compat).
- countDueMessages gates waking on `trigger = 1`.

## Routing

- `pickAgent` (returns one) → loop over all wired agents. Per agent:
  evaluate engage_mode; run access gate + sender-scope gate; on full
  match → resolveSession + writeSessionMessage(trigger=1) + wake. On
  miss with accumulate → writeSessionMessage(trigger=0), no wake. On
  miss with drop → skip.
- New `findSessionForAgent(agentGroupId, mgId, threadId)` scopes
  session lookup by agent so fan-out doesn't cross sessions.
- `messageIdForAgent` namespaces inbound message ids by agent_group_id
  so PRIMARY KEY doesn't collide across per-agent session DBs.

## Adapter layer

- `ConversationConfig` replaces `triggerPattern` + `requiresTrigger`
  with `engageMode` + `engagePattern`.
- Chat SDK bridge stores `Map<platformId, ConversationConfig[]>` (multi-
  agent per conversation) and applies union gating pre-onInbound:
    * onSubscribedMessage: engage if any wiring keeps firing in
      subscribed state (mention-sticky or pattern)
    * onNewMention: engage on mention; only subscribes the thread if
      at least one wiring is `mention-sticky`
    * onDirectMessage: engage per mode; sticky follows same rule
- Bridge no longer unconditionally calls `thread.subscribe()`.

## Sender scope

- Permissions module registers a second hook `setSenderScopeGate` that
  runs per-wiring after the existing access gate. `sender_scope='known'`
  requires canAccessAgentGroup(); `'all'` is a no-op. Not installed →
  no-op everywhere (default allow).

## Container side

- Host passes `NANOCLAW_MAX_MESSAGES_PER_PROMPT` (reuses existing
  MAX_MESSAGES_PER_PROMPT config; was dead code from v1).
- `getPendingMessages` queries `ORDER BY seq DESC LIMIT N`, reverses to
  chronological order for the prompt — accumulated context rides along
  with trigger rows up to the cap.
- `MessageInRow` gains `trigger: number` so the container can tell them
  apart in downstream code (container still processes both; only the
  host uses `trigger=0` for don't-wake).

## Defaults (per ACTION-ITEMS item 1 decision)

- DM (is_group=0): `engage_mode='pattern'`, `engage_pattern='.'` (always)
- Threaded group: `engage_mode='mention-sticky'` (seed-discord)
- Non-threaded group / CLI: pattern '.' in bootstrap scripts

## Tests

- src/host-core.test.ts: 3 new cases — fan-out (2 agents, 2 sessions,
  2 wakes), accumulate (trigger=0 + no wake), drop (no session created).
- Existing 10 host-core tests still pass.
- Migration 010 runs on an empty DB in 0-row path — verified.

Closes: ACTION-ITEMS items 1, 4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 01:30:04 +03:00
gavrielc 6a815190c0 feat(lifecycle): stuck detection + heartbeat lifecycle + SDK tool blocklist
Replaces the two overlapping old mechanisms (30-min setTimeout kill in
container-runner, 10-min heartbeat STALE_THRESHOLD reset in host-sweep)
with message-scoped stuck detection anchored to the processing_ack claim
age + an absolute 30-min ceiling that extends for long-declared Bash
tools.

Old model problems:
- IDLE_TIMEOUT setTimeout fired on plain wall-clock time; slow-but-alive
  agents got killed at 30min regardless of activity
- 10-min STALE_THRESHOLD in the sweep was unreliable — the heartbeat is
  only touched on SDK events, so legitimate silent tool work (sleep 30,
  long WebFetch, npm install) looked identical to a hung container
- Two overlapping sources of truth for "when to let go of a container"

New model:
- Host sweep is the single source of truth.
- Container exposes a new `container_state` single-row table in outbound.db
  (schema added; container writes, host reads). PreToolUse hook writes
  current_tool + tool_declared_timeout_ms (read from Bash's tool_input);
  PostToolUse / PostToolUseFailure clear it.
- Sweep decides with a pure helper `decideStuckAction`:
    * absolute ceiling — kill if heartbeat age > max(30min, bash_timeout)
    * per-claim stuck  — kill if any processing_ack row has claim_age >
      max(60s, bash_timeout) AND heartbeat hasn't been touched since claim
    * otherwise ok
  Kill paths reset leftover processing rows with exponential backoff,
  reusing the existing retry machinery.

Tool blocklist expanded:
- AskUserQuestion (SDK placeholder; we have mcp__nanoclaw__ask_user_question)
- EnterPlanMode, ExitPlanMode, EnterWorktree, ExitWorktree (Claude Code UI
  affordances; would hang in headless containers)
PreToolUse hook is also defense-in-depth: if a disallowed tool name slips
through, it returns `{ decision: 'block' }` so the agent sees a clear
error instead of appearing stuck.

Removed:
- container-runner.ts: IDLE_TIMEOUT setTimeout, resetIdle callback on
  activeContainers entry, resetContainerIdleTimer export.
- delivery.ts: the resetContainerIdleTimer call on successful delivery.
- poll-loop.ts: IDLE_END_MS + its setInterval. Keeping the query open is
  cheaper than close+reopen (no cold prompt cache). Liveness is now a
  host-side concern.
- host-sweep.ts: 10-min STALE_THRESHOLD_MS + getStuckProcessingIds in the
  stale-detection path (still exported for kill reset).

Tests:
- src/host-sweep.test.ts — 9 tests for decideStuckAction covering: fresh
  heartbeat, absolute ceiling, absent heartbeat, Bash-timeout extension
  (both ceiling and per-claim), claim age below tolerance, heartbeat
  touched after claim, unparseable timestamps.

Ref: docs/v1-vs-v2/ACTION-ITEMS.md items 9, 6a, 10.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 01:16:57 +03:00
gavrielc dcfa12ea06 feat(timezone): recreate v1 TZ-aware formatting + scheduling behavior
The agent needs to perceive times in the user's timezone, not UTC.
Dropping this in the v1→v2 port produced a class of bugs where the agent
would schedule tasks for the wrong hour, suggest dinner at midnight, etc.
This restores v1 parity.

Container side:
- New container/agent-runner/src/timezone.ts mirrors src/timezone.ts with
  isValidTimezone / resolveTimezone / formatLocalTime, plus:
  * TIMEZONE constant resolved at load from process.env.TZ (host sets this
    from src/container-runner.ts:254)
  * parseZonedToUtc(input, tz) — treats a naive ISO as wall-clock time in
    `tz`, returns the corresponding UTC Date. Strings with Z or offset
    are passed through.

- formatter.ts:
  * formatMessages() now prepends <context timezone="IANA"/>\n — matches
    v1 src/v1/router.ts:20-22
  * formatSingleChat uses formatLocalTime(ts, TIMEZONE) instead of a
    home-rolled HH:MM 24h formatter → outputs like "Jun 15, 2026, 8:00 AM"
  * reply_to="<id>" attribute + <quoted_message from="X">Y</quoted_message>
    element — matches v1 format exactly; old <reply-to/> shape is gone
  * stripInternalTags() exported for the dispatch path to reuse

- poll-loop.ts uses the exported stripInternalTags() instead of inline regex.

- mcp-tools/scheduling.ts:
  * schedule_task/update_task descriptions now explicitly document that
    processAfter accepts either UTC or naive local time (interpreted in
    the user's TZ from the context header)
  * handlers normalize through parseZonedToUtc() and store a UTC ISO

Host side:
- src/modules/scheduling/recurrence.ts passes { tz: TIMEZONE } to
  CronExpressionParser.parse. Without this, "0 9 * * *" fires at 09:00
  UTC instead of 09:00 user-local — this was the v1 behavior
  (src/v1/task-scheduler.ts:20-49).

Tests:
- container/agent-runner/src/timezone.test.ts — mirror of src/timezone.test.ts
  + new parseZonedToUtc cases
- container/agent-runner/src/formatter.test.ts — context header, reply_to,
  quoted_message, XML escaping, stripInternalTags (ported from v1
  formatting.test.ts)
- src/modules/scheduling/recurrence.test.ts — cron TZ respected, completed
  rows only cloned when recurrence is set

Ref: docs/v1-vs-v2/ACTION-ITEMS.md item 18 + timezone-formatting-v1-recreation.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 01:09:14 +03:00
gavrielc 47950671fa docs: add v1→v2 action-items analysis + SDK signal probe tool
- docs/v1-vs-v2/: full v1→v2 regression analysis (SUMMARY + 21 per-module
  docs + ACTION-ITEMS rollup with decisions + timezone recreation spec).
- container/agent-runner/scripts/sdk-signal-probe.ts: empirical harness
  used to characterise Claude Agent SDK event/hook/stderr timing for the
  stuck-detection design in item 9.
- src/channels/chat-sdk-bridge.ts: document the conversations Map staleness
  in a code comment; fix deferred to when dynamic group registration lands
  (ACTION-ITEMS item 17).

No runtime behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 01:00:04 +03:00
gavrielc 7169c25e70 refactor: relocate outbox I/O to session-manager + dead-code sweep
## Outbox extraction (delivery.ts → session-manager.ts)

File I/O for outbound attachments now lives in session-manager.ts alongside
the symmetric inbound extractAttachmentFiles. delivery.ts no longer touches
the filesystem — it hands buffers to the adapter and calls clearOutbox on
success.

- New `readOutboxFiles(agentGroupId, sessionId, messageId, filenames)` and
  `clearOutbox(agentGroupId, sessionId, messageId)` in session-manager.ts.
- deliverMessage in delivery.ts loses ~35 lines of fs/path code and its
  `fs`/`path` imports.

## Dead-code sweep

TypeScript's --noUnusedLocals surfaced several cruft imports. Fixed:

- src/container-runner.ts: drop unused `markContainerIdle` import; drop
  unused `session` parameter from `buildContainerArgs` signature.
- src/delivery.ts: drop unused `getSession`, `writeSessionMessage`,
  `wakeContainer` imports.
- src/host-sweep.ts: drop unused `updateSession`, `outboundDbPath` imports.
- container/agent-runner/src/poll-loop.ts: drop unused `config`,
  `processingIds` params from `processQuery`.
- Test files: drop unused imports in channel-registry.test, db-v2.test,
  host-core.test.

Skipped: `conversations` state in chat-sdk-bridge.ts (never read but
tangled with public `updateConversations` method; cleaning it risks a
merge conflict with the channels branch at the next sync).

## Validation

- `pnpm run build` clean
- `pnpm test` — 137 host tests pass
- `bun test` in container/agent-runner — 17 tests pass
- Service boots (`NanoClaw running`, `OneCLI approval handler started`)
  and shuts down cleanly on SIGTERM

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 21:34:08 +03:00
gavrielc 473f766585 refactor(modules): extract scheduling as registry-based module
Moves the scheduling surface — 5 delivery actions (schedule_task,
cancel_task, pause_task, resume_task, update_task), handleRecurrence,
applyPreTaskScripts, and task DB helpers — out of core and into
src/modules/scheduling/ (host) and container/agent-runner/src/scheduling/
(container).

First PR to fill the MODULE-HOOK markers introduced in PR #2:
  - src/host-sweep.ts MODULE-HOOK:scheduling-recurrence now dynamically
    imports handleRecurrence from the module each sweep tick.
  - container/agent-runner/src/poll-loop.ts MODULE-HOOK:scheduling-pre-task
    dynamically imports applyPreTaskScripts before the provider call.
    When the marker block is empty (scheduling uninstalled), `keep`
    falls back to `normalMessages` so non-task messages still flow.

The 5 task cases are removed from delivery.ts's handleSystemAction
switch — the registry now routes them. Task DB helpers moved out of
src/db/session-db.ts (which kept `nextEvenSeq` as a named export so
the module can uphold the host-writes-even-seq invariant). Test suite
split to match: scheduling-specific tests live in the module.

No migration — tasks are messages_in rows with kind='task'.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 16:17:47 +03:00
gavrielc 4202041d0b refactor: scaffold module registries and default-module layout
Additive change — existing code paths still run via inline fallbacks.
Prepares core for per-module extractions in PR #3 onward.

Four registries added with empty defaults:
  - delivery action handlers (delivery.ts)
  - router inbound gate (router.ts)
  - response dispatcher (index.ts)
  - MCP tool self-registration (container/agent-runner/src/mcp-tools/server.ts)

Default modules moved to src/modules/ for signaling:
  - src/modules/typing/       (extracted from delivery.ts)
  - src/modules/mount-security/ (moved from src/mount-security.ts)

Both are imported directly by core — no hook, no registry. Removal
requires editing core imports.

Migrator now keys applied rows by name (uniqueness) so module
migrations can pick arbitrary version numbers. Stored version column
is auto-assigned as an applied-order sequence.

sqlite_master guards added around core calls into module-owned tables
(user_roles, agent_destinations, pending_questions). No-ops today;
load-bearing after the owning modules are extracted.

MODULE-HOOK markers placed at scheduling's two skill-edit sites
(host-sweep.ts recurrence call, poll-loop.ts pre-task gate). PR #4
replaces the marked blocks when scheduling moves to its module.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 14:46:19 +03:00
gavrielc 86becf8dea chore: delete v1 reference code
Removes src/v1/ (37 files) and container/agent-runner/src/v1/ (3 files)
along with the v1 reference note in CLAUDE.md and the now-obsolete
tsconfig exclude. v1 was already out of the runtime path; this just
removes the dead weight.

~8,800 LOC removed, zero runtime change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 14:23:47 +03:00
gavrielc e93292de2a fix(agent-runner): spawn built-in MCP server with bun, not node
The Bun migration (c5d0ef8) dropped the in-image tsc build step, so
/app/src/mcp-tools/index.js never exists — only index.ts. The spawn
config in container/agent-runner/src/index.ts still pointed at
index.js and invoked it with `node`, which can't execute TypeScript
anyway. Net effect: every session failed to start the `nanoclaw`
MCP server, so scheduling, send_to_agent, interactive questions,
and self-mod tools were silently absent from the agent's toolset.

Matches entrypoint.sh and src/container-runner.ts, which already
use `exec bun run /app/src/index.ts` for the same reason.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 17:17:45 +03:00
gavrielc e0258e8c1b refactor(v2): move opencode provider off v2 trunk
v2 ships with only claude baked in. opencode now lives on the `providers`
branch and gets copied in via the /add-opencode skill.

Removed:
- src/providers/opencode.ts
- container/agent-runner/src/providers/{opencode,mcp-to-opencode}.ts + test
- @opencode-ai/sdk from agent-runner package.json + bun.lock
- opencode-ai global install + OPENCODE_VERSION ARG from Dockerfile
- opencode self-registration imports from both provider barrels
- opencode test case from factory.test.ts

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 14:10:56 +03:00
Tal Moskovich 22150261c5 feat(v2): OpenCode agent provider
- Add OpenCodeProvider (SSE, session resume, MCP via mcp-to-opencode)
- Register opencode in factory; AGENT_PROVIDER passthrough from DB
- Host: XDG mount, NO_PROXY merge, OPENCODE_* env for opencode sessions
- Dockerfile: opencode-ai CLI; docs checklist + architecture diagram
- Skill add-opencode for v2; AgentProviderName in src/types.ts

Made-with: Cursor
2026-04-17 12:20:22 +03:00
gavrielc 1f3b023a5a refactor(v2/providers): self-registration barrel + host container-config registry
Providers now mirror the channels pattern: each module calls
registerProvider() at top level, and providers/index.ts is a barrel of
side-effect imports. createProvider() becomes a thin registry lookup;
the closed ProviderName union is gone (now a string alias, since the
env var is a runtime string anyway).

Also adds a host-side provider-container-registry so providers can
declare their own mounts and env passthrough in src/providers/<name>.ts
instead of the container-runner having to know about each one. The
resolver runs once per spawn and threads provider + contribution
through buildMounts and buildContainerArgs so side effects (mkdir,
etc.) fire exactly once.

Both barrels are append-only — adding a new provider is a new file
+ one import line per barrel, no edits to existing files. The built-in
providers (claude, mock) don't need host-side config, so src/providers/
ships with an empty barrel; the container-side barrel imports both.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 12:17:09 +03:00
gavrielc c5d0ef8b4f feat(v2): migrate container runtime to Bun, improve image build surface
Container side:
- agent-runner switches to Bun. Drops better-sqlite3 (native compile gone),
  drops tsc build step in-image AND the tsc-on-every-session-wake in the
  entrypoint — bun runs src/index.ts directly. bun:sqlite replaces
  better-sqlite3; cross-mount DB invariants (journal_mode=DELETE, busy_timeout)
  preserved. Named params converted from @name to $name because bun:sqlite
  does not auto-strip the prefix the way better-sqlite3 does.
- Tests ported from vitest to bun:test (only describe/it/expect/before/afterEach
  used, API-compatible). vitest.config.ts excludes container/agent-runner/.
- bun.lock replaces pnpm-lock.yaml + pnpm-workspace.yaml under
  container/agent-runner/. Host pnpm workspace does NOT include this tree.

Dockerfile improvements (independent of Bun but bundled while touching the file):
- tini as PID 1 for correct SIGTERM propagation (prevents half-written
  outbound.db on shutdown).
- Extracted entrypoint.sh — readable and diffable vs the old inline printf.
- BuildKit cache mounts for apt + bun install + pnpm install.
- --no-install-recommends on apt, pinned CLAUDE_CODE_VERSION, AGENT_BROWSER,
  VERCEL, BUN_VERSION.
- CJK fonts (~200MB) behind ARG INSTALL_CJK_FONTS=false; build.sh reads from
  .env; setup/container.ts reads the same .env so /setup and manual rebuild
  stay in sync.
- PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD=1 in case any postinstall tries to pull a
  redundant Chromium.
- /home/node 755 (was 777).

Host side:
- src/container-runner.ts dynamic spawn command collapses from
  `pnpm exec tsc --outDir /tmp/dist … && node /tmp/dist/index.js` to
  `exec bun run /app/src/index.ts` — cold start ~200-500ms faster per wake.

CI:
- oven-sh/setup-bun@v2 alongside Node/pnpm. Adds explicit container
  typecheck (was documented in CLAUDE.md, not enforced) and `bun test` for
  agent-runner tests.
2026-04-17 11:38:01 +03:00