Commit Graph

779 Commits

Author SHA1 Message Date
gavrielc db3aa0bf1f docs(v2/checklist): reflect post-refactor MCP gating, cold-DM infra, and delivery ACL
- create_agent is not admin-gated (host has no role check on the system
  action; agentTools unconditionally in the container MCP tool list).
- install_packages / add_mcp_server approval is owner/admin via
  pickApprover, not "admin-only".
- Chat-first setup bootstrap + post-handoff welcome are partially done
  via /setup + /init-first-agent (still TODO: single top-level entrypoint,
  welcome prompt expansion).
- Add entries for cold-DM infrastructure (ChannelAdapter.openDM,
  ensureUserDm, user_dms cache) and /init-first-agent skill under
  Channel Adapters.
- Add entry for delivery ACL throw-on-unauthorized + implicit-origin
  allow + auto-create agent_destinations on wire (the silent-drop bug
  fix from the welcome-DM end-to-end test).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 00:09:55 +03:00
gavrielc 4d562524cd style: apply prettier formatting
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 00:04:11 +03:00
gavrielc c60a9bef2d style: apply prettier formatting
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 00:03:51 +03:00
gavrielc 0d3326aae5 feat(v2): user-level privilege model + cold DM infra + init-first-agent skill
Replaces the agent-group-centric "main group" concept with user-level
privileges and adds the cold-DM infrastructure needed for proactive
outbound messaging (pairing, approvals, welcome flows).

Privilege model
- New tables: users, user_roles (owner global-only; admin global or
  scoped to an agent_group), agent_group_members (explicit non-
  privileged access; admin/owner imply membership), user_dms (cold-DM
  resolution cache).
- Removed agent_groups.is_admin, messaging_groups.admin_user_id. Replaced
  with messaging_groups.unknown_sender_policy (strict | request_approval
  | public) for per-chat unknown-sender gating.
- src/access.ts: canAccessAgentGroup, pickApprover, pickApprovalDelivery.
- src/router.ts: access gate on every inbound, honoring
  unknown_sender_policy for unknown senders.
- src/channels/telegram.ts: pairing interceptor upserts the paired user
  and promotes them to owner if hasAnyOwner() is false (first-pair-wins).

Cold DM infrastructure
- ChannelAdapter.openDM?(handle) — optional method. Chat-SDK-bridge wires
  it to chat.openDM() for resolution-required channels (Discord, Slack,
  Teams, Webex, gChat); direct-addressable channels (Telegram, WhatsApp,
  iMessage, Matrix, Resend) fall through to the handle directly.
- src/user-dm.ts: ensureUserDm(userId) — resolves + caches via user_dms.

Approval routing
- onecli-approvals + delivery use pickApprover + pickApprovalDelivery:
  scoped admins → global admins → owners (dedup), first reachable via
  ensureUserDm, same-channel-kind tie-break. Approvals land in the
  approver's DM, not the origin chat.

Delivery fixes
- delivery.ts ACL rejection now throws instead of returning undefined —
  the outer loop previously marked rejected messages as delivered.
- Implicit-origin allow: session.messaging_group_id === target skips the
  destination check.
- createMessagingGroupAgent auto-creates the companion agent_destinations
  row (normalized local_name from the messaging group's name, collision-
  broken within the agent's namespace).

Container
- container-runner.ts: /workspace/global always read-only; drops
  NANOCLAW_IS_ADMIN; adds NANOCLAW_ADMIN_USER_IDS (owners + global admins
  + scoped admins for this agent group). Agent-runner poll-loop gates
  slash commands against that set.

New skill: /init-first-agent
- Walks the operator through standing up the first agent for a channel:
  channel pick → identity lookup (reads each channel SKILL.md's
  ## Channel Info > how-to-find-id) → DM platform_id resolution (direct-
  addressable, cold-DM via "user DMs bot first + sqlite lookup", or
  Telegram pair-code fallback) → run scripts/init-first-agent.ts →
  verify via tail of nanoclaw.log.
- scripts/init-first-agent.ts: parameterized helper that upserts the
  user + grants owner (if none), creates dm-with-<display-name> agent
  group + initGroupFilesystem, reuses/creates the DM messaging_group,
  wires it (auto-creates destination), resolves the session, and writes
  a kind:'chat' / sender:'system' welcome message into inbound.db. Host
  sweep wakes the container and the agent DMs the operator via the
  normal delivery path.

/manage-channels rewrite
- Drops --is-main / --jid / main-vs-non-main isolation references.
- First-channel flow delegates to /init-first-agent.
- Explains createMessagingGroupAgent auto-creates destinations.
- Adds a privileged-users show section.

setup/
- register.ts: drop --is-main, --jid, --local-name, --trigger
  requiresTrigger defaults; call initGroupFilesystem; normalize to
  v2 schema (no is_admin, no admin_user_id, sets unknown_sender_policy
  'strict'); let createMessagingGroupAgent handle the destination row.
- pair-telegram.ts: emit PAIRED_USER_ID (namespaced "telegram:<id>")
  instead of ADMIN_USER_ID; update header comment.
- register.test.ts deleted — was v1-only, tested a registered_groups
  table that no longer exists.

Docs
- v2-architecture-diagram.{md,html}: ER diagram updated to drop
  is_admin/admin_user_id, add unknown_sender_policy, and include
  users/user_roles/agent_group_members/user_dms.
- v2-architecture-draft.md: approval-routing paragraph rewritten for
  pickApprover/pickApprovalDelivery/ensureUserDm; SQL schema block
  updated; admin-verification paragraph references
  NANOCLAW_ADMIN_USER_IDS.
- v2-setup-wiring.md: entity-model sketch rewritten.
- v2-checklist.md: marked privilege refactor / container filtering /
  approval routing / unknown-sender gating done; removed obsolete
  admin_user_id and main-vs-non-main items.

Scripts
- scripts/init-first-agent.ts (new) replaces scripts/welcome-owner-dm.ts
  (removed; welcome-owner was a Discord-specific one-off).
- test-v2-host.ts, test-v2-channel-e2e.ts, seed-discord.ts: drop
  is_admin + admin_user_id, use unknown_sender_policy.

Tests
- src/access.test.ts (new): 14 tests for canAccessAgentGroup, role
  helpers, pickApprover, ensureUserDm, pickApprovalDelivery.
- src/db/db-v2.test.ts: adds 3 tests for the auto-created
  agent_destinations row (normalized name, no duplicates, collision
  break within an agent group).
- host-core.test.ts, channel-registry.test.ts: updated fixtures to
  use unknown_sender_policy: 'public' where the test exercises routing
  rather than the access gate.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 00:03:51 +03:00
Koshkoshinsk 8430e543c1 style: apply prettier formatting
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 15:31:48 +00:00
Koshkoshinsk 63746dfeb3 fix(v2/delivery): allow agent self-messages without a destination row
Approval follow-up prompts (e.g. the post-rebuild "Packages installed,
verify they work" note) are written with channel_type='agent' and
platform_id=<self agent_group_id>, and were dropped by the
agent-to-agent authorization check because no self-destination row
exists. Agents are always authorized to message themselves; skip the
hasDestination check when source == target.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 15:31:44 +00:00
Koshkoshinsk 2df81e0b32 fix(v2/approvals): render correct title + selected label after click
Approval cards bypass the deliverMessage path that populates
pending_questions, so the post-click lookup found nothing and the
card edit fell back to " Question" + the raw option value
("approve"/"reject"). Store title and normalized options on
pending_approvals as well, and look up either table via a shared
getAskQuestionRender helper so the chat-sdk post-click edit and the
Discord interaction callback render the per-card title and the
selectedLabel (e.g. " Approved").

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 15:31:44 +00:00
Koshkoshinsk 42467d796d style: apply prettier formatting
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 15:31:44 +00:00
Koshkoshinsk d92d75e173 feat(v2/approvals): per-card titles and structured options
Approval cards now carry a required title (Add MCP Request, Install
Packages Request, Rebuild Request, Credentials Request) and structured
options with distinct pre-click label, post-click selectedLabel (e.g.
" Approved" / " Rejected"), and value used for click routing. The
title and normalized options are persisted in pending_questions so the
post-click card edit can render the correct per-type title and selected
label on both chat-sdk channels and Discord interactions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 15:31:44 +00:00
Gabi Simons 8d60af71d3 feat(v2): add /add-vercel skill for agent Vercel deployments
Setup skill that installs Vercel CLI in agent containers and configures
OneCLI credential injection for api.vercel.com. Container skill bundled
in .claude/skills/add-vercel/container-skills/ and copied to
container/skills/ during setup. Also adds dashboard & web apps prompt
to /setup flow (step 5b).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 15:24:31 +00:00
Koshkoshinsk 1903fab5e8 feat(v2/approvals): bundle install_packages + rebuild into one approval
Install approval now auto-rebuilds the image and kills the container,
replacing the prior two-card flow where the agent had to call
request_rebuild separately after install_packages was approved.

Queues a processAfter=+5s synthetic prompt so the respawned container
verifies the new packages and reports back to the user.

Adds two v2-checklist gaps found along the way:
- /remote-control and /remote-control-end are v1 host-level commands
  not ported to v2
- messaging_groups.admin_user_id is hardcoded null at registration

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 12:54:13 +00:00
Gabi Simons 192a5a7569 docs(v2): add /add-whatsapp-v2 setup skill
Separate from the v1 /add-whatsapp skill — v1 remains untouched.
Follows the v2 skill pattern (flat sections, defers to /manage-channels
for wiring). Covers Baileys auth, pairing code, QR code, and
documents the native adapter's features and limitations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 12:02:27 +00:00
Gabi Simons c36541ba6c feat(v2/whatsapp): add file attachments, reactions, and inbound media
- Outbound files: images, videos, audio as native media messages;
  other types as documents. First file gets text as caption.
- Reactions: send emoji reactions via Baileys react message type
- Inbound media: download images, video, audio, documents from
  incoming messages and pass as attachments to the agent
- Edit operations silently skipped (WhatsApp linked device limitation)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 11:30:06 +00:00
Gabi Simons c02ac06258 feat(v2): add formatting, approvals, and echo filter to WhatsApp adapter
- Markdown→WhatsApp formatting: **bold**→*bold*, *italic*→_italic_,
  headings→bold, links→plaintext, code blocks preserved
- ask_question support: renders as text with /approve, /reject slash
  commands; matches replies and routes through onAction pipeline
- credential_request: text fallback (WhatsApp has no modal support)
- Bot echo filter: skip fromMe messages to prevent loops
- Formatting applied to all outbound text messages

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 11:02:42 +00:00
Koshkoshinsk f304c67318 fix(telegram): sanitize outbound markdown for legacy parse mode
The @chat-adapter/telegram adapter hardcodes parse_mode=Markdown (legacy)
but its converter emits CommonMark. Messages containing **bold** or list
bullets that round-trip to `*` produce "can't parse entities" errors and
get dropped after retries.

Add an opt-in transformOutboundText hook on the chat-sdk bridge and wire
a Telegram-specific sanitizer that downgrades **bold** to *bold*, rewrites
dash/plus list bullets to a Unicode bullet so the adapter's re-stringify
doesn't inject stray `*`, and strips unbalanced delimiters or brackets.
Only Telegram opts in; other channels are unaffected.

Workaround until upstream (vercel/chat) ships mode-aware conversion —
PR #367 adds a parseMode knob but not the converter fix.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 10:30:32 +00:00
Gabi Simons c303b6eb14 feat(v2): add native WhatsApp adapter using Baileys v6
Direct ChannelAdapter implementation — no Chat SDK bridge.
Ports v1 infrastructure: getMessage fallback, outgoing queue,
group metadata cache, LID-to-phone mapping, auto-reconnect.
Auth via pairing code (WHATSAPP_PHONE_NUMBER) or QR code.

Text messaging only (MVP). Not yet implemented:
- File/image attachments (send and receive)
- Edit message, delete message
- Reactions
- Bot echo filtering (own messages loop back as inbound)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 15:04:24 +00:00
Koshkoshinsk d16755eabc docs(v2-checklist): note self-approval UX gap 2026-04-13 14:08:51 +00:00
gavrielc 871bfa1809 fix(v2): use in-tree symlink for global CLAUDE.md @import
Claude Code's @-import directive only follows paths inside the project
memory tree (cwd + ancestors). Both `@/workspace/global/CLAUDE.md` and
`@../global/CLAUDE.md` are silently ignored because `/workspace/global`
is outside `/workspace/agent` (the cwd). The import line is parsed but
the content is never loaded — validated with a sentinel passphrase test
against a live container.

Fix: drop a `.claude-global.md` symlink into each group's dir pointing
at `/workspace/global/CLAUDE.md`. The link path is absolute on container
terms (dangling on host, valid via the /workspace/global mount) and the
symlink file itself is inside cwd, so Claude's @-import is happy. The
group's CLAUDE.md imports via `@./.claude-global.md`.

- src/group-init.ts: initGroupFilesystem now drops the symlink (idempotent,
  uses lstat so existsSync doesn't trip on the dangling target on the
  host). Default CLAUDE.md body uses `@./.claude-global.md`.
- scripts/migrate-group-claude-md.ts: creates the symlink for existing
  groups and rewrites any broken `@/workspace/global/CLAUDE.md` or
  `@../global/CLAUDE.md` import line to `@./.claude-global.md`.
- groups/main/CLAUDE.md: migration rewrote the import.

Validated: live container with the symlinked import correctly surfaces
global CLAUDE.md content (passphrase `quinoa-submarine-42` added to
global, retrieved via claude -p, removed).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 16:46:50 +03:00
Koshkoshinsk 9a955b9b01 docs(v2-checklist): plan main/non-main -> owner/admin refactor
Pairing-code registration applies to every Telegram group once the privileged
"main chat" identity goes away.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 12:49:22 +00:00
Koshkoshinsk ae88d2b7c2 fix(telegram): retry adapter setup on transient network errors
Cold-start DNS/network hiccups can fail the adapter's first deleteWebhook or
getMe call, leaving the channel silently dead while the service stays up.
Wrap bridge.setup in an exponential-backoff retry (5 attempts) — if the
network is truly down we surface it instead of hanging forever.

Lives in telegram.ts so the chat-sdk bridge stays generic; other channels
can opt in by copying the small helper if they hit the same issue.
2026-04-13 12:27:45 +00:00
Koshkoshinsk 65afcdc946 feat(telegram-pairing): surface wrong-code attempts + auto-regen with retry cap
- createPairing now replaces any existing pending pairing for the same intent
  (replace-by-default; no "two pending codes for one intent" state)
- tryConsume records each attempt on pending records (capped at 10); a
  wrong code invalidates the pairing immediately (one attempt per code)
- waitForPairing gains onAttempt callback for misses and rejects with a
  distinct "invalidated by wrong code" message so callers can distinguish
  TTL expiry from user-error
- pair-telegram emits PAIR_TELEGRAM_ATTEMPT on misses and auto-regenerates
  the pairing up to 5 times, emitting PAIR_TELEGRAM_NEW_CODE for each
- Skill docs updated so the host Claude knows to show new codes and
  offer another batch on max-regenerations-exceeded

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 12:27:09 +00:00
Koshkoshinsk 2454444f2e feat(telegram-pairing): accept bare 4-digit codes
Require the message to be exactly the 4 digits (optionally prefixed by
@botname). Loose matches like "my pin is 0349" are rejected to avoid false
positives from chat traffic that happens to contain a 4-digit number.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 12:27:06 +00:00
Koshkoshinsk 2017589683 feat(telegram): self-contained pairing for chat ownership verification
BotFather issues bot tokens with no user binding, so anyone who guesses the
bot's username can DM it and get registered as a channel. Pairing closes that
gap: setup issues a one-time 4-digit code, the operator echoes it back from
the chat they want to register, and the inbound interceptor binds
admin_user_id before the message reaches the router.

- src/channels/telegram-pairing.ts: JSON-backed store with createPairing,
  tryConsume, getStatus, waitForPairing (fs.watch + poll fallback)
- src/channels/telegram.ts: wraps bridge.setup with an onInbound interceptor
  that consumes pairing codes and upserts messaging_groups
- setup/pair-telegram.ts: CLI step issues a code and waits up to 5 min for
  the operator to echo it back, emitting PLATFORM_ID/IS_GROUP/ADMIN_USER_ID
- Skill docs: /setup reorders mounts -> service -> wire (pairing needs a
  live polling adapter); /manage-channels and /add-telegram-v2 use pairing
  instead of asking the user to discover chat IDs

All other channels still bind admin via install-time identity (OAuth/QR/token);
pairing is Telegram-only. The bridge, router, and other adapters are untouched.
2026-04-13 12:27:02 +00:00
gavrielc af13c23a5a style: format group-init.ts signature
Prettier reformat applied by the format hook after the previous commit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 14:17:55 +03:00
gavrielc 2e6dc21748 refactor(v2): per-group filesystem init, persistent across spawns
Each group's on-disk state (CLAUDE.md, .claude-shared/, agent-runner-src/)
is now initialized exactly once at group creation and owned by the group
forever after. Spawn does only mounts — no copies, no settings.json
overwrites, no skill clobbers, no source resyncs.

Global memory composition switches from "host reads /workspace/global/CLAUDE.md
at bootstrap and stuffs it into systemPrompt.append" to "group CLAUDE.md
imports it via @/workspace/global/CLAUDE.md at the top." Edits to global
propagate instantly through the existing read-only mount; no copy, no
restart.

- src/group-init.ts: new initGroupFilesystem(group, opts?) — idempotent,
  populates groups/<folder>/, .claude-shared/, agent-runner-src/ only when
  paths don't already exist.
- src/container-runner.ts: buildMounts() calls init defensively at the
  top (catches existing groups on first spawn after this change), drops
  the inline settings.json write, skills cpSync loop, and agent-runner-src
  rm-then-copy. Just mounts now.
- src/delivery.ts: create_agent flow uses initGroupFilesystem with
  optional instructions, replacing the inline mkdirSync + writeFileSync.
- container/agent-runner/src/index.ts: drops GLOBAL_CLAUDE_MD reading.
  systemContext.instructions is now only the runtime-generated
  destinations addendum.
- scripts/migrate-group-claude-md.ts: one-shot migration that prepends
  the @-import to existing groups' CLAUDE.md. Skips if global doesn't
  exist or if the @-import is already present (regex match on the @ form
  to avoid false positives from prose mentions of the path).
- groups/main/CLAUDE.md: prepended by the migration.

Existing groups need a one-time wipe of their agent-runner-src/ dir so
init re-populates from current host source — done locally before this
commit. Future host-side updates to container/skills/ or
container/agent-runner/src/ won't auto-propagate; that's the trade-off
for unconditional persistence and will be covered by host-mediated
refresh tools in a follow-up.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 14:17:50 +03:00
Gabi Simons 8676c07448 feat(v2): support async channel adapter factories
Channel adapter factories can now return a Promise, enabling adapters
that need async initialization like loading auth state from disk
(e.g. WhatsApp reading credentials via useMultiFileAuthState).
Existing sync factories are unaffected — await on a sync return is
a no-op. All current adapters remain synchronous.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 11:11:06 +00:00
Gabi Simons 3db0dceb1b docs(teams-v2): full setup guide with Azure CLI, manifest, and sideloading
Rewrites the add-teams-v2 skill with step-by-step instructions
covering App Registration, client secret, Azure Bot creation (portal
and CLI), messaging endpoint, Teams channel, manifest template,
sideloading, and RSC permissions for receiving all messages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 07:45:28 +00:00
gavrielc d4aacfe416 fix(v2): clear per-group agent-runner src before copy
fs.cpSync never removes files that disappeared from the source, so
renamed or deleted files linger in data/v2-sessions/<group>/agent-runner-src/.
The container's entrypoint runs tsc over the whole mounted src via
tsconfig's `include: ["src/**/*"]`, so a single stale file fails the
compile and the container exits 2.

Latent since the dir was introduced — surfaced when the provider
interface refactor made a leftover index-v2.ts stop typechecking.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 10:25:40 +03:00
gavrielc b63dd186df refactor(agent-runner): decouple provider interface from Claude specifics
Reshape AgentProvider so provider-specific assumptions stop leaking into
the generic layer. No change to what reaches sdkQuery() — same values,
different plumbing.

- QueryInput: opaque `continuation` replaces `sessionId` + `resumeAt`;
  `systemContext.instructions` replaces ambiguous `systemPrompt`;
  `mcpServers`, `env`, `additionalDirectories` move to `ProviderOptions`
  at construction time.
- AgentProvider gains `isSessionInvalid(err)` and
  `supportsNativeSlashCommands` so the poll-loop stops regex-matching
  Claude error strings and gates passthrough slash commands per provider.
- ClaudeProvider owns `CLAUDE_CODE_AUTO_COMPACT_WINDOW` and the
  stale-session regex internally.
- ProviderEvent.activity kept and documented as the liveness signal
  (fires on every SDK message so the idle timer stays honest during
  long tool runs); init carries `continuation` instead of `sessionId`.
- poll-loop drops mcpServers/env/systemPrompt from its config; admin
  user id now passed explicitly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 10:25:29 +03:00
gavrielc e07158e194 fix(agent-runner): preserve thread_id when sending to current channel
send_file and send_message with an explicit `to` parameter were always
setting thread_id to null, causing files and messages to land in the
Discord channel root instead of the thread the session is bound to.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 18:13:42 +03:00
gavrielc f0e4f07ac2 refactor(v2): extract webhook server into standalone module
Aligns with upstream feat/chat-sdk-integration pattern: regex-based
routing (/webhook/{adapterName}), response streaming, cleanup function.
Updates Slack and Teams skill docs to match /webhook/{name} convention
used by all other v2 channel skills.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 17:36:16 +03:00
gavrielc 5a606a83d4 refactor(v2): use Chat SDK webhooks proxy and clean up webhook server
Route webhook requests through chat.webhooks[name]() instead of calling
adapter.handleWebhook() directly, getting proper auto-initialization and
signature verification. Extract Node↔Web Request/Response conversion
into reusable helpers, parse URL pathname properly for query string
safety, and support all HTTP methods (not just POST).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 17:36:16 +03:00
gavrielc 669a8444ef refactor(v2): extract session DB operations into src/db/session-db.ts
Move all raw SQL out of session-manager, delivery, and host-sweep into
a dedicated DB module. Make session schemas idempotent (IF NOT EXISTS)
so initSessionFolder always applies them. Revert the markdown
plain-text retry from 4c477ac.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 17:36:16 +03:00
Koshkoshinsk 2376c88aaf docs(v2): add delivery-failure-feedback to system actions checklist 2026-04-12 13:31:29 +00:00
Gabi Simons b140b3655b fix(agent-runner): reply to originating channel in single-destination shortcut
When an agent has one configured destination (e.g. Discord) but
receives a message from a different channel (e.g. Slack), the
single-destination shortcut was routing replies to the destination
instead of the originating channel. Now uses the inbound message's
routing context (channel_type, platform_id) when available, falling
back to the destination table only when routing context is absent.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 12:34:21 +00:00
Gabi Simons 7e74bfd330 feat(v2): Teams adapter env-driven app type and updated skill docs
Teams adapter now reads TEAMS_APP_TYPE and TEAMS_APP_TENANT_ID from
env, supporting both MultiTenant (default) and SingleTenant configs.
Updated add-teams-v2 skill docs with full Azure Bot setup flow,
webhook endpoint format, and app package sideloading instructions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 12:34:09 +00:00
Gabi Simons a9f9eda9f8 docs(slack-v2): update skill with DM setup, webhook URL, and reinstall step
Corrects webhook URL to /api/webhooks/slack, adds Enable DMs step
(App Home > Messages Tab), documents reinstall requirement after
adding event subscriptions, and adds webhook server section.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 12:33:56 +00:00
Gabi Simons 9476a80ab0 feat(v2): shared webhook server for webhook-based channel adapters
Adds a shared HTTP server (port 3000, configurable via WEBHOOK_PORT)
that routes incoming webhooks to the correct Chat SDK adapter by path
(e.g. /api/webhooks/slack, /api/webhooks/teams). Required by Slack,
Teams, GitHub, Linear, and other non-gateway adapters.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 12:33:45 +00:00
Koshkoshinsk 53e12a627f chore(v2): drop session-DB schema band-aid in writeSessionRouting
The forward-compat CREATE TABLE IF NOT EXISTS papered over a stale-DB
problem we don't need to support — the canonical INBOUND_SCHEMA in
src/db/schema.ts already creates session_routing for every fresh
session DB. Pre-existing local DBs that predate the schema entry are
treated as garbage and recreated, not migrated.

Schema is the single source of truth; write paths shouldn't carry
defensive table-creation logic.
2026-04-12 10:43:42 +00:00
Koshkoshinsk 7bd8c6ad41 fix(v2): retry channel adapter setup on transient network errors
A NetworkError during adapter.setup() (e.g. Telegram deleteWebhook hitting
a DNS hiccup at boot) would log the failure and immediately give up,
leaving the channel permanently dead until the host process was manually
restarted — even though the host kept running and other channels worked.

Wrap the setup call in a small retry loop with backoff (2s, 5s, 10s) that
fires only on NetworkError. Misconfigs (bad tokens, invalid options) still
fail fast since they don't surface as NetworkError.

Universal across channels — applies to any adapter that throws
NetworkError from setup(), not just Telegram.
2026-04-12 09:32:15 +00:00
Koshkoshinsk 4c477acca3 fix(v2): retry as plain text when adapter rejects markdown
A single message with markdown the adapter couldn't parse (e.g. Telegram
MarkdownV2 entity errors) would fail in deliverSessionMessages and be
retried forever, blocking every subsequent reply on that session.

Catch ValidationError from postMessage and retry once with the markdown
stripped to plain text via markdownToPlainText. Files re-attach in a
follow-up post since the plain-text retry drops the files payload shape.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 09:32:12 +00:00
gavrielc 9dda75bb21 docs(v2): cross-mount invariants + diagrams; inline a2a routing
- session-manager.ts: shrink the cross-mount invariant header from 31
  lines to 12, keeping each invariant's cause and consequence inline.
- agent-runner/db/connection.ts: parallel cross-mount comment for the
  container-side reader (inbound.db must be journal_mode=DELETE).
- agent-runner/db/messages-out.ts: document that even/odd seq parity
  is load-bearing — seq is the agent-facing message ID returned by
  send_message and consumed by edit_message / add_reaction, looked
  up across both tables.
- v2-checklist.md: record the cross-mount invariants and seq parity
  under Core Architecture so future "simplifications" don't regress
  them.
- scripts/sanity-live-poll.ts: empirical validation harness for the
  three cross-mount invariants — flips each one and observes silent
  message loss / corruption.
- delivery.ts: inline routeAgentMessage at its single callsite (-17
  net lines). The wrapper added more boilerplate than it factored.
- docs/v2-architecture-diagram.{md,html}: rendered Mermaid diagrams
  of the v2 system, message flow, named destinations, entity model,
  and the two-DB split.
- channels/adapter.ts, chat-sdk-bridge.ts, credentials.ts,
  db/sessions.ts, db/db-v2.test.ts: prettier format pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 00:21:12 +03:00
gavrielc c9fa5cdbed docs(v2): expand checklist — credential collection, approvals, Chat SDK input
Mark OneCLI manual-approval integration and credential collection from chat
as partial with concrete sub-TODOs. Add upstream asks:

* Chat SDK input support beyond Slack — platforms support it natively
  (Discord modals, Teams/GChat/Webex Adaptive Cards, WhatsApp Flows), Chat
  SDK just doesn't expose the surfaces yet. Concrete per-platform mapping
  captured.
* Built-in OneCLI apps shadow generic secrets on the same host; the
  collection tool should check apps-list first and surface the connect URL
  when an app exists.
* Tunneled OneCLI dashboard fallback for channels with no native form input.
* Per-agent-group secret scoping via OneCLI agentId.
* SDK-native secret management to replace the shell facade in onecli-secrets.

Also:

* Admin model refactor — instance-level default admin + per-group override
  + DM delivery when supported.
* Discord-specific Chat SDK quirks (first-message @mention requirement,
  sub-thread materialization on subscribe).
* OneCLI migration check under Migration — flag whether existing installs
  need OneCLI re-init (new SDK version, credentials re-scoped).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 17:18:51 +03:00
gavrielc 062b0cb6bf fix(agent-runner): add updated_at column to session_state on older DBs
session_state was added after the initial v2 schema with a lazy
`CREATE TABLE IF NOT EXISTS` in getOutboundDb(), so older session
outbound.db files have a session_state table from before updated_at
existed. The lazy create is a no-op when the table already exists,
leaving the column missing and causing:

    Error: table session_state has no column named updated_at

on every `INSERT OR REPLACE INTO session_state` call.

Follow up the CREATE IF NOT EXISTS with a PRAGMA table_info check and
ALTER TABLE ADD COLUMN when updated_at is missing. Cheap on every open,
only runs DDL once per DB.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 17:18:34 +03:00
gavrielc e92b245399 feat(v2): OneCLI 0.3.1 — approvals, credential collection, threaded routing
Three features built on top of @onecli-sh/sdk 0.3.1, landed together because
they share wiring surfaces (session DB schema, delivery dispatcher, Chat SDK
bridge, channel adapter contract).

## OneCLI manual-approval handler

* `src/onecli-approvals.ts` — long-polls OneCLI via the SDK's
  `configureManualApproval`; on each request, delivers an `ask_question` card
  to the admin agent group's first messaging group, persists a
  `pending_approvals` row, and waits on an in-memory Promise resolved by the
  admin's button click or an expiry timer. Expired cards are edited to
  "Expired (...)" and a startup sweep flushes any rows left over from a
  previous process.
* Short 11-byte approval id (`oa-<8 base36>`) instead of the SDK's UUID so the
  Telegram 64-byte `callback_data` limit is respected; the OneCLI UUID stays
  in the persisted payload for audit.
* Migration 003 consolidated: `pending_approvals` now has the OneCLI-aware
  columns from the start (`agent_group_id`, `channel_type`, `platform_id`,
  `platform_message_id`, `expires_at`, `status`), `session_id` relaxed to
  nullable so cross-session approvals fit.
* `handleQuestionResponse` in `src/index.ts` now routes OneCLI approvals
  through `resolveOneCLIApproval` before falling back to the
  session-bound approval path.

## Credential collection from chat

New `trigger_credential_collection` MCP tool — the agent researches a
third-party API, calls the tool with `{name, hostPattern, headerName,
valueFormat, description}`, and blocks until the host reports saved, rejected,
or failed. The credential value never enters the agent's context: the user
submits it into a Chat SDK Modal on the host side, the host writes it to
OneCLI via a thin facade (`src/onecli-secrets.ts` — shells out to
`onecli secrets create`, shape mirrors the SDK we expect upstream), and only
the status string flows back to the container via a system message.

* `src/credentials.ts` — host-side handler: delivers the card to the
  conversation's own channel (not the admin channel — credential collection
  is a user-facing flow, distinct from admin approval), persists a
  `pending_credentials` row, drives the submit → `createSecret` → notify
  pipeline. Falls back gracefully when the channel doesn't support modals.
* `src/db/credentials.ts` + migration 005: `pending_credentials` table.
* `src/channels/chat-sdk-bridge.ts`: renders a `credential_request` card,
  handles the `nccr:` action prefix by opening a Modal with a TextInput,
  registers an `onModalSubmit` handler for the `nccm:` callback prefix.
* `container/agent-runner/src/mcp-tools/credentials.ts`: the blocking MCP
  tool, mirroring the `ask_user_question` polling pattern.
* `container/agent-runner/src/db/messages-in.ts`: `findCredentialResponse`
  helper to pick up the system message the host writes back.

## Threaded adapter routing

The destination layer previously didn't carry thread context, so agent replies
to Discord always landed in the root channel regardless of which thread the
inbound came from.

* `ChannelAdapter.supportsThreads: boolean` — declared by every channel skill
  at `createChatSdkBridge`. Threaded: Discord, Slack, Teams, Google Chat,
  Linear, GitHub, Webex. Non-threaded: Telegram, WhatsApp Cloud, Matrix,
  Resend, iMessage.
* `src/router.ts`: non-threaded adapters strip `threadId` at ingest (threads
  collapse to channel-level sessions). Threaded adapters override the
  wiring's `session_mode` to `'per-thread'` so each thread = a session
  (except `agent-shared`, which is preserved as a cross-channel intent the
  adapter can't know about).
* `session_routing` table in `inbound.db` — single-row default reply routing
  written by the host on every container wake from
  `session.messaging_group_id` + `session.thread_id`. Forward-compat
  `CREATE TABLE IF NOT EXISTS` handles older session DBs lazily.
* `container/agent-runner/src/db/session-routing.ts` — container-side reader.
* `send_message` / `send_file` / `ask_user_question` / `send_card` /
  scheduling tools all default their routing (channel, platform, **and**
  thread) from the session when no explicit `to` is given. Explicit `to`
  uses the destination's channel with `thread_id = null` (cross-destination
  sends start a new conversation elsewhere).
* `poll-loop.ts::sendToDestination` (the final-text single-destination
  shortcut) now inherits `thread_id` from `RoutingContext` too — this was
  the root cause of Discord replies landing in the root channel even after
  `send_message` was wired correctly.

## Related cleanups

* `src/container-runner.ts`: OneCLI agent identifier switched from the lossy
  folder-derived string to `agent_group.id`, making `getAgentGroup(externalId)`
  a trivial reverse lookup for per-agent scoping.
* `wakeContainer` race fix via an in-flight promise map — concurrent wakes
  during the async buildContainerArgs / OneCLI `applyContainerConfig` window
  no longer double-spawn containers against the same session directory.
* `src/db/db-v2.test.ts`: dropped the brittle `expect(row.v).toBe(N)` schema
  version assertion — it had to be bumped on every migration addition.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 17:18:21 +03:00
gavrielc 9dc8bc5d99 docs(v2): expand checklist — chat-first setup, product focus, skills marketplace
Capture the product direction that's been landing in recent work:
everything configurable from chat once bootstrap is done, skills as
the primary extension mechanism, and mark named destinations / agent
self-modification / agent-to-agent comms as complete.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 01:18:09 +03:00
gavrielc 630dd54ea9 chore(container): drop v1 IPC dirs and update entrypoint comment
The /workspace/ipc/* tree is a v1 leftover — v2 routes everything
through inbound.db / outbound.db. Refresh the surrounding comment to
describe what the entrypoint actually does.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 01:18:01 +03:00
gavrielc b59216c299 fix(v2): persist SDK session ID across container restarts
The v2 poll loop held the session ID in a local variable, so every
container restart started a fresh SDK session even though the .jsonl
transcript was still sitting in the shared .claude mount. Store it in
outbound.db (container-owned, already per channel/thread), seed the
loop on startup, clear on /clear, and recover from stale-session
errors the same way v1 did.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 01:17:42 +03:00
gavrielc b591d7ce96 refactor: move destinations from JSON file into inbound.db
The per-session destination map was being written as a sidecar JSON file
(/workspace/.nanoclaw-destinations.json) — inconsistent with the rest of
v2, where all host↔container IO goes through inbound.db / outbound.db.

Move it into a `destinations` table in INBOUND_SCHEMA. The host writes
it before every container wake AND on demand (e.g. after create_agent)
so the creator sees the new child destination mid-session without a
restart. The container queries the table live on every lookup — no
cache, no staleness window.

- src/db/schema.ts: add `destinations` table to INBOUND_SCHEMA.
- src/session-manager.ts: writeDestinationsFile → writeDestinations,
  writes via DELETE + INSERT inside a transaction.
- src/delivery.ts: create_agent handler calls writeDestinations on the
  creator's session after inserting the new destination rows.
- container/agent-runner/src/destinations.ts: queries inbound.db
  directly in every findByName/getAllDestinations/findByRouting call.
  No more cache. No setDestinationsForTest (obsolete). No fs import.
- container/agent-runner/src/index.ts and mcp-tools/index.ts: remove
  loadDestinations() calls — no longer needed.
- Test helper initTestSessionDb creates the destinations table.
  Integration test inserts a row directly instead of mocking the cache.

No backwards compatibility: sessions predating the schema update must
be recreated. This is fine on the v2 branch.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 16:45:53 +03:00
gavrielc 09e1861a22 feat: single-destination shortcut — no wrapping needed when there's only one
When an agent has exactly one configured destination, wrapping output in
<message to="..."> blocks is unnecessary. Plain text goes to the sole
destination automatically. This preserves the simple "just reply" flow
for the common case of one user on one channel.

Applies in three places:

- System prompt addendum: single-destination case gets a simplified
  explanation ("your messages are delivered to X, just write directly").
  Multi-destination case keeps the <message to="..."> syntax docs.

- Main output parser: if zero <message> blocks are found and there is
  exactly one destination, the entire cleaned text (with <internal>
  stripped) is sent to that destination.

- send_message / send_file MCP tools: `to` parameter is now optional.
  With one destination, omitted defaults to it. With multiple, omitting
  returns an error listing the options.

Multi-destination behavior is unchanged — explicit <message to="..."> is
still required, and untagged text is still scratchpad.

groups/global/CLAUDE.md updated to describe both cases.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 16:36:09 +03:00