Commit Graph

1034 Commits

Author SHA1 Message Date
gavrielc 7f4583d0fe fix(setup): add npm global prefix bin to PATH after fallback install
When corepack enable fails with EACCES (common when Node is installed to a system-writable prefix like /usr/local that the user doesn't own), we fall back to `npm install -g pnpm`. But npm's global prefix isn't always on the shell's PATH — users often set `npm config set prefix ~/.npm-global` to avoid sudo, and the resulting bin dir isn't picked up by `command -v`. Install succeeded, but pnpm "wasn't there" for the follow-up `pnpm install`.

Now after the npm fallback we query `npm config get prefix` and prepend `<prefix>/bin` to PATH. Mirror the same lookup in nanoclaw.sh right before `exec pnpm run setup:auto` — setup.sh's PATH mutation doesn't propagate back, and the hand-off needs pnpm visible too.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 10:50:21 +03:00
gavrielc 092f16dfaa Merge pull request #1927 from qwibitai/setup-feedback-fixes
Clarify setup flow from user-feedback session
2026-04-23 10:43:27 +03:00
gavrielc 4ff4cc75b9 Merge remote-tracking branch 'origin/main' into setup-feedback-fixes
# Conflicts:
#	setup/auto.ts
#	setup/channels/whatsapp.ts
2026-04-23 10:39:35 +03:00
gavrielc 56ef5b4461 feat(setup): clarify setup flow from user-feedback session
- Container step: duration hint + 3-line rolling output window with
  60s stall detector that offers "keep waiting" vs "ask Claude"
- First chat: reframed as a try-out with sandbox-model explainer
  (wakes on message, sleeps when idle, context persists)
- Timezone: auto-detected non-UTC zones now get an explicit
  confirm from the user instead of silent persist
- Outro: added always-on warning + prominent "check your DM" banner
  when a channel was configured; directive last line
- Discord: always show token-location reminder even when user says
  they have one; new "do you have a server?" branch walks through
  server creation if not
- All select prompts: custom brightSelect renderer keeps inactive
  option labels at full brightness (was dim gray); adds @clack/core
  as a direct dep

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 10:35:12 +03:00
github-actions[bot] 8a19ad019a chore: bump version to 2.0.4 2026-04-23 07:11:04 +00:00
gavrielc 5f1b3e5cad style: apply prettier formatting to install-slug additions 2026-04-23 10:10:48 +03:00
github-actions[bot] 72aba8c7ba docs: update token count to 128k tokens · 64% of context window 2026-04-23 07:10:31 +00:00
github-actions[bot] 3d44001633 chore: bump version to 2.0.3 2026-04-23 07:10:26 +00:00
gavrielc 7a9401ddf2 feat(setup): per-checkout service name and docker image tag
Two NanoClaw installs on the same host used to fight over the shared `com.nanoclaw` launchd label / `nanoclaw.service` systemd unit and the `nanoclaw-agent:latest` docker tag — the second install silently rewrote the service pointer and rebuilt the image out from under the first. Introduces a deterministic per-checkout slug (sha1(projectRoot)[:8]) and namespaces everything off it:

- Service: `com.nanoclaw-v2-<slug>` / `nanoclaw-v2-<slug>.service`
- Image:   `nanoclaw-agent-v2-<slug>:latest` (base), `nanoclaw-agent-v2-<slug>:<agentGroupId>` (per-group)

New shared helpers: src/install-slug.ts (host) + setup/lib/install-slug.sh (bash). Both compute the same slug so verify/probe/add-*.sh/build.sh/container-runner all agree. Any v1 `com.nanoclaw` service left on the host stays untouched and can coexist.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 10:10:09 +03:00
gavrielc 4f6d62a65e docs(readme-zh): align Chinese README with v2 English
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 09:55:34 +03:00
gavrielc 564000dcae docs(readme-ja): align Japanese README with v2 English
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 09:55:28 +03:00
gavrielc 601fc7c396 docs(readme): split Quick Start into separate commands
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 09:33:39 +03:00
gavrielc cdb9442796 docs(readme): clone into nanoclaw-v2 in Quick Start
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 09:31:49 +03:00
gavrielc 8326b4c0be fix(setup): offer to reuse an existing OneCLI instead of clobbering it
Before: setup/onecli.ts ran `curl -fsSL onecli.sh/install | sh` unconditionally. For users with OneCLI already running and bound to a specific listener (host-accessible, shared with other apps), re-running the installer rebound the gateway and broke those consumers.

Now: auto.ts probes for an existing install (`onecli version` + `onecli config get api-host`). If detected, clack asks: use the existing instance (recommended) or install a fresh one. The new --reuse flag in the onecli step skips the installer, reads the configured api-host, writes ONECLI_URL to .env, and moves on without touching the running gateway.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 01:48:24 +03:00
github-actions[bot] 22c2beff3c chore: bump version to 2.0.2 2026-04-22 22:05:25 +00:00
gavrielc 6cd261a26d chore(container): loosen /home/node to 0777
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 01:05:03 +03:00
gavrielc d97a0e1484 fix(setup): resolve channels remote dynamically, don't assume origin
Forks that keep the upstream nanoclaw repo under a non-origin remote name (typically `upstream`, with `origin` pointing at the user's fork) hit "git fetch origin channels failed" when adding a channel, because the fork doesn't carry the channels branch. New setup/lib/channels-remote.sh walks `git remote -v` for a url matching qwibitai/nanoclaw, auto-adds `upstream` if none matches, and honors NANOCLAW_CHANNELS_REMOTE as an override. Wired into the four add-*.sh scripts that setup:auto invokes (discord, telegram, whatsapp, teams).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 00:21:15 +03:00
gavrielc 16421cc022 fix(setup): fall back to npm install when corepack is missing
Some Node installs (older nvm, node@22 keg-only on brew, minimal distro packages) don't ship corepack, so the bootstrap was dying with "corepack: command not found" before pnpm could land on PATH. Now guards the corepack call and falls back to `npm install -g pnpm@<pinned>`, reading the version from package.json's packageManager field.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 23:20:38 +03:00
gavrielc 469dd9af7e refactor(skills): collapse setup skill to one instruction — run bash nanoclaw.sh
Deletes the Claude-orchestrated /setup and /new-setup flows. The scripted installer (bash nanoclaw.sh → setup:auto) now handles bootstrap, container, OneCLI, auth, service, first agent, and optional channel wiring end-to-end with inline Claude-assisted recovery on failure. Keeps /setup as a one-line redirect so the trigger still resolves. Drops the opt-out diagnostics files that belonged to the old flow and updates cross-refs in add-wechat, migrate-nanoclaw, and update-nanoclaw.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 23:20:38 +03:00
github-actions[bot] dbb859bfec docs: update token count to 127k tokens · 64% of context window 2026-04-22 16:50:18 +00:00
github-actions[bot] dbb82440bd chore: bump version to 2.0.1 2026-04-22 16:50:10 +00:00
gavrielc c16052ed4d Merge pull request #1919 from qwibitai/v2
v2: ground-up architectural rewrite
2026-04-22 19:49:51 +03:00
gavrielc e9d056dc6c merge: main into v2, resolving conflicts in favor of v2
Ground-up v2 rewrite supersedes all conflicting files from main. The one main-side fix (ONECLI_API_KEY forwarding, 8b5b581) already landed on v2 as 3db66c0. README preview banner dropped since v2 is now main.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 19:49:02 +03:00
gavrielc 25687dca8f chore(deps): drop @chat-adapter/telegram from trunk
Trunk ships no channel adapters — /add-telegram installs the package on demand from the channels branch. This dependency was stale and pulled ~2 transitive packages into every fresh install.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 19:44:54 +03:00
gavrielc 5df8d9d2e5 docs: drop v2 refactor planning docs ahead of merge
Removes transient analysis/proposal/checklist docs whose purpose is served once v2 ships: REFACTOR.md, docs/v1-vs-v2/, docs/checklist.md, docs/shared-source.md, docs/claude-md-composition.md, docs/module-contract.md, docs/DEBUG_CHECKLIST.md. Updates CLAUDE.md and docs/README.md index rows accordingly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 19:43:58 +03:00
gavrielc 8412b899fa feat(diagnostics): funnel events throughout setup with persisted install-id
Shared bash + node emitter in setup/lib/diagnostics.{sh,ts} reads/writes data/install-id so every event from a single install shares one distinct_id — bash-side setup_launched/setup_start, node-side auto_started, per-step started/completed, auth_method_chosen, channel_chosen, first_chat_ready/failed, setup_incomplete, setup_aborted, setup_completed. Opt-out via NANOCLAW_NO_DIAGNOSTICS=1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 19:21:06 +03:00
gavrielc 39ae04df98 docs(welcome-skill): expand onboarding tour with capability reveals and trust framing
Adds a structured drip-feed of capabilities (memory, agents, scheduling, research, code, UI, files, self-customization) and explicit sections on approvals, access control, and natural interaction.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 19:04:00 +03:00
gavrielc d2f53048f2 docs(module-fragments): add instructions for create_agent, interactive, and remaining core tools
Three MCP tool groups were orphaned from the ambient CLAUDE.md context
because they shipped no `*.instructions.md` alongside their source.
Backfill them so the composer picks them up as fragments on next spawn:

- core.instructions.md: add `send_file` (artifact delivery, path relative
  to /workspace/agent/) and `add_reaction` (by `#N` id with emoji
  shortcode name).
- interactive.instructions.md: `ask_user_question` (blocking
  multiple-choice with selectedLabel/value option objects, 300s default
  timeout) and `send_card` (non-blocking structured render with
  fallbackText). Opens with a one-line framing of the contrast between
  the two.
- agents.instructions.md: `create_agent` with how-it-works, when-to-use
  (companions vs collaborators — persistent memory vs independent
  parallel work), when-NOT-to-use (short tasks should use the SDK `Agent`
  tool instead), and guidance for writing the seed instructions string.

No composer changes — scan in `src/claude-md-compose.ts` already picks up
any file matching `*.instructions.md` in the mcp-tools directory.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 18:05:50 +03:00
gavrielc 52ebdce9c9 docs(claude-md): drop host-facing header comment from shared base
The HTML comment at the top was aimed at maintainers opening the file,
but it's loaded verbatim into every agent's system prompt via the
`.claude-shared.md` import. Agents don't need the meta-explanation of
where the file is mounted or how identity gets injected — it's just
context-budget drag. Move the maintainer guidance out of the agent's
view.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 17:47:30 +03:00
gavrielc e2b1df876b docs(claude-md): strengthen memory discipline in shared base
Replace the passing CLAUDE.local.md mention with an explicit Memory
section: anything substantive the user shares must be stored so it's
retrievable later, with per-topic files indexed from CLAUDE.local.md.
Frames this as a core part of the agent's job — the quality of its
memory systems is a main signal of how useful it is.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 17:46:59 +03:00
gavrielc 3b8240a91b refactor(self-mod): drop request_rebuild — approvals now bundle rebuild+restart
install_packages and add_mcp_server already did the right thing on approve
(install auto-rebuilt+killed, add_mcp_server just killed), so request_rebuild
was redundant plumbing agents sometimes called after an install — wasting an
admin approval round-trip. Delete it end-to-end:

- container/agent-runner/src/mcp-tools/self-mod.ts: remove requestRebuild
  tool + registration; update install_packages description.
- src/modules/self-mod/{request,apply,index}.ts: drop handleRequestRebuild
  + applyRequestRebuild + registrations; rewrite the rebuild-failed notify
  to point admins at retrying install_packages instead.
- src/modules/{approvals,self-mod}/{agent,project}.md and skill/self-
  customize/SKILL.md: scrub agent-facing references; clarify that
  add_mcp_server needs no rebuild (bun runs TS directly).
- docs/{module-contract,architecture-diagram,checklist,db-central,shared-
  source,v1-vs-v2/*}.md, CLAUDE.md, pending-approvals migration comment,
  approvals/index.ts docstring, REFACTOR.md: trailing references.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 17:28:36 +03:00
gavrielc e64bdb3016 refactor(claude-md): split shared base into module fragments, inject name at runtime
Move every agent-specific instruction out of the shared container/CLAUDE.md
so the base is genuinely universal. Persona/identity now comes from the
system-prompt addendum (buildSystemPromptAddendum now takes assistantName
and prepends "# You are {name}"). Per-module instructions live alongside
each MCP tool source:

  container/agent-runner/src/mcp-tools/core.instructions.md
  container/agent-runner/src/mcp-tools/scheduling.instructions.md
  container/agent-runner/src/mcp-tools/self-mod.instructions.md

composeGroupClaudeMd() scans that directory and emits `module-<name>.md`
fragments as symlinks to /app/src/mcp-tools/<name>.instructions.md (valid
via the existing RO source mount). Skill fragments renamed to
`skill-<name>.md` for naming consistency with `module-*` and `mcp-*`.

Mount tightening so composer-managed files can't be clobbered by agent
writes: nested RO mounts for /workspace/agent/CLAUDE.md and
/workspace/agent/.claude-fragments/. CLAUDE.local.md (per-group memory)
stays RW as the only writable CLAUDE.md-family file.

.gitignore: ignore CLAUDE.local.md, .claude-shared.md, .claude-fragments/
everywhere, and simplify groups/ rules to ignore the whole tree (per-
installation state, not tracked).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 17:14:51 +03:00
gavrielc 95e74d8383 docs(onecli): expand secrets section; correct stale admin-roles refs
Document the selective-mode gotcha for auto-created OneCLI agents
(no secrets injected by default) with the CLI commands to inspect
and fix it. Note that approval policies are not configurable via
the SDK or `onecli@1.3.0` CLI — web UI only.

Replace stale `NANOCLAW_ADMIN_USER_IDS` / `src/access.ts` references
across CLAUDE.md, docs/architecture.md, docs/checklist.md, and
docs/module-contract.md. Admin gating now runs host-side in
src/command-gate.ts against `user_roles`; approver picks live in
src/modules/approvals/primitive.ts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 16:46:17 +03:00
gavrielc 202ee71311 feat(setup): auto-detect timezone after CLI agent step
Adds a timezone step between cli-agent and channel wiring in setup:auto.
Autodetect via --step timezone; if it resolves to UTC or fails, confirm
with the user and accept either an IANA zone or a free-text description
(e.g. "New York"). Free-text falls through to a headless `claude -p`
call that returns a single IANA string, gated on the claude CLI being
on PATH.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 16:44:53 +03:00
gavrielc 0ed00b3358 docs(claude.md): add v1-merge STOP banner directing to migrate-v2.sh
Prepend a Claude-addressed banner so that when an upgrader (or Claude on
their behalf) runs `git pull` / `git merge` from v1 and hits merge
conflicts, Claude aborts the merge and routes the user to
`bash migrate-v2.sh` instead of trying to resolve the rewrite by hand.
Fresh clones are explicitly told to ignore the banner.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 16:21:28 +03:00
gavrielc 3db66c0ced fix: forward ONECLI_API_KEY to OneCLI SDK for authenticated container config
Ports the v1 fix from PR #1777 (originally 8b5b581 by @johnnyfish).
Cherry-pick did not apply cleanly because v2 reformatted the surrounding
code and split OneCLI usage into two sites — manual port was needed.

v2-specific adaptations:
- Also forward apiKey at the second OneCLI call site in
  src/modules/approvals/onecli-approvals.ts (v2 split the approvals
  module out of container-runner).
- Skipped the companion test-mock commit (38163bc) — it patches
  src/container-runner.test.ts, which no longer exists in v2 (tests
  consolidated into host-core.test.ts).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: johnnyfish <jonathanfishner11@gmail.com>
2026-04-22 15:16:59 +03:00
gavrielc 5371c76c14 docs(readme): mark translations as pre-v2 pending update
The Chinese and Japanese READMEs still describe the v1 architecture
(setup flow, channel list, "main channel" concept, etc.). Add a notice
at the top of each pointing readers to the English README for current
content until the translations are refreshed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 15:10:17 +03:00
gavrielc fe942dd3dd chore: bump to 2.0.0 with v2 CHANGELOG entry
Major version for the v2 rewrite. CHANGELOG documents the breaking
changes users will hit on upgrade: new entity model, two-DB session
split, `bash nanoclaw.sh` as default install, channels/providers
relocated to sibling branches, three-level isolation, Apple Container
removed from default setup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 15:10:10 +03:00
gavrielc 8e1c8f8f61 style: apply prettier formatting to touched files
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 14:57:09 +03:00
gavrielc 8662f21e8f docs(readme): update for v2 + scripted setup default
Rewrite install flow around `bash nanoclaw.sh`, update What It Supports to
reflect the three-level isolation model and the real channel roster, fix
the Philosophy section (AI-native hybrid, skills over features via
channels/providers branches, Codex/OpenCode/Ollama as drop-in providers),
and replace the pre-v2 architecture diagram and key-files list with the
two-DB session split (`inbound.db` / `outbound.db`) and current src layout.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 14:53:57 +03:00
gavrielc 390e09597a feat(setup): hand off to Claude for Teams finish-wiring
Post-install was a bare instruction block: "DM the bot, run
/manage-channels". Replace it with an explicit Done/Stuck-style
select backed by the handoff mechanism — Claude takes over, tails
logs/nanoclaw.log for the inbound, inspects data/v2.db for the
auto-created messaging_group row, runs scripts/init-first-agent.ts
with the discovered platform_id + AAD user id, and verifies end-to-end.

Operators who want to drive it themselves pick "I'll do it myself"
and get the same terse instructions as before. Default is the
handoff (recommended hint).

Why Teams and not the other channels: Telegram/Discord/WhatsApp
already have synchronous platform IDs we capture during setup, so
init-first-agent runs inline. Teams platform IDs only exist after
the first real inbound, so the wiring is necessarily deferred — and
that deferred work is exactly what the handoff handles best.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 14:31:32 +03:00
gavrielc a70e41856b feat(setup): Microsoft Teams wiring with Claude handoff
Teams is the most complex channel NanoClaw supports — no "paste a
token" shortcut exists. Operators walk through ~6 Azure portal steps
(app registration, client secret, Azure Bot resource, messaging
endpoint, Teams channel, manifest sideload). The driver makes each
step as guided as possible and gives the operator an explicit
escape to interactive Claude whenever they get stuck.

Handoff mechanism (reusable across channels):
- setup/lib/claude-handoff.ts: offerClaudeHandoff(ctx) spawns
  `claude --append-system-prompt <context> --permission-mode acceptEdits`
  with stdio: 'inherit', returns when Claude exits so the driver can
  re-offer the same step. Context captures channel, current step,
  completed steps, collected values (secrets redacted), and file refs.
- validateWithHelpEscape / isHelpEscape: wrap clack text/password
  prompts so typing '?' triggers the handoff mid-paste.
- Parallel to the existing claude-assist.ts (which is failure-triggered
  and runs claude -p for a one-shot command suggestion). This is the
  user-initiated, interactive counterpart.

Teams driver (setup/channels/teams.ts):
- 6-step walkthrough, each a clack note + paste prompts + stepGate
  select ("Done / Stuck — hand me off to Claude / Show me again").
- Collects TEAMS_APP_ID / TEAMS_APP_TENANT_ID / TEAMS_APP_PASSWORD /
  TEAMS_APP_TYPE plus the operator's public HTTPS URL (advisory —
  no tunnel automation yet).
- Emits the full Azure CLI invocation alongside the portal steps for
  operators who prefer scripted creation.
- UUID/password prompts accept '?' as a help escape; select prompts
  have an explicit 'Stuck' option that triggers the handoff.

Manifest generator (setup/lib/teams-manifest.ts):
- Builds data/teams/teams-app-package.zip in-process: manifest.json
  (schema v1.16) with app ID injected, a 32×32 outline icon, a
  192×192 brand-blue color icon, bundled with the system `zip`.
- Minimal hand-rolled PNG encoder (CRC32 table + zlib deflate) so we
  don't need ImageMagick or vendored binary blobs.
- ~2.5KB zip, validates with `unzip -l`; icons verify as valid PNGs.

Installer (setup/add-teams.sh):
- Non-interactive mirror of add-discord.sh. Validates the four env
  vars, copies adapter from origin/channels, installs
  @chat-adapter/teams@4.26.0, upserts creds to .env + data/env/env,
  restarts the service.

auto.ts: Teams option in askChannelChoice with 'complex setup' hint,
dispatch to runTeamsChannel.

Deferred (known limitation, operator instructed to finish manually):
- Wait-for-first-DM pairing to capture the auto-generated Teams
  platform_id. Teams platform IDs are only discoverable after the
  first inbound activity. The driver installs the adapter and stops
  there; the operator DMs the bot, NanoClaw auto-creates the
  messaging group, and they wire an agent via /manage-channels.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 14:27:29 +03:00
gavrielc a541e7542c Merge pull request #1914 from qwibitai/v2-with-refactors
V2 with refactors
2026-04-22 13:49:41 +03:00
gavrielc fb82c1babb Delete docs/shared-src.md 2026-04-22 13:46:28 +03:00
exe.dev user 7da24b166d fix(agent-runner): remove thread_id filter and fix processing ack on empty result
The concurrent poll in processQuery filtered out messages with
mismatched thread_ids, causing a deadlock when the initial batch
(e.g. a host-generated welcome trigger with null thread_id) completed
but follow-ups arrived with a different thread_id (e.g. a Discord DM).
The query stayed open waiting for matching-thread pushes that never
came, blocking the poll loop indefinitely.

Thread routing is the router's concern — per-thread sessions already
isolate threads into separate containers; shared sessions intentionally
merge everything. Removed the filter.

Also fixed processing_ack: a result event (with or without text) means
the turn is done, but markCompleted only ran when event.text was truthy.
When the agent responded via MCP send_message (empty result text), the
initial batch stayed in 'processing' for the query's lifetime, creating
false stuck signals in the host sweep. Now marks completed on any result
event.

Belt-and-suspenders: init-first-agent welcome trigger now sets threadId
to the DM platform_id instead of null.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-22 10:42:56 +00:00
gavrielc c8fc1da719 refactor(claude-md): compose per-group CLAUDE.md from shared base + fragments
Replace the per-group "written once at init, owned by the group" CLAUDE.md
with a host-regenerated entry point that imports:

  - a shared base (`container/CLAUDE.md` mounted RO at `/app/CLAUDE.md`)
  - optional per-skill fragments (skills that ship `instructions.md`)
  - optional per-MCP-server fragments (inline `instructions` field in
    `container.json`)
  - per-group agent memory (`CLAUDE.local.md`, auto-loaded by Claude Code)

Principle: RW = per-group memory, RO = shared content. Source/skills/base
are shared; personality, config, working files, and Claude state stay
per-group.

Key changes:

  - New `src/claude-md-compose.ts` — per-spawn composition +
    `migrateGroupsToClaudeLocal()` one-time cutover.
  - New `container/CLAUDE.md` — shared base, seeded verbatim from the
    former `groups/global/CLAUDE.md`.
  - `src/container-runner.ts` — swap `/workspace/global` mount for RO
    `/app/CLAUDE.md`; call `composeGroupClaudeMd()` after
    `initGroupFilesystem()`.
  - `src/group-init.ts` — drop `.claude-global.md` symlink + initial
    `CLAUDE.md` write; seed `CLAUDE.local.md` from `opts.instructions`.
  - `src/index.ts` — call `migrateGroupsToClaudeLocal()` at startup.
  - `src/container-config.ts` — add optional `instructions` field to
    `McpServerConfig` (inline per-MCP guidance fragment).
  - `container/Dockerfile` — drop dead `/workspace/global` mkdir.
  - Remove obsolete `scripts/migrate-group-claude-md.ts`.

Migration (runs once at host startup, idempotent):

  - Delete `.claude-global.md` symlinks in each group.
  - Rename each `groups/<folder>/CLAUDE.md` → `CLAUDE.local.md`
    (preserves existing per-group content as memory).
  - Delete `groups/global/` directory.

Design docs: `docs/claude-md-composition.md` and `docs/shared-source.md`
(the latter is the sibling design discussion this refactor builds on).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 12:58:43 +03:00
exe.dev user 8a12fa61ac refactor: shared source — replace per-group agent-runner copies with single RO mount
Replace the per-group agent-runner-src copy model with a single shared
read-only mount. Source and skills are now RO + shared; personality,
config, working files, and Claude state stay RW + per-group.

Key changes:
- Mount container/agent-runner/src/ RO at /app/src (all groups share one copy)
- Mount container/skills/ RO at /app/skills; per-group skill selection via
  symlinks in .claude-shared/skills/ based on container.json "skills" field
- Mount container.json as nested RO bind on top of RW group dir
- Move all NANOCLAW_* env vars to container.json (runner reads at startup)
- New runner config.ts module replaces process.env reads
- Move command gate (filtered/admin) from container to host router
- Dockerfile: remove source COPY, split CLI installs (claude-code last),
  move agent-runner deps above CLIs for better layer caching
- Add writeOutboundDirect for router denial responses
- Design doc at docs/shared-src.md

Not included (follow-up): DB migration to drop agent_provider columns,
cleanup of orphaned agent-runner-src directories.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-22 12:58:43 +03:00
gavrielc 596035be09 feat(setup): operator role prompt per channel, owner by default
Previously init-first-agent auto-granted global owner to the first
user wired through it and left every subsequent user as nothing — no
role, no membership. That worked for the bootstrap path but broke the
second channel's welcome DM: the access gate saw no role + no
membership and dropped the message with accessReason='not_member'.

Make the role explicit:

- scripts/init-first-agent.ts accepts --role owner|admin|member
  (default: owner). Role drives the grant:
    owner  -> global owner (agent_group_id=null)
    admin  -> admin scoped to this agent group
    member -> no role row, just membership
  Idempotent via getUserRoles pre-check — safe on re-runs. addMember
  runs unconditionally (INSERT OR IGNORE) so the access gate has a
  row even for users who'd otherwise pass via role alone.

- setup/lib/role-prompt.ts — shared askOperatorRole(channel) prompt
  with owner as the default pick. Self-host single-operator is the
  dominant case, so the user's fingers default to Enter.

- Telegram / Discord / WhatsApp drivers all call askOperatorRole
  before resolving the agent name and pass --role <choice> through.
  Captured in progression log via setupLog.userInput('<channel>_role').

Summary output drops the fragile "promoted on first owner" hint in
favor of a dedicated role: line ("owner (global)" / "admin (scoped to
<ag-id>)" / "member") so re-runs make the current grant legible.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 12:57:57 +03:00
gavrielc 4859d8fb2d feat(setup): Claude-assisted error recovery with resume-at-step retry
When a setup step fails — whether hard via fail() or soft via the
"What's left" / "Skipping the first chat" notes — offer to ask Claude
to diagnose. On consent, spawn `claude -p --output-format stream-json`
with a scrolling 3-line action window ("Reading x", "Running y") so
the 1–4 minute investigations feel active rather than hung. No hard
timeout: debugging can take time, Ctrl-C is the escape hatch.

The prompt is minimal: one-paragraph framing, failed step name + msg +
hint, and a list of file references (not contents). Claude's Read/Grep
tools fetch what they need. A per-step map in claude-assist.ts gives
the most relevant files per step; the rest is README + auto.ts +
logs/setup.log + the per-step raw log.

Claude responds with REASON + COMMAND lines. We show the reason in a
clack note, prefill the command via setup/run-suggested.sh (bash 4+
readline, 3.x fallback to Enter-to-run), and eval on the user's
confirm.

When the user runs a fix, fail() now offers to retry the failing step
rather than aborting. setup/logs.ts tracks successfully-completed step
names in-memory; fail() threads those as NANOCLAW_SKIP on a spawnSync
retry, so the child picks up exactly where the parent left off — no
rebuilding containers or reinstalling OneCLI.

Other polish in this change:
- fitToWidth + dimWrap in lib/theme.ts to prevent long spinner labels
  from soft-wrapping (each terminal row stacks a stale copy otherwise).
- Shorter container step label ("Preparing your assistant's sandbox…")
  so it fits on narrow terminals.
- Wordmark anchored in the clack intro line on every run.
- All 25 existing fail() call sites updated to await fail(...) since
  fail is now async.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 12:42:44 +03:00
gavrielc dfcbab5364 feat(setup): optional WhatsApp wiring + cross-channel UX polish
WhatsApp (community/Baileys) joins the setup:auto channel picker, with
the same clack-native UX discipline as Telegram and Discord:

- setup/channels/whatsapp.ts — driver. Collects auth method (QR terminal
  or pairing code), runs the auth step, renders QR blocks in-place with
  ANSI cursor-rewind on rotation so the terminal doesn't fill up with
  stale codes, reads creds.me.id for the bot phone, restarts the service,
  asks for the operator's personal phone (defaulting to the authed
  number), writes ASSISTANT_HAS_OWN_NUMBER=true when they differ
  (dedicated mode), and hands off to init-first-agent.

- setup/whatsapp-auth.ts — forked standalone auth step. Channels-branch
  version had a browser-QR path with an HTTP server + <canvas> QR
  renderer; stripped entirely (headless/SSH users hit dead ends too
  often, and the extra deps complicate install). The remaining terminal
  QR emits raw QR strings in WHATSAPP_AUTH_QR blocks so the parent
  driver owns the rendering. Pairing-code path retained. Status blocks
  now use the runner's vocabulary (success/skipped/failed) so spawnStep
  sets ok correctly; WhatsApp-specific UI text ("WhatsApp linked", "You
  chat") lives in the driver.

- setup/add-whatsapp.sh — non-interactive installer, mirror of
  add-telegram.sh. Fetches the adapter + groups step from the channels
  branch (whatsapp-auth.ts stays local, pair-telegram.ts pattern),
  installs pinned baileys/qrcode/pino, registers the steps in
  setup/index.ts's STEPS map. No service restart (adapter factory
  returns null until creds exist).

Cross-channel fixes bundled:

- scripts/init-first-agent.ts: always addMember(user, agentGroup) for
  the target user so subsequent wirings (not the first) pass the access
  gate. Telegram wiring first → Discord/WhatsApp second was dropping
  every inbound with accessReason='not_member' because only the first
  user gets owner. namespacedPlatformId also passes through JID-format
  raws (contains '@') so WhatsApp's bare <phone>@s.whatsapp.net matches
  what the adapter stores.

- setup/service.ts: launchctl unload-then-load instead of bare load (bare
  load errors 'already loaded' when a prior plist was cached, keeping
  launchd on the OLD ProgramArguments even after the file on disk
  changed). systemctl start → restart (start is a no-op on an active
  unit, swallowing unit-file edits).

- setup/add-telegram.sh: removed the in-script open "tg://resolve"
  block. The driver (setup/channels/telegram.ts) now owns the deep-link,
  gated on a p.confirm so the browser can't steal focus unexpectedly.

- setup/channels/discord.ts + setup/channels/telegram.ts: every browser
  open goes through confirmThenOpen (new shared helper in
  setup/lib/browser.ts) — operator presses Enter before their browser
  takes focus. Telegram switched from tg://resolve?domain= to
  https://t.me/<bot> which works everywhere.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 12:39:48 +03:00