claude-code CLI 2.1.128 -> 2.1.154 (Dockerfile build-arg). agent-runner SDK 0.2.128 -> 0.3.154: the 0.3 major moved @anthropic-ai/sdk and @modelcontextprotocol/sdk from regular deps to peer deps, so add @anthropic-ai/sdk ^0.100.0 as a direct dep and raise @modelcontextprotocol/sdk to ^1.29.0. Regenerate bun.lock. Typecheck + agent-runner tests pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The agent-runner runs the Agent SDK with settingSources: ['project', 'user'], which omits 'local'. Per the SDK docs the 'local' source is what loads CLAUDE.local.md (the 'project' source loads CLAUDE.md). So every group's CLAUDE.local.md is silently never read, even though container/CLAUDE.md tells each agent to use it as per-group memory.
Closes#2185.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The follow-up poll catches and logs SQLite errors but never recovers
from them. On Docker Desktop macOS, the kernel page cache for the
inbound.db bind mount can latch a torn snapshot mid-host-write (a known
virtiofs / gRPC-FUSE coherency issue), after which every fresh
openInboundDb() in the same process sees the same broken view and
emits 'database disk image is malformed' at the poll rate (2/sec).
Reopening the DB handle inside the container does not recover — only
a fresh container mount does. The fix: after CORRUPTION_STREAK_EXIT
consecutive corruption errors (~5s), log a clear message and
process.exit(75) so host-sweep respawns the container with a fresh
mount. Transient single torn reads are still tolerated.
- Add isCorruptionError() helper covering the three SQLite read-side
corruption symptoms (disk image malformed, SQLITE_CORRUPT, file is
not a database).
- Add streak counter scoped to processQuery's pollHandle so it resets
on any successful or non-corruption error.
- Add unit tests for the matcher.
Refs the cross-mount invariants documented in db/connection.ts:11-18.
fe2e881b (#2556) removed the <messages> wrapper from formatChatMessages
so the Claude Agent SDK calls the API instead of emitting a synthetic
stub, but poll-loop.test.ts still asserted the wrapper. The test has
failed on every PR built against main since. Assert the current shape:
no envelope, one self-contained <message> block per message.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CLAUDE_TRANSCRIPT_ROTATE_AGE_DAYS=0 (or negative) is documented to
disable age-based rotation, but transcriptRotateAgeMs() routed it
into the same branch as an unset var and returned the 14-day default.
Sessions intentionally configured to stay long-lived were still
rotated at 14 days, causing unexpected resets and context loss.
Distinguish unset/non-numeric (default 14d) from an explicit
non-positive override (Infinity = disabled; size alone governs).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The chat-adapter-imessage docs use photon.codes — our setup flow
and skill had the wrong domain.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- teams app create prints CLIENT_ID/CLIENT_SECRET/TENANT_ID; the existing Configure environment section expects TEAMS_APP_ID/TEAMS_APP_PASSWORD/TEAMS_APP_TENANT_ID, so without the mapping a user pasting verbatim would silently end up with an adapter that can't authenticate
- @microsoft/teams.cli registers bots via the Teams Developer Portal, skipping the Azure subscription requirement that blocks users on locked-down corporate tenants
A long-lived hub session never rotates its continuation, so the on-disk
.jsonl grows without bound — days of history plus base64 image blocks the
agent Read (screenshots from QA lanes, etc.). The SDK reloads the whole
transcript on every --resume, and past a threshold the first turn alone
exceeds the host's 30-min idle ceiling: the container is SIGKILLed before
it can reply, then the next message repeats the cycle forever. Symptom:
a hub that was responsive for days suddenly goes silent on a heavy turn.
Before resuming, the Claude provider now checks the transcript backing the
stored continuation; if it exceeds a size cap (default 12MB) or age cap
(default 14 days, from the first entry's timestamp) it archives a markdown
summary to conversations/ and starts a fresh session. Both caps are
operator-overridable via CLAUDE_TRANSCRIPT_ROTATE_BYTES /
CLAUDE_TRANSCRIPT_ROTATE_AGE_DAYS. The PreCompact archiver is refactored
into a shared archiveTranscriptFile() reused by the rotation path.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
signal-cli >= 0.13 emits the account identifier as `number` in JSON
output, not `account`. The skip-if-already-linked path in signal-auth
always returned an empty list, so re-runs of setup unconditionally
tried `signal-cli link`, which fails when the data directory already
exists.
Read `number` first, fall back to `account` for older signal-cli.
Installs rtk (60–90% token savings on dev commands) into agent containers
via host binary mount + Claude Code PreToolUse hook.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Filter channel registration target options to the approver's authorized agent groups and re-check target authorization before applying a pending approval. Add regression coverage for scoped admins attempting to connect channels to out-of-scope groups.
Previously, passing --assistant-name <Name> when registering an agent
did a project-wide find-replace of "Andy" → <Name> across every
groups/*/CLAUDE.md file, and overwrote .env's ASSISTANT_NAME.
Two unintended consequences:
- Registering a second agent (e.g. "Homie") clobbered an unrelated
primary agent's CLAUDE.md. Real-world hit when wiring Homie's
Signal group on an install that already had Diddyclaw set up —
groups/diddyclaw/CLAUDE.md ended up with "Homie" references it
shouldn't have had.
- The install-wide .env ASSISTANT_NAME flipped to the most recently-
registered name, becoming the default trigger pattern for any
subsequent group registered without an explicit --assistant-name.
Both were a per-agent operation accidentally exercising project-wide
state. Now only groups/<folder>/CLAUDE.md of the agent being
registered is touched. .env is left alone — it represents the
install-wide default and shouldn't be flipped by per-agent registers.
If the install's primary-default name needs to change, that's an
explicit one-line .env edit, not a side-effect of registration.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When 2+ pending messages were bundled into <messages>...</messages> at
container/agent-runner/src/formatter.ts:162-167, the Claude Agent SDK
responded with a synthetic stub (model="<synthetic>", stop_reason=
"stop_sequence", content="No response requested.") instead of calling
the real API. The poll loop never yielded a `result` event, so the
inbound message was never marked completed; the container exited; the
next sweep tick respawned it with the same batch; same synthetic; the
transcript file ballooned with each retry until tries=5 → failed.
Single-message turns (which skipped the wrapper) worked normally — the
SDK's heuristic appears to treat the wrapped envelope as a context dump
rather than a real user turn. Each `<message id=... from=...>...</message>`
block is already self-contained, so dropping the outer wrapper lets the
N>1 case work the same way the N=1 case always has.
Fix:
function formatChatMessages(messages: MessageInRow[]): string {
return messages.map(formatSingleChat).join('\n');
}
Updates one existing test that asserted on the envelope, and adds two
regression tests: one negative (no `<messages>` wrapper), one positive
(each inbound row produces a `<message>` block in order).
Confirmed working in a real install: two stuck lanes recovered after
reducing their pending queue to 1 message, and both produced normal
replies from claude after the wipe + this fix were both applied (the
wipe alone wasn't enough — a fresh session given the same batch shape
hit the same synthetic loop).
Refs nanocoai/nanoclaw#2555 for full repro + transcript evidence.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Teaches agents WhatsApp's mention syntax (@<phone-digits>, never display
names) and where to find the sender's phone JID in inbound metadata
(content.sender). Without this, agents default to @<displayName>, which
WhatsApp can't tag — it just renders as plain text with no notification.
Two files:
- SKILL.md — frontmatter + description so the Claude Agent SDK can
discover it via skill metadata for ad-hoc lookups.
- instructions.md — always-on guidance. claude-md-compose.ts inlines
any skill that ships an instructions.md into every group's CLAUDE.md
on container spawn, so the rule is in the agent's context for every
reply (not just when the agent decides to invoke the Skill tool).
Mirrors the existing container/skills/slack-formatting/ layout for the
analogous Slack mrkdwn rules. Pairs with the adapter-side fix on the
`channels` branch that wires `mentions` through to Baileys' contextInfo
— both layers are needed for tags to render end-to-end.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The SKILL.md recommends `--method qr-browser` and references `--method qr-terminal`, but `setup/whatsapp-auth.ts` on `channels` only accepts `qr` and `pairing-code`. Running the recommended path errors out with `Unknown --method: qr-browser (expected 'qr' or 'pairing-code')`.
Add `.claude/skills/add-whatsapp/scripts/wa-qr-browser.ts` — a small wrapper that spawns the existing `--method qr` step, parses its `WHATSAPP_AUTH_QR` status blocks, and serves the rotating QR as a PNG on a local HTTP server with the default browser auto-opened. Restores the 'QR in browser' UX the skill already promises.
Update SKILL.md to invoke the wrapper for the browser method and to call `--method qr` (not `qr-terminal`) for the terminal method. Also expand the 'pairing code keeps failing' troubleshooting with the 'Couldn't link device — An error happened' server-side rejection seen on fresh dedicated numbers.
No source changes (`setup/`, `src/`) — preserves the 'browser method dropped' decision in `setup/whatsapp-auth.ts`. No new npm deps — uses `qrcode` (already pinned by this skill) and Node's built-in `http`.
Documents the fix from #2510 (closes#2465) in user-facing prose
following the RELEASING.md style guide. Single-bullet release —
no rollup opener since this is a clean one-bump cycle.
migration-014 has ON DELETE CASCADE on container_configs.agent_group_id,
so the row was already being removed by the final DELETE FROM agent_groups.
Doing the delete explicitly here mirrors the shape of every other table
in the cascade and lets the handler surface a container_configs count in
the `removed` response, matching the rest of the breakdown.