mirror of https://github.com/qwibitai/nanoclaw.git synced 2026-06-04 10:14:47 +08:00

Files

T

gavrielc 47950671fa docs: add v1→v2 action-items analysis + SDK signal probe tool

- docs/v1-vs-v2/: full v1→v2 regression analysis (SUMMARY + 21 per-module
  docs + ACTION-ITEMS rollup with decisions + timezone recreation spec).
- container/agent-runner/scripts/sdk-signal-probe.ts: empirical harness
  used to characterise Claude Agent SDK event/hook/stderr timing for the
  stuck-detection design in item 9.
- src/channels/chat-sdk-bridge.ts: document the conversations Map staleness
  in a code comment; fix deferred to when dynamic group registration lands
  (ACTION-ITEMS item 17).

No runtime behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-20 01:00:04 +03:00

5.4 KiB

Raw Blame History

router: v1 vs v2

Scope

v1 (distributed across): src/v1/index.ts (startMessageLoop, trigger check), group-queue.ts (concurrency, retry), router.ts (outbound formatting, 44 LOC), sender-allowlist.ts (drop/allow)
v2: src/router.ts (317 LOC), src/session-manager.ts (346 LOC), src/container-runner.ts, src/access.ts, src/db/messaging-groups.ts (trigger_rules schema)

Routing-flow diff

v1 (polling, per-group)

Channel receives message → onMessage → store in DB
Sender allowlist drop-mode filter → discard denied
startMessageLoop polls every POLL_INTERVAL
For each group: lookup channel (findChannel O(n)), check trigger requirement, load allowlist, scan for pattern, skip if no trigger
Pull messages since lastAgentTimestamp, XML-format with tz context
If active container: write JSON to IPC file; else enqueueMessageCheck(groupJid) → GroupQueue
Retry on failure (up to 5, exp. backoff); rollback cursor on agent error

v2 (event-driven, entity model)

Channel adapter → routeInbound(platformId, threadId, message)
Apply thread policy (supportsThreads → collapse to null)
Resolve messaging_group (lookup or auto-create)
Extract sender → upsert users row → userId (namespaced channel_type:handle)
Lookup wired agent groups via messaging_group_agents; drop if none
pickAgent (highest priority; trigger_rules matching is TODO)
enforceAccess: owner/admin/member gate; unknown_sender_policy: strict | request_approval | public
resolveSession by session_mode (agent-shared/shared/per-thread)
insertMessage to session inbound.db, write session_routing + destinations
startTypingRefresh; wakeContainer(session) (dedup by activeContainers + wakePromises)
Container polls inbound.db, writes outbound.db; host delivery.ts polls and sends via adapter; stopTypingRefresh on container exit

Capability map

v1 behavior	v2 location	Status	Notes
Sender allowlist drop/allow modes	—	removed	Replaced by access gate + `unknown_sender_policy`
Group registration auto-creating folder on first message	`router.ts` auto-creates messaging_group; group folder via `group-init.ts` on wake	moved	Admin skill path for agent groups
Trigger pattern matching (`requiresTrigger`, `DEFAULT_TRIGGER`)	`messaging_group_agents.trigger_rules` JSON	deferred	Schema ready; `pickAgent` has TODO comment
`lastAgentTimestamp` cursor tracking	—	removed	All messages written immediately to inbound.db
IPC file polling (`inputDir`, `_close` sentinel)	—	removed	DB polling replaces
GroupQueue concurrency + waiting-groups	`container-runner.ts:42-82` `activeContainers` + `wakePromises`	reimplemented	Per-session not per-group
Task scheduler → enqueue to GroupQueue	host-sweep due-wake + delivery system-actions	preserved
Session reuse rules (session mode)	`session-manager.ts` (agent-shared/shared/per-thread)	enhanced	Explicit per-wiring
Remote control command interception	—	removed
Idle timeout + stdin close	`container-runner.ts:135-140` `resetIdle`	kept	Heartbeat instead of stdin
Host-level retry on agent error	—	removed	Container is authority; host sweep retries stale only
Typing indicator	`delivery.ts:startTypingRefresh`	kept	Gated on heartbeat

Missing from v2

Trigger-rule matching — router.ts:198 TODO. Currently every wired agent fires on every message (only priority breaks ties). Without this, multi-agent wirings don't work as intended.
Sender drop mode — v1's silent-drop for noisy users is gone. v2 only has binary allow/deny.
Cursor / state recovery — v2 writes immediately to DB. If container crashes mid-output, no host-level dedup guarantees (beyond messages_in.id PK)
Remote control — v1 intercepted /remote-control commands pre-storage; no v2 equivalent
Host-level retry with backoff on agent error — v1 had MAX_RETRIES=5 + exp. backoff on processGroupMessages; v2 only retries on stale heartbeat detection

Behavioral discrepancies

Trigger evaluation: v1 eager (skip group until trigger arrives, accumulate context); v2 TODO — once implemented, likely drops non-trigger messages at ingest (semantic change)
Session reuse: v1 single session per group; v2 multiple (one per thread on threaded platforms)
Access control timing: v1 pre-storage (cheap drop); v2 post-sender-resolution (requires users upsert)
Unknown channels: v1 silently ignored; v2 auto-creates messaging_groups row — no data loss but orphaned rows possible
Formatting: v1 host formats with tz + cursor-based message subset; v2 pushes raw JSON to inbound.db, container formats from full session history

Worth preserving?

Trigger rule matching (HIGH priority) — schema is ready; 10-line implementation in pickAgent. Currently broken-by-default for multi-agent wirings
Sender drop mode (MEDIUM) — add (agent_group_id, sender_pattern) drop table; orthogonal to privilege
State recovery (LOW) — add unique constraint on messages_in.id if not already; v2's model is simpler + more robust
Host-level retry on agent error (MEDIUM) — currently only stale containers retry. Explicit container-exit-error retry could be valuable
Remote control — decide: restore as opt-in skill or document deletion

5.4 KiB Raw Blame History

router: v1 vs v2

Scope

Routing-flow diff

v1 (polling, per-group)

v2 (event-driven, entity model)

Capability map

Missing from v2

Behavioral discrepancies

Worth preserving?

5.4 KiB

Raw Blame History