mirror of https://github.com/qwibitai/nanoclaw.git synced 2026-06-04 10:14:47 +08:00

Files

T

gavrielc 47950671fa docs: add v1→v2 action-items analysis + SDK signal probe tool

- docs/v1-vs-v2/: full v1→v2 regression analysis (SUMMARY + 21 per-module
  docs + ACTION-ITEMS rollup with decisions + timezone recreation spec).
- container/agent-runner/scripts/sdk-signal-probe.ts: empirical harness
  used to characterise Claude Agent SDK event/hook/stderr timing for the
  stuck-detection design in item 9.
- src/channels/chat-sdk-bridge.ts: document the conversations Map staleness
  in a code comment; fix deferred to when dynamic group registration lands
  (ACTION-ITEMS item 17).

No runtime behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-20 01:00:04 +03:00

3.4 KiB

Raw Blame History

session-cleanup: v1 vs v2

Scope

v1: src/v1/session-cleanup.ts (26 LOC) + scripts/cleanup-sessions.sh (151 LOC) — cadence 24h
v2: src/host-sweep.ts (174 LOC) primary, plus src/container-runtime.ts:60-80 (orphan cleanup), src/session-manager.ts (heartbeat path)

Capability map

v1 behavior	v2 location	Status	Notes
Cleanup cadence 24h	`host-sweep.ts:31` 60s sweep	changed	Continuous monitoring
Stale session detection via JSONL mtime	`host-sweep.ts:116-151` heartbeat file mtime	simplified	Heartbeat replaces JSONL
Heartbeat threshold	`STALE_THRESHOLD_MS = 10 * 60 * 1000` (`host-sweep.ts:32`)	new	10 min
Stuck-processing detection	`getStuckProcessingIds()` via outbound.db (`host-sweep.ts:134`)	new
Retry with exponential backoff	`BACKOFF_BASE_MS * 2^tries` (`host-sweep.ts:145`)	new
Max retries	`MAX_TRIES = 5` (`host-sweep.ts:33`)	new	Messages → failed after 5
Explicit container kill on stale	—	not done	Stale detection resets messages, doesn't stop container
JSONL + tool-results cleanup	—	removed	No artifact cleanup (SQLite persists in DB)
Artifact cleanup (debug logs, todos, telemetry)	—	removed	Per-type retention windows gone
Orphan container cleanup	`container-runtime.ts:60-80` `cleanupOrphans()`	new	At startup only
Active session detection via `store/messages.db`	`getActiveSessions()` from `v2.db` (`host-sweep.ts:52`)	changed	DB schema different
Sync `processing_ack` (outbound.db → inbound.db)	`syncProcessingAcks()` (`host-sweep.ts:87`)	new
Wake container for due messages	`countDueMessages()` + `wakeContainer()` (`host-sweep.ts:91-96`)	new	Replaces scheduler's role
Recurrence firing	`handleRecurrence()` (`host-sweep.ts:154-173`)	new	Cron-parsed next-run insertion

Missing from v2

Artifact cleanup — v1 pruned JSONLs (7d), debug logs (3d), todos (3d), telemetry (7d), group logs (7d). v2 has none; if v1 leftovers exist on disk, they'll accumulate
Explicit container termination on stale detection — v2 marks messages as retry-eligible but leaves the stale container running; orphan cleanup only runs at next startup
Configurable retention windows — v1 had per-artifact-type retention; v2 constants are hardcoded

Behavioral discrepancies

Aspect	v1	v2
Cadence	daily batch	60s continuous
Stale trigger	24h-old JSONL	10-min heartbeat
Retry	none (session removed)	5 tries, exp. backoff
Container wake	via message loop	via `countDueMessages()` in sweep
Transactions	implicit (offline script)	explicit per-session try/finally

Worth preserving?

Stop running containers on stale detection — currently only startup cleanupOrphans() removes them. If a container truly dies while the host runs, the host will retry messages but won't kill the shell. Low-cost fix: stopContainer(name) when heartbeat is stale AND processing_ack is stuck
Artifact cleanup migration — if v1 data exists on disk post-migration, one-time prune is worth scripting. Not a v2 runtime concern
Configurable thresholds — STALE_THRESHOLD_MS / MAX_TRIES could live in config.ts for operational tuning; minor improvement
Continuous sweep + recurrence + orphan cleanup are all significant improvements; keep as-is

3.4 KiB Raw Blame History

session-cleanup: v1 vs v2

Scope

Capability map

Missing from v2

Behavioral discrepancies

Worth preserving?

3.4 KiB

Raw Blame History