Merge branch 'main' into fix/pending-rows-idempotent

2026-06-12 18:11:51 +08:00 · 2026-04-23 22:37:22 +03:00
parent 97868af5a7 2861009d95
commit ffd38f660a
6 changed files with 281 additions and 76 deletions
@@ -0,0 +1,161 @@
+---
+name: add-codex
+description: Use Codex (CLI + AppServer) as the full agent provider — planning, tool orchestration, native compaction, MCP tools, session resume — in place of the Claude Agent SDK. ChatGPT subscription or OPENAI_API_KEY. Per-group via agent_provider. Distinct from using OpenAI as an MCP tool (where Claude remains the planner).
+---
+
+# Codex agent provider
+
+NanoClaw runs agents in a long-lived **poll loop** inside the container. The backend is selected with **`AGENT_PROVIDER`** (`claude` | `opencode` | `codex` | `mock`).
+
+Trunk ships with only the `claude` provider baked in. This skill copies the Codex provider files in from the `providers` branch, wires them into the host and container barrels, updates the Dockerfile to install the Codex CLI, and rebuilds the image.
+
+The Codex provider runs `codex app-server` as a child process and speaks JSON-RPC over stdio. That gives it native session resume, streaming events, MCP tool access, and `thread/compact/start` compaction — same feature bar as the Claude Agent SDK, without the Anthropic-only lock-in.
+
+## Install
+
+### Pre-flight
+
+If all of the following are already present, skip to **Configuration**:
+
+- `src/providers/codex.ts`
+- `container/agent-runner/src/providers/codex.ts`
+- `container/agent-runner/src/providers/codex-app-server.ts`
+- `container/agent-runner/src/providers/codex.factory.test.ts`
+- `import './codex.js';` line in `src/providers/index.ts`
+- `import './codex.js';` line in `container/agent-runner/src/providers/index.ts`
+- `ARG CODEX_VERSION` and `"@openai/codex@${CODEX_VERSION}"` in the pnpm global-install block in `container/Dockerfile`
+
+Missing pieces — continue below. All steps are idempotent; re-running is safe.
+
+### 1. Fetch the providers branch
+
+```bash
+git fetch origin providers
+```
+
+### 2. Copy the Codex source files
+
+Wholesale copies (owned entirely by this skill — user edits to these files won't survive a re-run, as designed):
+
+```bash
+git show origin/providers:src/providers/codex.ts                                      > src/providers/codex.ts
+git show origin/providers:container/agent-runner/src/providers/codex.ts               > container/agent-runner/src/providers/codex.ts
+git show origin/providers:container/agent-runner/src/providers/codex-app-server.ts    > container/agent-runner/src/providers/codex-app-server.ts
+git show origin/providers:container/agent-runner/src/providers/codex.factory.test.ts  > container/agent-runner/src/providers/codex.factory.test.ts
+```
+
+### 3. Append the self-registration imports
+
+Each barrel gets one line — alphabetical placement keeps diffs small.
+
+`src/providers/index.ts`:
+
+```typescript
+import './codex.js';
+```
+
+`container/agent-runner/src/providers/index.ts`:
+
+```typescript
+import './codex.js';
+```
+
+### 4. Add the Codex CLI to the container Dockerfile
+
+Two edits to `container/Dockerfile`, both idempotent (skip if already present):
+
+**(a)** In the "Pin CLI versions" ARG block (around line 18), add after `ARG CLAUDE_CODE_VERSION=...`:
+
+```dockerfile
+ARG CODEX_VERSION=0.121.0
+```
+
+**(b)** Add a new standalone `RUN` block for the Codex CLI, after the existing per-CLI install blocks (around line 106, right after the `@anthropic-ai/claude-code` block). The Dockerfile splits each global CLI into its own layer for cache granularity — keep that pattern; do not collapse them into a single combined `pnpm install -g` call:
+
+```dockerfile
+RUN --mount=type=cache,target=/root/.cache/pnpm \
+    pnpm install -g "@openai/codex@${CODEX_VERSION}"
+```
+
+Note: **no agent-runner package dependency** — Codex is a CLI binary, not a library. Unlike OpenCode, there's nothing to add to `container/agent-runner/package.json`.
+
+### 5. Build
+
+```bash
+pnpm run build                                         # host
+pnpm exec tsc -p container/agent-runner/tsconfig.json --noEmit   # container typecheck
+./container/build.sh                                   # agent image
+```
+
+## Configuration
+
+Codex supports two primary auth paths and one experimental BYO-endpoint path. Pick the one that matches your setup.
+
+### Option A — ChatGPT subscription (recommended for individuals)
+
+On the host (not inside the container), run Codex's OAuth login:
+
+```bash
+codex login
+```
+
+This writes `~/.codex/auth.json` with a subscription token. The host-side Codex provider ([src/providers/codex.ts](../../../src/providers/codex.ts)) copies `auth.json` into a per-session `~/.codex` directory mounted into the container — your host's own Codex CLI is never touched.
+
+No `.env` variables required for this mode.
+
+### Option B — API key (recommended for CI or API billing)
+
+```env
+OPENAI_API_KEY=sk-...
+CODEX_MODEL=gpt-5.4-mini
+```
+
+The host forwards both variables into the container. If both subscription (`auth.json`) and `OPENAI_API_KEY` are present, Codex prefers the subscription.
+
+### Option C — BYO OpenAI-compatible endpoint (experimental)
+
+Codex's built-in `openai` provider honors the `OPENAI_BASE_URL` env var directly. Point it at any OpenAI-compatible endpoint — Groq, Together, self-hosted vLLM, an OpenAI proxy, etc.
+
+```env
+OPENAI_API_KEY=...
+OPENAI_BASE_URL=https://api.groq.com/openai/v1
+CODEX_MODEL=llama-3.3-70b-versatile
+```
+
+Codex also ships first-class local-runner flags — `codex --oss --local-provider ollama` or `--local-provider lmstudio` — that auto-detect a local server. To use those inside NanoClaw, set `CODEX_MODEL` to a model your local runner serves and add the corresponding base URL; see the Codex CLI docs for the full `model_provider = oss` configuration.
+
+**Experimental caveat:** tool-calling quality depends on the model and endpoint. Not every OpenAI-compat provider implements the full function-calling spec, and smaller models (< 30B) often struggle with multi-step tool orchestration. Test before committing.
+
+### Per group / per session
+
+Schema: **`agent_groups.agent_provider`** and **`sessions.agent_provider`**. Set to `codex` for groups or sessions that should use Codex. The container receives `AGENT_PROVIDER` from the resolved value (session overrides group).
+
+`CODEX_MODEL` applies process-wide via `.env`; if you need different models for different groups, set them via `container_config.env` on the group.
+
+Extra MCP servers still come from **`NANOCLAW_MCP_SERVERS`** / `container_config.mcpServers` on the host. The runner merges them into the same `mcpServers` object passed to all providers.
+
+## Operational notes
+
+- **Spawn-per-query:** Codex's app-server is spawned fresh per query invocation, matching the OpenCode pattern. No long-lived daemon to keep healthy across sessions.
+- **Per-session `~/.codex` isolation:** each group gets its own copy of the host's `auth.json`. The container can rewrite `config.toml` freely on every wake without touching the host's Codex config.
+- **Native compaction:** kicks in automatically at 40K cumulative input tokens between turns, via `thread/compact/start`. If compaction fails, the provider logs and continues uncompacted — no fatal error.
+- **Approvals:** auto-accepted inside the container (the container is the sandbox; same posture as Claude/OpenCode).
+- **Mid-turn input:** Codex turns don't accept mid-turn messages. Follow-up `push()` calls queue and drain between turns, matching the OpenCode pattern. The poll-loop only pushes between turns anyway, so no messages are dropped.
+- **Stale thread recovery:** `isSessionInvalid` matches on stale-thread-ID errors (`thread not found`, `unknown thread`, etc.) so a cold-started app-server can recover cleanly when it sees a stored continuation it no longer has.
+
+## Verify
+
+```bash
+grep -q "./codex.js" container/agent-runner/src/providers/index.ts && echo "container barrel: OK"
+grep -q "./codex.js" src/providers/index.ts && echo "host barrel: OK"
+grep -q "@openai/codex@" container/Dockerfile && echo "Dockerfile install: OK"
+cd container/agent-runner && bun test src/providers/codex.factory.test.ts && cd -
+```
+
+After the image rebuild, set `agent_provider = 'codex'` on a test group and send a message. Successful round-trip looks like:
+
+- `init` event with a stable thread ID as continuation
+- One or more `activity` / `progress` events during the turn
+- `result` event with the model's reply
+
+If the agent hangs or errors, check `~/.codex/auth.json` exists on the host (Option A) or that `OPENAI_API_KEY` is forwarding correctly (Option B) — `docker exec` into a running container and `env | grep -i openai` to confirm.
@@ -1,6 +1,6 @@
 {
  "name": "nanoclaw",
-  "version": "2.0.7",
+  "version": "2.0.8",
  "description": "Personal Claude assistant. Lightweight, secure, customizable.",
  "type": "module",
  "packageManager": "pnpm@10.33.0",
@@ -1,5 +1,5 @@
-<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="90" height="20" role="img" aria-label="128k tokens, 64% of context window">
-  <title>128k tokens, 64% of context window</title>
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="90" height="20" role="img" aria-label="129k tokens, 64% of context window">
+  <title>129k tokens, 64% of context window</title>
  <linearGradient id="s" x2="0" y2="100%">
    <stop offset="0" stop-color="#bbb" stop-opacity=".1"/>
    <stop offset="1" stop-opacity=".1"/>
@@ -15,8 +15,8 @@
      <g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" font-size="11">
        <text aria-hidden="true" x="26" y="15" fill="#010101" fill-opacity=".3">tokens</text>
        <text x="26" y="14">tokens</text>
-        <text aria-hidden="true" x="71" y="15" fill="#010101" fill-opacity=".3">128k</text>
-        <text x="71" y="14">128k</text>
+        <text aria-hidden="true" x="71" y="15" fill="#010101" fill-opacity=".3">129k</text>
+        <text x="71" y="14">129k</text>
      </g>
    </g>
  </a>
@@ -1,5 +1,7 @@
-import { describe, it, expect, beforeEach } from 'vitest';
+import { describe, it, expect, beforeEach, afterEach } from 'vitest';
 import fs from 'fs';
+import os from 'os';
+import path from 'path';

 import Database from 'better-sqlite3';

@@ -17,58 +19,63 @@ describe('environment detection', () => {
  });
 });

-describe('registered groups DB query', () => {
-  let db: Database.Database;
+describe('detectRegisteredGroups', () => {
+  let tempDir: string;

  beforeEach(() => {
-    db = new Database(':memory:');
-    db.exec(`CREATE TABLE IF NOT EXISTS registered_groups (
-      jid TEXT PRIMARY KEY,
-      name TEXT NOT NULL,
-      folder TEXT NOT NULL UNIQUE,
-      trigger_pattern TEXT NOT NULL,
-      added_at TEXT NOT NULL,
-      container_config TEXT,
-      requires_trigger INTEGER DEFAULT 1
-    )`);
+    tempDir = fs.mkdtempSync(path.join(os.tmpdir(), 'nanoclaw-env-test-'));
+    fs.mkdirSync(path.join(tempDir, 'data'), { recursive: true });
  });

-  it('returns 0 for empty table', () => {
-    const row = db
-      .prepare('SELECT COUNT(*) as count FROM registered_groups')
-      .get() as { count: number };
-    expect(row.count).toBe(0);
+  afterEach(() => {
+    fs.rmSync(tempDir, { recursive: true, force: true });
  });

-  it('returns correct count after inserts', () => {
-    db.prepare(
-      `INSERT INTO registered_groups (jid, name, folder, trigger_pattern, added_at, requires_trigger)
-       VALUES (?, ?, ?, ?, ?, ?)`,
-    ).run(
-      '123@g.us',
-      'Group 1',
-      'group-1',
-      '@Andy',
-      '2024-01-01T00:00:00.000Z',
-      1,
-    );
+  it('returns false when no registration state exists', async () => {
+    const { detectRegisteredGroups } = await import('./environment.js');
+    expect(detectRegisteredGroups(tempDir)).toBe(false);
+  });

-    db.prepare(
-      `INSERT INTO registered_groups (jid, name, folder, trigger_pattern, added_at, requires_trigger)
-       VALUES (?, ?, ?, ?, ?, ?)`,
-    ).run(
-      '456@g.us',
-      'Group 2',
-      'group-2',
-      '@Andy',
-      '2024-01-01T00:00:00.000Z',
-      1,
-    );
+  it('detects pre-migration registered_groups.json', async () => {
+    const { detectRegisteredGroups } = await import('./environment.js');
+    fs.writeFileSync(path.join(tempDir, 'data', 'registered_groups.json'), '[]');
+    expect(detectRegisteredGroups(tempDir)).toBe(true);
+  });

-    const row = db
-      .prepare('SELECT COUNT(*) as count FROM registered_groups')
-      .get() as { count: number };
-    expect(row.count).toBe(2);
+  it('returns false for an empty v2 central DB', async () => {
+    const { detectRegisteredGroups } = await import('./environment.js');
+    const db = new Database(path.join(tempDir, 'data', 'v2.db'));
+    db.exec(`
+      CREATE TABLE agent_groups (id TEXT PRIMARY KEY);
+      CREATE TABLE messaging_group_agents (
+        id TEXT PRIMARY KEY,
+        messaging_group_id TEXT NOT NULL,
+        agent_group_id TEXT NOT NULL
+      );
+    `);
+    db.close();
+
+    expect(detectRegisteredGroups(tempDir)).toBe(false);
+  });
+
+  it('detects wired agent groups in the v2 central DB', async () => {
+    const { detectRegisteredGroups } = await import('./environment.js');
+    const db = new Database(path.join(tempDir, 'data', 'v2.db'));
+    db.exec(`
+      CREATE TABLE agent_groups (id TEXT PRIMARY KEY);
+      CREATE TABLE messaging_group_agents (
+        id TEXT PRIMARY KEY,
+        messaging_group_id TEXT NOT NULL,
+        agent_group_id TEXT NOT NULL
+      );
+    `);
+    db.prepare('INSERT INTO agent_groups (id) VALUES (?)').run('ag-1');
+    db.prepare(
+      'INSERT INTO messaging_group_agents (id, messaging_group_id, agent_group_id) VALUES (?, ?, ?)',
+    ).run('mga-1', 'mg-1', 'ag-1');
+    db.close();
+
+    expect(detectRegisteredGroups(tempDir)).toBe(true);
  });
 });

@@ -7,11 +7,35 @@ import path from 'path';

 import Database from 'better-sqlite3';

-import { STORE_DIR } from '../src/config.js';
 import { log } from '../src/log.js';
 import { commandExists, getPlatform, isHeadless, isWSL } from './platform.js';
 import { emitStatus } from './status.js';

+export function detectRegisteredGroups(projectRoot: string): boolean {
+  if (fs.existsSync(path.join(projectRoot, 'data', 'registered_groups.json'))) {
+    return true;
+  }
+
+  const dbPath = path.join(projectRoot, 'data', 'v2.db');
+  if (!fs.existsSync(dbPath)) return false;
+
+  let db: Database.Database | null = null;
+  try {
+    db = new Database(dbPath, { readonly: true });
+    const row = db
+      .prepare(
+        `SELECT COUNT(DISTINCT ag.id) as count FROM agent_groups ag
+         JOIN messaging_group_agents mga ON mga.agent_group_id = ag.id`,
+      )
+      .get() as { count: number };
+    return row.count > 0;
+  } catch {
+    return false;
+  } finally {
+    db?.close();
+  }
+}
+
 export async function run(_args: string[]): Promise<void> {
  const projectRoot = process.cwd();

@@ -39,26 +63,7 @@ export async function run(_args: string[]): Promise<void> {
  const authDir = path.join(projectRoot, 'store', 'auth');
  const hasAuth = fs.existsSync(authDir) && fs.readdirSync(authDir).length > 0;

-  let hasRegisteredGroups = false;
-  // Check JSON file first (pre-migration)
-  if (fs.existsSync(path.join(projectRoot, 'data', 'registered_groups.json'))) {
-    hasRegisteredGroups = true;
-  } else {
-    // Check SQLite directly using better-sqlite3 (no sqlite3 CLI needed)
-    const dbPath = path.join(STORE_DIR, 'messages.db');
-    if (fs.existsSync(dbPath)) {
-      try {
-        const db = new Database(dbPath, { readonly: true });
-        const row = db
-          .prepare('SELECT COUNT(*) as count FROM registered_groups')
-          .get() as { count: number };
-        if (row.count > 0) hasRegisteredGroups = true;
-        db.close();
-      } catch {
-        // Table might not exist yet
-      }
-    }
-  }
+  const hasRegisteredGroups = detectRegisteredGroups(projectRoot);

  // Check for existing OpenClaw installation
  const homedir = (await import('os')).homedir();
@@ -81,6 +81,26 @@ export interface ChatSdkBridgeConfig {
 * chunk boundary will render as two independent blocks on the receiving
 * platform, which is the same behavior as manually re-opening a fence.
 */
+/**
+ * Decode the actual option value from a button callback. Buttons are encoded
+ * with an integer index (to keep under Telegram's 64-byte callback_data cap),
+ * and the real value is looked up via `getAskQuestionRender(questionId)`.
+ * Falls back to treating the tail as a literal value so old in-flight cards
+ * (encoded before this shortening landed) still resolve.
+ */
+function resolveSelectedOption(
+  render: { options: NormalizedOption[] } | undefined,
+  eventValue: string | undefined,
+  tail: string | undefined,
+): string {
+  const candidate = eventValue ?? tail ?? '';
+  if (render && /^\d+$/.test(candidate)) {
+    const idx = Number(candidate);
+    if (render.options[idx]) return render.options[idx].value;
+  }
+  return candidate;
+}
+
 export function splitForLimit(text: string, limit: number): string[] {
  if (text.length <= limit) return [text];
  const chunks: string[] = [];
@@ -240,11 +260,15 @@ export function createChatSdkBridge(config: ChatSdkBridgeConfig): ChannelAdapter
        const parts = event.actionId.split(':');
        if (parts.length < 3) return;
        const questionId = parts[1];
-        const selectedOption = event.value || '';
+        const tail = parts.slice(2).join(':');
        const userId = event.user?.userId || '';

        // Resolve render metadata BEFORE dispatching onAction (which deletes the row).
        const render = getAskQuestionRender(questionId);
+        // New format: button id/value is an integer index into options (kept
+        // short to fit Telegram's 64-byte callback_data cap). Old format:
+        // the full value is embedded in actionId/value directly.
+        const selectedOption = resolveSelectedOption(render, event.value, tail);
        const title = render?.title ?? '❓ Question';
        const matched = render?.options.find((o) => o.value === selectedOption);
        const selectedLabel = matched?.selectedLabel ?? selectedOption ?? '(clicked)';
@@ -348,8 +372,13 @@ export function createChatSdkBridge(config: ChatSdkBridgeConfig): ChannelAdapter
          children: [
            CardText(question),
            Actions(
-              options.map((opt) =>
-                Button({ id: `ncq:${questionId}:${opt.value}`, label: opt.label, value: opt.value }),
+              // Encode button id/value with the option index rather than the
+              // full value. Telegram caps callback_data at 64 bytes, and
+              // long values (e.g. ISO datetimes, URLs) push the JSON payload
+              // well past that. The onAction handlers resolve the index back
+              // to the real value via getAskQuestionRender(questionId).
+              options.map((opt, idx) =>
+                Button({ id: `ncq:${questionId}:${idx}`, label: opt.label, value: String(idx) }),
              ),
            ),
          ],
@@ -507,12 +536,12 @@ async function handleForwardedEvent(

      // Parse the selected option from custom_id
      let questionId: string | undefined;
-      let selectedOption: string | undefined;
+      let tail: string | undefined;
      if (customId?.startsWith('ncq:')) {
        const colonIdx = customId.indexOf(':', 4); // after "ncq:"
        if (colonIdx !== -1) {
          questionId = customId.slice(4, colonIdx);
-          selectedOption = customId.slice(colonIdx + 1);
+          tail = customId.slice(colonIdx + 1);
        }
      }

@@ -521,6 +550,9 @@ async function handleForwardedEvent(
        ((interaction.message as Record<string, unknown>)?.embeds as Array<Record<string, unknown>>) || [];
      const originalDescription = (originalEmbeds[0]?.description as string) || '';
      const render = questionId ? getAskQuestionRender(questionId) : undefined;
+      // Discord custom_id mirrors the new index-based encoding (see Button
+      // construction). Decode back to the real option value for downstream.
+      const selectedOption = resolveSelectedOption(render, tail, tail);
      const cardTitle = render?.title ?? ((originalEmbeds[0]?.title as string) || '❓ Question');
      const matchedOpt = render?.options.find((o) => o.value === selectedOption);
      const selectedLabel = matchedOpt?.selectedLabel ?? selectedOption ?? customId;