diff --git a/.claude/skills/add-ollama-provider/SKILL.md b/.claude/skills/add-ollama-provider/SKILL.md new file mode 100644 index 000000000..83f7e5ae6 --- /dev/null +++ b/.claude/skills/add-ollama-provider/SKILL.md @@ -0,0 +1,179 @@ +--- +name: add-ollama-provider +description: Route a NanoClaw agent group to a local Ollama model instead of the Anthropic API. Ollama speaks the Anthropic API natively (v1/messages), so no provider code changes are needed — just env var overrides and a model setting. Use when the user wants to run their agent locally, cut API costs, or experiment with open-weight models. See docs/ollama.md for background. +--- + +# Add Ollama Provider + +Routes an agent group to a local Ollama instance instead of the Anthropic API. +See `docs/ollama.md` for how this works and the tradeoffs involved. + +## Prerequisites + +1. **Ollama is installed and running** on the host — verify: `curl -s http://localhost:11434/api/tags` +2. **A model is pulled** — e.g. `ollama pull gemma4` or `ollama pull qwen3-coder` +3. **The agent group already exists** — run `/init-first-agent` first if needed + +## 1. Check source support + +The feature requires two fields in `ContainerConfig` (`env` and `blockedHosts`) and their +corresponding wiring in `container-runner.ts`. Check if already present: + +```bash +grep -c 'blockedHosts' src/container-config.ts src/container-runner.ts +``` + +If either count is 0, apply the changes in steps 1a and 1b. Otherwise skip to step 2. + +### 1a. Extend ContainerConfig + +In `src/container-config.ts`, add to the `ContainerConfig` interface: + +```typescript +env?: Record; +blockedHosts?: string[]; +``` + +And in `readContainerConfig`, add inside the returned object: + +```typescript +env: raw.env, +blockedHosts: raw.blockedHosts, +``` + +### 1b. Wire into container-runner + +In `src/container-runner.ts`, after the `NANOCLAW_MCP_SERVERS` block, add: + +```typescript +// Per-agent-group env overrides — applied last to win over OneCLI values. +if (containerConfig.env) { + for (const [key, value] of Object.entries(containerConfig.env)) { + args.push('-e', `${key}=${value}`); + } +} + +// Blocked hosts: resolve to 0.0.0.0 so they are unreachable inside the container. +if (containerConfig.blockedHosts) { + for (const host of containerConfig.blockedHosts) { + args.push('--add-host', `${host}:0.0.0.0`); + } +} +``` + +### 1c. Fix home directory permissions (if not already done) + +The container may run as your host uid (not uid 1000). Check the Dockerfile: + +```bash +grep 'chmod.*home/node' container/Dockerfile +``` + +If it shows `chmod 755`, change it to `chmod 777` so any uid can write there. +Then rebuild the container image: `./container/build.sh` + +## 2. Identify the setup + +Ask the user (plain text, not AskUserQuestion): + +1. **Which agent group?** List available groups: `sqlite3 data/v2.db "SELECT folder, name FROM agent_groups;"` +2. **Which Ollama model?** List available: `curl -s http://localhost:11434/api/tags | grep '"name"'` +3. **Block Anthropic API?** Recommended yes — prevents accidental spend if config drifts. + +Record as `FOLDER`, `MODEL`, and `BLOCK_ANTHROPIC`. + +## 3. Configure container.json + +Read `groups//container.json`. Add (or merge into) an `env` block and optionally `blockedHosts`: + +```json +{ + "env": { + "ANTHROPIC_BASE_URL": "http://host.docker.internal:11434", + "ANTHROPIC_API_KEY": "ollama", + "NO_PROXY": "host.docker.internal", + "no_proxy": "host.docker.internal" + }, + "blockedHosts": ["api.anthropic.com"] +} +``` + +Omit `blockedHosts` if the user declined step 2. + +**Why these vars:** `ANTHROPIC_BASE_URL` redirects the Anthropic SDK to Ollama. +`ANTHROPIC_API_KEY=ollama` satisfies the SDK's key requirement (Ollama ignores it). +`NO_PROXY` bypasses the OneCLI HTTPS proxy for requests to `host.docker.internal` +so they reach Ollama directly instead of going through the credential gateway. + +## 4. Set the model + +Read the agent group's shared Claude settings: + +```bash +# Find the agent group ID +AG_ID=$(sqlite3 data/v2.db "SELECT id FROM agent_groups WHERE folder='';") +SETTINGS=data/v2-sessions/$AG_ID/.claude-shared/settings.json +``` + +Add `"model": ""` to that settings file. Create the file if it doesn't exist: + +```json +{ + "model": "gemma4:latest" +} +``` + +If the file already has content, merge the `model` key in — don't overwrite existing keys. + +**Why here and not container.json:** Claude Code reads its model from its own settings +file, not from env vars. This file is bind-mounted into the container as `~/.claude/settings.json`. + +## 5. Build and restart + +```bash +export PATH="/opt/homebrew/bin:$PATH" +pnpm run build +launchctl unload ~/Library/LaunchAgents/com.nanoclaw.plist +launchctl load ~/Library/LaunchAgents/com.nanoclaw.plist +# Linux: systemctl --user restart nanoclaw +``` + +## 6. Verify + +Send a message to the agent. Then confirm: + +```bash +# Ollama shows the model as active +curl -s http://localhost:11434/api/ps | grep '"name"' + +# Container has the right env vars +CTR=$(docker ps --filter "name=nanoclaw-v2-" --format "{{.Names}}" | head -1) +docker inspect "$CTR" --format '{{json .HostConfig.ExtraHosts}}' +docker exec "$CTR" env | grep ANTHROPIC +``` + +Expected: `api.anthropic.com:0.0.0.0` in ExtraHosts, `ANTHROPIC_BASE_URL=http://host.docker.internal:11434`. + +## Reverting to Claude + +To switch back to the Anthropic API: + +1. Remove the `env` and `blockedHosts` keys from `groups//container.json` +2. Remove `"model"` from the shared settings file +3. Restart the service + +No rebuild needed — both files are read at container spawn time. + +## Troubleshooting + +**Agent hangs, no response:** Ollama may be loading the model cold (large models take 10–30s). +Watch `curl -s http://localhost:11434/api/ps` — the model appears once loaded. + +**"model not found" error in container logs:** The model name in settings.json doesn't match +what Ollama has. Run `ollama list` on the host and use the exact name shown. + +**Responses claim to be Claude:** The model was trained on data that includes Claude conversations. +Add a line to `groups//CLAUDE.md` telling it what model it runs on. + +**Agent responds but Ollama shows no activity:** `NO_PROXY` may not have taken effect for +`http_proxy` (lowercase). Add both `NO_PROXY` and `no_proxy` to the env block. diff --git a/docs/ollama.md b/docs/ollama.md new file mode 100644 index 000000000..0ea025393 --- /dev/null +++ b/docs/ollama.md @@ -0,0 +1,88 @@ +# Running Agents on Local Ollama + +NanoClaw agents can be routed to a local [Ollama](https://ollama.com) instance instead of the Anthropic API. This cuts API costs to zero and keeps all inference on your hardware. + +## How It Works + +Ollama exposes an Anthropic-compatible `/v1/messages` endpoint. The Claude Code CLI (which runs inside agent containers) uses the Anthropic SDK, which reads `ANTHROPIC_BASE_URL` to find the API host. Pointing that variable at Ollama is all that's needed — no new provider code, no changes to the agent runtime. + +``` +┌─────────────────────────────┐ +│ Agent container │ +│ │ +│ Claude Code CLI │ +│ ↓ ANTHROPIC_BASE_URL │ +│ http://host.docker. │ ┌──────────────────┐ +│ internal:11434 ───────┼─────▶│ Ollama :11434 │ +│ │ │ gemma4:latest │ +└─────────────────────────────┘ └──────────────────┘ +``` + +`host.docker.internal` is Docker's magic hostname that resolves to the host machine from inside a container — so Ollama running on your Mac or Linux box is reachable at that address. + +## The OneCLI Complication + +NanoClaw normally runs API calls through an OneCLI HTTPS proxy that injects real credentials in place of a placeholder key. When redirecting to Ollama you need to bypass that proxy so requests go direct. Two env vars handle this: + +- `NO_PROXY=host.docker.internal` — tells the Anthropic SDK's HTTP client to skip the proxy for that hostname +- `no_proxy=host.docker.internal` — lowercase variant for tools that check the lowercase form + +Both are set in the agent group's `container.json` alongside `ANTHROPIC_BASE_URL`. + +## Network Isolation + +Setting `ANTHROPIC_BASE_URL` redirects requests but doesn't prevent a misconfigured agent from accidentally reaching `api.anthropic.com` directly. The `blockedHosts` field in `container.json` adds a Docker `--add-host` flag that resolves the domain to `0.0.0.0`, making it physically unreachable from inside the container: + +```json +"blockedHosts": ["api.anthropic.com"] +``` + +With this in place, even if the model setting drifts back to a Claude model name, the API call will fail immediately rather than silently billing your account. + +## Model Selection + +The Claude Code CLI reads its model from `~/.claude/settings.json` inside the container, which NanoClaw bind-mounts from `data/v2-sessions//.claude-shared/settings.json`. Set `"model": "gemma4:latest"` (or whatever Ollama model you've pulled) there. Use the exact name from `ollama list`. + +Model selection considerations for Apple Silicon: + +| Model | Size | Quality | Speed (M4 Pro) | +|-------|------|---------|----------------| +| `gemma4:latest` | 12B | Good general-purpose | Fast | +| `qwen3-coder:latest` | 32B | Excellent for coding tasks | Moderate | +| `llama3.2:latest` | 3B | Basic | Very fast | + +The agent uses tool calls extensively (read/write files, shell commands). Models that support tool use reliably work best. Gemma 4 and Qwen 3 Coder both handle structured tool calls well. + +## What Changes at the Code Level + +Three files need to support this feature. See `/add-ollama-provider` for the exact changes. + +**`src/container-config.ts`** — `ContainerConfig` interface needs `env` and `blockedHosts` fields so the per-group JSON can carry them. + +**`src/container-runner.ts`** — At container spawn time, `env` entries become `-e KEY=VAL` Docker flags (applied after OneCLI's injected vars so they win), and `blockedHosts` entries become `--add-host HOST:0.0.0.0` flags. + +**`container/Dockerfile`** — The container runs as the host user's uid (e.g. 501 on macOS), not as the `node` user (uid 1000). The home directory must be `chmod 777` so any uid can write `~/.claude.json` and `~/.claude/settings.json`. + +## Tradeoffs + +| | Ollama (local) | Anthropic API | +|---|---|---| +| Cost | Free | Pay-per-token | +| Privacy | Fully local | Data sent to Anthropic | +| Model quality | Good (open-weight) | Excellent (Claude) | +| Cold start | 5–30s (model load) | ~1s | +| Context window | Varies by model | 200k tokens (Sonnet) | +| Tool use reliability | Good (large models) | Excellent | +| Hardware req. | 16GB+ RAM | None | + +For personal automation on capable hardware, the tradeoff favors local. For complex multi-step tasks requiring large context or high reliability, Claude is still ahead. + +## Reverting to Claude + +Remove the `env` and `blockedHosts` keys from `groups//container.json`, remove `"model"` from the shared settings file, and restart the service. No rebuild needed. + +## See Also + +- `/add-ollama-provider` — step-by-step skill to configure any agent group for Ollama +- [Ollama Anthropic compatibility docs](https://ollama.com/blog/openai-compatibility) — upstream docs on the API bridge +- `docs/architecture.md` — how the container spawn and env injection pipeline works