mirror of
https://github.com/qwibitai/nanoclaw.git
synced 2026-06-27 18:34:58 +08:00
f9c86d0af2
The Claude Agent SDK adds a per-request cch=<hash> to the front of every prompt; it changes each turn, and Ollama's prompt cache only reuses a prompt whose start is unchanged, so it re-reads the whole prompt every time (slow). A tiny proxy filters the hash out (pins cch to a constant) so caching kicks in. In our setup (31B on Apple Silicon) follow-up replies went ~80s -> ~4s; numbers vary by model/hardware. Ollama ignores the hash, so output is unchanged. Scope: only the Claude-Code-CLI -> Ollama path; Codex/OpenCode emit no cch. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
NanoClaw Documentation
The official documentation is at docs.nanoclaw.dev.
The files in this directory are original design documents and developer references. For the most current and accurate information, use the documentation site.