mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-09 06:52:15 +00:00
node-llama-cpp defaults contextSize to "auto", which on large embedding models like Qwen3-Embedding-8B (trained context 40,960) inflates gateway VRAM from ~8.8 GB to ~32 GB and causes OOM on single-GPU hosts that share the gateway with an LLM runtime. Expose memorySearch.local.contextSize in openclaw.json (number | "auto"), default to 4096 which comfortably covers typical memory-search chunks (128–512 tokens) while keeping non-weight VRAM bounded. Closes #69667. |
||
|---|---|---|
| .. | ||
| templates | ||
| AGENTS.default.md | ||
| api-usage-costs.md | ||
| credits.md | ||
| device-models.md | ||
| memory-config.md | ||
| prompt-caching.md | ||
| RELEASING.md | ||
| rich-output-protocol.md | ||
| rpc.md | ||
| secretref-credential-surface.md | ||
| secretref-user-supplied-credentials-matrix.json | ||
| session-management-compaction.md | ||
| test.md | ||
| token-use.md | ||
| transcript-hygiene.md | ||
| wizard.md | ||