learn-claude-code/s10_system_prompt
gui-yue 1baf1aca5a
Some checks are pending
CI / build (push) Waiting to run
Test / python-smoke (push) Waiting to run
Test / web-build (push) Waiting to run
Follow up PR #265: refine chapters, diagrams, and add S20 (#283)
* feat: s01-s14 docs quality overhaul — tool pipeline, single-agent, knowledge & resilience

Rewrite code.py and README (zh/en/ja) for s01-s14, each chapter building
incrementally on the previous. Key fixes across chapters:

- s01-s04: agent loop, tool dispatch, permission pipeline, hooks
- s05-s08: todo write, subagent, skill loading, context compact
- s09-s11: memory system, system prompt assembly, error recovery
- s12-s14: task graph, background tasks, cron scheduler

All chapters CC source-verified. Code inherits fixes forward (PROMPT_SECTIONS,
json.dumps cache, real-state context, can_start dep protection, etc.).

* feat: s15-s19 docs quality overhaul — multi-agent platform: teams, protocols, autonomy, worktree, MCP tools

Rewrite code.py and README (zh/en/ja) for s15-s19, the multi-agent platform
chapters. Each chapter inherits all previous fixes and adds one mechanism:

- s15: agent teams (TeamCreate, teammate threads, shared task list)
- s16: team protocols (plan approval, shutdown handshake, consume_inbox)
- s17: autonomous agents (idle polling, auto-claim, consume_lead_inbox)
- s18: worktree isolation (git worktree, bind_task, cwd switching, safety)
- s19: MCP tools (MCPClient, normalize_mcp_name, assemble_tool_pool, no cache)

All appendix source code references verified against CC source. Config priority
corrected: claude.ai < plugin < user < project < local.

* fix: 5 regressions across s05-s19 — glob safety, todo validation, memory extraction, protocol types, dep crash

- s05-s09: glob results now filter with is_relative_to(WORKDIR) (inherited from s02)
- s06-s08: todo_write validates content/status required fields (inherited from s05)
- s09: extract_memories uses pre-compression snapshot instead of compacted messages
- s16: submit_plan docstring clarifies protocol-only (not code-level gate)
- s17-s19: match_response restores type mismatch validation (from s16)
- s17-s19: claim_task deps list handles missing dep files without crashing

* fix: s12 Todo V2 logic reversal, s14/s15 cron range validation, s18/s19 worktree name validation

- s12 README (zh/en/ja): fix Todo V2 direction — interactive defaults to Task,
  non-interactive/SDK defaults to TodoWrite. Fix env var name to
  CLAUDE_CODE_ENABLE_TASKS (not TODO_V2).
- s14/s15: add _validate_cron_field with per-field range checks (minute 0-59,
  hour 0-23, dom 1-31, month 1-12, dow 0-6), step > 0, range lo <= hi.
  Replace old try/except validation that only caught exceptions.
- s18/s19: add validate_worktree_name() to remove_worktree and keep_worktree,
  not just create_worktree.

* fix: align s16-s19 teaching tool consistency

* fix pr265 chapter diagrams

* Add comprehensive s20 harness chapter

* Fix chapter smoke test regressions

* Clarify README tutorial track transition

---------

Co-authored-by: Haoran <bill-billion@outlook.com>
2026-05-20 21:45:38 +08:00
..
images Follow up PR #265: refine chapters, diagrams, and add S20 (#283) 2026-05-20 21:45:38 +08:00
code.py Follow up PR #265: refine chapters, diagrams, and add S20 (#283) 2026-05-20 21:45:38 +08:00
README.en.md Follow up PR #265: refine chapters, diagrams, and add S20 (#283) 2026-05-20 21:45:38 +08:00
README.ja.md Follow up PR #265: refine chapters, diagrams, and add S20 (#283) 2026-05-20 21:45:38 +08:00
README.md Follow up PR #265: refine chapters, diagrams, and add S20 (#283) 2026-05-20 21:45:38 +08:00

s10: System Prompt — Assembled at Runtime, Never Hardcoded

中文 · English · 日本語

s01 → ... → s08 → s09 → s10s11 → s12 → ... → s20

"prompt is assembled, not hardcoded" — Sections + on-demand assembly + caching.

Harness Layer: Prompt — assembled at runtime, never hardcoded.


The Problem

From s01 to s09, the system prompt was always one hardcoded line:

SYSTEM = f"You are a coding agent at {WORKDIR}. Use tools to solve tasks."

That worked for s01 — only bash, read, write. But by s09, the agent has memory, compression, skill loading. The prompt needs to describe more and more capabilities:

SYSTEM = (
    f"You are a coding agent at {WORKDIR}. "
    "Use tools to solve tasks. Act, don't explain. "
    "Before starting any multi-step task, use todo_write. "
    "Skills are available via list_skills and load_skill. "
    "Relevant memories are injected below when available. "
    # ... add a capability, add a line
)

Three problems:

  1. Switching projects requires rewriting the entire prompt — no way to know what to change and what to keep
  2. One change can break others — adding a tool description might conflict with earlier instructions
  3. Every request carries everything — even when the current conversation doesn't need certain sections, they waste tokens

The system prompt should be a configuration assembled at runtime based on current state: which tools are enabled, which context is visible, which memories are relevant, and which content must remain stable to hit prompt cache.


The Solution

System Prompt Overview

s10 focuses on prompt assembly. It builds on the s08-s09 capabilities but doesn't re-implement compression or memory. The core change: split the hardcoded SYSTEM into independent sections, assemble them at runtime based on real state, and cache the result.

Four sections, two loading strategies:

Section Strategy Content Condition
identity always who you are, how to work always present
tools always available tool list enabled_tools
workspace always working directory always present
memory on-demand relevant memory content whether .memory/MEMORY.md exists

Key design: whether a section loads depends on real state (tools exist, files exist), not keywords in messages.


How It Works

PROMPT_SECTIONS: Topic-Keyed Fragments

Split the monolithic string into a dictionary, each key is a topic:

PROMPT_SECTIONS = {
    "identity": "You are a coding agent. Act, don't explain.",
    "tools": "Available tools: bash, read_file, write_file.",
    "workspace": f"Working directory: {WORKDIR}",
    "memory": "Relevant memories are injected below when available.",
}

Each section is maintained independently. Changing tools doesn't affect identity; adding memory doesn't touch workspace.

assemble_system_prompt: On-Demand Assembly

Not every section is needed every turn. No memory files? Loading the memory section just wastes tokens. Assembly is based on real state in context:

def assemble_system_prompt(context: dict) -> str:
    sections = []

    # Always loaded
    sections.append(PROMPT_SECTIONS["identity"])
    sections.append(PROMPT_SECTIONS["tools"])
    sections.append(PROMPT_SECTIONS["workspace"])

    # On-demand — based on real state, not keywords
    memories = context.get("memories", "")
    if memories:
        sections.append(f"Relevant memories:\n{memories}")

    return "\n\n".join(sections)

"Always loaded" sections are needed every turn: identity, tools, workspace. "On-demand" sections are only useful under specific conditions.

Why not load everything? Tokens have cost (system prompt is billed every turn), and fewer instructions means more focused output (irrelevant instructions are noise).

get_system_prompt: Cache to Avoid Re-Assembly

When context hasn't changed (multiple LLM calls in the same turn with the same context), re-assembling is wasteful. Use deterministic serialization to detect changes and return cached result:

def get_system_prompt(context: dict) -> str:
    global _last_context_key, _last_prompt
    key = json.dumps(context, sort_keys=True, ensure_ascii=False, default=str)
    if key == _last_context_key and _last_prompt:
        return _last_prompt
    _last_context_key = key
    _last_prompt = assemble_system_prompt(context)
    return _last_prompt

json.dumps instead of hash(): Python's built-in hash() has process randomization (unsuitable for stable cache keys) and throws unhashable type on nested dicts/lists.

Note: this cache only avoids redundant string assembly within a process. It's not the same as CC's API prompt cache, which uses SYSTEM_PROMPT_DYNAMIC_BOUNDARY to separate static and dynamic parts — the static parts hit global cache and don't invalidate when dynamic content changes.

context: Real State, Not Keyword Guessing

Context reflects the actual runtime state:

def update_context(context: dict, messages: list) -> dict:
    memories = ""
    if MEMORY_INDEX.exists():
        content = MEMORY_INDEX.read_text().strip()
        if content:
            memories = content
    return {
        "enabled_tools": list(TOOL_HANDLERS.keys()),
        "workspace": str(WORKDIR),
        "memories": memories,
    }

enabled_tools lists actually registered tools. memories checks whether .memory/MEMORY.md exists. Section loading is based on this real state, not searching for keywords in messages.

Putting It Together

def agent_loop(messages: list, context: dict):
    system = get_system_prompt(context)
    while True:
        response = client.messages.create(
            model=MODEL, system=system, messages=messages,
            tools=TOOLS, max_tokens=8000)
        # ... tool execution ...
        context = update_context(context, messages)
        system = get_system_prompt(context)

At the start of each loop iteration, get the system prompt. If context changed, re-assemble; if not, return cached version.


Changes From s09

Component Before (s09) After (s10)
prompt Hardcoded SYSTEM string PROMPT_SECTIONS + assemble_system_prompt
caching None get_system_prompt (json.dumps detection + cache)
new functions assemble_system_prompt, get_system_prompt, update_context
tools bash, read_file, write_file (3) bash, read_file, write_file (3) — unchanged
loop Uses fixed SYSTEM Uses get_system_prompt(context)

Try It

cd learn-claude-code
python s10_system_prompt/code.py

What to watch for:

  1. Output shows which sections were loaded ([assembled] sections: ... label)
  2. Cache hits show [cache hit] during continued conversation
  3. Creating .memory/MEMORY.md makes the memory section appear on the next turn

Try these prompts:

  1. Read the file README.md (observe the three always-loaded sections)
  2. Create a file called .memory/MEMORY.md with content "- [test](test.md) — test memory" (write a memory index)
  3. Read the file code.py (observe whether the memory section appears)

What's Next

System prompts can now be assembled at runtime. But the agent still crashes on errors. Network hiccups, API rate limits, truncated output, context overflow — these aren't bugs, they're normal.

s11 Error Recovery → four recovery paths. Upgrade tokens, compress context, exponential backoff, switch models.

Deep Dive Into CC Source Code

The following is based on analysis of CC source code constants/prompts.ts (914 lines), constants/systemPromptSections.ts (68 lines), context.ts (189 lines), utils/api.ts (718 lines), utils/systemPrompt.ts (123 lines), and bootstrap/state.ts.

How many sections does CC's system prompt have?

The count varies based on feature flags, output style, KAIROS/Proactive mode, user type, token budget, etc. Roughly two categories:

Static sections (always loaded): identity, system, doing_tasks, actions, using_tools, tone_style, output_efficiency, etc.

Dynamic sections (loaded by state): session_guidance, memory, ant_model_override, env_info_simple, language, output_style, mcp_instructions, scratchpad, frc, summarize_tool_results, numeric_length_anchors, token_budget, brief, etc.

mcp_instructions is the only volatile section (created via DANGEROUS_uncachedSystemPromptSection()), because MCP servers can connect and disconnect between turns.

Assembly Function

getSystemPrompt(tools, model, additionalWorkingDirs?, mcpClients?): Promise<string[]>

Returns string[] (each element is a section), separated by SYSTEM_PROMPT_DYNAMIC_BOUNDARY between static and dynamic parts.

cache scope

When global cache boundary is enabled, static sections are merged into one global cache block, and dynamic sections don't use global cache (cacheScope: null). Only paths without boundary or skipping global cache fall back to org scope.

The teaching version's cache only avoids redundant string assembly. CC's three-layer cache:

  1. lodash memoize: getSystemContext and getUserContext cached per session (context.ts)
  2. Section registry cache: STATE.systemPromptSectionCache caches dynamic section results, cleared on /clear or /compact
  3. API-level cache: splitSysPromptPrefix() (api.ts) splits prompt into blocks with different cache scopes via boundary

getUserContext vs getSystemContext

getSystemContext getUserContext
Content gitStatus, cacheBreaker CLAUDE.md content, currentDate
Injection appended to system prompt array prepended as <system-reminder> user message
When skipped custom system prompt always runs

How modes change the prompt

  • CLAUDE_CODE_SIMPLE: entire prompt is 2 lines
  • Proactive/KAIROS: compact prompt replaces all standard sections
  • Coordinator: coordinator-specific prompt fully replaces default
  • Agent mode: agent-defined prompt replaces or appends to default

Total size

Standard interactive mode system prompt core is ~20-30KB text. CLAUDE_CODE_SIMPLE is ~150 characters. User context (CLAUDE.md) and system context (git status) add on top.