learn-claude-code/s05_todo_write
gui-yue 1baf1aca5a
Some checks are pending
CI / build (push) Waiting to run
Test / python-smoke (push) Waiting to run
Test / web-build (push) Waiting to run
Follow up PR #265: refine chapters, diagrams, and add S20 (#283)
* feat: s01-s14 docs quality overhaul — tool pipeline, single-agent, knowledge & resilience

Rewrite code.py and README (zh/en/ja) for s01-s14, each chapter building
incrementally on the previous. Key fixes across chapters:

- s01-s04: agent loop, tool dispatch, permission pipeline, hooks
- s05-s08: todo write, subagent, skill loading, context compact
- s09-s11: memory system, system prompt assembly, error recovery
- s12-s14: task graph, background tasks, cron scheduler

All chapters CC source-verified. Code inherits fixes forward (PROMPT_SECTIONS,
json.dumps cache, real-state context, can_start dep protection, etc.).

* feat: s15-s19 docs quality overhaul — multi-agent platform: teams, protocols, autonomy, worktree, MCP tools

Rewrite code.py and README (zh/en/ja) for s15-s19, the multi-agent platform
chapters. Each chapter inherits all previous fixes and adds one mechanism:

- s15: agent teams (TeamCreate, teammate threads, shared task list)
- s16: team protocols (plan approval, shutdown handshake, consume_inbox)
- s17: autonomous agents (idle polling, auto-claim, consume_lead_inbox)
- s18: worktree isolation (git worktree, bind_task, cwd switching, safety)
- s19: MCP tools (MCPClient, normalize_mcp_name, assemble_tool_pool, no cache)

All appendix source code references verified against CC source. Config priority
corrected: claude.ai < plugin < user < project < local.

* fix: 5 regressions across s05-s19 — glob safety, todo validation, memory extraction, protocol types, dep crash

- s05-s09: glob results now filter with is_relative_to(WORKDIR) (inherited from s02)
- s06-s08: todo_write validates content/status required fields (inherited from s05)
- s09: extract_memories uses pre-compression snapshot instead of compacted messages
- s16: submit_plan docstring clarifies protocol-only (not code-level gate)
- s17-s19: match_response restores type mismatch validation (from s16)
- s17-s19: claim_task deps list handles missing dep files without crashing

* fix: s12 Todo V2 logic reversal, s14/s15 cron range validation, s18/s19 worktree name validation

- s12 README (zh/en/ja): fix Todo V2 direction — interactive defaults to Task,
  non-interactive/SDK defaults to TodoWrite. Fix env var name to
  CLAUDE_CODE_ENABLE_TASKS (not TODO_V2).
- s14/s15: add _validate_cron_field with per-field range checks (minute 0-59,
  hour 0-23, dom 1-31, month 1-12, dow 0-6), step > 0, range lo <= hi.
  Replace old try/except validation that only caught exceptions.
- s18/s19: add validate_worktree_name() to remove_worktree and keep_worktree,
  not just create_worktree.

* fix: align s16-s19 teaching tool consistency

* fix pr265 chapter diagrams

* Add comprehensive s20 harness chapter

* Fix chapter smoke test regressions

* Clarify README tutorial track transition

---------

Co-authored-by: Haoran <bill-billion@outlook.com>
2026-05-20 21:45:38 +08:00
..
example Follow up PR #265: refine chapters, diagrams, and add S20 (#283) 2026-05-20 21:45:38 +08:00
images Follow up PR #265: refine chapters, diagrams, and add S20 (#283) 2026-05-20 21:45:38 +08:00
code.py Follow up PR #265: refine chapters, diagrams, and add S20 (#283) 2026-05-20 21:45:38 +08:00
README.en.md Follow up PR #265: refine chapters, diagrams, and add S20 (#283) 2026-05-20 21:45:38 +08:00
README.ja.md Follow up PR #265: refine chapters, diagrams, and add S20 (#283) 2026-05-20 21:45:38 +08:00
README.md Follow up PR #265: refine chapters, diagrams, and add S20 (#283) 2026-05-20 21:45:38 +08:00

s05: TodoWrite — An Agent Without a Plan Drifts Off Course

中文 · English · 日本語

s01 → s02 → s03 → s04 → s05s06 → s07 → ... → s20

"An agent without a plan goes wherever the wind blows" — List the steps first, then execute. Complex tasks are less likely to miss steps.

Harness Layer: Planning — Let the Agent think before it acts.


The Problem

Give the Agent a complex task: "Rename all Python files to snake_case, run tests, and fix failures."

The Agent starts working, renames 3 files, runs a test, finds 2 failures, starts fixing. While fixing, it forgets the original goal was "rename to snake_case", the test failures have consumed all its attention.

The longer the conversation, the worse it gets: tool results keep filling the context, diluting the system prompt's influence. A 10-step refactoring: after steps 1-3, the Agent starts improvising because steps 4-10 have been pushed out of its attention.


The Solution

Todo Overview

The minimal hook structure from the previous chapter is preserved, focusing on the new todo_write tool and reminder mechanism. todo_write does no actual work, can't read files or run commands, it simply lets the Agent organize its thoughts before diving in.

The dispatch mechanism is unchanged; the new tool is still routed through TOOL_HANDLERS[block.name]. However, to demonstrate the todo reminder, a counter was added to the loop: after 3 consecutive rounds without calling todo_write, a reminder is injected.


How It Works

The todo_write tool, accepts a list with statuses, persists to .tasks/current_todos.json (teaching version writes to disk for observability), and displays progress in the terminal:

def run_todo_write(todos: list) -> str:
    tasks_file = TASKS_DIR / "current_todos.json"
    tasks_file.write_text(json.dumps(todos, indent=2, ensure_ascii=False))

    lines = ["\n## Current Tasks"]
    for t in todos:
        icon = {"pending": " ", "in_progress": "▸", "completed": "✓"}[t["status"]]
        lines.append(f"  [{icon}] {t['content']}")
    print("\n".join(lines))
    return f"Updated {len(todos)} tasks"

The tool definition joins the other 5 in the dispatch map:

TOOLS = [
    {"name": "bash",       ...},
    {"name": "read_file",  ...},
    {"name": "write_file", ...},
    {"name": "edit_file",  ...},
    {"name": "glob",       ...},
    # s05: new entry
    {"name": "todo_write", "description": "Create and manage a task list ...",
     "input_schema": {
         "type": "object",
         "properties": {
             "todos": {
                 "type": "array",
                 "items": {
                     "type": "object",
                     "properties": {
                         "content": {"type": "string"},
                         "status": {"type": "string", "enum": ["pending", "in_progress", "completed"]},
                     },
                 },
             },
         },
     },
    },
]

TOOL_HANDLERS["todo_write"] = run_todo_write

Nag reminder, when the model hasn't called todo_write for 3 consecutive rounds, a reminder is automatically injected (teaching mechanism; CC source has no fixed round-count logic):

if rounds_since_todo >= 3 and messages:
    messages.append({
        "role": "user",
        "content": "<reminder>Update your todos.</reminder>",
    })
    rounds_since_todo = 0

Typical flow when the Agent receives a task: first call todo_write to list all steps (all pending) → pick one step, set it to in_progress → complete it, set to completed → look at the next pending → continue. After 3 rounds without todo_write, the loop appends a reminder before the next LLM call.

Key insight: todo_write doesn't give the Agent any additional execution capability. What it adds is planning capability.


Changes from s04

Component Before (s04) After (s05)
Tool count 5 (bash, read, write, edit, glob) 6 (+todo_write)
Planning None Stateful TODO list + nag reminder
SYSTEM prompt Generic prompt Added "plan before executing" guidance
Loop Unchanged Dispatch unchanged, added rounds_since_todo counter and reminder injection

Try It

cd learn-claude-code
python s05_todo_write/code.py

Try these prompts:

  1. Refactor s05_todo_write/example/hello.py: add type hints, docstrings, and a main guard (should list 3 steps first, then execute)
  2. Create a Python package under s05_todo_write/example/demo_pkg with __init__.py, utils.py, and tests/test_utils.py
  3. Review Python files under s05_todo_write/example and fix any style issues

What to watch for: Was the first tool call todo_write? How many TODO steps were listed? Did statuses move from pending to in_progress / completed during execution?


What's Next

The Agent can plan now. But if a task is too large, say "refactor the entire auth module", a TODO list alone isn't enough. That task is itself a collection of dozens of subtasks that would drown in a single conversation's context.

→ s06 Subagent: Break large tasks into subtasks, each handled by an independent Agent with its own clean context, no cross-contamination.

Dive into CC Source Code

CC has two task systems coexisting (tasks.ts:133-139):

  • TodoWrite (V1): A simple list tool, data maintained in memory AppState (TodoWriteTool.ts:65-103). The teaching version writes to .tasks/current_todos.json for observability; the real V1 does not write to disk.
  • Task System (V2 = s12): File-persisted, dependency graph, concurrency locks, ownership.

The switch is controlled by isTodoV2Enabled(). In the current source: V2 is enabled by default in interactive sessions, V1 in non-interactive (SDK) sessions; setting CLAUDE_CODE_ENABLE_TASKS forces V2 regardless. Note the source comment "Force-enable tasks in non-interactive mode" describes the env var path's purpose, not the default branch's return semantics.

The teaching version omits the activeForm field from the real source (utils/todo/types.ts:8-15). CC uses it for the UI spinner to show "what's being done"; the teaching version only has terminal output and doesn't need this field.

The teaching version's nag reminder (3 rounds without update triggers injection) is an educational mechanism. The CC source has no fixed "3 rounds" logic; the closest is TodoWriteTool.ts:72-107 which appends a verification nudge when 3+ todos are all completed without a verification item.

Core increments of the Task System over TodoWrite:

  • File persistence (Claude config directory tasks/{taskListId}/{taskId}.json) instead of in-memory list
  • blockedBy dependency graph instead of flat list
  • proper-lockfile concurrency safety instead of no locking
  • Four separate tools (Create/Get/Update/List) instead of one
  • TaskCreated / TaskCompleted hooks (TaskCreateTool.ts:80-129, TaskUpdateTool.ts:231-260) for external system integration