* feat: s01-s14 docs quality overhaul — tool pipeline, single-agent, knowledge & resilience Rewrite code.py and README (zh/en/ja) for s01-s14, each chapter building incrementally on the previous. Key fixes across chapters: - s01-s04: agent loop, tool dispatch, permission pipeline, hooks - s05-s08: todo write, subagent, skill loading, context compact - s09-s11: memory system, system prompt assembly, error recovery - s12-s14: task graph, background tasks, cron scheduler All chapters CC source-verified. Code inherits fixes forward (PROMPT_SECTIONS, json.dumps cache, real-state context, can_start dep protection, etc.). * feat: s15-s19 docs quality overhaul — multi-agent platform: teams, protocols, autonomy, worktree, MCP tools Rewrite code.py and README (zh/en/ja) for s15-s19, the multi-agent platform chapters. Each chapter inherits all previous fixes and adds one mechanism: - s15: agent teams (TeamCreate, teammate threads, shared task list) - s16: team protocols (plan approval, shutdown handshake, consume_inbox) - s17: autonomous agents (idle polling, auto-claim, consume_lead_inbox) - s18: worktree isolation (git worktree, bind_task, cwd switching, safety) - s19: MCP tools (MCPClient, normalize_mcp_name, assemble_tool_pool, no cache) All appendix source code references verified against CC source. Config priority corrected: claude.ai < plugin < user < project < local. * fix: 5 regressions across s05-s19 — glob safety, todo validation, memory extraction, protocol types, dep crash - s05-s09: glob results now filter with is_relative_to(WORKDIR) (inherited from s02) - s06-s08: todo_write validates content/status required fields (inherited from s05) - s09: extract_memories uses pre-compression snapshot instead of compacted messages - s16: submit_plan docstring clarifies protocol-only (not code-level gate) - s17-s19: match_response restores type mismatch validation (from s16) - s17-s19: claim_task deps list handles missing dep files without crashing * fix: s12 Todo V2 logic reversal, s14/s15 cron range validation, s18/s19 worktree name validation - s12 README (zh/en/ja): fix Todo V2 direction — interactive defaults to Task, non-interactive/SDK defaults to TodoWrite. Fix env var name to CLAUDE_CODE_ENABLE_TASKS (not TODO_V2). - s14/s15: add _validate_cron_field with per-field range checks (minute 0-59, hour 0-23, dom 1-31, month 1-12, dow 0-6), step > 0, range lo <= hi. Replace old try/except validation that only caught exceptions. - s18/s19: add validate_worktree_name() to remove_worktree and keep_worktree, not just create_worktree. * fix: align s16-s19 teaching tool consistency * fix pr265 chapter diagrams * Add comprehensive s20 harness chapter * Fix chapter smoke test regressions * Clarify README tutorial track transition --------- Co-authored-by: Haoran <bill-billion@outlook.com> |
||
|---|---|---|
| .. | ||
| images | ||
| code.py | ||
| README.en.md | ||
| README.ja.md | ||
| README.md | ||
s12: Task System — Break Big Goals into Small Tasks
s01 → ... → s10 → s11 → s12 → s13 → s14 → ... → s20
"Break big goals into small tasks, order them, persist" — File-persisted task graph, the foundation for multi-agent collaboration.
Harness Layer: Tasks — Persisted goals, recoverable progress.
The Problem
The agent receives a project: set up a database, write APIs, add tests. It uses s05's TodoWrite to create a checklist, then starts writing the API first, gets halfway through and realizes there are no database tables, goes back to fix them; when adding tests, discovers the API interface signatures have changed again...
You can't build the roof before laying the foundation. Tasks have ordering. Task dependencies should form a Directed Acyclic Graph (DAG); the teaching version only demonstrates blockedBy checking, without cycle detection.
s05's TodoWrite is a list. No dependencies, no persistence — when the conversation ends, the list is gone. What you need is a task system: each task is a JSON file, tasks have blockedBy dependencies, and they persist across sessions on disk.
The Solution
Teaching code keeps a basic agent loop, omitting S11's full error recovery (RecoveryState, backoff, escalation, reactive compact, fallback model) to stay focused on the task system. Added: 5 new task tools + .tasks/ directory for persistence + blockedBy dependency checking. The task system and error recovery are independent layers: in CC source, utils/tasks.ts only handles CRUD, while query.ts's with_retry/RecoveryState handles error recovery, with no coupling between them.
TodoWrite vs Task System:
| TodoWrite (s05) | Task System (s12) | |
|---|---|---|
| Storage | In-memory list | .tasks/ JSON files |
| Dependencies | None | blockedBy dependency graph |
| Persistence | Lost when conversation ends | Cross-session |
| Multi-agent | None | owner field |
| Status | checked / unchecked | pending → in_progress → completed |
How It Works
Task: Data Structure
Each task is a JSON file, stored in the .tasks/ directory:
@dataclass
class Task:
id: str
subject: str
description: str
status: str # pending | in_progress | completed
owner: str | None # Agent name (multi-agent scenarios)
blockedBy: list[str] # List of dependency task IDs
IDs are generated with timestamp + random hex, simple but sufficient. CC uses sequential IDs + a highwatermark file to prevent ID reuse, which is a more rigorous design.
create_task: Create Tasks
def create_task(subject: str, description: str = "",
blockedBy: list[str] | None = None) -> Task:
task = Task(
id=f"task_{int(time.time())}_{random_hex(4)}",
subject=subject, description=description,
status="pending", owner=None,
blockedBy=blockedBy or [],
)
save_task(task)
return task
Automatically calls save_task on creation to write .tasks/{id}.json. blockedBy declares dependencies, for example "write API" has blockedBy: ["task_schema"].
can_start: Dependency Check
A task can only start after all its blockedBy dependencies are completed:
def can_start(task_id: str) -> bool:
task = load_task(task_id)
for dep_id in task.blockedBy:
if not _task_path(dep_id).exists():
return False # missing dependency = blocked
dep = load_task(dep_id)
if dep.status != "completed":
return False
return True
can_start is a prerequisite check for claim_task: if any blockedBy dependency is not completed, the task cannot be claimed. Missing dependencies are treated as blocked, avoiding crashes from referencing wrong IDs.
claim_task: Claim a Task
When the agent starts working on a task, it calls claim_task: sets owner, changes status from pending → in_progress. The owner field records who is working on the task, preventing duplicate claims in multi-agent scenarios:
def claim_task(task_id: str, owner: str = "agent") -> str:
task = load_task(task_id)
if task.status != "pending":
return f"Task {task_id} is {task.status}, cannot claim"
if not can_start(task_id):
deps = [d for d in task.blockedBy
if load_task(d).status != "completed"]
return f"Blocked by: {deps}"
task.owner = owner
task.status = "in_progress"
save_task(task)
return f"Claimed {task_id} ({task.subject})"
If the task is already claimed by someone else (status != "pending"), or dependencies aren't met (can_start returns False), the claim is rejected.
complete_task: Complete and Unblock
When a task is done, set it to completed. Simultaneously scan all other tasks to find downstream tasks that were just unblocked:
def complete_task(task_id: str) -> str:
task = load_task(task_id)
task.status = "completed"
save_task(task)
# Find newly unblocked downstream tasks
unblocked = [t.subject for t in list_tasks()
if t.status == "pending" and t.blockedBy
and can_start(t.id)]
msg = f"Completed {task_id} ({task.subject})"
if unblocked:
msg += f"\nUnblocked: {', '.join(unblocked)}"
return msg
After completing "schema", can_start returns True for "endpoints" and "docs"; they can begin.
get_task: View Full Details
list_tasks only shows a one-line summary. get_task returns the full task JSON, including description and dependency details. When recovering across sessions, the agent needs to read the full description to continue work:
def get_task(task_id: str) -> str:
task = load_task(task_id)
return json.dumps(asdict(task), indent=2)
State Machine: Two Actions, Three States
pending ──claim──→ in_progress ──complete──→ completed
Here claim / complete are actions, while pending / in_progress / completed are states:
- claim_task:
pending→in_progress. Sets owner, begins work. - complete_task:
in_progress→completed. Marks the task done and unblocks downstream.
CC has no in_progress → pending release path. If a teammate terminates or shuts down, CC unassigns its unfinished tasks (clears owner) and resets status to pending, allowing other agents to reclaim them. The teaching version omits this recovery path.
Putting It Together
# Create tasks with dependencies
schema = create_task("setup database schema")
endpoints = create_task("create API endpoints", blockedBy=[schema.id])
tests = create_task("write tests", blockedBy=[endpoints.id])
docs = create_task("write docs", blockedBy=[schema.id])
# Agent claims the first available task
claim_task(schema.id) # ✓ Claimed (no dependencies)
complete_task(schema.id) # ✓ Completed → unblocks endpoints, docs
claim_task(endpoints.id) # ✓ Claimed (schema completed)
complete_task(endpoints.id) # ✓ Completed → unblocks tests
claim_task(docs.id) # ✓ Claimed (schema completed)
complete_task(docs.id) # ✓ Completed
claim_task(tests.id) # ✓ Claimed (endpoints completed)
complete_task(tests.id) # ✓ Completed
Each create_task writes a JSON file, each claim_task / complete_task updates the file. Across sessions, the .tasks/ directory persists — the agent reads the files to recover progress.
Changes from s11
| Component | Before (s11) | After (s12) |
|---|---|---|
| Task management | None | Task dataclass + 5 tools |
| New types | — | Task (id, subject, description, status, owner, blockedBy) |
| Storage | No persistence | .tasks/{id}.json cross-session |
| Dependencies | None | blockedBy graph + can_start check |
| Tools | bash, read_file, write_file (3) | + create_task, list_tasks, get_task, claim_task, complete_task (8) |
| Lifecycle | — | pending → in_progress → completed (no release rollback) |
Try It
cd learn-claude-code
python s12_task_system/code.py
Try these prompts:
Create tasks: setup database schema, create API endpoints (depends on schema), write tests (depends on endpoints), write docs (depends on schema)List all tasks and their statusesClaim the first unblocked task and complete itList tasks again — which ones are now unblocked?
What to observe: Are JSON files generated in the .tasks/ directory? After completing a task, are the blocked tasks unblocked?
What's Next
The task graph is in place. But some tasks take a long time — like running full test suites or deploying to a server. The agent calls the LLM billed by token, it can't afford to wait on a slow operation.
s13 Background Tasks → Slow operations go to the background. The agent continues processing other tasks, and gets notified when the background work is done.
Deep Dive into CC Source
The following is a complete analysis based on CC source code
utils/tasks.ts(862 lines),tools/TaskCreateTool/TaskCreateTool.ts(138 lines),tools/TaskUpdateTool/TaskUpdateTool.ts(406 lines),tools/TaskGetTool/TaskGetTool.ts(128 lines),tools/TaskListTool/TaskListTool.ts(116 lines),hooks/useTaskListWatcher.ts(221 lines).
1. TaskRecord's Full Fields
The tutorial only covers id, subject, status, owner, blockedBy. CC actually has 9 fields (utils/tasks.ts:76-89):
| Field | Type | Purpose |
|---|---|---|
id |
string | Incrementing integer ID |
subject |
string | Short title |
description |
string | Free-form description |
activeForm |
string? | Present tense form, shown in spinner when in_progress |
owner |
string? | Assigned agent ID |
status |
pending/in_progress/completed | Lifecycle |
blocks |
string[] | Task IDs blocked by this task (downstream) |
blockedBy |
string[] | Task IDs blocking this task (upstream) |
metadata |
Record? | Arbitrary extension key-value pairs |
Storage location: ~/.claude/tasks/{taskListId}/{id}.json. One file per task.
2. Not a TodoWrite Upgrade — Two Independent Systems
In CC, Task System and TodoWrite coexist, toggled by isTodoV2Enabled() (utils/tasks.ts:133) — interactive sessions default to Task (V2), non-interactive/SDK sessions default to TodoWrite. The CLAUDE_CODE_ENABLE_TASKS env var can force-enable Task. Task has what TodoWrite lacks: file-lock concurrency protection, dependency enforcement, ownership, fs.watch reactive monitoring, lifecycle hooks.
3. Concurrent Claim Locking
claimTask() (utils/tasks.ts:541-612) uses dual locking to prevent races:
Task file lock: proper-lockfile locks {taskId}.json (up to 30 retries, exponential backoff 5-100ms). Inside the lock:
- Re-read task (prevent TOCTOU)
- Check already claimed by another →
already_claimed - Check already completed →
already_resolved - Check upstream not completed →
blocked - Set owner
List-level lock (agent busy check): .lock file, atomic scan of all tasks to check if the agent already has other open tasks.
Note: The teaching version combines claiming and starting work into one step (claim = set owner + in_progress); real CC's claimTask primarily resolves owner competition — it only sets owner without changing status. Status updates are handled by TaskUpdate.
4. High-Water Mark to Prevent ID Reuse
The .highwatermark file records the highest task ID ever assigned. Even if a task is deleted, its ID won't be reused.
5. Four Task Tools
CC's task system has four tools (not the tutorial's single generic Task tool): TaskCreate, TaskGet, TaskUpdate, TaskList. All set isConcurrencySafe: true and shouldDefer: true (tool schemas aren't in the initial prompt; only visible after ToolSearch).
The teaching version's create_task(blockedBy=...) declares dependencies at creation time, which is a reasonable simplification. Real CC's TaskCreate only accepts subject/description/activeForm/metadata — dependencies are maintained via TaskUpdate's addBlocks/addBlockedBy.