mirror of https://github.com/shareAI-lab/learn-claude-code.git synced 2026-05-21 02:29:23 +00:00

History

gui-yue 1baf1aca5a Some checks are pending CI / build (push) Waiting to run Details Test / python-smoke (push) Waiting to run Details Test / web-build (push) Waiting to run Details Follow up PR #265 : refine chapters, diagrams, and add S20 (#283 ) * feat: s01-s14 docs quality overhaul — tool pipeline, single-agent, knowledge & resilience Rewrite code.py and README (zh/en/ja) for s01-s14, each chapter building incrementally on the previous. Key fixes across chapters: - s01-s04: agent loop, tool dispatch, permission pipeline, hooks - s05-s08: todo write, subagent, skill loading, context compact - s09-s11: memory system, system prompt assembly, error recovery - s12-s14: task graph, background tasks, cron scheduler All chapters CC source-verified. Code inherits fixes forward (PROMPT_SECTIONS, json.dumps cache, real-state context, can_start dep protection, etc.). * feat: s15-s19 docs quality overhaul — multi-agent platform: teams, protocols, autonomy, worktree, MCP tools Rewrite code.py and README (zh/en/ja) for s15-s19, the multi-agent platform chapters. Each chapter inherits all previous fixes and adds one mechanism: - s15: agent teams (TeamCreate, teammate threads, shared task list) - s16: team protocols (plan approval, shutdown handshake, consume_inbox) - s17: autonomous agents (idle polling, auto-claim, consume_lead_inbox) - s18: worktree isolation (git worktree, bind_task, cwd switching, safety) - s19: MCP tools (MCPClient, normalize_mcp_name, assemble_tool_pool, no cache) All appendix source code references verified against CC source. Config priority corrected: claude.ai < plugin < user < project < local. * fix: 5 regressions across s05-s19 — glob safety, todo validation, memory extraction, protocol types, dep crash - s05-s09: glob results now filter with is_relative_to(WORKDIR) (inherited from s02) - s06-s08: todo_write validates content/status required fields (inherited from s05) - s09: extract_memories uses pre-compression snapshot instead of compacted messages - s16: submit_plan docstring clarifies protocol-only (not code-level gate) - s17-s19: match_response restores type mismatch validation (from s16) - s17-s19: claim_task deps list handles missing dep files without crashing * fix: s12 Todo V2 logic reversal, s14/s15 cron range validation, s18/s19 worktree name validation - s12 README (zh/en/ja): fix Todo V2 direction — interactive defaults to Task, non-interactive/SDK defaults to TodoWrite. Fix env var name to CLAUDE_CODE_ENABLE_TASKS (not TODO_V2). - s14/s15: add _validate_cron_field with per-field range checks (minute 0-59, hour 0-23, dom 1-31, month 1-12, dow 0-6), step > 0, range lo <= hi. Replace old try/except validation that only caught exceptions. - s18/s19: add validate_worktree_name() to remove_worktree and keep_worktree, not just create_worktree. * fix: align s16-s19 teaching tool consistency * fix pr265 chapter diagrams * Add comprehensive s20 harness chapter * Fix chapter smoke test regressions * Clarify README tutorial track transition --------- Co-authored-by: Haoran <bill-billion@outlook.com>		2026-05-20 21:45:38 +08:00
..
images	Follow up PR #265 : refine chapters, diagrams, and add S20 (#283 )	2026-05-20 21:45:38 +08:00
code.py	Follow up PR #265 : refine chapters, diagrams, and add S20 (#283 )	2026-05-20 21:45:38 +08:00
README.en.md	Follow up PR #265 : refine chapters, diagrams, and add S20 (#283 )	2026-05-20 21:45:38 +08:00
README.ja.md	Follow up PR #265 : refine chapters, diagrams, and add S20 (#283 )	2026-05-20 21:45:38 +08:00
README.md	Follow up PR #265 : refine chapters, diagrams, and add S20 (#283 )	2026-05-20 21:45:38 +08:00

README.en.md

s15: Agent Teams — One Agent Isn't Enough, Form a Team

中文 · English · 日本語

s01 → ... → s13 → s14 → s15 → s16 → s17 → s18 → s19 → s20

"One agent isn't enough, form a team" — File-based inboxes + teammate threads.

Harness Layer: Teams — Multi-agent collaboration, message bus.

The Problem

"Refactor the entire backend" touches auth, database layer, API routes, and tests. One agent working on API routes no longer has auth module details in context. The context window is limited, a single agent can't cover every module.

s06's sub-agents are temps, called in for one job, then gone. Some tasks need teammates that can communicate and collaborate.

The Solution

Teaching code carries forward S14's capabilities (prompt assembly, task system, background execution, cron scheduling). To stay focused on the team mechanism, it omits full error recovery, memory, and skill systems. Added: MessageBus (file-based inboxes), spawn_teammate_thread (launch teammate threads), inbox injection (Lead receives teammate messages and injects into history).

Sub-agent vs Teammate:

	s06 Sub-agent	s15 Teammate
Lifetime	One-shot, destroyed after use	Multi-turn (teaching: 10 rounds; real CC: idle loop)
Communication	Only returns conclusion	Async inbox, communicate anytime
Context	Fully isolated	Shared via messages
Count	One lead + occasional sub-agent	One Lead + multiple teammates

How It Works

MessageBus: File-Based Inboxes

Each agent (including Lead and teammates) has a .jsonl inbox. Send = append a JSON line to the target's file. Read = read file + delete (consumption):

class MessageBus:
    def send(self, from_agent: str, to_agent: str,
             content: str, msg_type: str = "message"):
        msg = {"from": from_agent, "to": to_agent,
               "content": content, "type": msg_type,
               "ts": time.time()}
        inbox = MAILBOX_DIR / f"{to_agent}.jsonl"
        with open(inbox, "a") as f:
            f.write(json.dumps(msg) + "\n")

    def read_inbox(self, agent: str) -> list[dict]:
        inbox = MAILBOX_DIR / f"{agent}.jsonl"
        if not inbox.exists():
            return []
        msgs = [json.loads(line) for line in inbox.read_text().splitlines()]
        inbox.unlink()  # consume: read + delete
        return msgs

Why files instead of in-memory queues? Teaching code uses files because they're intuitive and observable across threads. Real CC also uses file inboxes (~/.claude/teams/{team}/inboxes/) but adds proper-lockfile for concurrent write safety. The teaching version's read_inbox has a read + unlink race, concurrent reads could lose messages, acceptable for teaching purposes.

spawn_teammate_thread: Launching a Teammate

Lead calls the spawn_teammate tool to start a teammate. The teammate runs in its own daemon thread with its own system prompt, messages, and simplified tool set:

def spawn_teammate_thread(name: str, role: str, prompt: str) -> str:
    system = f"You are '{name}', a {role}. Use tools to complete tasks."

    def run():
        messages = [{"role": "user", "content": prompt}]
        sub_tools = [bash, read_file, write_file, send_message]
        for _ in range(10):           # max 10 rounds
            inbox = BUS.read_inbox(name)
            if inbox:
                messages.append({"role": "user",
                    "content": f"<inbox>{json.dumps(inbox)}</inbox>"})
            response = client.messages.create(
                model=MODEL, system=system, messages=messages[-20:],
                tools=sub_tools, max_tokens=8000)
            # ... execute tools, process results
        # Send final summary to Lead
        BUS.send(name, "lead", summary, "result")

    threading.Thread(target=run, daemon=True).start()

Key design:

Simplified tool set: bash, read, write, send_message. Teaching code omits tasks and cron to focus on communication. Real CC teammates also have TaskCreate, TaskUpdate, etc., the task system is shared across the team
Teaching: 10 rounds max: prevents infinite loops. Real CC uses idle loop: after each round, send idle_notification, wait for inbox messages, resume on arrival, exit only on shutdown_request
Auto-report on completion: BUS.send(name, "lead", summary) sends the final result to Lead's inbox

Lead's Inbox Injection

Lead checks inbox after each main loop iteration. Teammate messages are injected into history so the LLM can see and react to them:

# After main loop iteration
inbox = BUS.read_inbox("lead")
if inbox:
    inbox_text = "\n".join(
        f"From {m['from']}: {m['content'][:200]}" for m in inbox)
    history.append({"role": "user",
                    "content": f"[Inbox]\n{inbox_text}"})

Teaching code injects in the user input loop. Real CC is more refined, Lead's useInboxPoller checks every 1 second, submitting messages as new turns without waiting for user input.

Permission Bubbling

Teaching code omits permission bubbling. Real CC's flow (permissionSync.ts, useSwarmPermissionPoller.ts):

Teammate encounters an operation needing approval → sends permission_request to Lead's inbox
Lead's useInboxPoller detects the request → routes to approval queue
User approves → Lead sends permission_response back to teammate
Teammate's useSwarmPermissionPoller (polls every 500ms) receives reply → continue or reject

Putting It Together

1. Lead: "Build the backend: one agent isn't enough, form a team"
2. Lead → spawn_teammate("alice", "backend dev", "Create database schema")
3. Lead → spawn_teammate("bob", "frontend dev", "Write API client")
4. Alice thread starts → her own LLM call → bash "python manage.py migrate"
5. Bob thread starts → his own LLM call → write_file("client.ts", ...)
6. Alice done → BUS.send("alice", "lead", "Schema done: users, orders tables")
7. Bob done → BUS.send("bob", "lead", "Client written with types")
8. Lead next iteration → inbox injected into history → LLM sees both results

Two teammates work in parallel.

Changes from s14

Component	Before (s14)	After (s15)
Agent count	1	1 Lead + N teammate threads
Communication	None	MessageBus + .mailboxes/*.jsonl
New classes	—	MessageBus, active_teammates dict
New functions	—	spawn_teammate_thread, run_send_message, run_check_inbox
Lead tools	11 (s14)	+ spawn_teammate, send_message, check_inbox (14)
Teammate tools	—	bash, read_file, write_file, send_message (4)
Permissions	Local decisions	Teaching code omits (real CC has bubbling)

Try It

cd learn-claude-code
python s15_agent_teams/code.py

Try these prompts:

Spawn alice as a backend developer. Ask her to create a file called schema.sql with a users table.
Check your inbox for alice's result.
Spawn bob as a tester. Ask him to check if schema.sql exists and list its contents.

What to observe: How does Lead spawn teammates? What do the .mailboxes/ JSONL files look like? After teammates finish, is Lead's inbox injected into history?

What's Next

Teammates can work and communicate. But if Lead wants Alice to shut down, killing the thread outright could leave half-written files. A graceful shutdown protocol is needed: Lead sends shutdown_request, teammate wraps up and exits.

s16 Team Protocols → Shutdown handshake and message conventions.

Deep Dive into CC Source

The following is a complete analysis based on CC source code spawnMultiAgent.ts, useInboxPoller.ts (969 lines), useSwarmPermissionPoller.ts (330 lines), teammateMailbox.ts, teamHelpers.ts.

1. No Central Message Bus, It's the Filesystem

Teaching code uses a MessageBus class to send and receive messages. Real CC is more direct, each agent writes directly to other agents' inbox files.

Inbox path: ~/.claude/teams/{teamName}/inboxes/{agentName}.json

Writes use proper-lockfile for concurrent write safety (up to 10 retries). Each file is a JSON array; appending reads → appends → writes back.

2. 15 Message Types

CC team communication has 15 structured message types (teammateMailbox.ts):

Type	Direction	Purpose
`plain text`	Both ways	Normal inter-teammate communication
`idle_notification`	Teammate→Lead	Teammate finished a turn, now idle
`permission_request`	Teammate→Lead	Teammate needs operation approval
`permission_response`	Lead→Teammate	Lead's approval result
`plan_approval_request`	Teammate→Lead	Teammate submits plan for review
`plan_approval_response`	Lead→Teammate	Lead's plan review
`shutdown_request`	Lead→Teammate	Request graceful shutdown
`shutdown_approved`	Teammate→Lead	Confirm shutdown
`shutdown_rejected`	Teammate→Lead	Reject shutdown (with reason)
`task_assignment`	Lead→Teammate	Assign a task
`team_permission_update`	Lead→Teammate	Broadcast permission changes
`mode_set_request`	Lead→Teammate	Change teammate's permission mode
`sandbox_permission_*`	Both ways	Network permission request/reply
`teammate_terminated`	System	Teammate removed notification

Text messages are wrapped in <teammate-message> XML tags for delivery to the model.

3. Permission Bubbling: Bidirectional Polling

Teaching code omits permission bubbling. Real CC's flow (permissionSync.ts):

Teammate encounters operation needing approval → sends permission_request to Lead's inbox
Lead's useInboxPoller (polls every 1s) detects request → routes to ToolUseConfirmQueue
Lead's UI shows approval dialog with teammate name and color
User approves → Lead sends permission_response back to teammate's inbox
Teammate's useSwarmPermissionPoller (polls every 500ms) receives reply → continue or reject

4. Teammate Lifecycle

CC teammates are created by spawnTeammate() (spawnMultiAgent.ts):

Spawn: Create tmux pane (or in-process), assign color, write team config
Work: useInboxPoller checks inbox every 1s → submit as new turn when messages arrive
Idle: Stop hook fires → send idle_notification to Lead
Shutdown: Lead sends shutdown_request → teammate replies shutdown_approved → Lead cleans up

5. Team Config

Team registry at ~/.claude/teams/{teamName}/config.json (teamHelpers.ts):

{
  "name": "my-team",
  "leadAgentId": "lead@my-team",
  "members": [{
    "agentId": "researcher@my-team",
    "name": "researcher",
    "agentType": "general-purpose",
    "color": "blue",
    "isActive": true
  }]
}

Teammates cannot be nested (AgentTool.tsx:273 explicitly forbids "teammates spawning other teammates").