Clarify README tutorial track transition
117
README-ja.md
|
|
@ -166,15 +166,15 @@ Claude Code = 一つの agent loop
|
|||
>
|
||||
> **s02** *"ツールを足すなら、ハンドラーを1つ足すだけ"* — ループは変わらない。新ツールは dispatch map に登録するだけ
|
||||
>
|
||||
> **s03** *"まず境界を決め、それから自由を与える"* — 権限パイプラインが承認の要否を判断する
|
||||
> **s03** *"まず境界を決め、それから自由を与える"* — 実行してよいか、止めるか、ユーザーに聞くかを判断する
|
||||
>
|
||||
> **s04** *"ループの外にフックし、ループは書き換えない"* — フックがツール実行前後に拡張ロジックを注入
|
||||
> **s04** *"ループの外にフックし、ループは書き換えない"* — メインループを変えずに拡張できる入口を作る
|
||||
>
|
||||
> **s05** *"計画のないエージェントは行き当たりばったり"* — まずステップを書き出し、それから実行
|
||||
>
|
||||
> **s06** *"大きなタスクを分割し、各サブタスクにクリーンなコンテキストを"* — サブエージェントは独立した messages[] を使い、メイン会話を汚さない
|
||||
> **s06** *"大きなタスクを分割し、各サブタスクにクリーンなコンテキストを"* — サブ Agent が作業し、結果だけを持ち帰る
|
||||
>
|
||||
> **s07** *"必要な知識を、必要な時に読み込む"* — system prompt ではなく tool_result で注入
|
||||
> **s07** *"必要な知識を、必要な時に読み込む"* — スキルはまず一覧だけ、必要な時に展開する
|
||||
>
|
||||
> **s08** *"コンテキストはいつか溢れる、空ける手段が要る"* — 4層圧縮、安い方から先に実行
|
||||
>
|
||||
|
|
@ -182,23 +182,23 @@ Claude Code = 一つの agent loop
|
|||
>
|
||||
> **s10** *"プロンプトは実行時に組み立てる、ハードコードではない"* — セクション分割 + オンデマンド連結
|
||||
>
|
||||
> **s11** *"エラーは終わりではない、リトライの始まりだ"* — トークン拡張、コンテキスト圧縮、モデル切替
|
||||
> **s11** *"エラーは終わりではない、リトライの始まりだ"* — 失敗したら再試行し、空きを作り、別の道を試す
|
||||
>
|
||||
> **s12** *"大きな目標を小タスクに分解し、順序付けし、ディスクに記録する"* — ファイルベースのタスクグラフ、マルチエージェント協調の基盤
|
||||
>
|
||||
> **s13** *"遅い操作はバックグラウンドへ、エージェントは次を考え続ける"* — バックグラウンドスレッドがコマンド実行、完了後に通知を注入
|
||||
>
|
||||
> **s14** *"スケジュールで発火、人間の起動は不要"* — cron スケジューリング、永続 or セッション限定
|
||||
> **s14** *"スケジュールで発火、人間の起動は不要"* — 時間になったら自動でタスクを動かす
|
||||
>
|
||||
> **s15** *"一人で終わらないなら、チームメイトに任せる"* — 永続チームメイト + 非同期メールボックス
|
||||
>
|
||||
> **s16** *"チームメイト間には統一の通信ルールが必要"* — 1つの request-response パターンが全交渉を駆動
|
||||
> **s16** *"チームメイト間には統一の通信ルールが必要"* — 固定のリクエスト-返信形式で連携する
|
||||
>
|
||||
> **s17** *"チームメイトが自らボードを見て、仕事を取る"* — リーダーが逐一割り振る必要はない
|
||||
>
|
||||
> **s18** *"各自のディレクトリで作業し、互いに干渉しない"* — タスクは目標を管理、worktree はディレクトリを管理、IDで紐付け
|
||||
>
|
||||
> **s19** *"能力不足? MCP でプラグイン"* — マルチトランスポート、チャネルルーティング、ツールプール統合
|
||||
> **s19** *"能力不足? MCP でプラグイン"* — 外部ツールを同じツールプールに接続する
|
||||
>
|
||||
> **s20** *"仕組みは多く、ループは一つ"* — すべての仕組みを 1 つの Harness に戻す
|
||||
|
||||
|
|
@ -233,6 +233,35 @@ def agent_loop(messages):
|
|||
|
||||
各セッションはこのループの上に 1 つの Harness メカニズムを重ねる -- ループ自体は変わらない。ループは Agent のもの。メカニズムは Harness のもの。
|
||||
|
||||
## バージョン状況
|
||||
|
||||
このリポジトリには現在、2 つのチュートリアルトラックが共存している:
|
||||
|
||||
- **現行トラック:ルート直下の `s01-s20`**
|
||||
ルート直下の `s01_*` から `s20_*` までが新しい正規版であり、現在推奨する読書経路。各セッションには中国語原文、英語/日本語訳、実行可能な `code.py`、必要に応じた図が含まれる。
|
||||
- **旧版移行トラック:`docs/`、`agents/`、現在の `web/`**
|
||||
これらは旧 12 セッション版を保持している。既存読者、旧リンク、Web プラットフォームのために移行期間中は一時的に残している。
|
||||
|
||||
新しく読む場合は、ルート直下の `s01_agent_loop/` から `s20_comprehensive/` までを読む。旧リンクや現在の Web アプリから入った場合は、旧 12 セッション版を読んでいる可能性が高い。旧版と現行版のセッション番号は常に一致しないため、番号を混同しないこと。
|
||||
|
||||
### 旧版から現行版への対応
|
||||
|
||||
| 旧 12 セッション版 | 現行 20 セッション版 | トピック |
|
||||
|---|---|---|
|
||||
| 旧 s01 | 現行 s01 | Agent Loop |
|
||||
| 旧 s02 | 現行 s02 | Tool Use |
|
||||
| 旧 s03 | 現行 s05 | TodoWrite |
|
||||
| 旧 s04 | 現行 s06 | Subagent |
|
||||
| 旧 s05 | 現行 s07 | Skill Loading |
|
||||
| 旧 s06 | 現行 s08 | Context Compact |
|
||||
| 旧 s07 | 現行 s12 | Task System |
|
||||
| 旧 s08 | 現行 s13 | Background Tasks |
|
||||
| 旧 s09 | 現行 s15 | Agent Teams |
|
||||
| 旧 s10 | 現行 s16 | Team Protocols |
|
||||
| 旧 s11 | 現行 s17 | Autonomous Agents |
|
||||
| 旧 s12 | 現行 s18 | Worktree Isolation |
|
||||
| 現行版のみ | s03、s04、s09、s10、s11、s14、s19、s20 | Permission、Hooks、Memory、System Prompt、Error Recovery、Cron、MCP、Comprehensive Agent |
|
||||
|
||||
## スコープ (重要)
|
||||
|
||||
このリポジトリは Harness 工学の 0->1 学習プロジェクト -- Agent モデルを囲む環境の構築を学ぶ。
|
||||
|
|
@ -248,6 +277,8 @@ def agent_loop(messages):
|
|||
|
||||
## クイックスタート
|
||||
|
||||
### 現行 20 セッション版
|
||||
|
||||
```sh
|
||||
git clone https://github.com/shareAI-lab/learn-claude-code
|
||||
cd learn-claude-code
|
||||
|
|
@ -259,24 +290,68 @@ python s08_context_compact/code.py # コンテキスト圧縮(複雑章)
|
|||
python s20_comprehensive/code.py # 終点: 全メカニズムを 1 つのループへ
|
||||
```
|
||||
|
||||
### 旧 12 セッション移行版
|
||||
|
||||
```sh
|
||||
python agents/s01_agent_loop.py
|
||||
python agents/s12_worktree_task_isolation.py
|
||||
python agents/s_full.py
|
||||
```
|
||||
|
||||
### Web プラットフォーム
|
||||
|
||||
インタラクティブな可視化、ステップスルーアニメーション、ソースビューア、各セッションのドキュメント。
|
||||
現在の Web プラットフォームはまだ `docs/` の旧 12 セッション版を表示する。現行 20 セッション版はルート直下の `s01-s20` を読む。
|
||||
|
||||
```sh
|
||||
cd web && npm install && npm run dev # http://localhost:3000
|
||||
```
|
||||
|
||||
## 6つの段階
|
||||
## 学習パス
|
||||
|
||||
| 段階 | セッション | 構築するもの |
|
||||
|---|---|---|
|
||||
| **ツールパイプライン** | `s01-s04` | loop → dispatch → permission → hooks |
|
||||
| **シングルエージェント機能** | `s05-s08` | planning → subagent → skill → context compact |
|
||||
| **知識と回復力** | `s09-s11` | memory → prompt assembly → error recovery |
|
||||
| **永続的作業** | `s12-s14` | task graph → background → cron |
|
||||
| **マルチエージェント基盤** | `s15-s19` | teams → protocols → autonomy → worktree → MCP |
|
||||
| **完全な Harness** | `s20` | すべての仕組みを agent loop に統合 |
|
||||
主線:動ける → 複雑な仕事ができる → 記憶して回復できる → 長く動ける → 協作できる → 拡張して統合する
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
%% カードスタイル
|
||||
classDef stage1 fill:#E3F2FD,stroke:#1976D2,stroke-width:2px,color:#0D47A1,rx:12,ry:12,text-align:left
|
||||
classDef stage2 fill:#E8F5E9,stroke:#388E3C,stroke-width:2px,color:#1B5E20,rx:12,ry:12,text-align:left
|
||||
classDef stage3 fill:#FFF3E0,stroke:#F57C00,stroke-width:2px,color:#E65100,rx:12,ry:12,text-align:left
|
||||
classDef stage4 fill:#FCE4EC,stroke:#C2185b,stroke-width:2px,color:#880E4F,rx:12,ry:12,text-align:left
|
||||
classDef stage5 fill:#F3E5F5,stroke:#7B1FA2,stroke-width:2px,color:#4A148C,rx:12,ry:12,text-align:left
|
||||
classDef stage6 fill:#E0F7FA,stroke:#0097A7,stroke-width:2px,color:#006064,rx:12,ry:12,text-align:left
|
||||
|
||||
%% 背景スタイル
|
||||
classDef groupBox fill:#F8F9FA,stroke:#CED4DA,stroke-width:2px,stroke-dasharray: 5 5,rx:15,ry:15,color:#495057
|
||||
|
||||
%% 第1層:1-3段階
|
||||
subgraph Phase1 ["🌱 段階 1-3:基礎能力の構築(単純から複雑へ)"]
|
||||
direction LR
|
||||
S1["<b>第1段階:Agent が動ける</b><br/>━━━━━━━━━━━━━<br/><b>s01 Agent Loop</b><br/>└─ 1つのループ + bash<br/><br/><b>s02 Tool Use</b><br/>└─ 1つのツールから複数へ<br/><br/><b>s03 Permission</b><br/>└─ 実行してよいか判断する<br/><br/><b>s04 Hooks</b><br/>└─ ツール前後に拡張入口を作る"]:::stage1
|
||||
|
||||
S2["<b>第2段階:複雑な仕事をこなす</b><br/>━━━━━━━━━━━━━<br/><b>s05 TodoWrite</b><br/>└─ 先に計画し、それから実行<br/><br/><b>s06 Subagent</b><br/>└─ サブ Agent が結果を返す<br/><br/><b>s07 Skill Loading</b><br/>└─ スキルを必要時に展開<br/><br/><b>s08 Context Compact</b><br/>└─ 長いコンテキストに空きを作る"]:::stage2
|
||||
|
||||
S3["<b>第3段階:記憶して回復する</b><br/>━━━━━━━━━━━━━<br/><b>s09 Memory</b><br/>└─ 覚えるべきことを覚える<br/><br/><b>s10 System Prompt</b><br/>└─ 実行時に組み立てる<br/><br/><b>s11 Error Recovery</b><br/>└─ 再試行し、別の道へ"]:::stage3
|
||||
|
||||
S1 ==> S2 ==> S3
|
||||
end
|
||||
|
||||
%% 第2層:4-6段階
|
||||
subgraph Phase2 ["🚀 段階 4-6:高次能力の進化(長期実行、協作、統合)"]
|
||||
direction LR
|
||||
S4["<b>第4段階:長く動くタスク</b><br/>━━━━━━━━━━━━━<br/><b>s12 Task System</b><br/>└─ タスクと依存関係を保存<br/><br/><b>s13 Background Tasks</b><br/>└─ 遅い作業をバックグラウンドへ<br/><br/><b>s14 Cron Scheduler</b><br/>└─ 時間で自動実行"]:::stage4
|
||||
|
||||
S5["<b>第5段階:複数 Agent の協作</b><br/>━━━━━━━━━━━━━<br/><b>s15 Agent Teams</b><br/>└─ チームメイト + メールボックス<br/><br/><b>s16 Team Protocols</b><br/>└─ 固定のリクエスト-返信形式<br/><br/><b>s17 Autonomous Agents</b><br/>└─ ボードを見て仕事を取る<br/><br/><b>s18 Worktree Isolation</b><br/>└─ 別ディレクトリで作業"]:::stage5
|
||||
|
||||
S6["<b>第6段階:外部能力と統合</b><br/>━━━━━━━━━━━━━<br/><b>s19 MCP Plugin</b><br/>└─ 外部ツールを同じプールへ<br/><br/><b>s20 Comprehensive Agent</b><br/>└─ すべてを1つのループへ"]:::stage6
|
||||
|
||||
S4 ==> S5 ==> S6
|
||||
end
|
||||
|
||||
%% 2つの層を接続
|
||||
Phase1 ===> Phase2
|
||||
|
||||
class Phase1,Phase2 groupBox
|
||||
```
|
||||
|
||||
## 全セッション
|
||||
|
||||
|
|
@ -317,10 +392,10 @@ learn-claude-code/
|
|||
...
|
||||
s19_mcp_plugin/
|
||||
s20_comprehensive/ # 終点セッション
|
||||
agents/ # フラットコピー、python agents/sXX.py でクイック実行
|
||||
agents/ # 旧 12 セッションの実行可能コピー + s_full.py
|
||||
skills/ # s07 で使用するスキルファイル
|
||||
docs/ # 旧バージョン(アーカイブ)
|
||||
web/ # Web 学習プラットフォーム
|
||||
docs/ # 旧 12 セッション文書、移行期間中は保持
|
||||
web/ # 現在は docs/ の旧版内容を生成・表示
|
||||
tests/
|
||||
```
|
||||
|
||||
|
|
|
|||
118
README-zh.md
|
|
@ -166,15 +166,15 @@ Claude Code = 一个 agent loop
|
|||
>
|
||||
> **s02** *"加一个工具, 只加一个 handler"* — 循环不用动, 新工具注册进 dispatch map 就行
|
||||
>
|
||||
> **s03** *"先划边界, 再给自由"* — 权限管线决定哪些操作需要审批
|
||||
> **s03** *"先划边界, 再给自由"* — 先判断操作能不能做,要不要问用户
|
||||
>
|
||||
> **s04** *"挂在循环上, 不写进循环里"* — 钩子在工具执行前后注入扩展逻辑
|
||||
> **s04** *"挂在循环上, 不写进循环里"* — 在工具前后留插口,不改主循环也能扩展
|
||||
>
|
||||
> **s05** *"没有计划的 agent 走哪算哪"* — 先列步骤再动手, 完成率翻倍
|
||||
>
|
||||
> **s06** *"大任务拆小, 每个小任务干净的上下文"* — Subagent 用独立 messages[], 不污染主对话
|
||||
> **s06** *"大任务拆小, 每个小任务干净的上下文"* — 子 Agent 自己干活,只把结果带回来
|
||||
>
|
||||
> **s07** *"用到时再加载, 别全塞 prompt 里"* — 通过 tool_result 注入, 不塞 system prompt
|
||||
> **s07** *"用到时再加载, 别全塞 prompt 里"* — 技能先列目录,用到时再展开
|
||||
>
|
||||
> **s08** *"上下文总会满, 要有办法腾地方"* — 四层压缩策略, 便宜的先跑贵的后跑
|
||||
>
|
||||
|
|
@ -182,23 +182,23 @@ Claude Code = 一个 agent loop
|
|||
>
|
||||
> **s10** *"prompt 是组装出来的, 不是写死的"* — 分段 + 按需拼接
|
||||
>
|
||||
> **s11** *"错误不是终点, 是重试的起点"* — 升级 token、压缩上下文、切换模型
|
||||
> **s11** *"错误不是终点, 是重试的起点"* — 出错时会重试、腾空间、换路子
|
||||
>
|
||||
> **s12** *"大目标拆成小任务, 排好序, 持久化"* — 文件持久化的任务图, 多 agent 协作的基础
|
||||
>
|
||||
> **s13** *"慢操作丢后台, agent 继续思考"* — 后台线程跑命令, 完成后注入通知
|
||||
>
|
||||
> **s14** *"定时触发, 不需要人推"* — cron 调度, 持久化或会话级
|
||||
> **s14** *"定时触发, 不需要人推"* — 按时间自动触发任务
|
||||
>
|
||||
> **s15** *"一个搞不定, 组队来"* — 持久化队友 + 异步邮箱
|
||||
>
|
||||
> **s16** *"队友之间要有约定"* — 一个 request-response 模式驱动所有协商
|
||||
> **s16** *"队友之间要有约定"* — 用固定的请求-回复格式沟通
|
||||
>
|
||||
> **s17** *"队友自己看板, 有活就认领"* — 不需要领导逐个分配, 自组织
|
||||
>
|
||||
> **s18** *"各干各的目录, 互不干扰"* — 任务管目标, worktree 管目录, 按 ID 绑定
|
||||
>
|
||||
> **s19** *"能力不够? 插上 MCP"* — 多传输、通道路由、工具池合并
|
||||
> **s19** *"能力不够? 插上 MCP"* — 把外部工具接进同一个工具池
|
||||
>
|
||||
> **s20** *"机制很多,循环一个"* — 前面所有机制回到一个完整 harness
|
||||
|
||||
|
|
@ -233,6 +233,35 @@ def agent_loop(messages):
|
|||
|
||||
每个课程在这个循环之上叠加一个 harness 机制 -- 循环本身始终不变。循环属于 agent。机制属于 harness。
|
||||
|
||||
## 版本说明
|
||||
|
||||
本仓库现在同时保留两条教程线:
|
||||
|
||||
- **新版主线:根目录 `s01-s20`**
|
||||
根目录下的 `s01_*` 到 `s20_*` 是新的主版本,也是当前推荐阅读路径。每章包含完整叙事 README、英文/日文译本、可运行的 `code.py`,以及必要的图示。
|
||||
- **旧版过渡:`docs/`、`agents/`、当前 `web/`**
|
||||
这些仍保留旧 12 章体系,暂时用于已有读者、旧链接和 Web 平台过渡。
|
||||
|
||||
新读者请从根目录 `s01_agent_loop/` 读到 `s20_comprehensive/`。如果你是从旧链接或当前 Web 平台进入,大概率看到的是旧 12 章版本。旧版章节号和新版不完全一致,不要混用章节号。
|
||||
|
||||
### 旧版到新版的对应关系
|
||||
|
||||
| 旧 12 章版本 | 新 20 章版本 | 主题 |
|
||||
|---|---|---|
|
||||
| 旧 s01 | 新 s01 | Agent Loop |
|
||||
| 旧 s02 | 新 s02 | Tool Use |
|
||||
| 旧 s03 | 新 s05 | TodoWrite |
|
||||
| 旧 s04 | 新 s06 | Subagent |
|
||||
| 旧 s05 | 新 s07 | Skill Loading |
|
||||
| 旧 s06 | 新 s08 | Context Compact |
|
||||
| 旧 s07 | 新 s12 | Task System |
|
||||
| 旧 s08 | 新 s13 | Background Tasks |
|
||||
| 旧 s09 | 新 s15 | Agent Teams |
|
||||
| 旧 s10 | 新 s16 | Team Protocols |
|
||||
| 旧 s11 | 新 s17 | Autonomous Agents |
|
||||
| 旧 s12 | 新 s18 | Worktree Isolation |
|
||||
| 新版新增 | s03、s04、s09、s10、s11、s14、s19、s20 | Permission、Hooks、Memory、System Prompt、Error Recovery、Cron、MCP、Comprehensive Agent |
|
||||
|
||||
## 范围说明 (重要)
|
||||
|
||||
本仓库是一个 0->1 的 harness 工程学习项目 -- 构建围绕 agent 模型的工作环境。
|
||||
|
|
@ -248,6 +277,8 @@ def agent_loop(messages):
|
|||
|
||||
## 快速开始
|
||||
|
||||
### 新版 20 章主线
|
||||
|
||||
```sh
|
||||
git clone https://github.com/shareAI-lab/learn-claude-code
|
||||
cd learn-claude-code
|
||||
|
|
@ -259,24 +290,69 @@ python s08_context_compact/code.py # 上下文压缩(复杂章)
|
|||
python s20_comprehensive/code.py # 终点章: 全部机制归到一个循环
|
||||
```
|
||||
|
||||
### 旧版 12 章过渡线
|
||||
|
||||
```sh
|
||||
python agents/s01_agent_loop.py
|
||||
python agents/s12_worktree_task_isolation.py
|
||||
python agents/s_full.py
|
||||
```
|
||||
|
||||
### Web 平台
|
||||
|
||||
交互式可视化、分步动画、源码查看器, 以及每个课程的文档。
|
||||
当前 Web 平台仍读取 `docs/` 中的旧 12 章内容。新版 20 章请直接阅读根目录 `s01-s20`。
|
||||
|
||||
```sh
|
||||
cd web && npm install && npm run dev # http://localhost:3000
|
||||
```
|
||||
|
||||
## 六个阶段
|
||||
## 学习路径
|
||||
|
||||
| 阶段 | 章节 | 你在构建什么 |
|
||||
|---|---|---|
|
||||
| **工具管线** | `s01-s04` | loop → dispatch → permission → hooks |
|
||||
| **单 Agent 能力** | `s05-s08` | planning → subagent → skill → context compact |
|
||||
| **知识与韧性** | `s09-s11` | memory → prompt assembly → error recovery |
|
||||
| **持久化工作** | `s12-s14` | task graph → background → cron |
|
||||
| **多 Agent 平台** | `s15-s19` | teams → protocols → autonomy → worktree → MCP |
|
||||
| **完整 Harness** | `s20` | 全部机制归到一个 agent loop |
|
||||
主线:能动手 → 能做复杂任务 → 能记住和恢复 → 能长期运行 → 能协作 → 能扩展并合体
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
%% 统一定义卡片样式:加入 text-align:left 保证列表不会居中乱飘
|
||||
classDef stage1 fill:#E3F2FD,stroke:#1976D2,stroke-width:2px,color:#0D47A1,rx:12,ry:12,text-align:left
|
||||
classDef stage2 fill:#E8F5E9,stroke:#388E3C,stroke-width:2px,color:#1B5E20,rx:12,ry:12,text-align:left
|
||||
classDef stage3 fill:#FFF3E0,stroke:#F57C00,stroke-width:2px,color:#E65100,rx:12,ry:12,text-align:left
|
||||
classDef stage4 fill:#FCE4EC,stroke:#C2185b,stroke-width:2px,color:#880E4F,rx:12,ry:12,text-align:left
|
||||
classDef stage5 fill:#F3E5F5,stroke:#7B1FA2,stroke-width:2px,color:#4A148C,rx:12,ry:12,text-align:left
|
||||
classDef stage6 fill:#E0F7FA,stroke:#0097A7,stroke-width:2px,color:#006064,rx:12,ry:12,text-align:left
|
||||
|
||||
%% 背景框样式
|
||||
classDef groupBox fill:#F8F9FA,stroke:#CED4DA,stroke-width:2px,stroke-dasharray: 5 5,rx:15,ry:15,color:#495057
|
||||
|
||||
%% 第一层:1-3阶段
|
||||
subgraph Phase1 ["🌱 阶段 1-3:基础能力构建(从简单到复杂)"]
|
||||
direction LR
|
||||
S1["<b>第一阶段:让 Agent 能动手</b><br/>━━━━━━━━━━━━━<br/><b>s01 Agent Loop</b><br/>└─ 一个循环 + bash<br/><br/><b>s02 Tool Use</b><br/>└─ 单个到多个工具<br/><br/><b>s03 Permission</b><br/>└─ 判断能不能做<br/><br/><b>s04 Hooks</b><br/>└─ 工具前后留扩展插口"]:::stage1
|
||||
|
||||
S2["<b>第二阶段:做复杂任务</b><br/>━━━━━━━━━━━━━<br/><b>s05 TodoWrite</b><br/>└─ 先列计划,再执行<br/><br/><b>s06 Subagent</b><br/>└─ 子节点干活带回结果<br/><br/><b>s07 Skill Loading</b><br/>└─ 技能按需展开<br/><br/><b>s08 Context Compact</b><br/>└─ 长下文腾空间"]:::stage2
|
||||
|
||||
S3["<b>第三阶段:记住和恢复</b><br/>━━━━━━━━━━━━━<br/><b>s09 Memory</b><br/>└─ 该记记,该忘忘<br/><br/><b>s10 System Prompt</b><br/>└─ 运行时组装<br/><br/><b>s11 Error Recovery</b><br/>└─ 重试换路子"]:::stage3
|
||||
|
||||
S1 ==> S2 ==> S3
|
||||
end
|
||||
|
||||
%% 第二层:4-6阶段
|
||||
subgraph Phase2 ["🚀 阶段 4-6:高阶能力进化(长期、协作与融合)"]
|
||||
direction LR
|
||||
S4["<b>第四阶段:让任务长期运行</b><br/>━━━━━━━━━━━━━<br/><b>s12 Task System</b><br/>└─ 任务落盘记依赖<br/><br/><b>s13 Background Tasks</b><br/>└─ 慢操作丢后台<br/><br/><b>s14 Cron Scheduler</b><br/>└─ 按时自动触发"]:::stage4
|
||||
|
||||
S5["<b>第五阶段:让多个 Agent 协作</b><br/>━━━━━━━━━━━━━<br/><b>s15 Agent Teams</b><br/>└─ 队友 + 邮箱通信<br/><br/><b>s16 Team Protocols</b><br/>└─ 固定收发格式<br/><br/><b>s17 Autonomous Agents</b><br/>└─ 自己看板认领活<br/><br/><b>s18 Worktree Isolation</b><br/>└─ 隔离目录"]:::stage5
|
||||
|
||||
S6["<b>第六阶段:接外部能力合体</b><br/>━━━━━━━━━━━━━<br/><b>s19 MCP Plugin</b><br/>└─ 外部接进工具池<br/><br/><b>s20 Comprehensive Agent</b><br/>└─ 全机制回单循环"]:::stage6
|
||||
|
||||
S4 ==> S5 ==> S6
|
||||
end
|
||||
|
||||
%% 将两个模块连接起来,形成 Z 字形阅读流
|
||||
Phase1 ===> Phase2
|
||||
|
||||
%% 应用背景样式
|
||||
class Phase1,Phase2 groupBox
|
||||
```
|
||||
|
||||
## 全部章节
|
||||
|
||||
|
|
@ -317,10 +393,10 @@ learn-claude-code/
|
|||
...
|
||||
s19_mcp_plugin/
|
||||
s20_comprehensive/ # 终点章
|
||||
agents/ # 扁平副本,方便 python agents/sXX.py 快速运行
|
||||
agents/ # 旧 12 章可运行副本 + s_full.py
|
||||
skills/ # s07 使用的 skill 文件
|
||||
docs/ # 旧版线上文档(已归档)
|
||||
web/ # Web 教学平台
|
||||
docs/ # 旧 12 章文档,过渡期保留
|
||||
web/ # 当前仍基于 docs/ 旧版内容生成
|
||||
tests/
|
||||
```
|
||||
|
||||
|
|
|
|||
139
README.md
|
|
@ -159,6 +159,51 @@ Every lesson layers one harness mechanism on top of this loop -- the loop itself
|
|||
|
||||
---
|
||||
|
||||
## Version Status
|
||||
|
||||
This repository currently contains two tutorial tracks:
|
||||
|
||||
- **Current track: root-level `s01-s20`**
|
||||
The root-level `s01_*` ... `s20_*` folders are the new canonical version. Each chapter contains a full narrative README, translations, runnable `code.py`, and diagrams where needed.
|
||||
- **Legacy transition track: `docs/`, `agents/`, and the current `web/` app**
|
||||
These still preserve the older 12-lesson version. They are kept temporarily for existing readers, old links, and the web platform while the new 20-lesson track settles.
|
||||
|
||||
If you are starting now, read the root-level `s01_agent_loop/` through `s20_comprehensive/` chapters. If you are following an older link or using the current web app, you are likely reading the legacy 12-lesson track. The legacy and current chapter numbers do not always match, so avoid mixing chapter numbers across tracks.
|
||||
|
||||
### Legacy-to-Current Mapping
|
||||
|
||||
| Legacy 12-lesson track | Current 20-lesson track | Topic |
|
||||
|---|---|---|
|
||||
| old s01 | new s01 | Agent Loop |
|
||||
| old s02 | new s02 | Tool Use |
|
||||
| old s03 | new s05 | TodoWrite |
|
||||
| old s04 | new s06 | Subagent |
|
||||
| old s05 | new s07 | Skill Loading |
|
||||
| old s06 | new s08 | Context Compact |
|
||||
| old s07 | new s12 | Task System |
|
||||
| old s08 | new s13 | Background Tasks |
|
||||
| old s09 | new s15 | Agent Teams |
|
||||
| old s10 | new s16 | Team Protocols |
|
||||
| old s11 | new s17 | Autonomous Agents |
|
||||
| old s12 | new s18 | Worktree Isolation |
|
||||
| new only | s03, s04, s09, s10, s11, s14, s19, s20 | Permission, Hooks, Memory, System Prompt, Error Recovery, Cron, MCP, Comprehensive Agent |
|
||||
|
||||
---
|
||||
|
||||
## Scope
|
||||
|
||||
This repository is a 0-to-1 harness engineering learning project: it teaches how to build the working environment around an agent model. To keep the learning path clear, some production mechanisms are intentionally simplified or omitted:
|
||||
|
||||
- Full event / hook bus behavior, such as `PreToolUse`, `SessionStart/End`, and `ConfigChange`.
|
||||
The teaching code uses minimal lifecycle events where needed.
|
||||
- Rule-based permission governance and full trust workflows.
|
||||
- Session lifecycle controls such as resume/fork, plus more complete worktree lifecycle handling.
|
||||
- Full MCP runtime details such as transport, OAuth, resource subscription, and polling.
|
||||
|
||||
The JSONL mailbox protocol in this repository is a teaching implementation, not a claim about any specific production internal implementation.
|
||||
|
||||
---
|
||||
|
||||
## 20 Progressive Lessons
|
||||
|
||||
**Each lesson adds one harness mechanism. Each mechanism has a motto.**
|
||||
|
|
@ -167,15 +212,15 @@ Every lesson layers one harness mechanism on top of this loop -- the loop itself
|
|||
>
|
||||
> **s02** *"Adding a tool means adding one handler"* — the loop stays untouched; new tools register into the dispatch map
|
||||
>
|
||||
> **s03** *"Set boundaries first, then grant freedom"* — the permission pipeline decides which operations need approval
|
||||
> **s03** *"Set boundaries first, then grant freedom"* — check what can run, what must stop, and what needs approval
|
||||
>
|
||||
> **s04** *"Hook around the loop, never rewrite the loop"* — hooks inject extension logic before and after tool execution
|
||||
> **s04** *"Hook around the loop, never rewrite the loop"* — add extension points without changing the main loop
|
||||
>
|
||||
> **s05** *"An agent without a plan drifts"* — list the steps before starting; completion rate doubles
|
||||
>
|
||||
> **s06** *"Big tasks split small, each subtask gets clean context"* — subagents use a fresh messages[], keeping the main conversation clean
|
||||
> **s06** *"Big tasks split small, each subtask gets clean context"* — subagents do the side work and bring back only the result
|
||||
>
|
||||
> **s07** *"Load knowledge on demand, not upfront"* — inject via tool_result, not the system prompt
|
||||
> **s07** *"Load knowledge on demand, not upfront"* — list skills first, expand them only when needed
|
||||
>
|
||||
> **s08** *"Context always fills up -- have a way to make room"* — multi-layer compaction strategies buy you infinite sessions
|
||||
>
|
||||
|
|
@ -183,38 +228,74 @@ Every lesson layers one harness mechanism on top of this loop -- the loop itself
|
|||
>
|
||||
> **s10** *"Prompts are assembled at runtime, not hardcoded"* — section-based concatenation, loaded on demand
|
||||
>
|
||||
> **s11** *"Errors aren't the end, they're the start of a retry"* — escalate tokens, compact context, switch models
|
||||
> **s11** *"Errors aren't the end, they're the start of a retry"* — retry, make room, or take another path when things fail
|
||||
>
|
||||
> **s12** *"Big goals break into small tasks, ordered, persisted to disk"* — a file-backed task graph that lays the groundwork for multi-agent coordination
|
||||
>
|
||||
> **s13** *"Slow ops go background, agent keeps thinking"* — background threads run commands; notifications inject on completion
|
||||
>
|
||||
> **s14** *"Fire on schedule, no human kick needed"* — cron scheduling, durable or session-scoped
|
||||
> **s14** *"Fire on schedule, no human kick needed"* — trigger tasks automatically by time
|
||||
>
|
||||
> **s15** *"Too big for one agent -- delegate to teammates"* — persistent teammates + async mailboxes
|
||||
>
|
||||
> **s16** *"Teammates need shared communication rules"* — one request-response pattern drives all negotiation
|
||||
> **s16** *"Teammates need shared communication rules"* — use a fixed request-reply format for coordination
|
||||
>
|
||||
> **s17** *"Teammates check the board, claim work themselves"* — no leader assigning one by one; self-organizing
|
||||
>
|
||||
> **s18** *"Each works in its own directory, no interference"* — tasks own goals, worktrees own directories, bound by ID
|
||||
>
|
||||
> **s19** *"Not enough capability? Plug in more via MCP"* — multi-transport, channel routing, tool pool merging
|
||||
> **s19** *"Not enough capability? Plug in more via MCP"* — connect external tools into the same tool pool
|
||||
>
|
||||
> **s20** *"Many mechanisms, one loop"* — all previous mechanisms return to one complete harness
|
||||
|
||||
---
|
||||
|
||||
## Six Stages
|
||||
## Learning Path
|
||||
|
||||
| Stage | Chapters | What you are building |
|
||||
|---|---|---|
|
||||
| **Tool pipeline** | `s01-s04` | loop → tool dispatch → permission pipeline → hook extensions |
|
||||
| **Single-agent capability** | `s05-s08` | planning → subagent → skill loading → context compaction |
|
||||
| **Knowledge and resilience** | `s09-s11` | memory → prompt assembly → error recovery |
|
||||
| **Durable work** | `s12-s14` | task graph → background execution → scheduled triggers |
|
||||
| **Multi-agent platform** | `s15-s19` | teams → protocols → autonomy → worktree isolation → MCP |
|
||||
| **Complete harness** | `s20` | full agent loop with all mechanisms assembled |
|
||||
Main line: act → handle complex work → remember and recover → run long tasks → collaborate → extend and assemble.
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
%% Card styles
|
||||
classDef stage1 fill:#E3F2FD,stroke:#1976D2,stroke-width:2px,color:#0D47A1,rx:12,ry:12,text-align:left
|
||||
classDef stage2 fill:#E8F5E9,stroke:#388E3C,stroke-width:2px,color:#1B5E20,rx:12,ry:12,text-align:left
|
||||
classDef stage3 fill:#FFF3E0,stroke:#F57C00,stroke-width:2px,color:#E65100,rx:12,ry:12,text-align:left
|
||||
classDef stage4 fill:#FCE4EC,stroke:#C2185b,stroke-width:2px,color:#880E4F,rx:12,ry:12,text-align:left
|
||||
classDef stage5 fill:#F3E5F5,stroke:#7B1FA2,stroke-width:2px,color:#4A148C,rx:12,ry:12,text-align:left
|
||||
classDef stage6 fill:#E0F7FA,stroke:#0097A7,stroke-width:2px,color:#006064,rx:12,ry:12,text-align:left
|
||||
|
||||
%% Group style
|
||||
classDef groupBox fill:#F8F9FA,stroke:#CED4DA,stroke-width:2px,stroke-dasharray: 5 5,rx:15,ry:15,color:#495057
|
||||
|
||||
%% Layer 1: stages 1-3
|
||||
subgraph Phase1 ["🌱 Stages 1-3: Core capabilities (simple to complex)"]
|
||||
direction LR
|
||||
S1["<b>1. Let the Agent act</b><br/>━━━━━━━━━━━━━<br/><b>s01 Agent Loop</b><br/>└─ one loop + bash<br/><br/><b>s02 Tool Use</b><br/>└─ one tool to many tools<br/><br/><b>s03 Permission</b><br/>└─ decide what can run<br/><br/><b>s04 Hooks</b><br/>└─ extension points around tools"]:::stage1
|
||||
|
||||
S2["<b>2. Handle complex work</b><br/>━━━━━━━━━━━━━<br/><b>s05 TodoWrite</b><br/>└─ plan first, then execute<br/><br/><b>s06 Subagent</b><br/>└─ side work, result back<br/><br/><b>s07 Skill Loading</b><br/>└─ expand skills on demand<br/><br/><b>s08 Context Compact</b><br/>└─ make room in long context"]:::stage2
|
||||
|
||||
S3["<b>3. Remember and recover</b><br/>━━━━━━━━━━━━━<br/><b>s09 Memory</b><br/>└─ remember what matters<br/><br/><b>s10 System Prompt</b><br/>└─ assemble at runtime<br/><br/><b>s11 Error Recovery</b><br/>└─ retry or change path"]:::stage3
|
||||
|
||||
S1 ==> S2 ==> S3
|
||||
end
|
||||
|
||||
%% Layer 2: stages 4-6
|
||||
subgraph Phase2 ["🚀 Stages 4-6: Advanced capabilities (long-running, collaboration, integration)"]
|
||||
direction LR
|
||||
S4["<b>4. Run long tasks</b><br/>━━━━━━━━━━━━━<br/><b>s12 Task System</b><br/>└─ persist tasks and deps<br/><br/><b>s13 Background Tasks</b><br/>└─ send slow work background<br/><br/><b>s14 Cron Scheduler</b><br/>└─ trigger by time"]:::stage4
|
||||
|
||||
S5["<b>5. Coordinate many Agents</b><br/>━━━━━━━━━━━━━<br/><b>s15 Agent Teams</b><br/>└─ teammates + mailboxes<br/><br/><b>s16 Team Protocols</b><br/>└─ fixed request-reply format<br/><br/><b>s17 Autonomous Agents</b><br/>└─ claim work from the board<br/><br/><b>s18 Worktree Isolation</b><br/>└─ separate directories"]:::stage5
|
||||
|
||||
S6["<b>6. Extend and assemble</b><br/>━━━━━━━━━━━━━<br/><b>s19 MCP Plugin</b><br/>└─ external tools, one pool<br/><br/><b>s20 Comprehensive Agent</b><br/>└─ all mechanisms, one loop"]:::stage6
|
||||
|
||||
S4 ==> S5 ==> S6
|
||||
end
|
||||
|
||||
%% Connect the two layers
|
||||
Phase1 ===> Phase2
|
||||
|
||||
class Phase1,Phase2 groupBox
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -266,6 +347,8 @@ Read from s01 through s20 in order. Each chapter assumes you've read the previou
|
|||
|
||||
## Quick Start
|
||||
|
||||
### Current 20-Lesson Track
|
||||
|
||||
```sh
|
||||
git clone https://github.com/shareAI-lab/learn-claude-code
|
||||
cd learn-claude-code
|
||||
|
|
@ -277,6 +360,22 @@ python s08_context_compact/code.py # Context compaction (complex)
|
|||
python s20_comprehensive/code.py # Endpoint: all mechanisms in one loop
|
||||
```
|
||||
|
||||
### Legacy 12-Lesson Track
|
||||
|
||||
```sh
|
||||
python agents/s01_agent_loop.py
|
||||
python agents/s12_worktree_task_isolation.py
|
||||
python agents/s_full.py
|
||||
```
|
||||
|
||||
### Web Platform
|
||||
|
||||
The current web app still renders the legacy `docs/` s01-s12 track. Use the root-level folders for the new s01-s20 track.
|
||||
|
||||
```sh
|
||||
cd web && npm install && npm run dev # http://localhost:3000
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Project Structure
|
||||
|
|
@ -293,10 +392,10 @@ learn-claude-code/
|
|||
...
|
||||
s19_mcp_plugin/
|
||||
s20_comprehensive/ # endpoint chapter
|
||||
agents/ # flat copies for quick python agents/sXX.py
|
||||
agents/ # legacy 12 runnable copies + s_full.py
|
||||
skills/ # skill files used by s07
|
||||
docs/ # legacy online docs (archived)
|
||||
web/ # web teaching platform
|
||||
docs/ # legacy 12-lesson docs, kept during transition
|
||||
web/ # currently renders the legacy docs/ track
|
||||
tests/
|
||||
```
|
||||
|
||||
|
|
|
|||
|
|
@ -4,9 +4,9 @@
|
|||
|
||||
s01 → s02 → s03 → `s04` → [s05](../s05_todo_write/) → s06 → ... → s20
|
||||
|
||||
> *"挂在循环上, 不写进循环里"* — 钩子在工具执行前后注入扩展逻辑。
|
||||
> *"挂在循环上, 不写进循环里"* — hook 在工具执行前后注入扩展逻辑。
|
||||
>
|
||||
> **Harness 层**: 钩子 — 扩展点不侵入循环。
|
||||
> **Harness 层**: hook — 扩展点不侵入循环。
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -38,7 +38,7 @@ def agent_loop(messages):
|
|||
|
||||

|
||||
|
||||
s03 的循环和权限逻辑完全保留。唯一的变动是把 `check_permission()` 从循环体内移到了钩子上,循环不再直接调用任何检查函数,改为 `trigger_hooks("PreToolUse", block)`,由注册表决定跑什么。
|
||||
s03 的循环和权限逻辑完全保留。唯一的变动是把 `check_permission()` 从循环体内移到了 hook 上,循环不再直接调用任何检查函数,改为 `trigger_hooks("PreToolUse", block)`,由注册表决定跑什么。
|
||||
|
||||
四个事件,覆盖一个完整的 agent cycle:
|
||||
|
||||
|
|
@ -55,7 +55,7 @@ s03 的循环和权限逻辑完全保留。唯一的变动是把 `check_permissi
|
|||
|
||||
## 工作原理
|
||||
|
||||
**钩子注册表**:一个字典,事件名映射到回调列表。
|
||||
**hook 注册表**:一个字典,事件名映射到回调列表。
|
||||
|
||||
```python
|
||||
HOOKS = {
|
||||
|
|
@ -71,12 +71,12 @@ def register_hook(event: str, callback):
|
|||
def trigger_hooks(event: str, *args):
|
||||
for callback in HOOKS[event]:
|
||||
result = callback(*args)
|
||||
if result is not None: # 返回值 ≠ None → 钩子说"停"
|
||||
if result is not None: # 返回值 ≠ None → hook 说"停"
|
||||
return result
|
||||
return None
|
||||
```
|
||||
|
||||
教学版中,PreToolUse 返回非 None 表示阻止执行,Stop 返回非 None 表示强制续跑。UserPromptSubmit 和 PostToolUse 的返回值未被使用。
|
||||
教学版中,PreToolUse 的非 None 返回值会阻止本次工具执行,Stop 的非 None 返回值会强制续跑。UserPromptSubmit 和 PostToolUse 的返回值未被使用。
|
||||
|
||||
**UserPromptSubmit**,用户输入提交后、进入 LLM 前触发。CC 中可以拦截或修改输入,教学版只做日志演示:
|
||||
|
||||
|
|
@ -98,10 +98,10 @@ history.append({"role": "user", "content": query})
|
|||
agent_loop(history)
|
||||
```
|
||||
|
||||
**PreToolUse / PostToolUse**,工具执行前后的钩子。s03 的权限检查逻辑现在包装成 PreToolUse 钩子,再加一个日志钩子和一个大输出提醒:
|
||||
**PreToolUse / PostToolUse**,工具执行前后的 hook。s03 的权限检查逻辑现在包装成 PreToolUse hook,再加一个日志 hook 和一个大输出提醒:
|
||||
|
||||
```python
|
||||
# PreToolUse: 权限检查(s03 的逻辑,从循环移到钩子)
|
||||
# PreToolUse: 权限检查(s03 的逻辑,从循环移到 hook)
|
||||
def permission_hook(block):
|
||||
if block.name == "bash":
|
||||
for pattern in DENY_LIST:
|
||||
|
|
@ -163,7 +163,7 @@ for block in response.content:
|
|||
continue
|
||||
|
||||
# s03: if not check_permission(block): ...
|
||||
# s04: 钩子替代硬编码
|
||||
# s04: hook 替代硬编码
|
||||
blocked = trigger_hooks("PreToolUse", block)
|
||||
if blocked:
|
||||
results.append({"type": "tool_result", "tool_use_id": block.id,
|
||||
|
|
@ -179,7 +179,7 @@ for block in response.content:
|
|||
"content": output})
|
||||
```
|
||||
|
||||
四个钩子覆盖了 agent cycle 的关键节点:输入→执行前→执行后→退出。循环只负责调用 trigger_hooks(),具体逻辑全在钩子回调里。
|
||||
四个 hook 覆盖了 agent cycle 的关键节点:输入→执行前→执行后→退出。循环只负责调用 trigger_hooks(),具体逻辑全在 hook 回调里。
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -189,7 +189,7 @@ for block in response.content:
|
|||
|------|-----------|-----------|
|
||||
| 扩展方式 | check_permission() 硬编码在循环里 | HOOKS 注册表 + trigger_hooks() |
|
||||
| 新函数 | — | register_hook, trigger_hooks |
|
||||
| 钩子回调 | — | context_inject_hook, permission_hook, log_hook, large_output_hook, summary_hook |
|
||||
| hook 回调 | — | context_inject_hook, permission_hook, log_hook, large_output_hook, summary_hook |
|
||||
| 循环 | 直接调用 check_permission() | 调用 trigger_hooks("PreToolUse", ...) |
|
||||
| 退出控制 | 无 | trigger_hooks("Stop", ...) 可阻止退出 |
|
||||
| 输入拦截 | 无 | trigger_hooks("UserPromptSubmit", ...) 可注入上下文 |
|
||||
|
|
@ -205,11 +205,11 @@ python s04_hooks/code.py
|
|||
|
||||
试试这些 prompt:
|
||||
|
||||
1. `Read the file README.md`(应该直接通过,观察钩子日志)
|
||||
1. `Read the file README.md`(应该直接通过,观察 hook 日志)
|
||||
2. `Create a file called test.txt`(通过后观察 PostToolUse 是否触发)
|
||||
3. `Delete all temporary files in /tmp`(bash + rm 触发权限钩子)
|
||||
3. `Delete all temporary files in /tmp`(bash + rm 触发权限 hook)
|
||||
|
||||
观察重点:每次工具执行前,是否出现了 `[HOOK]` 日志?权限被拒时,是钩子拦截的还是循环里硬编码的?
|
||||
观察重点:每次工具执行前,是否出现了 `[HOOK]` 日志?权限被拒时,是 hook 拦截的还是循环里硬编码的?
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -251,16 +251,16 @@ CC 的 `HookResult`(`types/hooks.ts:260-275`)有 14 个字段,以下是常
|
|||
| `outcome` | success/blocking/non_blocking_error/cancelled | 执行结果 |
|
||||
| `preventContinuation` | boolean | 阻止后续执行 |
|
||||
| `stopReason` | string | 停止原因描述 |
|
||||
| `permissionBehavior` | allow/deny/ask/passthrough | 钩子返回权限决策 |
|
||||
| `permissionBehavior` | allow/deny/ask/passthrough | hook 返回权限决策 |
|
||||
| `updatedInput` | Record | 修改工具输入 |
|
||||
| `additionalContext` | string | 附加上下文 |
|
||||
| `updatedMCPToolOutput` | unknown | MCP 工具输出修改 |
|
||||
|
||||
### 三、关键不变式:Hook 'allow' 不能绕过 deny/ask 规则
|
||||
|
||||
这是 CC 权限系统最重要的安全设计(`toolHooks.ts:325-331`):**钩子返回 allow 时,仍然要检查 settings.json 的 deny/ask 规则**。即使用户的钩子脚本说"允许",如果在 settings.json 中禁用了这个工具,操作仍然会被阻止。
|
||||
这是 CC 权限系统最重要的安全设计(`toolHooks.ts:325-331`):**hook 返回 allow 时,仍然要检查 settings.json 的 deny/ask 规则**。即使用户的 hook 脚本说"允许",如果在 settings.json 中禁用了这个工具,操作仍然会被阻止。
|
||||
|
||||
教学版没有这个层次,钩子返回非 None 就直接中断。这在教学场景中够了,但在生产环境中会形成安全漏洞。
|
||||
教学版没有这个层次,只把 PreToolUse 的非 None 返回值解释为阻止本次工具执行。这在教学场景中够了,但在生产环境中会形成安全漏洞。
|
||||
|
||||
### 四、stopHookActive 机制
|
||||
|
||||
|
|
@ -268,12 +268,12 @@ CC 的 Stop hooks 有一个防无限循环机制(`query.ts:212,1300`):`sto
|
|||
|
||||
### 五、hook_stopped_continuation
|
||||
|
||||
PostToolUse hooks 返回 `preventContinuation: true` 时,会产生一个 `hook_stopped_continuation` 附件(`toolHooks.ts:117-130`)。query.ts(L1388-1393)检测到后设置 `shouldPreventContinuation = true`,循环退出。这是"钩子优雅地让 Agent 停机"的机制——不是崩溃,是完成。
|
||||
PostToolUse hooks 返回 `preventContinuation: true` 时,会产生一个 `hook_stopped_continuation` 附件(`toolHooks.ts:117-130`)。query.ts(L1388-1393)检测到后设置 `shouldPreventContinuation = true`,循环退出。这是 "hook 优雅地让 Agent 停机" 的机制,不是崩溃,是完成。
|
||||
|
||||
### 教学版的简化是刻意的
|
||||
|
||||
- 27 个事件 → 4 个(UserPromptSubmit/PreToolUse/PostToolUse/Stop):覆盖 agent cycle 关键节点
|
||||
- 14 个字段 → 简单的返回值(None = 继续,非 None = 中断/续跑):心智负担降到最低
|
||||
- 14 个字段 → 简单的返回值(None = 继续,非 None = 阻止/续跑):心智负担降到最低
|
||||
- Hook allow vs deny/ask 不变式 → 省略:教学版没有 settings.json 层
|
||||
- stopHookActive → 省略:教学版 Stop hook 只做简单续跑,不涉及防无限循环机制
|
||||
|
||||
|
|
|
|||
|
|
@ -10,7 +10,7 @@ s04: Hooks — move extension logic out of the loop, onto hooks.
|
|||
└────────┬─────────┘
|
||||
▼
|
||||
┌────────────┐ ┌─────────────────────────────┐
|
||||
│ messages │────▶│ LLM (stop_reason?) │
|
||||
│ messages │────▶│ LLM (stop_reason=tool_use?)│
|
||||
└────────────┘ │ No ──▶ Stop hooks ──▶ exit │
|
||||
│ Yes ──▶ tool_use block ──┐ │
|
||||
└────────────────────────────┘ │
|
||||
|
|
@ -164,7 +164,7 @@ def register_hook(event: str, callback):
|
|||
def trigger_hooks(event: str, *args):
|
||||
for callback in HOOKS[event]:
|
||||
result = callback(*args)
|
||||
if result is not None: # non-None return → abort
|
||||
if result is not None: # teaching shortcut: block this tool call
|
||||
return result
|
||||
return None
|
||||
|
||||
|
|
|
|||
|
|
@ -39,7 +39,7 @@
|
|||
<!-- ② LLM -->
|
||||
<rect x="200" y="108" width="120" height="64" rx="8" fill="#fff" stroke="#2563eb" stroke-width="1.5"/>
|
||||
<text x="260" y="134" fill="#1e3a5f" font-size="14" font-weight="700" text-anchor="middle">LLM</text>
|
||||
<text x="260" y="154" fill="#64748b" font-size="10" text-anchor="middle">stop_reason?</text>
|
||||
<text x="260" y="154" fill="#64748b" font-size="10" text-anchor="middle">stop_reason=tool_use?</text>
|
||||
|
||||
<!-- LLM No → Return -->
|
||||
<line x1="260" y1="172" x2="260" y2="200" stroke="#16a34a" stroke-width="2" marker-end="url(#arrow)"/>
|
||||
|
|
@ -57,12 +57,12 @@
|
|||
<text x="460" y="132" fill="#166534" font-size="9" font-weight="600" text-anchor="middle">PreToolUse</text>
|
||||
<rect x="396" y="140" width="128" height="18" rx="3" fill="#dcfce7" stroke="#16a34a" stroke-width="0.8"/>
|
||||
<text x="460" y="153" fill="#166534" font-size="8" text-anchor="middle">permission_hook · log_hook</text>
|
||||
<text x="460" y="176" fill="#64748b" font-size="8" text-anchor="middle">return non-None → block</text>
|
||||
<text x="460" y="176" fill="#64748b" font-size="8" text-anchor="middle">Teaching: non-None → block</text>
|
||||
|
||||
<!-- PreToolUse Block → branch down -->
|
||||
<line x1="460" y1="184" x2="460" y2="218" stroke="#dc2626" stroke-width="2" marker-end="url(#arrow-red)"/>
|
||||
<rect x="405" y="220" width="110" height="24" rx="12" fill="#fef2f2" stroke="#dc2626" stroke-width="1.5"/>
|
||||
<text x="460" y="236" fill="#991b1b" font-size="10" font-weight="600" text-anchor="middle">Skip Execution</text>
|
||||
<text x="460" y="236" fill="#991b1b" font-size="10" font-weight="600" text-anchor="middle">Write tool_result</text>
|
||||
|
||||
<!-- PreToolUse Pass → TOOL_HANDLERS -->
|
||||
<line x1="540" y1="140" x2="588" y2="140" stroke="#16a34a" stroke-width="2" marker-end="url(#arrow-green)"/>
|
||||
|
|
|
|||
|
Before Width: | Height: | Size: 6.6 KiB After Width: | Height: | Size: 6.6 KiB |
|
|
@ -39,7 +39,7 @@
|
|||
<!-- ② LLM -->
|
||||
<rect x="200" y="108" width="120" height="64" rx="8" fill="#fff" stroke="#2563eb" stroke-width="1.5"/>
|
||||
<text x="260" y="134" fill="#1e3a5f" font-size="14" font-weight="700" text-anchor="middle">LLM</text>
|
||||
<text x="260" y="154" fill="#64748b" font-size="10" text-anchor="middle">stop_reason?</text>
|
||||
<text x="260" y="154" fill="#64748b" font-size="10" text-anchor="middle">stop_reason=tool_use?</text>
|
||||
|
||||
<!-- LLM No → 返却 -->
|
||||
<line x1="260" y1="172" x2="260" y2="200" stroke="#16a34a" stroke-width="2" marker-end="url(#arrow)"/>
|
||||
|
|
@ -57,12 +57,12 @@
|
|||
<text x="460" y="132" fill="#166534" font-size="9" font-weight="600" text-anchor="middle">PreToolUse</text>
|
||||
<rect x="396" y="140" width="128" height="18" rx="3" fill="#dcfce7" stroke="#16a34a" stroke-width="0.8"/>
|
||||
<text x="460" y="153" fill="#166534" font-size="8" text-anchor="middle">permission_hook · log_hook</text>
|
||||
<text x="460" y="176" fill="#64748b" font-size="8" text-anchor="middle">非 None を返す → 中断</text>
|
||||
<text x="460" y="176" fill="#64748b" font-size="8" text-anchor="middle">教育版: 非 None → ブロック</text>
|
||||
|
||||
<!-- PreToolUse 中断 → 下に分岐 -->
|
||||
<line x1="460" y1="184" x2="460" y2="218" stroke="#dc2626" stroke-width="2" marker-end="url(#arrow-red)"/>
|
||||
<rect x="405" y="220" width="110" height="24" rx="12" fill="#fef2f2" stroke="#dc2626" stroke-width="1.5"/>
|
||||
<text x="460" y="236" fill="#991b1b" font-size="10" font-weight="600" text-anchor="middle">実行をスキップ</text>
|
||||
<text x="460" y="236" fill="#991b1b" font-size="10" font-weight="600" text-anchor="middle">tool_result に返す</text>
|
||||
|
||||
<!-- PreToolUse 通過 → TOOL_HANDLERS -->
|
||||
<line x1="540" y1="140" x2="588" y2="140" stroke="#16a34a" stroke-width="2" marker-end="url(#arrow-green)"/>
|
||||
|
|
|
|||
|
Before Width: | Height: | Size: 6.7 KiB After Width: | Height: | Size: 6.7 KiB |
|
|
@ -39,7 +39,7 @@
|
|||
<!-- ② LLM -->
|
||||
<rect x="200" y="108" width="120" height="64" rx="8" fill="#fff" stroke="#2563eb" stroke-width="1.5"/>
|
||||
<text x="260" y="134" fill="#1e3a5f" font-size="14" font-weight="700" text-anchor="middle">LLM</text>
|
||||
<text x="260" y="154" fill="#64748b" font-size="10" text-anchor="middle">stop_reason?</text>
|
||||
<text x="260" y="154" fill="#64748b" font-size="10" text-anchor="middle">stop_reason=tool_use?</text>
|
||||
|
||||
<!-- LLM 否 → 返回 -->
|
||||
<line x1="260" y1="172" x2="260" y2="200" stroke="#16a34a" stroke-width="2" marker-end="url(#arrow)"/>
|
||||
|
|
@ -51,18 +51,18 @@
|
|||
<line x1="320" y1="140" x2="378" y2="140" stroke="#555" stroke-width="2" marker-end="url(#arrow)"/>
|
||||
<text x="345" y="132" fill="#d97706" font-size="10" font-weight="600">是</text>
|
||||
|
||||
<!-- ③ PreToolUse 钩子(s04 新增) -->
|
||||
<!-- ③ PreToolUse hook(s04 新增) -->
|
||||
<rect x="380" y="96" width="160" height="88" rx="10" fill="#f0fdf4" stroke="#16a34a" stroke-width="2" stroke-dasharray="6,3"/>
|
||||
<text x="460" y="116" fill="#166534" font-size="11" font-weight="700" text-anchor="middle">trigger_hooks()</text>
|
||||
<text x="460" y="132" fill="#166534" font-size="9" font-weight="600" text-anchor="middle">PreToolUse</text>
|
||||
<rect x="396" y="140" width="128" height="18" rx="3" fill="#dcfce7" stroke="#16a34a" stroke-width="0.8"/>
|
||||
<text x="460" y="153" fill="#166534" font-size="8" text-anchor="middle">permission_hook · log_hook</text>
|
||||
<text x="460" y="176" fill="#64748b" font-size="8" text-anchor="middle">返回非 None → 中断</text>
|
||||
<text x="460" y="176" fill="#64748b" font-size="8" text-anchor="middle">教学版:非 None → 阻止</text>
|
||||
|
||||
<!-- PreToolUse 中断 → 向下引出 -->
|
||||
<!-- PreToolUse 阻止 → 向下引出 -->
|
||||
<line x1="460" y1="184" x2="460" y2="218" stroke="#dc2626" stroke-width="2" marker-end="url(#arrow-red)"/>
|
||||
<rect x="405" y="220" width="110" height="24" rx="12" fill="#fef2f2" stroke="#dc2626" stroke-width="1.5"/>
|
||||
<text x="460" y="236" fill="#991b1b" font-size="10" font-weight="600" text-anchor="middle">跳过执行</text>
|
||||
<text x="460" y="236" fill="#991b1b" font-size="10" font-weight="600" text-anchor="middle">写入 tool_result</text>
|
||||
|
||||
<!-- PreToolUse 通过 → TOOL_HANDLERS -->
|
||||
<line x1="540" y1="140" x2="588" y2="140" stroke="#16a34a" stroke-width="2" marker-end="url(#arrow-green)"/>
|
||||
|
|
@ -78,7 +78,7 @@
|
|||
<line x1="640" y1="172" x2="640" y2="268" stroke="#16a34a" stroke-width="2"/>
|
||||
<text x="648" y="224" fill="#16a34a" font-size="9" font-weight="600">执行后</text>
|
||||
|
||||
<!-- ⑤ PostToolUse 钩子(s04 新增) -->
|
||||
<!-- ⑤ PostToolUse hook(s04 新增) -->
|
||||
<rect x="560" y="270" width="160" height="56" rx="10" fill="#f0fdf4" stroke="#16a34a" stroke-width="2" stroke-dasharray="6,3"/>
|
||||
<text x="640" y="290" fill="#166534" font-size="11" font-weight="700" text-anchor="middle">trigger_hooks()</text>
|
||||
<text x="640" y="306" fill="#166534" font-size="9" font-weight="600" text-anchor="middle">PostToolUse</text>
|
||||
|
|
|
|||
|
Before Width: | Height: | Size: 6.6 KiB After Width: | Height: | Size: 6.6 KiB |
|
|
@ -83,15 +83,14 @@ TOOL_HANDLERS["todo_write"] = run_todo_write
|
|||
|
||||
```python
|
||||
if rounds_since_todo >= 3 and messages:
|
||||
last = messages[-1]
|
||||
if last["role"] == "user" and isinstance(last.get("content"), list):
|
||||
last["content"].insert(0, {
|
||||
"type": "text",
|
||||
"text": "<reminder>Update your todos.</reminder>",
|
||||
})
|
||||
messages.append({
|
||||
"role": "user",
|
||||
"content": "<reminder>Update your todos.</reminder>",
|
||||
})
|
||||
rounds_since_todo = 0
|
||||
```
|
||||
|
||||
Typical flow when the Agent receives a task: first call `todo_write` to list all steps (all `pending`) → pick one step, set it to `in_progress` → complete it, set to `completed` → look at the next `pending` → continue. After 3 rounds without updates, the reminder prompts the Agent to update TODO status.
|
||||
Typical flow when the Agent receives a task: first call `todo_write` to list all steps (all `pending`) → pick one step, set it to `in_progress` → complete it, set to `completed` → look at the next `pending` → continue. After 3 rounds without `todo_write`, the loop appends a reminder before the next LLM call.
|
||||
|
||||
**Key insight**: todo_write doesn't give the Agent any additional **execution capability**. What it adds is **planning capability**.
|
||||
|
||||
|
|
@ -117,11 +116,11 @@ python s05_todo_write/code.py
|
|||
|
||||
Try these prompts:
|
||||
|
||||
1. `Refactor the file hello.py: add type hints, docstrings, and a main guard` (should list 3 steps first, then execute)
|
||||
2. `Create a Python package with __init__.py, utils.py, and tests/test_utils.py`
|
||||
3. `Review all Python files and fix any style issues`
|
||||
1. `Refactor s05_todo_write/example/hello.py: add type hints, docstrings, and a main guard` (should list 3 steps first, then execute)
|
||||
2. `Create a Python package under s05_todo_write/example/demo_pkg with __init__.py, utils.py, and tests/test_utils.py`
|
||||
3. `Review Python files under s05_todo_write/example and fix any style issues`
|
||||
|
||||
What to watch for: Did the Agent call `todo_write` first? How many steps did it list? Did it go back to update TODO status during execution? Did the nag reminder appear after 3 rounds without updates?
|
||||
What to watch for: Was the first tool call `todo_write`? How many TODO steps were listed? Did statuses move from `pending` to `in_progress` / `completed` during execution?
|
||||
|
||||
---
|
||||
|
||||
|
|
|
|||
|
|
@ -83,15 +83,14 @@ TOOL_HANDLERS["todo_write"] = run_todo_write
|
|||
|
||||
```python
|
||||
if rounds_since_todo >= 3 and messages:
|
||||
last = messages[-1]
|
||||
if last["role"] == "user" and isinstance(last.get("content"), list):
|
||||
last["content"].insert(0, {
|
||||
"type": "text",
|
||||
"text": "<reminder>Update your todos.</reminder>",
|
||||
})
|
||||
messages.append({
|
||||
"role": "user",
|
||||
"content": "<reminder>Update your todos.</reminder>",
|
||||
})
|
||||
rounds_since_todo = 0
|
||||
```
|
||||
|
||||
Agent がタスクを受け取った後の典型的な流れ:まず `todo_write` を呼び出して全手順を列挙(全て `pending`)→ 一つの手順に取り掛かり、`in_progress` に変更 → 完了したら `completed` に変更 → 次の `pending` を見る → 続行。3 ラウンド更新なしの場合、リマインダーが TODO の更新を促す。
|
||||
Agent がタスクを受け取った後の典型的な流れ:まず `todo_write` を呼び出して全手順を列挙(全て `pending`)→ 一つの手順に取り掛かり、`in_progress` に変更 → 完了したら `completed` に変更 → 次の `pending` を見る → 続行。3 ラウンド `todo_write` がない場合、次の LLM 呼び出し前にリマインダーが追加される。
|
||||
|
||||
**重要な洞察**:todo_write は Agent に**実行能力**を何も追加しない。追加するのは**計画能力**だ。
|
||||
|
||||
|
|
@ -117,11 +116,11 @@ python s05_todo_write/code.py
|
|||
|
||||
以下のプロンプトを試してみよう:
|
||||
|
||||
1. `Refactor the file hello.py: add type hints, docstrings, and a main guard`(まず 3 手順を列挙してから実行するはず)
|
||||
2. `Create a Python package with __init__.py, utils.py, and tests/test_utils.py`
|
||||
3. `Review all Python files and fix any style issues`
|
||||
1. `Refactor s05_todo_write/example/hello.py: add type hints, docstrings, and a main guard`(まず 3 手順を列挙してから実行するはず)
|
||||
2. `Create a Python package under s05_todo_write/example/demo_pkg with __init__.py, utils.py, and tests/test_utils.py`
|
||||
3. `Review Python files under s05_todo_write/example and fix any style issues`
|
||||
|
||||
観察のポイント:Agent はまず `todo_write` を呼び出したか? 何手順列挙したか? 実行中に TODO のステータスを更新し戻ったか? 3 ラウンド更新なしで Nag リマインダーが表示されたか?
|
||||
観察のポイント:最初のツール呼び出しは `todo_write` か? TODO は何手順列挙されたか? 実行中にステータスが `pending` から `in_progress` / `completed` に変わったか?
|
||||
|
||||
---
|
||||
|
||||
|
|
|
|||
|
|
@ -83,15 +83,14 @@ TOOL_HANDLERS["todo_write"] = run_todo_write
|
|||
|
||||
```python
|
||||
if rounds_since_todo >= 3 and messages:
|
||||
last = messages[-1]
|
||||
if last["role"] == "user" and isinstance(last.get("content"), list):
|
||||
last["content"].insert(0, {
|
||||
"type": "text",
|
||||
"text": "<reminder>Update your todos.</reminder>",
|
||||
})
|
||||
messages.append({
|
||||
"role": "user",
|
||||
"content": "<reminder>Update your todos.</reminder>",
|
||||
})
|
||||
rounds_since_todo = 0
|
||||
```
|
||||
|
||||
Agent 收到任务后的典型流程:先调 `todo_write` 列出所有步骤(全 `pending`)→ 做一个步骤,改成 `in_progress` → 做完改成 `completed` → 看下一个 `pending` → 继续。3 轮不更新时,reminder 会提醒它回头更新 TODO 状态。
|
||||
Agent 收到任务后的典型流程:先调 `todo_write` 列出所有步骤(全 `pending`)→ 做一个步骤,改成 `in_progress` → 做完改成 `completed` → 看下一个 `pending` → 继续。连续 3 轮没有调用 `todo_write` 时,循环会在下一次 LLM 调用前追加一条 reminder。
|
||||
|
||||
**关键洞察**:todo_write 不给 Agent 增加任何**执行能力**。它增加的是**规划能力**。
|
||||
|
||||
|
|
@ -117,11 +116,11 @@ python s05_todo_write/code.py
|
|||
|
||||
试试这些 prompt:
|
||||
|
||||
1. `Refactor the file hello.py: add type hints, docstrings, and a main guard`(先列 3 步再执行)
|
||||
2. `Create a Python package with __init__.py, utils.py, and tests/test_utils.py`
|
||||
3. `Review all Python files and fix any style issues`
|
||||
1. `Refactor s05_todo_write/example/hello.py: add type hints, docstrings, and a main guard`(先列 3 步再执行)
|
||||
2. `Create a Python package under s05_todo_write/example/demo_pkg with __init__.py, utils.py, and tests/test_utils.py`
|
||||
3. `Review Python files under s05_todo_write/example and fix any style issues`
|
||||
|
||||
观察重点:Agent 先调了 `todo_write` 吗?它列了几个步骤?执行过程中有没有回头更新 TODO 状态?连续 3 轮不更新时是否出现了 nag reminder?
|
||||
观察重点:第一次工具调用是不是 `todo_write`?TODO 列了几步?执行过程中状态有没有从 `pending` 变成 `in_progress` / `completed`?
|
||||
|
||||
---
|
||||
|
||||
|
|
|
|||
6
s05_todo_write/example/hello.py
Normal file
|
|
@ -0,0 +1,6 @@
|
|||
def greet(name):
|
||||
message = "Hello, " + name
|
||||
print(message)
|
||||
|
||||
|
||||
greet("Claude")
|
||||
|
|
@ -36,7 +36,7 @@
|
|||
<!-- LLM -->
|
||||
<rect x="190" y="86" width="110" height="52" rx="8" fill="#fff" stroke="#2563eb" stroke-width="1.5"/>
|
||||
<text x="245" y="108" fill="#1e3a5f" font-size="13" font-weight="700" text-anchor="middle">LLM</text>
|
||||
<text x="245" y="126" fill="#64748b" font-size="9" text-anchor="middle">stop_reason?</text>
|
||||
<text x="245" y="126" fill="#64748b" font-size="9" text-anchor="middle">stop_reason=tool_use?</text>
|
||||
|
||||
<!-- No → Return -->
|
||||
<line x1="245" y1="138" x2="245" y2="162" stroke="#16a34a" stroke-width="2" marker-end="url(#arrow)"/>
|
||||
|
|
|
|||
|
Before Width: | Height: | Size: 5.6 KiB After Width: | Height: | Size: 5.6 KiB |
|
|
@ -36,7 +36,7 @@
|
|||
<!-- LLM -->
|
||||
<rect x="190" y="86" width="110" height="52" rx="8" fill="#fff" stroke="#2563eb" stroke-width="1.5"/>
|
||||
<text x="245" y="108" fill="#1e3a5f" font-size="13" font-weight="700" text-anchor="middle">LLM</text>
|
||||
<text x="245" y="126" fill="#64748b" font-size="9" text-anchor="middle">stop_reason?</text>
|
||||
<text x="245" y="126" fill="#64748b" font-size="9" text-anchor="middle">stop_reason=tool_use?</text>
|
||||
|
||||
<!-- No → 返却 -->
|
||||
<line x1="245" y1="138" x2="245" y2="162" stroke="#16a34a" stroke-width="2" marker-end="url(#arrow)"/>
|
||||
|
|
|
|||
|
Before Width: | Height: | Size: 5.7 KiB After Width: | Height: | Size: 5.7 KiB |
|
|
@ -36,7 +36,7 @@
|
|||
<!-- LLM -->
|
||||
<rect x="190" y="86" width="110" height="52" rx="8" fill="#fff" stroke="#2563eb" stroke-width="1.5"/>
|
||||
<text x="245" y="108" fill="#1e3a5f" font-size="13" font-weight="700" text-anchor="middle">LLM</text>
|
||||
<text x="245" y="126" fill="#64748b" font-size="9" text-anchor="middle">stop_reason?</text>
|
||||
<text x="245" y="126" fill="#64748b" font-size="9" text-anchor="middle">stop_reason=tool_use?</text>
|
||||
|
||||
<!-- 否 → 返回 -->
|
||||
<line x1="245" y1="138" x2="245" y2="162" stroke="#16a34a" stroke-width="2" marker-end="url(#arrow)"/>
|
||||
|
|
|
|||
|
Before Width: | Height: | Size: 5.6 KiB After Width: | Height: | Size: 5.6 KiB |
|
|
@ -119,9 +119,9 @@ Try these prompts:
|
|||
|
||||
1. `Use a subtask to find what testing framework this project uses` (sub-Agent reads files, main Agent receives only the conclusion)
|
||||
2. `Delegate: read all .py files in agents/ and summarize what each one does`
|
||||
3. `Use a task to create a new module, then verify it from here`
|
||||
3. `Use a task to create s06_subagent/example/string_tools.py with a slugify(text: str) function, then verify it from the parent agent`
|
||||
|
||||
What to watch for: Does the Agent spawn a sub-Agent to read files? Do the sub-Agent's intermediate steps appear in the main conversation? Does the final conclusion include the file contents that the sub-Agent read?
|
||||
What to watch for: Do `[Subagent spawned]` / `[Subagent done]` appear? Do sub-Agent tool calls print as `[sub] ...`? Does the parent Agent continue with only the summary returned by the sub-Agent?
|
||||
|
||||
---
|
||||
|
||||
|
|
|
|||
|
|
@ -119,9 +119,9 @@ python s06_subagent/code.py
|
|||
|
||||
1. `Use a subtask to find what testing framework this project uses`(サブエージェントがファイルを読み、メイン Agent は結論のみ受け取る)
|
||||
2. `Delegate: read all .py files in agents/ and summarize what each one does`
|
||||
3. `Use a task to create a new module, then verify it from here`
|
||||
3. `Use a task to create s06_subagent/example/string_tools.py with a slugify(text: str) function, then verify it from the parent agent`
|
||||
|
||||
観察のポイント:Agent はサブエージェントを spawn してファイルを読みに行くか? サブエージェントの中間過程はメイン会話に現れるか? 最後に返された結論に、サブエージェントが読んだファイルの内容は含まれているか?
|
||||
観察のポイント:`[Subagent spawned]` / `[Subagent done]` が表示されるか? サブエージェントのツール呼び出しが `[sub] ...` として出力されるか? 親 Agent はサブエージェントが返した要約だけを受け取って続行するか?
|
||||
|
||||
---
|
||||
|
||||
|
|
|
|||
|
|
@ -123,9 +123,9 @@ python s06_subagent/code.py
|
|||
|
||||
1. `Use a subtask to find what testing framework this project uses`(子 Agent 去读文件,主 Agent 只收结论)
|
||||
2. `Delegate: read all .py files in agents/ and summarize what each one does`
|
||||
3. `Use a task to create a new module, then verify it from here`
|
||||
3. `Use a task to create s06_subagent/example/string_tools.py with a slugify(text: str) function, then verify it from the parent agent`
|
||||
|
||||
观察重点:Agent 会 spawn 子 Agent 去读文件吗?子 Agent 的中间过程是否出现在主对话中?最后返回的结论包含子 Agent 读的那些文件内容吗?
|
||||
观察重点:是否出现 `[Subagent spawned]` / `[Subagent done]`?子 Agent 的工具调用是否以 `[sub] ...` 输出?主 Agent 最后是否只继续处理子 Agent 返回的摘要?
|
||||
|
||||
---
|
||||
|
||||
|
|
|
|||
|
|
@ -111,7 +111,7 @@
|
|||
<rect x="60" y="370" width="680" height="56" rx="8" fill="#f1f5f9"/>
|
||||
|
||||
<rect x="80" y="384" width="16" height="12" rx="3" fill="#f0f4ff" stroke="#2563eb" stroke-width="1"/>
|
||||
<text x="104" y="394" fill="#334155" font-size="10">s05 保留:循环、钩子、todo_write、6 个基础工具</text>
|
||||
<text x="104" y="394" fill="#334155" font-size="10">s05 保留:循环、hook、todo_write、6 个基础工具</text>
|
||||
|
||||
<rect x="80" y="404" width="16" height="12" rx="3" fill="#ede9fe" stroke="#7c3aed" stroke-width="1"/>
|
||||
<text x="104" y="414" fill="#334155" font-size="10">s06 新增:task 工具 + spawn_subagent() — 独立 messages[],只回传摘要</text>
|
||||
|
|
|
|||
|
Before Width: | Height: | Size: 8.2 KiB After Width: | Height: | Size: 8.2 KiB |
|
|
@ -125,11 +125,11 @@ python s07_skill_loading/code.py
|
|||
|
||||
Try these prompts:
|
||||
|
||||
1. `What skills are available?` (should answer directly from the SYSTEM prompt catalog, no tool call)
|
||||
2. `Load the code-review skill and follow its instructions` (should call load_skill)
|
||||
1. `What skills are available?`
|
||||
2. `Load the code-review skill and follow its instructions`
|
||||
3. `I need to do a code review -- load the relevant skill first`
|
||||
|
||||
What to watch for: Does the Agent know which skills are available directly from the SYSTEM catalog? Does it proactively call `load_skill` when it needs specific specs? Does the full skill content appear in the system prompt?
|
||||
What to watch for: Does the Agent know available skills from the SYSTEM catalog? Does `[HOOK] load_skill` appear when full instructions are needed? Does the answer use the loaded skill's instructions?
|
||||
|
||||
---
|
||||
|
||||
|
|
|
|||
|
|
@ -125,11 +125,11 @@ python s07_skill_loading/code.py
|
|||
|
||||
以下のプロンプトを試してみよう:
|
||||
|
||||
1. `What skills are available?`(SYSTEM prompt のカタログから直接回答するはず、ツール呼び出しなし)
|
||||
2. `Load the code-review skill and follow its instructions`(load_skill を呼び出すはず)
|
||||
1. `What skills are available?`
|
||||
2. `Load the code-review skill and follow its instructions`
|
||||
3. `I need to do a code review -- load the relevant skill first`
|
||||
|
||||
観察のポイント:Agent は SYSTEM 内のカタログから利用可能なスキルを知っているか? 具体的な仕様が必要なときに `load_skill` を積極的に呼び出すか? system prompt にスキルの完全な内容が含まれているか?
|
||||
観察のポイント:Agent は SYSTEM 内のカタログから利用可能なスキルを知っているか? 完全な手順が必要なときに `[HOOK] load_skill` が表示されるか? 読み込んだスキルの説明を使って回答しているか?
|
||||
|
||||
---
|
||||
|
||||
|
|
|
|||
|
|
@ -125,11 +125,11 @@ python s07_skill_loading/code.py
|
|||
|
||||
试试这些 prompt:
|
||||
|
||||
1. `What skills are available?`(应该直接从 SYSTEM prompt 里的目录回答,不调工具)
|
||||
2. `Load the code-review skill and follow its instructions`(应该调 load_skill)
|
||||
1. `What skills are available?`
|
||||
2. `Load the code-review skill and follow its instructions`
|
||||
3. `I need to do a code review -- load the relevant skill first`
|
||||
|
||||
观察重点:Agent 是否直接从 SYSTEM 里的目录知道有哪些技能?它在需要具体规范时主动调了 `load_skill` 吗?system prompt 里有没有出现 skill 的完整内容?
|
||||
观察重点:Agent 是否直接从 SYSTEM 里的目录知道有哪些技能?需要完整规范时是否出现 `[HOOK] load_skill`?加载后回答是否使用了对应 skill 的说明?
|
||||
|
||||
---
|
||||
|
||||
|
|
|
|||
|
|
@ -39,7 +39,7 @@
|
|||
<!-- LLM -->
|
||||
<rect x="200" y="106" width="110" height="48" rx="8" fill="#fff" stroke="#2563eb" stroke-width="1.5"/>
|
||||
<text x="255" y="128" fill="#1e3a5f" font-size="13" font-weight="700" text-anchor="middle">LLM</text>
|
||||
<text x="255" y="146" fill="#64748b" font-size="9" text-anchor="middle">stop_reason?</text>
|
||||
<text x="255" y="146" fill="#64748b" font-size="9" text-anchor="middle">stop_reason=tool_use?</text>
|
||||
|
||||
<!-- No → return -->
|
||||
<line x1="255" y1="154" x2="255" y2="178" stroke="#2563eb" stroke-width="2" marker-end="url(#arrow-blue)"/>
|
||||
|
|
|
|||
|
Before Width: | Height: | Size: 6.9 KiB After Width: | Height: | Size: 6.9 KiB |
|
|
@ -39,7 +39,7 @@
|
|||
<!-- LLM -->
|
||||
<rect x="200" y="106" width="110" height="48" rx="8" fill="#fff" stroke="#2563eb" stroke-width="1.5"/>
|
||||
<text x="255" y="128" fill="#1e3a5f" font-size="13" font-weight="700" text-anchor="middle">LLM</text>
|
||||
<text x="255" y="146" fill="#64748b" font-size="9" text-anchor="middle">stop_reason?</text>
|
||||
<text x="255" y="146" fill="#64748b" font-size="9" text-anchor="middle">stop_reason=tool_use?</text>
|
||||
|
||||
<!-- No → 戻る -->
|
||||
<line x1="255" y1="154" x2="255" y2="178" stroke="#2563eb" stroke-width="2" marker-end="url(#arrow-blue)"/>
|
||||
|
|
|
|||
|
Before Width: | Height: | Size: 7.1 KiB After Width: | Height: | Size: 7.1 KiB |
|
|
@ -39,7 +39,7 @@
|
|||
<!-- LLM -->
|
||||
<rect x="200" y="106" width="110" height="48" rx="8" fill="#fff" stroke="#2563eb" stroke-width="1.5"/>
|
||||
<text x="255" y="128" fill="#1e3a5f" font-size="13" font-weight="700" text-anchor="middle">LLM</text>
|
||||
<text x="255" y="146" fill="#64748b" font-size="9" text-anchor="middle">stop_reason?</text>
|
||||
<text x="255" y="146" fill="#64748b" font-size="9" text-anchor="middle">stop_reason=tool_use?</text>
|
||||
|
||||
<!-- 否 → 返回 -->
|
||||
<line x1="255" y1="154" x2="255" y2="178" stroke="#2563eb" stroke-width="2" marker-end="url(#arrow-blue)"/>
|
||||
|
|
|
|||
|
Before Width: | Height: | Size: 6.9 KiB After Width: | Height: | Size: 6.9 KiB |
|
|
@ -85,7 +85,7 @@
|
|||
<!-- ===== ③ LLM ===== -->
|
||||
<rect x="440" y="132" width="100" height="52" rx="8" fill="#f0f4ff" stroke="#2563eb" stroke-width="1.5"/>
|
||||
<text x="490" y="155" fill="#1e3a5f" font-size="14" font-weight="700" text-anchor="middle">LLM</text>
|
||||
<text x="490" y="172" fill="#64748b" font-size="9" text-anchor="middle">stop_reason?</text>
|
||||
<text x="490" y="172" fill="#64748b" font-size="9" text-anchor="middle">stop_reason=tool_use?</text>
|
||||
|
||||
<!-- LLM No → Return -->
|
||||
<line x1="490" y1="184" x2="490" y2="278" stroke="#16a34a" stroke-width="2" marker-end="url(#arrow-green)"/>
|
||||
|
|
|
|||
|
Before Width: | Height: | Size: 9 KiB After Width: | Height: | Size: 9 KiB |
|
|
@ -85,7 +85,7 @@
|
|||
<!-- ===== ③ LLM ===== -->
|
||||
<rect x="440" y="132" width="100" height="52" rx="8" fill="#f0f4ff" stroke="#2563eb" stroke-width="1.5"/>
|
||||
<text x="490" y="155" fill="#1e3a5f" font-size="14" font-weight="700" text-anchor="middle">LLM</text>
|
||||
<text x="490" y="172" fill="#64748b" font-size="9" text-anchor="middle">stop_reason?</text>
|
||||
<text x="490" y="172" fill="#64748b" font-size="9" text-anchor="middle">stop_reason=tool_use?</text>
|
||||
|
||||
<!-- LLM No → 返却 -->
|
||||
<line x1="490" y1="184" x2="490" y2="278" stroke="#16a34a" stroke-width="2" marker-end="url(#arrow-green)"/>
|
||||
|
|
|
|||
|
Before Width: | Height: | Size: 9.2 KiB After Width: | Height: | Size: 9.2 KiB |
|
|
@ -85,7 +85,7 @@
|
|||
<!-- ===== ③ LLM ===== -->
|
||||
<rect x="440" y="132" width="100" height="52" rx="8" fill="#f0f4ff" stroke="#2563eb" stroke-width="1.5"/>
|
||||
<text x="490" y="155" fill="#1e3a5f" font-size="14" font-weight="700" text-anchor="middle">LLM</text>
|
||||
<text x="490" y="172" fill="#64748b" font-size="9" text-anchor="middle">stop_reason?</text>
|
||||
<text x="490" y="172" fill="#64748b" font-size="9" text-anchor="middle">stop_reason=tool_use?</text>
|
||||
|
||||
<!-- LLM 否 → 返回 -->
|
||||
<line x1="490" y1="184" x2="490" y2="278" stroke="#16a34a" stroke-width="2" marker-end="url(#arrow-green)"/>
|
||||
|
|
@ -123,7 +123,7 @@
|
|||
<rect x="50" y="390" width="720" height="116" rx="6" fill="#f8fafc" stroke="#e2e8f0" stroke-width="1"/>
|
||||
|
||||
<rect x="70" y="404" width="16" height="12" rx="3" fill="#f0f4ff" stroke="#2563eb" stroke-width="1"/>
|
||||
<text x="94" y="414" fill="#334155" font-size="10">s07 保留:循环、钩子、技能加载、子 Agent</text>
|
||||
<text x="94" y="414" fill="#334155" font-size="10">s07 保留:循环、hook、技能加载、子 Agent</text>
|
||||
|
||||
<rect x="70" y="426" width="16" height="12" rx="3" fill="#fde68a" stroke="#d97706" stroke-width="1"/>
|
||||
<text x="94" y="436" fill="#334155" font-size="10">① 每轮自动:L3→L1→L2 在每次 LLM 调用前无条件执行,0 API</text>
|
||||
|
|
|
|||
|
Before Width: | Height: | Size: 9 KiB After Width: | Height: | Size: 9 KiB |
|
|
@ -89,11 +89,10 @@
|
|||
<text x="55" y="477" fill="#64748b" font-size="11" font-weight="600">Emergency Fallback (triggered when API still returns prompt_too_long)</text>
|
||||
|
||||
<!-- Emergency: reactiveCompact -->
|
||||
<rect x="80" y="492" width="600" height="46" rx="7" fill="url(#emergency)" stroke="#c2410c" stroke-width="1.5"/>
|
||||
<rect x="80" y="492" width="600" height="62" rx="7" fill="url(#emergency)" stroke="#c2410c" stroke-width="1.5"/>
|
||||
<text x="100" y="512" fill="#9a3412" font-size="12" font-weight="600">Emrg</text>
|
||||
<text x="135" y="512" fill="#9a3412" font-size="13" font-weight="700">reactiveCompact</text>
|
||||
<text x="278" y="512" fill="#9a3412" font-size="11">API 413 → byte-level trim, keep last 5 + summary</text>
|
||||
<text x="650" y="512" fill="#9a3412" font-size="10" text-anchor="end">more aggressive</text>
|
||||
<text x="135" y="530" fill="#c2410c" font-size="9">Trigger: API returns prompt_too_long or 413 error</text>
|
||||
<text x="135" y="528" fill="#9a3412" font-size="10">API returns 413 / prompt_too_long → byte-level trim</text>
|
||||
<text x="135" y="544" fill="#c2410c" font-size="9">Keep last 5 + summary; more aggressive than autoCompact</text>
|
||||
|
||||
</svg>
|
||||
|
|
|
|||
|
Before Width: | Height: | Size: 6.8 KiB After Width: | Height: | Size: 6.7 KiB |
|
|
@ -89,11 +89,10 @@
|
|||
<text x="55" y="477" fill="#64748b" font-size="11" font-weight="600">緊急フォールバック(API が引き続き prompt_too_long を返す場合にトリガー)</text>
|
||||
|
||||
<!-- 緊急: reactiveCompact -->
|
||||
<rect x="80" y="492" width="600" height="46" rx="7" fill="url(#emergency)" stroke="#c2410c" stroke-width="1.5"/>
|
||||
<rect x="80" y="492" width="600" height="62" rx="7" fill="url(#emergency)" stroke="#c2410c" stroke-width="1.5"/>
|
||||
<text x="100" y="512" fill="#9a3412" font-size="12" font-weight="600">緊急</text>
|
||||
<text x="135" y="512" fill="#9a3412" font-size="13" font-weight="700">reactiveCompact</text>
|
||||
<text x="278" y="512" fill="#9a3412" font-size="11">API が 413 を返す → バイトレベルでトリム、最後の 5 件 + 要約を保持</text>
|
||||
<text x="590" y="512" fill="#9a3412" font-size="10" text-anchor="end">autoCompact より積極的</text>
|
||||
<text x="135" y="530" fill="#c2410c" font-size="9">トリガー:API が prompt_too_long または 413 エラーを返す</text>
|
||||
<text x="135" y="528" fill="#9a3412" font-size="10">API が 413 / prompt_too_long を返す → バイト単位でトリム</text>
|
||||
<text x="135" y="544" fill="#c2410c" font-size="9">最後の 5 件 + 要約を保持、autoCompact より積極的</text>
|
||||
|
||||
</svg>
|
||||
|
|
|
|||
|
Before Width: | Height: | Size: 7.2 KiB After Width: | Height: | Size: 7.1 KiB |
|
|
@ -89,11 +89,10 @@
|
|||
<text x="55" y="477" fill="#64748b" font-size="11" font-weight="600">应急兜底(API 仍然返回 prompt_too_long 时触发)</text>
|
||||
|
||||
<!-- 应急: reactiveCompact -->
|
||||
<rect x="80" y="492" width="600" height="46" rx="7" fill="url(#emergency)" stroke="#c2410c" stroke-width="1.5"/>
|
||||
<rect x="80" y="492" width="600" height="62" rx="7" fill="url(#emergency)" stroke="#c2410c" stroke-width="1.5"/>
|
||||
<text x="100" y="512" fill="#9a3412" font-size="12" font-weight="600">应急</text>
|
||||
<text x="135" y="512" fill="#9a3412" font-size="13" font-weight="700">reactiveCompact</text>
|
||||
<text x="278" y="512" fill="#9a3412" font-size="11">API 返回 413 → 字节级裁剪,保留最后 5 条 + 摘要</text>
|
||||
<text x="590" y="512" fill="#9a3412" font-size="10" text-anchor="end">比 autoCompact 更激进</text>
|
||||
<text x="135" y="530" fill="#c2410c" font-size="9">触发:API 返回 prompt_too_long 或 413 错误</text>
|
||||
<text x="135" y="528" fill="#9a3412" font-size="10">API 返回 413 / prompt_too_long → 字节级裁剪</text>
|
||||
<text x="135" y="544" fill="#c2410c" font-size="9">保留最后 5 条 + 摘要,比 autoCompact 更激进</text>
|
||||
|
||||
</svg>
|
||||
|
|
|
|||
|
Before Width: | Height: | Size: 6.7 KiB After Width: | Height: | Size: 6.6 KiB |