qwen-code/.qwen/skills/structured-debugging/SKILL.md
易良 202be6ec7d
Some checks are pending
Qwen Code CI / Lint (push) Waiting to run
Qwen Code CI / Test (push) Blocked by required conditions
Qwen Code CI / Test-1 (push) Blocked by required conditions
Qwen Code CI / Test-2 (push) Blocked by required conditions
Qwen Code CI / Test-3 (push) Blocked by required conditions
Qwen Code CI / Test-4 (push) Blocked by required conditions
Qwen Code CI / Test-5 (push) Blocked by required conditions
Qwen Code CI / Test-6 (push) Blocked by required conditions
Qwen Code CI / Test-7 (push) Blocked by required conditions
Qwen Code CI / Test-8 (push) Blocked by required conditions
Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions
Qwen Code CI / CodeQL (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run
E2E Tests / E2E Test - macOS (push) Waiting to run
feat(vscode): expose /skills as slash command with secondary picker (#2548)
* feat(vscode): expose /skills as slash command with secondary picker

Add a secondary completion picker for the /skills slash command in the
VSCode IDE companion, allowing users to browse and select skills from
a dropdown before sending.

Changes:
- CLI: add 'skills' to ALLOWED_BUILTIN_COMMANDS_NON_INTERACTIVE whitelist
- CLI: send available_skills_update via ACP with skill names/descriptions
- Extension: handle available_skills_update in session update handler
- Webview: implement secondary picker that triggers after selecting /skills
- Webview: allow spaces in completion trigger for /skills sub-queries

Closes #1562

Made-with: Cursor

* feat(vscode-ide-companion): embed skills in commands update metadata

- Move available skills from separate session update to _meta field of
  available_commands_update for more efficient delivery
- Simplify skill data to just skill names (string array)
- Add skillsCompletion utility for secondary picker logic
- Cache available skills in WebViewProvider for replay on webview ready
- Update all related types and handlers to support the new structure

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* refactor(vscode-ide-companion): simplify skills picker flow

* refactor(vscode-ide-companion): extract skills completion utils to shared module

Move `isSkillsSecondaryQuery`, `shouldOpenSkillsSecondaryPicker`, and
`SKILL_ITEM_ID_PREFIX` from App.tsx and useCompletionTrigger.ts into a
shared `completionUtils.ts` file to eliminate duplication.

* fix(vscode-ide-companion): restore skills picker state on reload

Cache and replay available skills when the webview becomes ready again.

Clear stale skills when commands metadata does not include availableSkills.

* fix(vscode-ide-companion): replay slash commands after webview reload

Cache available commands in the webview provider.

Replay them on webviewReady so slash command state survives reloads.

* fix(vscode-ide-companion): import AvailableCommand from ACP SDK

* fix(vscode-ide-companion): fallback /skills to direct command

* test(vscode-ide-companion): cover skills secondary picker flow

* test(vscode-ide-companion): guard App mock initialization

* fix(vscode-ide-companion): remove duplicate AvailableCommand import

The auto-merge introduced a duplicate AvailableCommand in the
@agentclientprotocol/sdk import block, causing TS2300.

* fix(vscode-ide-companion): remove duplicate availableCommands replay in handleWebviewReady

The handleWebviewReady method was sending cachedAvailableCommands twice
on every webview-ready handshake, causing an unnecessary extra state
update in the webview.

---------

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-24 23:28:53 +08:00

193 lines
7.9 KiB
Markdown

---
name: structured-debugging
description: Hypothesis-driven debugging methodology for hard bugs. Use this
skill whenever you're investigating non-trivial bugs, unexpected behavior,
flaky tests, or tracing issues through complex systems. Activate proactively
when debugging requires more than a quick glance — especially when the first
attempt at a fix didn't work, when behavior seems "impossible", or when you're
tempted to blame an external system (model, API, library) without evidence.
---
# Structured Debugging
When debugging hard issues, the natural instinct is to form a theory and
immediately apply a fix. This fails more often than it works. The fix addresses
the wrong cause, adds complexity, creates false confidence, and obscures the
real issue. Worse, after several failed attempts you lose track of what's been
tried and start guessing randomly.
This methodology replaces guessing with a disciplined cycle that converges on
the root cause. Each iteration narrows the search space. It's slower per attempt
but dramatically faster overall because you stop wasting runs on wrong theories.
## The Cycle
### 1. Hypothesize
Before touching code, write down what you think is happening and why. Be
specific about the expected state at each step in the execution path.
Bad: "Something is wrong with the wait loop." Good: "The leader hangs because
`hasActiveTeammates()` returns true after all agents have reported completed,
likely because terminal status isn't being set on the agent object after the
backend process exits."
For bugs you expect to take more than one round, create a side note file for the
investigation in whichever location the project uses for such notes.
Write your hypothesis there. This file persists across conversation turns and
even across sessions — it's your investigation journal.
### 2. Design Instrumentation
Add targeted debug logs or assertions at the exact decision points that would
confirm or reject your hypothesis. Think about what data you need to see.
Don't scatter `console.log` everywhere. Identify the 2-3 places where your
hypothesis makes a testable prediction, and instrument those.
Prefer logging _values_ (return codes, payload contents, stream types, message
bodies, env state) over _presence checks_ ("was this function called?", "was
this branch taken?"). Code-path traces tell you what ran; data traces tell you
what it ran on. Most non-trivial bugs are correct code processing wrong data.
Ask yourself: "If my hypothesis is correct, what will I see at point X? If it's
wrong, what will I see instead?"
### 3. Verify Data Collection
Before running, confirm that your instrumentation output will actually be
captured and accessible.
Common traps:
- stderr discarded by `2>/dev/null` in the test command
- Process killed before flush (logs lost)
- Logging to a file in a directory that doesn't exist
- Output piped through something that truncates it
- Looking at log files from a _previous_ run, not the current one
A test run that produces no data is wasted.
### 4. Run and Observe
Execute the test. Read the actual output — every line of it. Don't assume what
it says.
When the data contradicts your hypothesis, believe the data. Don't rationalize
it away. The whole point of this step is to let reality override your theory.
### 5. Document Findings
Update the side note with:
- What the data showed (quote specific log lines)
- What was confirmed vs. disproved
- Updated hypothesis for the next iteration
This is critical for not losing context across attempts. Hard bugs typically
take 3-5 rounds. Without notes, you'll forget what you ruled out and waste runs
re-checking things.
### 6. Iterate
Update the hypothesis based on the new evidence. Go back to step 2. Each round
should narrow the search space.
If you're not making progress after 3 rounds, step back and question your
assumptions. The bug might be in a layer you haven't considered.
## Failure Modes to Avoid
These are the specific traps this methodology is designed to prevent. When you
notice yourself drifting toward any of them, stop and return to the cycle.
### Jumping to fixes without evidence
The most common failure. You have a plausible theory, so you "fix" it and run
again. If the theory was wrong, you've added complexity, wasted a test run, and
possibly introduced a new bug. The side note should always show "hypothesis
verified by [specific data]" before any fix is applied.
### Blaming external systems
"The model is hallucinating." "The API is flaky." "The library has a bug." These
conclusions feel satisfying because they put the problem outside your control.
They're also usually wrong.
Before blaming an external system, inspect what it actually received. A model
that appears to hallucinate may be responding rationally to stale data you
didn't know was there. An API that appears flaky may be receiving malformed
requests. Look at the inputs, not just the outputs.
### Inspecting code paths but not data
You instrument the code and prove it executes correctly — the right functions
are called, in the right order, with no errors. But the bug persists. Why?
Because the code can work perfectly while processing garbage input. A function
that correctly reads an inbox, correctly delivers messages, and correctly
formats output is still broken if the inbox contains stale messages from a
previous run.
Always inspect the _content_ flowing through the code, not just whether the code
runs. Check payloads, message contents, file data, and database state.
### Reframing the user's report instead of investigating it
When the user reports a symptom your own run doesn't reproduce, the
contradiction _is_ the evidence — the two environments differ in some way you
haven't identified yet. The wrong move is to reframe their report ("they must be
on a stale SHA", "they must be confused about what they saw", "must be a flake")
so that your run becomes the ground truth. Once you do that, every later piece
of evidence gets bent to defend the reframing, and the actual bug stays hidden.
The right move: catalogue what differs between their environment and yours (TTY
vs pipe, terminal emulator, shell, locale, env vars, prior state, build
artifacts) before forming any hypothesis. For ambiguous symptoms ("no output",
"it's slow", "it's wrong") ask one disambiguating question first — e.g., "does
does it hang or exit cleanly?" That prunes the hypothesis space before any
test run.
### Losing context across attempts
After several debugging rounds, you start forgetting what you already tried and
what you ruled out. You re-check things, go in circles, or abandon a promising
line of investigation because you lost track of where it was heading.
This is why the side note file exists. Update it after every run. When you start
a new round, re-read it first.
## Persistent State: A Special Category
Features that persist data across runs — caches, session recordings, message
queues, temp files, and database rows often cause "impossible" bugs.
The current run's behavior is contaminated by leftover state from previous runs.
When behavior seems irrational, always check:
- Is there persistent state that carries across runs?
- Was it cleared before this run?
- Is the system responding to stale data rather than current data?
This is easy to miss because the code is correct — it's the data that's wrong.
## When to Exit the Cycle
Apply the fix only when you can point to specific data from your
instrumentation that confirms the root cause. Write in the side note:
```
Root cause: [specific mechanism]
Evidence: [specific log lines / data that confirm it]
Fix: [what you're changing and why it addresses the root cause]
```
Then apply the fix, remove instrumentation, and verify with a clean run.
## Worked examples
- `examples/headless-bg-agent-empty-stdout.md`
— pipe-captured runs all passed; the user's TTY printed nothing. The
contradiction _was_ the bug. Illustrates _reproduction contradiction is data_
and _instrument data, not code paths_.