qwen-code

mirror of https://github.com/QwenLM/qwen-code.git synced 2026-05-05 23:42:03 +00:00

History

Shaojin Wen cae09279fa fix(cli): bound SubAgent display by visual height to prevent flicker (#3721 ) * fix(cli): bound SubAgent display by visual height to prevent flicker The SubAgent runtime display used hard-coded MAX_TASK_PROMPT_LINES=5 and MAX_TOOL_CALLS=5 plus character-length truncation (`length > 80`). On narrow terminals the soft-wrapped content overflowed the available height as the tool-call list grew, forcing Ink to clear and redraw on every update. Pull AgentExecutionDisplay onto the same visual-height/visual-width slicing pattern that ToolMessage and ConversationMessages already use: - Add `sliceTextByVisualHeight` to textUtils — counts soft wraps as visual rows, supports top/bottom overflow direction. - AgentExecutionDisplay now derives maxTaskPromptLines / maxToolCalls from the assigned `availableHeight` and uses `truncateToVisualWidth` (CJK + emoji safe) instead of substring(0, 80). Compact mode is unchanged. - Drop the 300 ms debounced `refreshStatic` AppContainer fired on every terminalWidth change — that was a flicker source on resize and the static area no longer needs the refresh. Tests: - textUtils.test.ts covers undefined maxHeight, top/bottom overflow, and soft-wrap counting. - AgentExecutionDisplay.test.tsx asserts the height-bounded render keeps the prompt + tool list inside the assigned rows. - AppContainer.test.tsx asserts width-only changes no longer clear the terminal. * test(tui): add SubAgent flicker regression script and ANSI counter Two reusable tools for measuring TUI flicker: - `scripts/measure-flicker.mjs` — standalone Node script that counts the ANSI escape sequences which betray flicker (clearTerminalPair, clearScreen, eraseLine, cursorUp) inside any recorded raw stream (`script` log, `tmux pipe-pane` output, custom PTY capture). Supports baseline diff mode. - `integration-tests/terminal-capture/subagent-flicker-regression.ts` — end-to-end ratchet that boots a mock OpenAI server, drives a real qwen process through an `agent` tool dispatch + 5 `read_file` SubAgent rounds, then reads PTY bytes and asserts ANSI-redraw counts stay below configured ceilings. Mirrors PR #43f128b20's resize-clear-regression pattern. Reference numbers (60-col / 18-row terminal, fixed build): clearTerminalPair=5, clearScreen=10, eraseLine=440, cursorUp=132 The ratchet defaults to 10/20 ceilings — roughly 2× steady state — so regressions like reverting sliceTextByVisualHeight or restoring the width-driven refreshStatic trip the build. Implementation notes captured in the script's docstring: - Strips HTTP_PROXY family env vars (NO_PROXY isn't honored by undici, so corp proxy would otherwise hijack the loopback request). - Drops `--bare` (bare mode hard-codes the registered tool set and rejects the `agent` tool); HOME is sandboxed to a temp dir instead. - Mock server speaks SSE because the CLI requests stream:true. * fix(cli): address inline review on SubAgent flicker fix Three issues from inline review on PR #3721: 1. availableHeight as total budget (Critical). The previous formula only constrained prompt + tool-call height, not the surrounding header / section labels / gaps / footer. Default and verbose mode could still overrun the parent-provided budget. Subtract a fixed-row overhead (10 rows running, 18 completed) before computing `maxTaskPromptLines` / `maxToolCalls`. Add unit tests that assert the rendered frame line-count stays within `availableHeight` for both running and completed states. 2. Ratchet that actually distinguishes fix from no-fix. The previous `clearTerminalPair` / `clearScreen` ceilings passed for both fixed and unfixed builds. Add an `eraseLine` upper bound (default 460) — that's the metric whose drop reflects the in-place-update efficiency the visual-height fix delivers (no-fix observed 469, with-fix 434). Refresh docstring with the current numbers and a coverage map that honestly states what this ratchet does and does not exercise. 3. Keypress scope. `useKeypress` was active on every mounted `AgentExecutionDisplay`, including completed/historical instances in chat history — Ctrl+E / Ctrl+F would toggle them all in lock-step and cause large scrollback reflows. Gate `isActive` on `data.status === 'running'`. Test mock now also honors `{ isActive }` so the new "completed displays ignore Ctrl+E" regression is enforceable. * fix(cli): address round-2 inline review on SubAgent flicker Three follow-up issues from inline review on PR #3721: 1. sliceTextByVisualHeight reservedRows early-return (Critical). The early return compared `visualLineCount <= targetMaxHeight` and ignored `reservedRows`, so a caller asking us to keep one row free for a footer could still receive the full input back with `hiddenLinesCount: 0` even though only `targetMaxHeight - reservedRows` content rows were actually available. Compare against `visibleContentHeight` instead and add a regression test for the `'a\nb\nc' / 3 / reservedRows: 1` case the reviewer flagged. 2. Footer hint and rendered prompt now share one slicing result (Suggestion). Previously `hasMoreLines` looked at `data.taskPrompt.split('\n').length` (hard newlines only), but the prompt body was already truncated by `sliceTextByVisualHeight` (which counts soft wraps). A long single-line prompt could be visually truncated without the footer ever surfacing the "ctrl+f to show more" hint. Lift the slice into the parent component and feed both the rendered `TaskPromptSection` and the footer's `hasMoreLines` from the same `hiddenLinesCount`. 3. Running → completed transition test (Critical). The previous "completed displays ignore Ctrl+E" test rendered already-completed data, so `useKeypress` was inactive from the start and Ctrl+E was a no-op trivially. It missed the real path: a running subagent gets expanded, then completes while preserving the expanded `displayMode` — which is exactly when the completed-state budget has to hold the layout. Replace the test with a `rerender`-based one that runs the full transition, asserts the completed expanded frame stays within `availableHeight`, and asserts the post-transition Ctrl+E is a no-op. Bumped `COMPLETED_FIXED_OVERHEAD` from 18 to 22 to accommodate the ExecutionSummary + ToolUsage block accounting that the new transition test exposed. * fix(cli): gate SubAgent useKeypress on isFocused for parallel runs Per @yiliang114's review on PR #3721 — `data.status === 'running'` alone fixes the historical/scrollback case but two SubAgents running in parallel both stay `running`, so a single Ctrl+E / Ctrl+F still toggles them in lock-step and the dual reflow brings back the flicker the gating was meant to prevent. The component already receives `isFocused` from ToolMessage (via SubagentExecutionRenderer) for the inline confirmation prompt — reuse it on the keypress hook: isActive: data.status === 'running' && isFocused Adds a regression test that renders a running SubAgent with `isFocused={false}` and asserts Ctrl+E is a no-op (frame unchanged). --------- Co-authored-by: wenshao <wenshao@U-K7F6PQY3-2157.local>		2026-04-29 22:34:55 +08:00
..
scenarios	feat: add bugfix workflow, test-engineer agent, and debugging skills	2026-04-04 18:30:09 +08:00
motivation.md	feat(terminal-capture): add streaming capture with GIF generation	2026-03-05 17:46:09 +08:00
package.json	fix: upgrade @lydell/node-pty to 1.2.0-beta.10 to fix PTY FD leak	2026-04-01 07:55:56 +08:00
run.ts	feat: add terminal-capture for CLI screenshot automation	2026-02-14 21:34:42 +08:00
scenario-runner.ts	fix(cli): improve /btw overlay UX — layout, dismiss hints, and history cleanup	2026-03-21 01:07:02 +08:00
subagent-flicker-regression.ts	fix(cli): bound SubAgent display by visual height to prevent flicker (#3721 )	2026-04-29 22:34:55 +08:00
terminal-capture.ts	feat(cron): add interactive E2E tests and fix cron trigger reactivity	2026-03-29 04:22:28 +00:00