qwen-code/integration-tests
Shaojin Wen d40f3e975e
Some checks are pending
Qwen Code CI / Lint (push) Waiting to run
Qwen Code CI / Test (push) Blocked by required conditions
Qwen Code CI / Test-1 (push) Blocked by required conditions
Qwen Code CI / Test-2 (push) Blocked by required conditions
Qwen Code CI / Test-3 (push) Blocked by required conditions
Qwen Code CI / Test-4 (push) Blocked by required conditions
Qwen Code CI / Test-5 (push) Blocked by required conditions
Qwen Code CI / Test-6 (push) Blocked by required conditions
Qwen Code CI / Test-7 (push) Blocked by required conditions
Qwen Code CI / Test-8 (push) Blocked by required conditions
Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions
Qwen Code CI / CodeQL (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run
E2E Tests / E2E Test - macOS (push) Waiting to run
fix(test): restore abort-and-lifecycle stdin-close test to pre-#3723 version (#3777)
* fix(test): restore abort-and-lifecycle stdin-close test to pre-#3723 version

#3723 rewrote `should handle control responses when stdin closes
before replies` in a way that flipped its semantics:

- Old: canUseTool sleeps 1s before allowing; asyncGenerator awaits
  `inputStreamDonePromise` so stdin closes WHILE the control reply
  is still in flight; expects `original content` (the in-flight
  tool must NOT execute). Tests CLI robustness when stdin closes
  before replies — matching the test name.

- New: canUseTool returns `allow` immediately; stdin stays open
  until the second result arrives; expects `updated`. Requires
  the LLM to actually call write_file → receive tool result →
  reply 'done'. The test name still says "stdin closes before
  replies", but it no longer tests that.

The new version times out (testTimeout 5min, retry x2 = 900s) on
both macOS and Linux on every push since #3723, because it depends
on LLM tool-calling behavior that isn't deterministic on the CI
endpoint. CI history shows the pre-#3723 version was stable across
30+ runs.

This restores only the test file. The shared permissionFlow,
coreToolScheduler/Session wiring, and e2e workflow `npm run bundle`
step from #3723 are kept intact.

* test(integration): add timeout and unify loop into race chain

Address review feedback on the restored test:

- firstResultPromise / secondResultPromise now have a 30s setTimeout
  reject path, matching the pattern used by canUseToolCalledPromise
  and inputStreamDonePromise (15s). Without these, a hang in the
  result stream falls back to the global Vitest testTimeout (5min)
  with no useful diagnostic.

- loop() is now retained as `loopPromise` and joined into the await
  chain via `Promise.race`. If the iterator throws or the consumer
  exits unexpectedly, the failure surfaces directly to the test
  instead of becoming an unhandled rejection while the test waits
  on side-channel promises.

* test(integration): close pseudo-pass paths in stdin-close lifecycle test

Address review feedback. Each change maps to a specific finding:

- Guard canUseTool by toolName === 'write_file' AND file_path against
  the target absolute path. The model may issue read_file or call
  write_file with an unexpected path; those must not satisfy the
  permission-control timing harness, otherwise the test could pass
  without exercising the intended path.

- Capture the second SDK result and assert it's defined, so the
  Promise.race below can no longer short-circuit silently.

- Replace `Promise.race([..., loopPromise])` with a rejection-only
  loopError partner. Loop completion alone (e.g. iterator ends before
  canUseTool is invoked) must not short-circuit the awaited
  milestones; only loop errors should fail the test.

- Restore absolute path via `helper.getPath('test.txt')` and embed it
  in the prompt, so the file the test asserts on is unambiguously
  the same one the model is asked to write.

- Wrap timing promises in a `boundedPromise` helper that clears its
  timeout on resolve, eliminating dangling timers on success runs.

- Drop the unconditional `console.log(JSON.stringify(...))` in the
  consumer loop to reduce CI retry noise.

Out of scope (acknowledged but deferred): the test still requires
the model to actually emit a write_file tool call; with the new
15s/30s bounded timeouts, an LLM that fails to call write_file now
fails fast with a labeled error ("canUseTool callback not called
timeout after 15000ms") instead of hanging to the global 5-min
testTimeout. Making the test fully model-independent would require
a control-only path that doesn't go through tool dispatch — out of
scope for this regression fix.

* test(integration): defer phase timers in stdin-close lifecycle test

Address review suggestion: the 15s budgets on canUseToolCalled and
inputStreamDone started counting at promise creation, but those phases
only begin after firstResult (30s budget) resolves. On a slow CI run
where the first LLM round-trip exceeds 15s, those timers would reject
before their phase even starts, surfacing a misleading
"canUseTool callback not called" error when the actual cause was
first-result latency.

Add an explicit `startTimer()` to boundedPromise and arm each timer
only when its phase actually begins:

- firstResult: armed immediately (begins with the query).
- canUseToolCalled / inputStreamDone / secondResult: armed inside
  createPrompt right after firstResult resolves, so first-turn latency
  cannot eat into their budgets.

This also makes timeout errors point at the correct phase if any of
them does fire.
2026-05-02 21:39:43 +08:00
..
cli feat(core): event monitor tool with throttled stdout streaming (Phase C) (#3684) 2026-05-02 20:57:26 +08:00
concurrent-runner feat(web-search): remove built-in web_search tool, replace with MCP-based approach (#3502) 2026-04-24 11:29:02 +08:00
fixtures/settings-migration refactor: remove summarizeToolOutput feature 2026-03-15 13:51:32 +08:00
hook-integration feat(hooks): Add HTTP Hook, Function Hook and Async Hook support (#2827) 2026-04-16 10:10:33 +08:00
interactive test(integration): match new cron notification format in interactive tests (#3402) 2026-04-17 22:56:34 +08:00
sdk-typescript fix(test): restore abort-and-lifecycle stdin-close test to pre-#3723 version (#3777) 2026-05-02 21:39:43 +08:00
terminal-bench Terminal Bench Integration Test (#521) 2025-09-05 17:02:03 +08:00
terminal-capture feat(core): event monitor tool with throttled stdout streaming (Phase C) (#3684) 2026-05-02 20:57:26 +08:00
channel-plugin.test.ts docs(channels): add plugin developer guide and rename mock to plugin-example 2026-03-27 03:19:34 +00:00
globalSetup.ts feat(memory): managed auto-memory and auto-dream system (#3087) 2026-04-16 20:05:45 +08:00
test-helper.ts fix(integration-tests): honor stdinDoesNotEnd option (#2966) 2026-04-18 09:17:27 +08:00
test-mcp-server.ts # 🚀 Sync Gemini CLI v0.2.1 - Major Feature Update (#483) 2025-09-01 14:48:55 +08:00
tsconfig.json chore: rename @qwen-code/sdk-typescript to @qwen-code/sdk 2025-12-05 21:47:26 +08:00
vitest.config.ts add doc for hooks and skip integration test 2026-03-12 07:44:26 -07:00
vitest.terminal-bench.config.ts Fix E2E caused by Terminal Bench test (#529) 2025-09-08 10:51:14 +08:00