qwen-code

mirror of https://github.com/QwenLM/qwen-code.git synced 2026-05-19 16:28:28 +00:00

History

Shaojin Wen 84ecb5b8a3 feat(cli): add --json-schema for structured output in headless mode (#3598 ) * feat(cli): add --json-schema for structured output in headless mode Registers a synthetic `structured_output` tool whose parameter schema IS the user-supplied JSON Schema. In headless mode (`qwen -p`), the first successful call terminates the session and exposes the validated payload via the result message's `structured_result` field. Invalid schemas are rejected at CLI parse time via a new strict Ajv compile helper so they can't silently no-op at runtime. * fix(cli): honour "first structured_output call ends session" + reject non-object root schemas Two review fixes for the `--json-schema` feature: 1. `runNonInteractive` now breaks out of the tool-call loop as soon as the first successful `structured_output` invocation is captured, rather than continuing to execute any trailing tool calls the model emitted in the same turn. This restores the documented single-shot contract and prevents side-effecting tools from running after the final answer has already been accepted. 2. `resolveJsonSchemaArg` rejects schemas whose root `type` is anything other than "object" (or a type array including "object"). Function- calling APIs require tool arguments to be JSON objects, so a schema like `{"type": "array"}` would have registered an unusable synthetic tool the model could never satisfy. Absent `type` and `type: "object"` remain accepted. Adds tests for both paths and updates the existing Ajv-compile test to exercise that path without tripping the new root-type guard first. * fix(cli): also reject root anyOf/oneOf schemas whose branches can't accept objects Addresses a review follow-up: the previous root-object check only inspected the top-level `type` keyword, so a schema like `{"anyOf":[{"type":"array"},{"type":"string"}]}` slipped through even though none of its branches can ever validate the object-shaped arguments that function-calling APIs send. Replace the single `type` check with `schemaRootAcceptsObject`, which recursively walks root-level anyOf/oneOf branches and requires at least one to accept objects. Absent `type`, `type: "object"`, `type: ["object", ...]`, and mixed anyOf branches where one accepts object all still pass. `allOf` is left to Ajv's runtime behaviour — guessing intent across contradictory allOf branches at parse time is fragile. * fix(cli): propagate exitCode from --json-schema failure path + tests Address two PR-3598 review findings: 1. gemini.tsx unconditionally called process.exit(0) after runNonInteractive/runNonInteractiveStreamJson, clobbering the process.exitCode = 1 set by nonInteractiveCli.ts when the model emits plain text instead of the structured_output tool. Switch both call sites to process.exit(process.exitCode ?? 0) so CI can detect the failure via the exit code. 2. nonInteractiveCli.test.ts: strengthen the structured-output success path to assert registry.abortAll() is called and that the stdout result envelope carries the JSON-stringified args under `result` plus the raw object under `structured_result`. Add a retry-path test that mocks executeToolCall to return an error on the first structured_output call, then verifies sendMessageStream is called a second time so the model can retry rather than the session terminating early. * fix(cli): suppress non-structured tool calls when structured_output is in the same turn When --json-schema is active and the model emits a batch like [write_file(...), structured_output(...)], the previous implementation ran the leading side-effecting tool before accepting the structured result, violating the "structured_output is the terminal contract" guarantee. The trailing-only break also let an invalid first structured_output fall through to subsequent tools before the retry turn. Pre-scan the batch: if a structured_output request is present, execute ONLY the first one and skip everything else (leading and trailing). This is consistent with the existing terminal-path semantics — the suppressed tool_use blocks lack a matching tool_result, the same way max-turns / cancellation leave the stream. Adds a test covering the reverse-order [side_effect, structured_output] case alongside the existing trailing-suppression and retry tests. * fix(cli): tighten --json-schema root validation per review feedback Three small holes flagged in the latest pass: 1. `schemaRootAcceptsObject` returned early when a root `type` keyword was present, ignoring sibling `anyOf`/`oneOf`. JSON Schema applies keywords at the same level conjunctively, so e.g. `{type:"object", anyOf:[{type:"string"}]}` is unsatisfiable for any value but used to pass. Now both `type` AND any sibling `anyOf`/`oneOf` must independently admit object. 2. The FatalConfigError text said "Every branch of a root anyOf/oneOf must be satisfiable by an object", but the actual logic only requires at least one branch (and tests still accept `anyOf:[object, string]`). Reworded to "at least one branch" so the message matches the behaviour. 3. `compileStrict` used `typeof schema !== 'object'` to gate input, which lets arrays through (`typeof [] === 'object'`). The contract says "schema must be a JSON object", so add an `Array.isArray` check so array input gets the intended error rather than a less helpful Ajv compile message. Tests cover the new rejection paths and the array case. * fix(cli): handle root $ref and allOf in --json-schema accept-object check `schemaRootAcceptsObject` previously only inspected `type`, `anyOf`, and `oneOf` at the root, so a couple of unsatisfiable shapes still slipped through: 1. `{"$ref":"#/$defs/Foo","$defs":{"Foo":{"type":"array"}}}` would be accepted because we don't follow $refs, but registers a synthetic tool whose params resolve to "array" — the model can never produce a valid object. Now reject any root $ref unless the user adds a sibling `type:"object"` as an explicit anchor. 2. `allOf` was deferred to Ajv runtime, but allOf is conjunctive at the same level as `type` / `anyOf` / `oneOf`, so an entry like `{"allOf":[{"type":"object"},{"type":"string"}]}` is unsatisfiable for any value. Walk it like the others, requiring every branch to admit object. Tests cover the new $ref-rejected / $ref+anchor-accepted paths and the allOf reject/accept paths. * fix(cli): explicit exit code from runNonInteractive + pair suppressed tool calls Three review threads on the structured-output flow: 1. The break that ends the for-loop on a successful structured_output call sat before the responseParts.push and modelOverride capture. SyntheticOutputTool currently returns neither, so it was safe today — but anyone wiring extra signals into the synthetic tool later would see them silently dropped. Move the break after both captures so the contract is explicit, not implicit. 2. The failure path used to set process.exitCode = 1 and return void, relying on global mutable state across an async boundary. Any cleanup task between runNonInteractive and process.exit could silently turn the structured-output failure into exit 0. Switch runNonInteractive to Promise<number>, return 0 / 1 directly from each function-level exit, and have gemini.tsx use the captured return value. 3. The pre-scan from the prior commit suppresses sibling tool calls when structured_output is in the same turn. On the retry path — when structured_output fails validation — the next-turn payload has tool_result for structured_output but no entry for the suppressed siblings, leaving the prior assistant turn's tool_use blocks unpaired. Anthropic and OpenAI both reject that batch shape, so the retry would surface as an opaque provider error. Synthesize a "skipped" functionResponse for every suppressed call so every tool_use in the prior assistant message has a matching tool_result. Tests cover the new retry-pairing contract and update the existing plain-text-failure test to assert on the return value rather than process.exitCode. * fix: address Copilot follow-up review on --json-schema scaffolding Five small but real findings flagged on the latest pass: 1. core/src/index.ts re-exported `SyntheticOutputTool` via `export type`, but it's a runtime class — that erased it from the emitted JS and would break value imports. Split into a value `export { ... }` and a `export type { StructuredOutputParams }`. 2. The structured-output success path returned without flushing `localQueue` notifications or finalising one-shot monitors. If a background task had already emitted `task_started`, exiting here could drop its paired `task_notification` and leave SDK consumers with unpaired lifecycle events. Mirror the regular terminal path's `flushQueuedNotificationsToSdk` + `finalizeOneShotMonitors` calls before `emitResult`. 3. `schemaRootAcceptsObject` ignored the `not` keyword, so `{not:{type:"object"}}` (which forbids every object value) slipped through. Add a best-effort `not` check that rejects when `not.type` directly excludes object. Deeper negated patterns still fall through to Ajv at runtime. 4. `compileStrict`'s JSDoc claimed it errored on "Ajv versions we can't support", but the function doesn't actually check Ajv versions. Reword to "malformed or uses unsupported draft/features for our Ajv configuration" so the contract matches the implementation. 5. The pre-scan suppressed sibling tool calls but only synthesised tool_result events for them on the retry path — the success path left those tool_use blocks unpaired in the emitted JSONL/stream-json event log. Move the synthesis after the for loop so it runs for both the success break and the validation-failure fall-through; the event log is now consistent regardless of which path the run takes. Tests cover the new \`not\`-rejection paths, the success-path tool_result synthesis, and the existing retry-pairing test still passes against the restructured emit ordering. * fix(cli): tighten --json-schema parse-time gate per Copilot review Two more shapes that used to slip through: 1. `schemaRootAcceptsObject` defaulted to true when no narrowing keyword was present, so root-value constraints like `{const: 1}` or `{enum: [1, 2]}` registered an unsatisfiable structured_output contract — the model could never produce a value matching the tool's parameter schema, and the run would loop on validation failures until max-turns. Reject `const` whose literal isn't an object, and `enum` whose members include no object. 2. The yargs check rejected `--json-schema` with `-i` and with no prompt, but not with `--input-format stream-json`. Stream-json keeps the process open waiting for protocol messages, so "terminate on the first valid structured_output call" silently drops everything queued after that point. Refuse the combination at parse time so the contradiction surfaces immediately. Tests cover the new const/enum reject and accept paths. * fix(cli): handle empty/boolean subschemas + allow stdin-only prompt Three more shapes flagged on the latest review pass: 1. `schemaRootAcceptsObject` treated an empty root `anyOf`/`oneOf` as "no constraint" (skipped when length === 0), but per JSON Schema an empty union is unsatisfiable — no value can match a member of the empty set. Reject those at parse time so users get a clear parse error instead of an opaque runtime never-validates loop. 2. JSON Schema (draft-06+) allows boolean subschemas anywhere a schema is accepted: `true` matches every value, `false` matches nothing. The `anyOf`/`oneOf`/`allOf` walks were rejecting booleans via the typeof-object guard, which incorrectly rejected `{anyOf:[true]}` and `{allOf:[true,{type:"object"}]}` while letting `{anyOf:[false]}` slip through. Replace the per-branch object guard with a `variantAcceptsObject` helper that treats `true` as accepting and `false` as rejecting, then recurses on object subschemas. 3. The yargs `.check` rejected `--json-schema` when no `-p` / positional prompt was given, but the headless CLI also reads the prompt from stdin (`cat prompt.txt \| qwen --json-schema '...'`) — a legit usage pattern that was being blocked. Drop the parse-time no-prompt rejection; the existing runtime "No input provided via stdin..." error in gemini.tsx still catches genuinely empty input. Tests cover the empty-union, all-`false`, mixed-boolean accept, and `false`-in-allOf reject paths. Live-verified against the bundled CLI: `echo "..." \| qwen --json-schema '...'` now reaches the model call, and the four schema edge cases all surface the expected error text or proceed past parse time. * docs(core): note SyntheticOutputTool as the value-export exception The block comment above the lazy-load type re-exports said tool classes "are now lazy-loaded and are not exported as values from the package root", but `SyntheticOutputTool` was just promoted to a runtime export in `62038527c` so the CLI's `--json-schema` flow can construct it from the package root. Document that exception inline so downstream consumers reading the comment don't get told the wrong story. * fix(cli): try every structured_output in a same-turn batch in order The pre-scan used to pick only the FIRST structured_output call from a turn and suppress everything else, even other structured_output calls. That created two avoidable failure modes: 1. `[structured_output(bad), structured_output(good)]` would attempt only the bad one, fail validation, and force a full retry turn. The model already produced a valid structured payload — we should try it before asking again. 2. The trailing structured_output's tool_result was synthesised with the "Skipped: structured_output was also requested in this turn..." message, which is misleading because that call WAS the structured output we should have tried. Filter `requestsToExecute` to ALL structured_output calls (in original order) when --json-schema is active, and let the existing loop break on the first success. Track an `executedCallIds` set, then synthesise tool_result + retry parts after the loop for every tool_use the model emitted that we never actually executed — covering both non-structured siblings (always suppressed) and any structured_output left over after the success break (only one terminal contract per turn). Reworded the synthesised "skipped" output to "this turn's structured_output contract took precedence" so it reads correctly regardless of whether the suppressed call was structured or not. Tests cover the multi-structured retry-free success path; the existing single-structured retry and trailing/leading suppression tests still pass against the updated emit ordering. * fix: address gpt-5.5 review on --json-schema (privacy + $ref + core-tools) Three findings, three changes: 1. Reject every root `$ref` in --json-schema, even with a sibling `type: "object"` anchor. Ajv applies `$ref` conjunctively with sibling keywords, so the previous "accept when type:object is present" carve-out was unsound: `{type:"object",$ref:"#/$defs/Foo", $defs:{Foo:{type:"array"}}}` parsed fine but no object value can satisfy both at runtime, leaving the model to loop until maxTurns. Updated docstring + test cases (replaced the accept-with-anchor case with a reject case for both anchored and well-formed $ref shapes — users wanting composition should inline at the root). 2. Redact `function_args` for structured_output in ToolCallEvent. The args ARE the user's structured payload (already emitted via stdout `result` / `structured_result`); recording them again as ordinary tool-call function_args duplicates that data into OTLP exports, QwenLogger, ui-telemetry, and the chat-recording UI event mirror — surfaces that can leak off-device. Replace with a stable `__redacted` placeholder so consumers still see the call happened (duration, success, decision metrics preserved) but the payload itself doesn't ride along. Two new uiTelemetry tests cover the redacted vs non-redacted paths. 3. Document and test that structured_output bypasses the --core-tools allowlist (same as agent / skill / exit_plan_mode / ask_user_question etc.). The synthetic tool only exists when --json-schema is set, so adding it to CORE_TOOLS would let `--core-tools read_file --json-schema X` silently drop the terminal contract and loop the model until maxTurns — bypass is intentional. Expanded the CORE_TOOLS docstring to enumerate the synthetic-tool exclusions and added a permission-manager test mirroring the pattern used for agent / skill / exit_plan_mode. * fix(cli): apply structured_output terminal handling to drain turns The synthetic structured_output tool is registered for the entire headless session, so it can be invoked from EITHER the main assistant-turn loop OR from a drain turn (queued cron-job / notification reply). The drain path (drainOneItem) was treating it like any other tool: execute, append the response back into itemMessages, keep going. The submitted args were never captured and no structured_result envelope was emitted, so a run that legitimately satisfied --json-schema mid-drain ended up failing the contract with "Model produced plain text..." anyway. Apply the same terminal handling to drain turns: - Hoist `structuredSubmission` to session scope so both paths write to one variable. - In `drainOneItem`, run the same pre-scan: when --json-schema is active and structured_output is in the batch, execute every structured_output in original order until one succeeds; suppress every non-structured sibling. Synthesise tool_results for any unexecuted tool_use the model emitted, mirroring the main path. - On capture, return early from drainOneItem so the drained item's inner while loop stops. - `drainLocalQueue` short-circuits when a captured submission is in flight, so subsequent queued items don't run. - The cron `checkCronDone` watches the same flag and stops the scheduler immediately on capture, releasing the surrounding `await new Promise(...)`. - The final holdback loop bails out on capture so monitor lifecycle doesn't extend past the structured submission. - After the holdback, before the existing failure / regular-success emit, emit the structured success envelope and return 0. Adds a focused unit test that drives the drain path end-to-end via a synchronously-fired monitor notification: main turn produces plain text, the drain reply calls structured_output, and the test asserts exit 0 + structured_result populated + no "Model produced plain text..." error. * fix(cli): address gpt-5.5 review follow-ups on --json-schema scaffolding Six review findings, six small fixes: 1. Nested $ref incorrectly rejected. `schemaRootAcceptsObject` recurses into anyOf/oneOf/allOf branches and used to apply the root-only $ref rejection at every level, blocking common composition shapes like `{anyOf:[{$ref:"#/$defs/Foo"},{type:"string"}]}`. Add an `isRoot=true` parameter; non-root recursion treats `$ref` as opaque and defers to Ajv at runtime. Tests cover nested refs in anyOf / oneOf / allOf. 2. Inaccurate package-root export comment. `core/src/index.ts` claimed `SyntheticOutputTool` was exported as a runtime value for the CLI's --json-schema flow, but the only construction is inside `Config.registerLazy` via a relative dynamic import — no value consumer reaches into `@qwen-code/qwen-code-core`. Revert to a type-only re-export so `SyntheticOutputTool` lines up with every other lazy-loaded tool class. 3. Unused constructor parameter. `SyntheticOutputTool` took `(_config: Config, schema)` but never read `_config`. Drop the parameter (and the corresponding pass-through at the registration call site) so readers don't wonder why a Config is being threaded through. 4. Tool description claimed "exactly once". The retry path explicitly tolerates multiple calls until one validates, so "Call this tool exactly once" is misleading to a model that tried twice. Reword to "Call this tool to deliver the final result; the first call with valid arguments ends the session" so the description matches the actual contract. 5. Asymmetric shutdown on the structured-output success path. The regular terminal path waits in a holdback loop until `hasUnfinalizedTasks()` is false; the structured-output path used to call `abortAll()` and flush immediately, dropping the matching `task_notification` for any agent whose natural handler hadn't yet enqueued it. Add a bounded holdback (capped at 500ms via STRUCTURED_SHUTDOWN_HOLDBACK_MS) — long enough for typical abort callbacks to enqueue, short enough that a hung agent can't block exit. 6. gemini.tsx exit-code asymmetry. `runNonInteractive` returns an explicit exit code, but `runNonInteractiveStreamJson` still reads `process.exitCode` after `runExitCleanup`. Currently safe because the yargs `.check` rejects --json-schema with stream-json input, but a future stream-json equivalent of structured output would need to plumb the exit code through the return value too. Document this in a comment so the constraint is visible at the call site. Plus: strengthen `synthesises tool_result for suppressed sibling calls when structured_output fails validation` to assert the failed structured_output's `functionResponse.response` carries the actual validation error string ("args invalid"), not the synthesised "Skipped:" prose — a regression that overwrote it would otherwise slip past the existing pairing assertion. * fix(cli): close --json-schema gaps surfaced in self-audit + review Five fixes layered onto the same robustness pass over the `--json-schema` flow: 1. bare-mode registration (`packages/core/src/config/config.ts`): `qwen --bare --json-schema X -p "..."` previously skipped the synthetic `structured_output` registration entirely (the registration block lives below the bare-mode early-return), so the model had no way to terminate and the run looped to `maxSessionTurns`. Register the synthetic tool inside the bare branch too. 2. TTY interactive rejection (`packages/cli/src/gemini.tsx`): `qwen --json-schema X` on a TTY with no `-p` and no piped stdin routes to `isInteractive=true` (priority-3 fallback) and would launch the TUI, where `structured_output` is just an inert tool that prints "accepted" and lets the chat continue. Parse-time gating can't catch this (stdin isn't probed yet at parse time), so reject at runtime before the UI launches; runs `runExitCleanup` first so MCP subprocesses get torn down. 3. drain-turn structured-success flush (`packages/cli/src/nonInteractiveCli.ts`): when a drain turn captures `structured_output`, `drainLocalQueue` returns early, leaving any items the drain didn't process in `localQueue`. The prior emit path then ran `registry.abortAll()` + `emitResult` without flushing — stream-json consumers saw `task_started` events without paired `task_notification`. Add the same 500ms holdback + `flushQueuedNotificationsToSdk` the main-turn structured-success path uses, so the two paths agree. 4. ACP mutual-exclusion (`packages/cli/src/config/config.ts`): `--acp` runs an independent `runAcpAgent` turn loop that doesn't honour the synthetic-tool terminal contract, so `--acp --json-schema X` would register the tool but never terminate. Add a yargs `.check` rejection covering both `--acp` and the deprecated `--experimental-acp` alias. 5. max-turns + Skipped wording (review comments #3198579251/#3198579389/#3198579567 from yiliang114): - `handleMaxTurnsExceededError` now appends a `--json-schema`- specific hint pointing at the common stuck-run causes (structured_output denied by `permissions.deny` / `--exclude-tools`, unsatisfiable schema, prompt didn't instruct the model). Without this, three different failures all surfaced as the same generic "increase maxSessionTurns" line. - The synthesised "Skipped:" tool_result for suppressed sibling calls drops the trailing "Re-issue this call in a separate turn if needed." sentence on the success path, where the session terminates immediately and no consumer (model or SDK) can act on the advice. Retry path keeps the sentence — the model is about to receive these parts and may legitimately re-issue. Tests cover each fix: bare-mode registration order, ACP / experimental- acp rejection (×2), `--json-schema` hint in both text and JSON max-turns output, and explicit Skipped-text assertions on the success and retry paths. * fix: address 9 self-qreview comments on --json-schema PR Folds the 9 Suggestion-level comments from the previous /qreview pass into code/test fixes. Each one is a real issue, but mostly defensive — none changes the user-visible happy path. Refactors (F4/F5/F6 — code-quality) - F4 `nonInteractiveCli.ts`: extract `SUPPRESSED_OUTPUT_SUCCESS` / `SUPPRESSED_OUTPUT_RETRY` module-level constants and a `suppressedOutputBody(structuredCaptured)` helper. Both the main-turn and drain-turn synthesis sites previously had a 4-way duplicated ternary; future wording changes can no longer drift between them. - F5 `nonInteractiveCli.ts`: extract `emitStructuredSuccess()` closure inside `runNonInteractive`. The "abortAll → bounded holdback → flush → finalize one-shot monitors → emitResult → return 0" terminal block is now defined once and called from both the main-turn and drain-turn success paths. `finalizeOneShotMonitors` is idempotent (`oneShotMonitorsFinalized` guard) so the unconditional invocation is safe even when the drain-turn already finalized monitors before reaching the helper. - F6 `core/config/config.ts`: extract `registerStructuredOutputIfRequested()` helper. The synthetic-tool registration block is no longer duplicated between the bare-mode early-return branch and the regular registration branch. Tests (F7/F8/F9 — pin existing behaviour) - F7 `nonInteractiveCli.test.ts`: new test "holds back for in-flight background tasks before emitting structured success" — flips `hasUnfinalizedTasks: true → false` mid-poll so the holdback `while` body actually executes; spies on `abortAll` and asserts ordering of `task_notification` (must precede the result envelope) and the bounded elapsed-time cap. None of the existing structured-output success tests entered this branch (they all pinned `hasUnfinalizedTasks: () => false`). - F8 `gemini.test.tsx`: new test "rejects --json-schema when running in interactive (TUI) mode" — pins the TUI guard at gemini.tsx:694, asserting the headless-only stderr message AND the exact ordering `writeStderrLine → runExitCleanup → process.exit(1)` so a future refactor can't swap any of those steps. - F9 `cli/config.test.ts`: pin the two previously-untested `--json-schema` mutual-exclusion branches: `-i`/`--prompt-interactive` and `--input-format stream-json`. The stream-json check is load-bearing — `gemini.tsx:768` explicitly relies on this rejection holding (the parse-time `process.exitCode ?? 0` plumbing in the stream-json branch is only safe because `--json-schema` can't reach it). Behaviour fixes (F1/F2/F15 — privacy / security / correctness) - F1 `core/core/geminiChat.ts`: redact `functionCall.args` for `structured_output` tool calls before passing them to `chatRecordingService.recordAssistantTurn`. Without this, the user's structured payload (already emitted on stdout via `result` / `structured_result`) was persisted verbatim to `<projectDir>/chats/<sessionId>.jsonl` and re-fed into model context on `--continue` / `--resume`, contradicting the privacy contract documented next to the existing `ToolCallEvent` redaction. Each validation-failure retry was also recorded. Now mirrors the same `__redacted` placeholder. Helper extracted as `redactStructuredOutputArgsForRecording` so it's unit-testable. - F2 `cli/config/config.ts`: `resolveJsonSchemaArg`'s `@path` reader now (a) `fs.statSync`s first to refuse non-regular files (FIFOs, character devices like `/dev/zero`, directories), (b) caps the schema file at 1 MiB so an attacker who can influence the path through a wrapping process can't OOM the run, and (c) on JSON parse failure for `@path` source emits a generic "content of <path> is not valid JSON" instead of echoing the SyntaxError — Node ≥18's SyntaxError embeds a ~10-char file-content prefix in its message, which would otherwise ride out on stderr through any wrapper that surfaces the error. Inline (non-`@path`) JSON keeps the SyntaxError detail because the user is the source. - F15 `core/tools/tool-registry.ts`: `registerTool` now also checks the lazy `factories` map for name collisions, not just the eager `tools` map. An MCP server registering a tool whose name shadows a built-in lazy factory (e.g. `structured_output`) now gets auto-qualified to `mcp__<server>__<name>`, instead of silently winning the resolution. The synthetic structured-output tool no longer needs renaming for the corner case to be safe. Targeted suite (13 changed-area test files): 883/886 pass — 3 pre-existing skips. Typecheck clean on both packages. * fix: address 3 deepseek-v4-pro qreview comments on --json-schema PR Three Suggestion-level comments from the latest /qreview pass. N1 — `schemaRootAcceptsObject` skips `if/then/else` (cli/config/config.ts): A schema like `{"if": true, "then": {"type": "string"}}` passed parse-time gating but is unsatisfiable for object-typed tool args at runtime — the model would loop until maxSessionTurns. Add a best-effort check for the two decidable shapes: - `if: true` → object MUST match `then`; if `then` excludes objects (boolean `false`, non-object `type`, etc.), reject at parse time. - `if: false` → object MUST match `else` (`true` if absent); same check. Object-schema `if` cases stay runtime-decidable and fall through to Ajv, matching the existing best-effort scope on `not`. 4 new test cases pin both reject and accept paths. N2 — subagent registries register `structured_output` too (core/config/config.ts, core/tools/agent/agent.ts, core/agents/backends/InProcessBackend.ts): `createApprovalModeOverride` and `buildSubagentContextOverride` rebuild the tool registry on a `Object.create(base)` config. `this.jsonSchema` propagates through the prototype chain, so `registerStructuredOutputIfRequested` was firing for every subagent registry rebuild — but only `runNonInteractive`'s main / drain loops detect a successful `structured_output` call as terminal. A subagent that called the tool would receive "Session will end now" and then keep running because its own loop has no terminator: wasted tokens, no structured payload on stdout. Add a `forSubAgent: true` option to `createToolRegistry` (alongside the existing `skipDiscovery`), and propagate it from both subagent rebuild sites. The structured-output registration helper short-circuits when the flag is set. Bare-mode init does NOT set the flag, preserving the F6 fix where `qwen --bare --json-schema X -p "..."` still gets the synthetic tool. New test asserts the registry rebuilt with `forSubAgent: true` registers READ_FILE / EDIT / SHELL but NOT STRUCTURED_OUTPUT. N3 — TEXT-mode `structuredResult` not integration-tested (nonInteractiveCli.test.ts): All 8 existing `--json-schema` tests pin `OutputFormat.JSON` or `STREAM_JSON`. TEXT (the default for `qwen -p ...`) has no integration coverage, so a regression in `BaseJsonOutputAdapter.buildResultMessage`'s `hasStructured ? JSON.stringify(structuredResult) : resultText` contract or in `JsonOutputAdapter.emitResult`'s text-mode `process.stdout.write(`${result}\n`)` path would only surface to plain `qwen -p` users. New test pins TEXT-mode behaviour: stdout is exactly `${JSON.stringify(structuredArgs)}\n` — no JSON envelope, no event log. Targeted suite (13 spec files): 945/948 pass — 3 pre-existing skips. Typecheck clean on both packages. * fix(cli): narrow `not` rejection in schemaRootAcceptsObject Address Critical review comment #3216123734. `schemaRootAcceptsObject`'s `not` handler previously rejected any schema whose `not.type` included `"object"`, regardless of what other constraints `not` had. That's a false positive for schemas where the extra constraints NARROW what `not` excludes: { "not": { "type": "object", "required": ["error"] } } excludes only objects with an `error` key — the value `{}` satisfies this schema fine, but the old check rejected it at parse time with "--json-schema root must accept object-typed values". Fix: only reject when `not` is exactly `{type: ...}` with no narrowing siblings (the unambiguous "every object is excluded" case). When other keywords are present (`required`, `properties`, `minProperties`, `enum`, etc.), defer to Ajv at runtime — same best-effort scope as the sibling `anyOf`/`oneOf`/`allOf` deep-content checks. 3 new test cases pin the fixed accept paths (`{not:{type:"object",required:[...]}}`, `{not:{type:"object",properties:...,required:[...]}}`, `{not:{type:"object",minProperties:1}}`). The existing reject test for bare `{not:{type:"object"}}` still passes. * refactor: dedupe structured_output handling per qreview C1/C2/C3 Three Suggestion-level review comments from the latest /qreview pass. C1 — main-turn / drain-turn `structured_output` dispatch was duplicated ~120 lines (`nonInteractiveCli.ts`) The two batch-handling sites had near-identical bodies (filter `structured_output` from the batch when `--json-schema` is active → iterate with `executeToolCall` → write to `structuredSubmission` on first valid call → synthesise tool_result events for suppressed siblings). The only meaningful difference was which `modelOverride` binding the loop wrote to (session-scoped `modelOverride` for the main turn vs per-drain-item `itemModelOverride`). Extracted `processToolCallBatch(batchRequests, setModelOverride)` defined inside `runNonInteractive`: - Closes over session-scoped state (`adapter`, `config`, `abortController`, `options`, `structuredSubmission`, `executeToolCall`, `handleToolError`, `suppressedOutputBody`, the progress-handler helpers). - Takes the `modelOverride` setter as the one call-site-specific parameter so the main turn binds to the session var and the drain binds to the per-item var. Main-turn body went from ~120 lines to a single call; drain-turn body likewise. Net file shrink ~80 lines, no behaviour change. All 42 existing structured-output tests still pass (including `stops executing remaining tool calls...`, `tries multiple structured_output calls in the same turn...`, `synthesises tool_result for suppressed sibling calls...`, `captures structured_output emitted from a drain-turn (queued notification)`). C2 + C3 — `{__redacted: '…'}` placeholder duplicated in two files (`telemetry/types.ts` + `core/geminiChat.ts`) The `ToolCallEvent` constructor (for telemetry surfaces — OTLP / QwenLogger / ui-telemetry / chat-recording UI event mirror) and `redactStructuredOutputArgsForRecording` (for the on-disk chat-recording JSONL) each had a verbatim copy of: { __redacted: 'structured_output payload (see stdout result)' } If the redaction wording (or the `__redacted` key, or the placeholder text) ever drifted between the two surfaces, the privacy contract would be subtly broken on one and not the other. Hoisted to `STRUCTURED_OUTPUT_REDACTED_ARGS` exported from `packages/core/src/tools/syntheticOutput.ts`, imported in both sites. The constant carries its rationale in a JSDoc block so future readers see both call sites at once. Targeted suite (13 spec files): 961/964 pass — 3 pre-existing skips. Typecheck clean on both packages. --------- Co-authored-by: wenshao <wenshao@U-K7F6PQY3-2157.local>		2026-05-11 14:21:55 +08:00
..
scripts	Fix: Improve ripgrep binary detection and cross-platform compatibility (#1060 )	2025-11-18 19:38:30 +08:00
src	feat(cli): add --json-schema for structured output in headless mode (#3598 )	2026-05-11 14:21:55 +08:00
vendor	feat test tool permissions	2026-03-10 16:30:22 +08:00
index.ts	fix: Remove remaining ClearcutLogger export from packages/core/index.ts	2026-02-01 14:52:14 +08:00
package.json	chore(release): v0.15.10 [skip ci]	2026-05-10 16:38:09 +08:00
test-setup.ts	feat(memory): managed auto-memory and auto-dream system (#3087 )	2026-04-16 20:05:45 +08:00
tsconfig.json	fix: upgrade @lydell/node-pty to 1.2.0-beta.10 to fix PTY FD leak	2026-04-01 07:55:56 +08:00
vitest.config.ts	Sync upstream Gemini-CLI v0.8.2 (#838 )	2025-10-23 09:27:04 +08:00