Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
- Fix generationConfigSources to preserve existing source info when auto-detecting max_tokens
- Add unit tests for max_tokens fallback logic
When modelProviders config does not specify samplingParams.max_tokens,
requests to non-Qwen models (Claude, GPT, Gemini, etc.) omit max_tokens
entirely. Many APIs default to a small value (e.g., Anthropic via
VertexAI defaults to 4096), causing long responses to be truncated
mid-generation — often breaking tool call parameters.
Fix: apply tokenLimit(model, 'output') as a fallback in
applyResolvedModelDefaults(), following the same pattern already used
for contextWindowSize and modalities auto-detection.
Output limits from tokenLimits.ts:
- Claude Opus 4.6: 128K
- Claude Sonnet 4.6 / fallback: 64K
- GPT-5.x: 128K
- Gemini 3.x: 64K
- Qwen 3.5: 64K
Made-with: Cursor
When a file is skipped because the model doesn't support a modality,
it should not be treated as an error. The error field was incorrectly
being set alongside the informational message.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Adds the required getContentGeneratorConfig mock to read-file.test.ts
and pathReader.test.ts to fix failing tests that depend on content
generator configuration.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
This fixes session corruption issues where the modality check was based on
the model name rather than the actual resolved config, causing inconsistent
behavior when the config's modalities differed from the defaults.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
- Increase description warning threshold from 500 to 1,000 characters
- Change system prompt 10,000 char limit from error to warning
- Remove intermediate 5,000 char warning threshold for system prompts
- Update documentation to reflect soft warning behavior
This provides more flexibility for users while still guiding them
toward better practices.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Restructures the convertOpenAIResponseToGemini method to set
response.candidates within the choice conditional block, making
the empty choices handling more explicit and the control flow clearer.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
- Add 'array' type support to SettingItemDefinition
- Change hooks field from object to array type
- Add additionalProperties constraint for env fields
- Fix additionalProperties generation to only apply for object types
This ensures the hooks configuration schema correctly represents hooks as an array
and properly validates environment variable objects.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Replay raw PTY output through a terminal buffer to properly collapse
carriage-return progress updates (e.g., git clone progress) instead of
showing all intermediate states.
This ensures the final output reflects what a user would actually see
in their terminal, with progress bars and spinners collapsed to their
final state.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
The Terminal.prototype.write mock's data parameter needed to accept
string | Uint8Array to match the actual method signature.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
StartSessionEvent requires getTruncateToolOutputThreshold,
getTruncateToolOutputLines, getIdeMode, getShouldUseNodePtyShell,
and getHookSystem from the config object.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
When PTY outputs data rapidly, the headless terminal rendering is
slower than appending buffers, and the render queue can be overrun
before finalize() races on exit. Additionally, node-pty can fire
onExit before all onData events have been delivered.
This fix:
- Captures raw output immediately before processing
- Uses setImmediate drain mechanism to flush late onData events
- Builds output from raw chunks instead of xterm buffer to avoid
scrollback limit data loss
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Replace average-line-length estimation with explicit head/tail budgets.
Remove line wrapping; save original content to file. Handle very long
lines by truncating with ellipsis. Add tests for edge cases with
variable line lengths.
This ensures truncated output stays predictably near the character
threshold, avoiding cases where long lines in the tail would blow past
the budget.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Remove the summarizeToolOutput setting and related functionality.
This feature allowed LLM-based summarization of shell tool output but is no longer needed.
This simplifies the codebase by removing unused summarization logic and configuration options.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
- Extract truncateAndSaveToFile to utils/truncation.ts with tests
- Move truncation handling from CoreToolScheduler to ShellTool
- Remove outputFile field from ToolCallResponseInfo and display types
- Add line limit constraint alongside character threshold for truncation
This improves separation of concerns by handling output truncation at the tool level where the output is generated, rather than centrally in the scheduler.
- Remove deprecated fields: embedding_model, api_key_enabled, vertex_ai_enabled, log_prompts_enabled
- Add new fields: truncate_tool_output_threshold, truncate_tool_output_lines, hooks, ide_enabled, interactive_shell_enabled
This aligns telemetry data with the current CLI configuration options.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Remove the enableToolOutputTruncation boolean setting. Users can now
disable truncation by setting truncateToolOutputThreshold to 0 or a
negative value instead of using a separate toggle.
This simplifies the configuration and prevents users from accidentally
disabling truncation which could cause memory issues with large tool
outputs.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
- Install AbortController before awaiting previous prompt so session/cancel
during the wait targets the correct prompt
- Check if cancelled after waiting for previous prompt to complete
- Drop untagged streamEnd events when a tagged stream is active
This prevents race conditions where a new prompt could be incorrectly
cancelled or have its state cleared by stale events from a previous prompt.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Add `forRequestId` parameter to `sendStreamEnd()` to detect and ignore
stale calls when a newer request has taken over shared state.
This prevents stale handleSendMessage invocations from emitting
streamEnd events tagged with the wrong request ID.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
- Add pendingPromptCompletion tracking to ensure new prompts wait for
previous prompt's tool results before reading chat history
- Add requestId correlation between streamStart/streamEnd to detect and
discard stale events from cancelled requests
- Guard against duplicate streamEnd messages
- Preserve isWaitingForResponse during cancel to prevent auto-submit
This fixes malformed history issues in VS Code extension where rapid
prompt cancellation and resubmission caused tool_call → user_query →
tool_result ordering instead of the correct sequence.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Previously, `generateQualitativeInsights` used `Promise.all` with a
`generate` helper that re-threw errors. A single LLM call failure
(timeout, rate limit, JSON parse error) caused the entire `qualitative`
object to become `undefined`, hiding all detailed report sections.
Now individual `generate` calls catch errors and return `undefined`
instead of throwing. The `QualitativeInsights` interface fields are
made optional so partial results render correctly — each React section
component already guards against missing data with `if (!field) return
null`.
Made-with: Cursor
Replace isContinuation boolean with SendMessageType enum for clearer
message type semantics. Add stripOrphanedUserEntriesFromHistory() to
clean up orphaned user entries in chat history before retry operations,
preventing 'messages with role tool must be a response to preceding
message with tool_calls' API errors.
Fixes#2360
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
The streaming path (convertOpenAIChunkToGemini) already uses optional
chaining on choices and guards with `if (choice)`, but the
non-streaming path accesses choices[0] directly. Providers that return
an empty choices array cause a TypeError crash.
Made-with: Cursor
deepseek-r1 normalizes to `deepseek-r1` which does not match the
existing `^deepseek-reasoner` output pattern, causing it to fall
through to the 8K default. DeepSeek R1 supports 64K output tokens,
same as deepseek-reasoner. Without this fix, responses from
deepseek-r1 are silently truncated at 8K tokens.
Made-with: Cursor
The temporary debug log session setup at the start of loadSettings() was
removed along with unused imports (setDebugLogSession, sanitizeCwd). The
resolvedWorkspaceDir variable is now defined where it's actually used.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>