This aligns the test with the updated error handling that uses `status` instead of `code` for HTTP status codes.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
- Add cachedContentTokenCount tracking in uiTelemetry service
- Collect cached_content_token_count from streaming usage metadata
- Use cached tokens instead of estimated overhead when available
- Fix messages token calculation to avoid 'messages = 0' issue
This improves context window display accuracy when using providers
that support prefix caching (e.g., DashScope).
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Add comprehensive documentation for the Agent Arena feature, covering
usage, configuration, best practices, troubleshooting, and limitations.
Update navigation metadata to include the new page.
This enables users to discover and learn about the multi-model comparison
capability for competitive task execution.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
- Move getErrorStatus from retry.ts to errors.ts for better organization
- Add getErrorType utility to extract error class/category names
- Enhance getErrorMessage to include cause chain for better debugging
- Refactor ApiErrorEvent to use options object pattern (more readable)
- Rename 'error' to 'error_message' in ApiErrorEvent for clarity
- Make isQwenQuotaExceededError more precise: requires status=429,
code='insufficient_quota', and 'free allocated quota exceeded' message
- Update all tests to match new error detection behavior
This improves error telemetry and makes quota detection more reliable.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
- Add truncateToolOutput helper in truncation.ts to centralize threshold reading, file saving, and telemetry logging
- Refactor shell.ts to use the new helper, removing duplicate code
- Add truncation support for MCP tool output while preserving non-text content (images, audio, resources)
- Refactor getDisplayFromParts to work on transformed Part[] instead of raw MCP response
This reduces code duplication and ensures consistent truncation behavior across shell and MCP tools.
- Added candidatesTokens prop to LoadingIndicator for displaying token counts.
- Updated formatting to show elapsed time and token counts inline.
- Refactored tests to validate new token display functionality and formatting changes.
- Introduced formatTokenCount utility for consistent token count representation.
This improves user feedback during loading states by providing clearer information on token usage.
- Add hasExplicitOutputLimit() to detect models with defined output limits
- For known models: cap user max_tokens to model limit (avoid API errors)
- For unknown models (deployment aliases, self-hosted): respect user config
- Update tests to cover new behavior
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>