Adds a summary view that runs after Arena agents finish, so users can
compare model outputs without opening each agent's conversation first.
Summary surface:
- Agent status overview
- Files changed in common vs. unique to one agent
- Per-agent approach summary generated through that agent's own provider
- Token / runtime / line-change / file-count metrics
Selection dialog now supports:
- p — toggle preview for the highlighted agent
- d — toggle detailed diff
- Enter — select winner
- x — discard all results
- Esc — cancel
Approach summary generation:
- Each agent's summary is generated through that agent's own content
generator, keeping mixed-provider Arena sessions within their
respective auth boundaries
- 20s timeout + AbortController per agent, bounded prompt inputs
(finalText 2K, transcript 6K, diff 6K)
- Falls back to a deterministic "Changed N files ..." summary when no
per-agent generator is available or on error
Diff summary now handles binary, rename-only, and mode-only diffs;
the previous heuristic required textual +/- hunks and would have
dropped those.
Resolves#2559
* feat(core): add dynamic swarm worker tool
Add a swarm tool for ad-hoc parallel worker execution with bounded concurrency, wait-all and first-success modes, per-worker failure
isolation, and aggregated results.
Register the tool in core, prevent nested worker recursion, and document the new workflow.
* fix(core): harden swarm worker execution
Prevent swarm calls from bypassing the outer scheduler concurrency budget.
Disallow interactive question prompts in swarm workers by default, and avoid incomplete Markdown table escaping by using an HTML entity for
pipe characters. Add focused tests for the scheduler behavior, worker tool restrictions, and result formatting.
Add comprehensive documentation for the Agent Arena feature, covering
usage, configuration, best practices, troubleshooting, and limitations.
Update navigation metadata to include the new page.
This enables users to discover and learn about the multi-model comparison
capability for competitive task execution.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>