Split the legacy core speech stack into two built-in, independently toggleable plugins: `_kokoro_tts` for TTS and `_whisper_stt` for STT.
This refactor keeps dependency installation and bootstrap concerns in Docker/bootstrap/preload, while moving speech-specific tooling, APIs, prompts, UI, and runtime behavior into the plugins. Core now exposes engine-agnostic `tts-service` and `stt-service` brokers, with browser-native TTS preserved as the fallback when Kokoro is disabled.
Included in this change:
- add built-in `_kokoro_tts` plugin with plugin-owned synth API, config, status UI, and provider registration
- add built-in `_whisper_stt` plugin with plugin-owned transcribe API, mic runtime, device UI, prompt injection, and provider registration
- remove legacy core speech APIs/helpers/settings/UI and delete unused `webui/js/speech_browser.js`
- replace the old hardcoded speech settings section with a generic voice surface backed by plugin extensions
- update preload/docs/tests to match the new plugin-owned speech architecture
Behavioral intent:
- both plugins are built-in but not `always_enabled`
- users can now hot-switch TTS and STT independently
- browser TTS remains available when `_kokoro_tts` is off
- Whisper mic UI only appears when `_whisper_stt` is enabled
Replace the plugin list activation dropdown and advanced shortcut with a one-click ON/OFF switch. Keep project/profile-specific activation inside the plugin config flow, remove the old advanced-only modal, update plugin docs, and add regression coverage for the binary list toggle contract.
Deep-merge model preset slots with the active configuration so custom context windows, rate limits, and nested kwargs survive preset switches.
Treat legacy utility preset defaults as implicit values, allow omitted utility and embedding slots to inherit configured models, and document the partial-preset behavior.
Rename high-impact skills to task-oriented names and move plugin-owned skills into their owning plugin folders.\n\nAlign renamed skill frontmatter with the official SKILL.md standard by keeping trigger language in name/description metadata, replacing the old create-skill wizard with build-skill, and updating browser, A0 connector, computer-use, CLI setup, and scheduler skill references.\n\nTighten the recurring cross-provider guidance gaps surfaced by the evidence sweeps: memory requests now avoid promptinclude-file routing, scheduler prompts distinguish cron schedules from planned ISO dates, document questions prefer document_query, skills_tool search/read_file usage is clearer, normal notifications set info/priority 10, and local/host text editors preserve patch intent.\n\nUpdate regression tests for the renamed skills, plugin ownership, prompt budget reality, and standard frontmatter shape.
Summarize the provider evidence sweeps into a scored Agent Zero tool-efficiency chart.\n\nDocument the scoring rubric, ranked provider/model results, recurring failure clusters, applied prompt/skill improvements, and follow-up candidates while keeping the raw EVIDENCE files untracked.
Add a screenshot-led guide for the guided onboarding wizard using Agent Zero API as the example setup. Link the guide from the docs hub, quickstart, and installation setup flow, and include optimized screenshots for choosing Cloud, selecting a provider, choosing models, and reaching the ready state.
Refresh README, quickstart, and the docs index around Browser, Desktop, A0 CLI, projects, memory, skills, profiles, and model presets.
Add optimized scoped screenshots for Web UI, Browser, Desktop, and CLI workflows.
Trim architecture-heavy developer pages toward DeepWiki handoffs so local docs stay focused on practical user guidance.
Store and surface host-browser preparation and CDP endpoint metadata from A0 CLI.
Let Browser runtime prepare candidate CLIs before the first action, and keep host-required errors more actionable.
Simplify Host Browser settings language and document the Chrome remote-debugging consent flow.
Integrate host-browser routing into the existing Browser tool. Store connector host-browser metadata, add pending browser op resolution, select connector runtimes from Browser settings, enforce host-content privacy policy, support automatic host preparation, and document the A0 CLI host-browser flow.
Run a best-effort uv cache clean when the durable self-update manager consumes an update request. Cover uv-present, uv-missing, and cleanup-failure paths with focused tests, and document the new self-update step.
Use /a0/tmp/playwright as the Browser plugin Chromium cache and Docker install target while preserving full Chromium installs.
Add startup migration cleanup for retired usr Playwright caches, update Browser status/runtime references and docs, and cover migration behavior with focused regressions.
Make the Office canvas mount passive so Xpra starts only when the Desktop surface is opened or an official Office document is created/opened.
Track Desktop host visibility to unload hidden frames, stop monitors, dedupe viewport resize work, and set Xpra offscreen mode according to HTTPS support. Add a near-future note for the tunnel memory footprint.
Show Office desktop startup progress
Display a loading message while the Agent Zero Desktop environment is starting or restarting, so the right-canvas Desktop button gives immediate feedback before Xpra finishes waking up.
Configure the Docker image initializer to raise the container soft nofile limit to a safer default when Docker's hard limit allows it.
Document the new-image behavior and optional A0_NOFILE_LIMIT override without requiring users to add Docker ulimit flags.
Create a generic OAuth Connections plugin with Codex/ChatGPT Account as the first provider, using OpenAI's device-code flow to persist Codex-compatible account tokens.
Expose a loopback OpenAI-compatible wrapper for models, responses, and chat completions, and point LiteLLM at the container-local Agent Zero origin.
Add a dummy API-key extension and focused tests so the account-backed provider appears configured without requiring a user-entered key.
docs: add Codex plan OAuth callout
Highlight that Agent Zero can use an existing OpenAI Codex plan through the new OAuth flow.
Add the account-backed LLM plans image and surface the section from the README navigation, while pointing toward future Gemini CLI and Claude Code integrations.
Handle Codex account SSE chat chunks
Teach the Codex/ChatGPT account bridge to extract text from OpenAI-style SSE chat completion deltas and fall back to a normal output_text response when upstream only streams chunks.
Strip user-supplied stream kwargs before LiteLLM calls so Agent Zero owns streaming mode and custom parameters cannot pass stream twice.
Add targeted tests for streamed delta extraction and reconstructed responses.
update README.md with LLM plans mention
Persist the active agent profile with each chat context and add a context-scoped endpoint for switching profiles without mutating global settings. Update the WebUI selector and docs to treat settings as the default for new chats, and expose the switch through the A0 connector plugin.
Introduce the new built-in Browser plugin for Agent Zero, replacing the legacy
browser-use-based browser agent with a direct Playwright-powered browser tool,
live WebUI viewer, browser session controls, status APIs, configuration, and
extension-management support.
Add browser-specific modal behavior so the browser can run as a floating,
resizable, no-backdrop window, including modal focus, toggle, and idempotent
open helpers for richer WebUI surfaces.
Remove the old `_browser_agent` core plugin and the `browser-use` dependency,
then clean up stale browser-model wiring and references across agent code,
model configuration docs, setup guides, troubleshooting docs, skills, and
Agent Zero knowledge.
Update regression and WebUI extension-surface coverage for the new browser
architecture and modal behavior.
The legacy browser-use implementation has been extracted from core so it can
continue separately as a community plugin published through the A0 Plugin Index for any user or professional that were relying on it for workflow.
The new guide explains:
- where profiles live
- what belongs in agent.yaml
- how prompt overrides work
- which root /prompts files are useful levers
- how profile-specific Main/Utility models are actually configured via _model_config/config.json
- why that config must be complete, not partial
Add LLM preset selection to project create/edit flows, backed by _model_config scoped project config. Support global, project, and combined preset APIs with explicit metadata while preserving plain YAML preset files. Copy selected preset chat/utility settings into project-scoped config, keep embedding settings from the effective config, and document/test the new project model config paths.
Add a builtin `a0-setup-cli` skill for guiding host-side A0 connector setup,
and restore the lightweight trigger-word based skill matching flow, which many users asked for.
- add builtin `skills/a0-setup-cli/` with installer-first host setup guidance,
container guardrails, fallback install paths, and example responses
- fix `helpers.skills_cli` so builtin skills under `/skills` are discoverable,
searchable, and validatable
- restore trigger-pattern scoring in runtime `search_skills()`
- re-enable `skills_tool:search` in the current tool flow
- add lightweight lexical relevant-skill recall for the current user message
without reintroducing memory/vector-db skill recall
- update skill prompts to steer the agent toward search/load when requests
match skill trigger phrases
- Fix two error returns in ws_dev_test.py using non-standard {"_error": True, ...} format, which _collect_results misidentifies as a success response and wraps as ok=True, sending a false-success to the client
- Switch to WsResult.error(code=..., message=...) standard API, consistent with all other handlers
- Fix 5 occurrences of process_event → process in documentation examples
- Remove non-existent HANDLER_ID and HANDLED_EVENTS class attribute examples from docs
- Fix validate_event_types (plural) → validate_event_type (singular) in docs
- Remove RELEASE_NOTES_DIR env var and docs/release_notes/ directory with v1.0 and v1.1 markdown files
- Add OPENROUTER_API_KEY and OPENROUTER_MODEL to workflow environment variables
- Add OPENROUTER_CHAT_COMPLETIONS_URL constant and OPENROUTER_SYSTEM_PROMPT_PATH pointing to scripts/openrouter_release_notes_system_prompt.md
- Add require_env, load_text, github_repository_parts, github_api_get helpers
- Add trigger_self_update.sh to executable permissions in Dockerfile
- Add trigger-update command mode to self_update_manager.py with argparse CLI
- Add queue_update_request helper to write trigger file with normalized parameters
- Add parse_selector_version, is_valid_selector_tag, is_supported_selector_tag helpers
- Add get_latest_same_major_tag to resolve "latest" within current major version line
- Add ensure
- Add LATEST_SELECTOR_TAG constant and is_latest_selector_tag helper to identify "latest" selection
- Add split_describe_version helper to parse git describe output into tag and commit count
- Replace fetch_release_refs with resolve_requested_target that handles both specific tags and "latest" resolution
- For main branch, resolve "latest" to newest reachable release tag
- For testing/development branches
- Update default backup path from /a0/tmp/self-update-backups to /root/update-backups in self_update_manager.py, helpers/self_update.py, and documentation
- Move aiogram from global requirements.txt to plugin-local requirements for _telegram_integration
- Add ensure_dependencies() helper that installs aiogram on-demand via uv pip install
- Add has_aiogram() check to avoid
- Add ready branch to Docker publish workflow alongside testing and main
- Replace tag search/autocomplete UI with standard select element preloaded with current major version tags
- Add get_selector_tag_options helper that filters tags to current major line and returns list of higher major versions available
- Show attention banner with Docker update guide link when newer major versions exist on selected
- Change version tag format from `v{epoch}.{major}.{minor}.{rest}` to `v{major}.{minor}`
- Raise minimum selector version from v0.9.9 to v1.0
- Update tag parsing to require exactly two version segments
- Add numeric sorting for version tags (e.g., v1.10 > v1.9)
- Update major version compatibility check to compare first number only
- Improve validation messages: "vX.Y.Z" → "vX.Y", add "v1.0 or newer" requirement
- Clear default