Teach the Browser content helper to ignore global/delegated framework event bindings so snapshots surface the actual actionable controls instead of broad wrapper elements. Add an accessible name to the Browser address bar for more reliable capture output.
Allow agents to use selector-based reference actions, coordinate click fallbacks, focused-field typing, and string key chords such as CTRL+A across the browser tool, container runtime, and host connector runtime. Cover the behavior with browser regression and host connector tests.
Move URL normalization into Agent Zero-owned Browser helper code and expose the content helper's required API contract from the shared asset. Normalize host-browser open/navigate payloads before they cross into the connector, including nested multi actions, and add regression coverage for helper payload delivery and URL edge cases.
Integrate host-browser routing into the existing Browser tool. Store connector host-browser metadata, add pending browser op resolution, select connector runtimes from Browser settings, enforce host-content privacy policy, support automatic host preparation, and document the A0 CLI host-browser flow.
Use /a0/tmp/playwright as the Browser plugin Chromium cache and Docker install target while preserving full Chromium installs.
Add startup migration cleanup for retired usr Playwright caches, update Browser status/runtime references and docs, and cover migration behavior with focused regressions.
Adds the explicit browser:screenshot action that writes JPEG/PNG files for vision_load, extends agent-callable Browser input actions, and documents the explicit vision workflow.
Adds the browser-forms on-demand skill and regression coverage for dispatch, runtime screenshot files, ref point resolution, upload path normalization, prompt discoverability, and label-wrapped form controls surfaced by the chat-driven E2E.
Detect cached Playwright contexts that have already closed before reusing the browser runtime.
Clear stale browser pages, popup waiters, screencasts, and interaction state; stop the old Playwright instance; and restart cleanly on next use. Add regression coverage for stale context recovery and unexpected context close events.
Bridge copy, cut, paste, and common edit shortcuts from the Browser modal and canvas screenshot surface into the Playwright runtime while preserving native clipboard behavior for Agent Zero UI fields.
Add websocket and runtime clipboard handling with regression coverage for frontend shortcut routing, paste fallback, and viewer input dispatch.
- Auto-register tabs opened by site (window.open, target=_blank,
ctrl-click) via context.on("page",...) with registry lock and
closing-state guard.
- Modifier-key click via Playwright trusted input: keyboard.down/up
around mouse.click for coord-based path; locator.click(modifiers=...)
selector fallback for off-screen / hidden elements. Chrome focus
rule: ctrl/meta-click keeps focus on origin tab; override via
focus_popup arg.
- key_chord action: presses keys in order, releases in reverse;
guarantees release on exception. Supports Ctrl+A/C/V style chords.
- mouse modifiers click-only (raises ValueError for non-click events).
- list(include_content=true) bulk read across all tabs in parallel
via asyncio.gather (was sequential).
- multi action: batched sub-calls. Different browser_id groups run
concurrently; same browser_id sequentially. Returns array of
{ok, result|error} matching input order. Lets the agent fan out
reads or coordinated mutations across tabs in one tool call.
- Cross-tab work no longer steals viewer focus.
last_interacted_browser_id promotes only on open / set_active /
same-tab work / Chrome popup rule. WebUI auto-open allowlist
tightened to open|navigate|set_active so background actions don't
drag the viewer.
- New set_active action for explicit focus switch.
- JS helper bumps VERSION to force re-injection on cached pages;
exports boundingBoxFor returning {x,y,w,h,selector} for the
trusted-input modifier-click paths.
Backwards-compatible: every new arg is optional with safe defaults.
No removed actions; existing call shapes preserved.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Keep browser sessions context-qualified so tabs from different chats can coexist without closing on context switches.
Create a real chat context when Browser launches from dashboard/no selected context, preserving agent handoff for that session.
Move chat context detail out of visible tab labels and into hover tooltips using only real chat names, with regression coverage for the updated lifecycle.
Restart the canvas screencast after page-changing commands and remount viewport metrics when starting or resizing streams so canvas scrolling stays smooth across first mount, new tabs, and navigation.
Move Browser JS off Alpine global store lookups and onto direct store imports, tighten modal/canvas handoff state, and keep annotations aligned with accepted viewport frames.
Improve Browser tab close ergonomics, allow Chromium native error pages to render without blocking the UI, include right-canvas tab polish, and expand regression coverage for these paths.
Decode browser frames before display and only render frames that match the active viewer viewport, avoiding stretched stale screencast images during startup and resize.
Keep rejecting mismatched CDP screencast frames on the backend, extend canvas viewport settling, and cover the behavior with browser regression tests.
Include small browser panel CSS polish.
Add Codex-inspired annotation UI to the built-in Browser surfaces, including the Annotate toggle, Cmd/Ctrl+. shortcut, selection overlay, inline comments, and batch Draft to chat / Send now actions.
Wire browser_viewer_annotation through the WebSocket and runtime layers, and expose safe DOM metadata extraction for clicked elements and selected areas without leaking password/value data.
Expand regression coverage for the Browser UI, annotation dispatch, runtime helper exposure, prompt formatting, and WebUI extension surface harness behavior.
Add Browser settings for the default starting page and tool-result autofocus, and wire them through config, APIs, runtime opens, and the settings UI.
Resolve Chrome extension __MSG_* manifest labels from locale metadata so installed extensions show readable names. Stabilize Browser viewport negotiation across canvas and modal surfaces by clearing stale frames, waiting for stable surface dimensions, and forcing sync after dock transitions. Move Browser loading/error state into a thin bottom status bar so it no longer overlays the page viewport.
Extend Browser into a reusable panel that can run in either the Universal Canvas or the floating modal. Add canvas registration, dock/undock behavior, and keep the existing modal path working as a fallback.
Stabilize tab switching with viewer tokens and stale-frame rejection, prevent command snapshots from crossing active tabs, and keep tab changes responsive.
Improve canvas navigation and scrolling by making screencast polling non-blocking and removing page-settle waits from wheel input, so the visible frame updates promptly without stretch/catch-up artifacts.
Polish Browser busy feedback with a spinner-only status affordance to avoid misleading “updating browser” copy.
Replace the Browser viewer’s screenshot polling with CDP screencast streaming for much smoother navigation. The runtime now starts/stops CDP screencasts cleanly, acknowledges frames, drops stale frames, and keeps the WebSocket payload compatible with the existing viewer.
Also fixes modal viewport sizing by sending the initial stage dimensions on subscribe, applying CDP emulation sizing before the first frame, avoiding image stretching, and increasing screencast JPEG quality to 92. Regression coverage was added for the screencast path, frame ack/drop behavior, viewport sizing, and UI rendering assumptions.
-- Still needs thorough performance audit and optimization --
- Download Chrome Web Store extensions using the detected Chrome prodversion instead of a stale hardcoded version
- Update extension settings copy to reflect Chrome Web Store URL support
- Serialize Browser persistent-context startup and clean stale Chromium profile singleton locks
- Increase Browser viewer subscribe timeout for extension-enabled cold starts
- Add regressions for Web Store download URL handling, slow viewer startup, and stale profile lock cleanup
Introduce the new built-in Browser plugin for Agent Zero, replacing the legacy
browser-use-based browser agent with a direct Playwright-powered browser tool,
live WebUI viewer, browser session controls, status APIs, configuration, and
extension-management support.
Add browser-specific modal behavior so the browser can run as a floating,
resizable, no-backdrop window, including modal focus, toggle, and idempotent
open helpers for richer WebUI surfaces.
Remove the old `_browser_agent` core plugin and the `browser-use` dependency,
then clean up stale browser-model wiring and references across agent code,
model configuration docs, setup guides, troubleshooting docs, skills, and
Agent Zero knowledge.
Update regression and WebUI extension-surface coverage for the new browser
architecture and modal behavior.
The legacy browser-use implementation has been extracted from core so it can
continue separately as a community plugin published through the A0 Plugin Index for any user or professional that were relying on it for workflow.