Commit graph

3 commits

Author SHA1 Message Date
Alessandro
12b96ae41e Harden browser multi-tab focus handling 2026-05-02 15:49:05 +02:00
TerminallyLazy
5012dd3128 feat(browser): multi-tab awareness + modifier-key click
- Auto-register tabs opened by site (window.open, target=_blank,
  ctrl-click) via context.on("page",...) with registry lock and
  closing-state guard.
- Modifier-key click via Playwright trusted input: keyboard.down/up
  around mouse.click for coord-based path; locator.click(modifiers=...)
  selector fallback for off-screen / hidden elements. Chrome focus
  rule: ctrl/meta-click keeps focus on origin tab; override via
  focus_popup arg.
- key_chord action: presses keys in order, releases in reverse;
  guarantees release on exception. Supports Ctrl+A/C/V style chords.
- mouse modifiers click-only (raises ValueError for non-click events).
- list(include_content=true) bulk read across all tabs in parallel
  via asyncio.gather (was sequential).
- multi action: batched sub-calls. Different browser_id groups run
  concurrently; same browser_id sequentially. Returns array of
  {ok, result|error} matching input order. Lets the agent fan out
  reads or coordinated mutations across tabs in one tool call.
- Cross-tab work no longer steals viewer focus.
  last_interacted_browser_id promotes only on open / set_active /
  same-tab work / Chrome popup rule. WebUI auto-open allowlist
  tightened to open|navigate|set_active so background actions don't
  drag the viewer.
- New set_active action for explicit focus switch.
- JS helper bumps VERSION to force re-injection on cached pages;
  exports boundingBoxFor returning {x,y,w,h,selector} for the
  trusted-input modifier-click paths.

Backwards-compatible: every new arg is optional with safe defaults.
No removed actions; existing call shapes preserved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 06:37:21 -04:00
Alessandro
983d431a5e browser: replace browser-use agent with native browser
Introduce the new built-in Browser plugin for Agent Zero, replacing the legacy
browser-use-based browser agent with a direct Playwright-powered browser tool,
live WebUI viewer, browser session controls, status APIs, configuration, and
extension-management support.

Add browser-specific modal behavior so the browser can run as a floating,
resizable, no-backdrop window, including modal focus, toggle, and idempotent
open helpers for richer WebUI surfaces.

Remove the old `_browser_agent` core plugin and the `browser-use` dependency,
then clean up stale browser-model wiring and references across agent code,
model configuration docs, setup guides, troubleshooting docs, skills, and
Agent Zero knowledge.

Update regression and WebUI extension-surface coverage for the new browser
architecture and modal behavior.

The legacy browser-use implementation has been extracted from core so it can
continue separately as a community plugin published through the A0 Plugin Index for any user or professional that were relying on it for workflow.
2026-04-24 15:43:52 +02:00