Commit graph

26 commits

Author SHA1 Message Date
Alessandro
c553e91c03 Add Browser extension UI open action
Detect openable Chrome extension UI pages from manifests and expose resolved chrome-extension URLs to the Browser UI.

Render an Open button in the compact Browser extension dropdown and cover manifest UI metadata with regression tests.
2026-05-02 20:02:28 +02:00
Alessandro
ad7925b543 Enable clipboard shortcuts in Browser visual mode
Bridge copy, cut, paste, and common edit shortcuts from the Browser modal and canvas screenshot surface into the Playwright runtime while preserving native clipboard behavior for Agent Zero UI fields.

Add websocket and runtime clipboard handling with regression coverage for frontend shortcut routing, paste fallback, and viewer input dispatch.
2026-05-02 17:21:44 +02:00
Alessandro
9fc3ff20a4 Add browser extension uninstall controls
Expose extension deletion from the Browser internal settings page and keep the compact Browser dropdown focused on quick enable/install actions.\n\nAdd a guarded uninstall API that only deletes Browser-managed extension folders, updates enabled extension paths, refreshes the settings UI, and covers managed versus external paths with regression tests.
2026-05-02 17:05:33 +02:00
Alessandro
39a96012f9 Make browser annotation tray draggable
Fix annotation panel stacking so draft popovers render above the annotations recap.\n\nAllow the annotations recap tray to float within the browser stage by dragging its header, with bounded positioning and cleanup when annotations are cleared or the browser surface unmounts.
2026-05-02 16:55:50 +02:00
Alessandro
c2fb2c3c94 Add browser screenshot previews to tool messages
Render Browser tool Screenshot KVPs as clickable live thumbnails that open the Browser canvas while preserving the existing lower-row Browser action.\n\nAdd a lightweight websocket snapshot endpoint for existing browser runtimes and keep preview frame memory bounded with revocable object URLs.
2026-05-02 16:48:38 +02:00
Alessandro
90ae70eb6e Merge ready into browser multi-tab PR 2026-05-02 15:56:16 +02:00
Alessandro
12b96ae41e Harden browser multi-tab focus handling 2026-05-02 15:49:05 +02:00
Alessandro
ce7ec3cb4c fix(canvas): keep browser and office surfaces opt-in
Make Markdown the first-class document workflow in the office skills and state the Desktop/LibreOffice path as opt-in for GUI or binary Office work.

Remove passive Browser canvas auto-opening from tool results; Browser result handling now only syncs an already-open Browser canvas, while explicit user buttons can still open the canvas or modal. Add regression coverage for the no-auto-open policy and Markdown-first skill guidance.
2026-05-02 14:08:35 +02:00
Alessandro
eb5220b058 fix(browser): read content inside shadow DOM
Teach the browser page-content helper to traverse open shadow roots and assigned slot nodes when collecting text, rendering list/inline children, and resolving selectors. This lets Agent Zero inspect modern component-heavy pages more accurately without depending only on light-DOM textContent.

Bump the injected helper version so existing browser contexts can refresh to the new DOM traversal behavior.
2026-05-02 13:07:10 +02:00
TerminallyLazy
5012dd3128 feat(browser): multi-tab awareness + modifier-key click
- Auto-register tabs opened by site (window.open, target=_blank,
  ctrl-click) via context.on("page",...) with registry lock and
  closing-state guard.
- Modifier-key click via Playwright trusted input: keyboard.down/up
  around mouse.click for coord-based path; locator.click(modifiers=...)
  selector fallback for off-screen / hidden elements. Chrome focus
  rule: ctrl/meta-click keeps focus on origin tab; override via
  focus_popup arg.
- key_chord action: presses keys in order, releases in reverse;
  guarantees release on exception. Supports Ctrl+A/C/V style chords.
- mouse modifiers click-only (raises ValueError for non-click events).
- list(include_content=true) bulk read across all tabs in parallel
  via asyncio.gather (was sequential).
- multi action: batched sub-calls. Different browser_id groups run
  concurrently; same browser_id sequentially. Returns array of
  {ok, result|error} matching input order. Lets the agent fan out
  reads or coordinated mutations across tabs in one tool call.
- Cross-tab work no longer steals viewer focus.
  last_interacted_browser_id promotes only on open / set_active /
  same-tab work / Chrome popup rule. WebUI auto-open allowlist
  tightened to open|navigate|set_active so background actions don't
  drag the viewer.
- New set_active action for explicit focus switch.
- JS helper bumps VERSION to force re-injection on cached pages;
  exports boundingBoxFor returning {x,y,w,h,selector} for the
  trusted-input modifier-click paths.

Backwards-compatible: every new arg is optional with safe defaults.
No removed actions; existing call shapes preserved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 06:37:21 -04:00
Alessandro
ff828e294e Add plugin thumbnails 2026-04-28 15:04:19 +02:00
Alessandro
b7ba8eff7f Improve browser session context lifecycle
Keep browser sessions context-qualified so tabs from different chats can coexist without closing on context switches.

Create a real chat context when Browser launches from dashboard/no selected context, preserving agent handoff for that session.

Move chat context detail out of visible tab labels and into hover tooltips using only real chat names, with regression coverage for the updated lifecycle.
2026-04-28 14:40:48 +02:00
Alessandro
9ec070793d Stabilize browser canvas screencast lifecycle
Restart the canvas screencast after page-changing commands and remount viewport metrics when starting or resizing streams so canvas scrolling stays smooth across first mount, new tabs, and navigation.

Move Browser JS off Alpine global store lookups and onto direct store imports, tighten modal/canvas handoff state, and keep annotations aligned with accepted viewport frames.

Improve Browser tab close ergonomics, allow Chromium native error pages to render without blocking the UI, include right-canvas tab polish, and expand regression coverage for these paths.
2026-04-28 07:02:40 +02:00
Alessandro
decb05a682 Stabilize browser viewer viewport rendering
Decode browser frames before display and only render frames that match the active viewer viewport, avoiding stretched stale screencast images during startup and resize.

Keep rejecting mismatched CDP screencast frames on the backend, extend canvas viewport settling, and cover the behavior with browser regression tests.

Include small browser panel CSS polish.
2026-04-28 04:29:33 +02:00
Alessandro
ad76578c47 Create chat context for browser launches
Allow the Browser surface to create and select a chat context when opened without an active context.

Reuse an in-flight context creation promise so repeated startup paths do not race, and update commands/viewer connection to ensure a context before calling browser websocket APIs.

Add a browser regression guard for the no-context startup path.
2026-04-27 17:44:57 +02:00
Alessandro
e412f5faf7 Fix browser canvas startup viewport settle
Wait for the right-canvas browser surface to finish its opening transition before using its dimensions as the Playwright viewport.

Measure raw stage dimensions for stability, then apply the existing clamped viewport values so initial screencasts do not render into a stretched canvas.

Add a browser regression guard for the raw viewport settle path.
2026-04-27 17:35:50 +02:00
Alessandro
10e8f5d01a canvas + browser CSS polish 2026-04-27 00:16:58 +02:00
Alessandro
4ff3244ce6 Add browser annotate mode
Add Codex-inspired annotation UI to the built-in Browser surfaces, including the Annotate toggle, Cmd/Ctrl+. shortcut, selection overlay, inline comments, and batch Draft to chat / Send now actions.

Wire browser_viewer_annotation through the WebSocket and runtime layers, and expose safe DOM metadata extraction for clicked elements and selected areas without leaking password/value data.

Expand regression coverage for the Browser UI, annotation dispatch, runtime helper exposure, prompt formatting, and WebUI extension surface harness behavior.
2026-04-26 23:57:48 +02:00
Alessandro
c32e328287 Polish Browser settings and viewport handling
Add Browser settings for the default starting page and tool-result autofocus, and wire them through config, APIs, runtime opens, and the settings UI.

Resolve Chrome extension __MSG_* manifest labels from locale metadata so installed extensions show readable names. Stabilize Browser viewport negotiation across canvas and modal surfaces by clearing stale frames, waiting for stable surface dimensions, and forcing sync after dock transitions. Move Browser loading/error state into a thin bottom status bar so it no longer overlays the page viewport.
2026-04-26 21:47:50 +02:00
Alessandro
f1b014feb3 Automatic canvas handoffs
- Auto-open Office and Browser canvas surfaces from fresh tool results, including history/result messages.
- Preserve Browser target IDs when focusing a canvas session from tool output.
- Convert substantial response-style artifacts into Office documents at runtime, without relying only on prompt compliance.
- Attach Office artifact metadata to the completed response log so the canvas opens without leaving a dangling Processing group.
- Polish Office UX by removing the inactive version-history action, showing only the healthy dot, and improving Collabora blank-load recovery with browser state cleanup.
- Deduplicate auto-open events and ignore stale results.
2026-04-26 19:32:50 +02:00
Alessandro
370ac9b878 Make Browser dockable and stabilize canvas interaction
Extend Browser into a reusable panel that can run in either the Universal Canvas or the floating modal. Add canvas registration, dock/undock behavior, and keep the existing modal path working as a fallback.

Stabilize tab switching with viewer tokens and stale-frame rejection, prevent command snapshots from crossing active tabs, and keep tab changes responsive.

Improve canvas navigation and scrolling by making screencast polling non-blocking and removing page-settle waits from wheel input, so the visible frame updates promptly without stretch/catch-up artifacts.

Polish Browser busy feedback with a spinner-only status affordance to avoid misleading “updating browser” copy.
2026-04-26 17:09:21 +02:00
Alessandro
dccf017d2c Redesign Browser viewer screencast transport and viewport fit
Replace the Browser viewer’s screenshot polling with CDP screencast streaming for much smoother navigation. The runtime now starts/stops CDP screencasts cleanly, acknowledges frames, drops stale frames, and keeps the WebSocket payload compatible with the existing viewer.

Also fixes modal viewport sizing by sending the initial stage dimensions on subscribe, applying CDP emulation sizing before the first frame, avoiding image stretching, and increasing screencast JPEG quality to 92. Regression coverage was added for the screencast path, frame ack/drop behavior, viewport sizing, and UI rendering assumptions.

-- Still needs thorough performance audit and optimization --
2026-04-26 02:28:59 +02:00
Alessandro
cf67047ad3 Polish Browser chrome and extension management UX
Refine the Browser modal UI with more native-feeling tabs, consistent chrome controls, right-side tab close buttons, and a cleaner extension dropdown. Move the Browser LLM preset into the dropdown with the active Main Model summary, simplify extension settings, remove the global extension enable switch and legacy extension root behavior, and add per-extension enable toggles.

Also updates the Chrome extension install/review flow with contextual warning copy, “Scan with A0”, cleaner labels, hidden empty extension state, and regression coverage for the new Browser UX.
2026-04-26 00:09:16 +02:00
Alessandro
fa7eef1919 Use persistent full Chromium runtime for Browser
- Always launch Browser with full Playwright Chromium instead of switching between headless shell and extension mode
- Cache Chromium under /a0/usr/plugins/_browser/playwright with legacy lookup for existing installs
- Store installed Browser extensions under /a0/usr/plugins/_browser/extensions with legacy extension-root compatibility
- Show clearer first-run Chromium install messaging and extend the initial Browser timeout
- Fix Browser spinner animation for startup and extension install states
- Update Docker Playwright install script and regression coverage
2026-04-24 19:08:01 +02:00
Alessandro
fb98c2f89a Fix Chrome extension install and Browser startup with extensions
- Download Chrome Web Store extensions using the detected Chrome prodversion instead of a stale hardcoded version
- Update extension settings copy to reflect Chrome Web Store URL support
- Serialize Browser persistent-context startup and clean stale Chromium profile singleton locks
- Increase Browser viewer subscribe timeout for extension-enabled cold starts
- Add regressions for Web Store download URL handling, slow viewer startup, and stale profile lock cleanup
2026-04-24 18:12:18 +02:00
Alessandro
983d431a5e browser: replace browser-use agent with native browser
Introduce the new built-in Browser plugin for Agent Zero, replacing the legacy
browser-use-based browser agent with a direct Playwright-powered browser tool,
live WebUI viewer, browser session controls, status APIs, configuration, and
extension-management support.

Add browser-specific modal behavior so the browser can run as a floating,
resizable, no-backdrop window, including modal focus, toggle, and idempotent
open helpers for richer WebUI surfaces.

Remove the old `_browser_agent` core plugin and the `browser-use` dependency,
then clean up stale browser-model wiring and references across agent code,
model configuration docs, setup guides, troubleshooting docs, skills, and
Agent Zero knowledge.

Update regression and WebUI extension-surface coverage for the new browser
architecture and modal behavior.

The legacy browser-use implementation has been extracted from core so it can
continue separately as a community plugin published through the A0 Plugin Index for any user or professional that were relying on it for workflow.
2026-04-24 15:43:52 +02:00