agent-zero

mirror of https://github.com/agent0ai/agent-zero.git synced 2026-05-17 04:01:13 +00:00

Author	SHA1	Message	Date
Alessandro	70adbe91a0	Polish Editor and Browser surface cleanup Remove obsolete Office markdown editor UI and handoff code now that Markdown lives in the dedicated Editor surface. Harden the Editor modal so it opens directly into a Markdown draft and rebinds Ace to the visible root when switching surfaces. Make Browser address Enter navigation explicit and update the canvas setup expectations for the slimmer Office shell.	2026-05-15 12:38:29 +02:00
Alessandro	4bab8da3f5	Keep host browser requests on Browser runtime Route host/local browser requests through the Browser tool instead of desktop or shell fallbacks. Add remote-debugging setup guidance to Browser runtime errors and document the exact Chrome inspect setting in prompts, skills, and Web UI copy.	2026-05-12 15:45:29 +02:00
Alessandro	7b1c84aeca	Improve browser tool ergonomics for agent UI control Teach the Browser content helper to ignore global/delegated framework event bindings so snapshots surface the actual actionable controls instead of broad wrapper elements. Add an accessible name to the Browser address bar for more reliable capture output. Allow agents to use selector-based reference actions, coordinate click fallbacks, focused-field typing, and string key chords such as CTRL+A across the browser tool, container runtime, and host connector runtime. Cover the behavior with browser regression and host connector tests.	2026-05-12 09:41:13 +02:00
Alessandro	6d29268cbd	refactor: align skills and tool guidance Some checks are pending Build And Publish Docker Images / plan (push) Waiting to run Details Build And Publish Docker Images / build (push) Blocked by required conditions Details Rename high-impact skills to task-oriented names and move plugin-owned skills into their owning plugin folders.\n\nAlign renamed skill frontmatter with the official SKILL.md standard by keeping trigger language in name/description metadata, replacing the old create-skill wizard with build-skill, and updating browser, A0 connector, computer-use, CLI setup, and scheduler skill references.\n\nTighten the recurring cross-provider guidance gaps surfaced by the evidence sweeps: memory requests now avoid promptinclude-file routing, scheduler prompts distinguish cron schedules from planned ISO dates, document questions prefer document_query, skills_tool search/read_file usage is clearer, normal notifications set info/priority 10, and local/host text editors preserve patch intent.\n\nUpdate regression tests for the renamed skills, plugin ownership, prompt budget reality, and standard frontmatter shape.	2026-05-10 07:13:14 +02:00
Alessandro	daf95ec3ab	Normalize tool contracts and slim prompt surface Standardize multi-action tools around tool_args.action while keeping parser compatibility for older tool/args, tool_name:action, and method-shaped requests. This keeps new prompts clean without breaking agents that learned the previous dialect. Move A0 connector remote execution/file tools into stable standard prompts, make remote targeting independent of the active chat context, and skill-gate beta computer-use remote so it no longer weighs down the always-on tool list. Align text editor, scheduler, skills, office artifact, memory, notify, and browser prompts/tools around the canonical action contract. Add scheduler update/timezone handling, skills_tool read_file, text editor patch coverage, and fixes for memory_forget, behaviour_adjustment, and code execution progress warnings. Reduce default prompt pressure by compacting browser and scheduler prompts into skill-backed manifests, shortening skill catalog descriptions, and pruning noisy framework knowledge. Remove obsolete connector prompt stubs and root tool-call knowledge examples. Tests: conda run -n a0 pytest tests/test_a0_connector_prompt_gating.py tests/test_tool_action_contracts.py tests/test_task_scheduler_timezone.py tests/test_text_editor_context_patch.py tests/test_tool_request_normalization.py tests/test_office_document_store.py::test_odf_is_advertised_and_docx_remains_explicit_compatibility tests/test_office_document_store.py::test_document_artifact_accepts_method_alias_for_ods_create tests/test_skills_runtime.py tests/test_default_prompt_budget.py::test_a0_small_profile_removed_and_prompt_text_generic -q	2026-05-09 21:54:43 +02:00
Alessandro	09d9ed2e80	Bound browser tab usage during research	2026-05-09 17:36:15 +02:00
Alessandro	0a8aaee9ac	Add host browser profile mode setting Default Bring Your Own Browser mode to the existing browser profile while exposing a clean Agent profile option in Browser settings with a clear warning for existing-profile access. Forward the selected profile mode through the connector browser runtime, tolerate legacy config modules and old saved configs, and update regression coverage for the new payload shape.	2026-05-09 16:25:27 +02:00
Alessandro	44d5e1ccf7	Persist browser history screenshots Save a static JPEG for each Browser tool call in the chat history folder and render that immutable image in transcript screenshot previews. Keep live Browser surface attachment available through stored browser/context metadata, and ignore generated Playwright CLI artifacts.	2026-05-08 19:24:44 +02:00
Alessandro	bb2432693e	Fix canvas attachment for browser and documents Attach the Browser canvas to active Docker sessions by returning an initial snapshot on subscribe and preserving valid frames through state-only updates. Route Markdown document opens through the right-canvas Desktop editor instead of the legacy office modal. Skip automatic office document response affordances for subordinate agents so delegated reviews keep their actual content.	2026-05-08 19:08:53 +02:00
Alessandro	229de5166b	Expose Browser runtime selection to CLI Some checks are pending Build And Publish Docker Images / plan (push) Waiting to run Details Build And Publish Docker Images / build (push) Blocked by required conditions Details Add a protected connector endpoint for reading and updating the Browser plugin runtime backend so the A0 CLI can switch between Docker browser and Bring Your Own Browser mode. Keep legacy host_when_available values normalized to host_required, move the host/container setting to the top of Browser settings, and cover the config normalization path.	2026-05-08 18:37:46 +02:00
Alessandro	001c7e2ccb	Simplify Host Browser config Remove the ambiguous Use host when ready option from the Browser plugin settings and present the host-required path as Bring Your Own Browser. Add concise Chrome/Chromium remote-debugging guidance, normalize legacy host_when_available values to the BYOB setting, and make missing host-browser connector setup a repairable error with regression coverage.	2026-05-08 18:18:03 +02:00
Alessandro	aa7944b95a	Centralize Browser helper contracts Move URL normalization into Agent Zero-owned Browser helper code and expose the content helper's required API contract from the shared asset. Normalize host-browser open/navigate payloads before they cross into the connector, including nested multi actions, and add regression coverage for helper payload delivery and URL edge cases.	2026-05-08 16:39:04 +02:00
Alessandro	c020f1af28	Send browser helper source to host connector Make the Browser plugin the source of truth for browser-page-content.js by attaching its source and sha256 to host-browser operations when the CLI has no matching helper hash. Store the helper hash in connector metadata and cover the routing/ensure path in tests.	2026-05-08 15:23:44 +02:00
Alessandro	d47207dfd7	Refine host browser routing and settings copy Some checks are pending Build And Publish Docker Images / plan (push) Waiting to run Details Build And Publish Docker Images / build (push) Blocked by required conditions Details Store and surface host-browser preparation and CDP endpoint metadata from A0 CLI. Let Browser runtime prepare candidate CLIs before the first action, and keep host-required errors more actionable. Simplify Host Browser settings language and document the Chrome remote-debugging consent flow.	2026-05-08 06:37:32 +02:00
Alessandro	4b3e2eb327	Route Browser through A0 host connector Integrate host-browser routing into the existing Browser tool. Store connector host-browser metadata, add pending browser op resolution, select connector runtimes from Browser settings, enforce host-content privacy policy, support automatic host preparation, and document the A0 CLI host-browser flow.	2026-05-08 04:22:18 +02:00
Alessandro	06a83030c0	Remove browser chat action button Delete the Browser plugin's injected bottom-action button so it no longer appears under the chat input while preserving the rest of the Browser surface entry points. Update the browser regression coverage to assert the chat action stays absent.	2026-05-08 00:36:36 +02:00
Alessandro	8b921a8ded	Move Browser Playwright cache to tmp Use /a0/tmp/playwright as the Browser plugin Chromium cache and Docker install target while preserving full Chromium installs. Add startup migration cleanup for retired usr Playwright caches, update Browser status/runtime references and docs, and cover migration behavior with focused regressions.	2026-05-07 18:43:24 +02:00
Alessandro	022b6f031f	Split live surfaces out of modals Introduce the shared surfaces frontend service and stylesheet so Browser and Desktop can register docked or floating live UI without special cases in modals.js. Update Browser and right-canvas integration to preserve active viewers across canvas/modal switches and avoid creating blank tabs unless explicitly requested.	2026-05-07 00:14:31 +02:00
Alessandro	d3c249cbdd	Add Browser v1 explicit screenshot and form actions Adds the explicit browser:screenshot action that writes JPEG/PNG files for vision_load, extends agent-callable Browser input actions, and documents the explicit vision workflow. Adds the browser-forms on-demand skill and regression coverage for dispatch, runtime screenshot files, ref point resolution, upload path normalization, prompt discoverability, and label-wrapped form controls surfaced by the chat-driven E2E.	2026-05-05 15:54:13 +02:00
Alessandro	f9175ed00b	Stabilize Browser modal switching Keep Browser modal activation passive when switching from Desktop by reusing existing Browser sessions instead of creating a blank tab on viewer subscribe. Add a Focus mode control to the Browser modal header matching Desktop's fullscreen/restore behavior. Cover the passive subscribe path and Browser modal focus button in regression tests.	2026-05-05 14:34:42 +02:00
Alessandro	d37500967f	Restart browser runtime after stale context Detect cached Playwright contexts that have already closed before reusing the browser runtime. Clear stale browser pages, popup waiters, screencasts, and interaction state; stop the old Playwright instance; and restart cleanly on next use. Add regression coverage for stale context recovery and unexpected context close events.	2026-05-03 01:55:24 +02:00
Alessandro	677a0c1e64	Bridge desktop URLs into Browser Register the Agent Zero Browser as the Desktop URL handler, queue URL intents from the Xfce environment, and route them into Browser on the opposite canvas/modal surface. Also make floating Browser and Desktop modals pass outside clicks through while preserving interaction inside the modal window.	2026-05-03 00:57:56 +02:00
Alessandro	c553e91c03	Add Browser extension UI open action Detect openable Chrome extension UI pages from manifests and expose resolved chrome-extension URLs to the Browser UI. Render an Open button in the compact Browser extension dropdown and cover manifest UI metadata with regression tests.	2026-05-02 20:02:28 +02:00
Alessandro	ad7925b543	Enable clipboard shortcuts in Browser visual mode Bridge copy, cut, paste, and common edit shortcuts from the Browser modal and canvas screenshot surface into the Playwright runtime while preserving native clipboard behavior for Agent Zero UI fields. Add websocket and runtime clipboard handling with regression coverage for frontend shortcut routing, paste fallback, and viewer input dispatch.	2026-05-02 17:21:44 +02:00
Alessandro	9fc3ff20a4	Add browser extension uninstall controls Expose extension deletion from the Browser internal settings page and keep the compact Browser dropdown focused on quick enable/install actions.\n\nAdd a guarded uninstall API that only deletes Browser-managed extension folders, updates enabled extension paths, refreshes the settings UI, and covers managed versus external paths with regression tests.	2026-05-02 17:05:33 +02:00
Alessandro	39a96012f9	Make browser annotation tray draggable Fix annotation panel stacking so draft popovers render above the annotations recap.\n\nAllow the annotations recap tray to float within the browser stage by dragging its header, with bounded positioning and cleanup when annotations are cleared or the browser surface unmounts.	2026-05-02 16:55:50 +02:00
Alessandro	c2fb2c3c94	Add browser screenshot previews to tool messages Render Browser tool Screenshot KVPs as clickable live thumbnails that open the Browser canvas while preserving the existing lower-row Browser action.\n\nAdd a lightweight websocket snapshot endpoint for existing browser runtimes and keep preview frame memory bounded with revocable object URLs.	2026-05-02 16:48:38 +02:00
Alessandro	90ae70eb6e	Merge ready into browser multi-tab PR	2026-05-02 15:56:16 +02:00
Alessandro	12b96ae41e	Harden browser multi-tab focus handling	2026-05-02 15:49:05 +02:00
Alessandro	ce7ec3cb4c	fix(canvas): keep browser and office surfaces opt-in Make Markdown the first-class document workflow in the office skills and state the Desktop/LibreOffice path as opt-in for GUI or binary Office work. Remove passive Browser canvas auto-opening from tool results; Browser result handling now only syncs an already-open Browser canvas, while explicit user buttons can still open the canvas or modal. Add regression coverage for the no-auto-open policy and Markdown-first skill guidance.	2026-05-02 14:08:35 +02:00
Alessandro	eb5220b058	fix(browser): read content inside shadow DOM Teach the browser page-content helper to traverse open shadow roots and assigned slot nodes when collecting text, rendering list/inline children, and resolving selectors. This lets Agent Zero inspect modern component-heavy pages more accurately without depending only on light-DOM textContent. Bump the injected helper version so existing browser contexts can refresh to the new DOM traversal behavior.	2026-05-02 13:07:10 +02:00
TerminallyLazy	5012dd3128	feat(browser): multi-tab awareness + modifier-key click - Auto-register tabs opened by site (window.open, target=_blank, ctrl-click) via context.on("page",...) with registry lock and closing-state guard. - Modifier-key click via Playwright trusted input: keyboard.down/up around mouse.click for coord-based path; locator.click(modifiers=...) selector fallback for off-screen / hidden elements. Chrome focus rule: ctrl/meta-click keeps focus on origin tab; override via focus_popup arg. - key_chord action: presses keys in order, releases in reverse; guarantees release on exception. Supports Ctrl+A/C/V style chords. - mouse modifiers click-only (raises ValueError for non-click events). - list(include_content=true) bulk read across all tabs in parallel via asyncio.gather (was sequential). - multi action: batched sub-calls. Different browser_id groups run concurrently; same browser_id sequentially. Returns array of {ok, result\|error} matching input order. Lets the agent fan out reads or coordinated mutations across tabs in one tool call. - Cross-tab work no longer steals viewer focus. last_interacted_browser_id promotes only on open / set_active / same-tab work / Chrome popup rule. WebUI auto-open allowlist tightened to open\|navigate\|set_active so background actions don't drag the viewer. - New set_active action for explicit focus switch. - JS helper bumps VERSION to force re-injection on cached pages; exports boundingBoxFor returning {x,y,w,h,selector} for the trusted-input modifier-click paths. Backwards-compatible: every new arg is optional with safe defaults. No removed actions; existing call shapes preserved. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 06:37:21 -04:00
Alessandro	ff828e294e	Add plugin thumbnails	2026-04-28 15:04:19 +02:00
Alessandro	b7ba8eff7f	Improve browser session context lifecycle Keep browser sessions context-qualified so tabs from different chats can coexist without closing on context switches. Create a real chat context when Browser launches from dashboard/no selected context, preserving agent handoff for that session. Move chat context detail out of visible tab labels and into hover tooltips using only real chat names, with regression coverage for the updated lifecycle.	2026-04-28 14:40:48 +02:00
Alessandro	9ec070793d	Stabilize browser canvas screencast lifecycle Restart the canvas screencast after page-changing commands and remount viewport metrics when starting or resizing streams so canvas scrolling stays smooth across first mount, new tabs, and navigation. Move Browser JS off Alpine global store lookups and onto direct store imports, tighten modal/canvas handoff state, and keep annotations aligned with accepted viewport frames. Improve Browser tab close ergonomics, allow Chromium native error pages to render without blocking the UI, include right-canvas tab polish, and expand regression coverage for these paths.	2026-04-28 07:02:40 +02:00
Alessandro	decb05a682	Stabilize browser viewer viewport rendering Decode browser frames before display and only render frames that match the active viewer viewport, avoiding stretched stale screencast images during startup and resize. Keep rejecting mismatched CDP screencast frames on the backend, extend canvas viewport settling, and cover the behavior with browser regression tests. Include small browser panel CSS polish.	2026-04-28 04:29:33 +02:00
Alessandro	ad76578c47	Create chat context for browser launches Allow the Browser surface to create and select a chat context when opened without an active context. Reuse an in-flight context creation promise so repeated startup paths do not race, and update commands/viewer connection to ensure a context before calling browser websocket APIs. Add a browser regression guard for the no-context startup path.	2026-04-27 17:44:57 +02:00
Alessandro	e412f5faf7	Fix browser canvas startup viewport settle Wait for the right-canvas browser surface to finish its opening transition before using its dimensions as the Playwright viewport. Measure raw stage dimensions for stability, then apply the existing clamped viewport values so initial screencasts do not render into a stretched canvas. Add a browser regression guard for the raw viewport settle path.	2026-04-27 17:35:50 +02:00
Alessandro	10e8f5d01a	canvas + browser CSS polish	2026-04-27 00:16:58 +02:00
Alessandro	4ff3244ce6	Add browser annotate mode Add Codex-inspired annotation UI to the built-in Browser surfaces, including the Annotate toggle, Cmd/Ctrl+. shortcut, selection overlay, inline comments, and batch Draft to chat / Send now actions. Wire browser_viewer_annotation through the WebSocket and runtime layers, and expose safe DOM metadata extraction for clicked elements and selected areas without leaking password/value data. Expand regression coverage for the Browser UI, annotation dispatch, runtime helper exposure, prompt formatting, and WebUI extension surface harness behavior.	2026-04-26 23:57:48 +02:00
Alessandro	c32e328287	Polish Browser settings and viewport handling Add Browser settings for the default starting page and tool-result autofocus, and wire them through config, APIs, runtime opens, and the settings UI. Resolve Chrome extension __MSG_* manifest labels from locale metadata so installed extensions show readable names. Stabilize Browser viewport negotiation across canvas and modal surfaces by clearing stale frames, waiting for stable surface dimensions, and forcing sync after dock transitions. Move Browser loading/error state into a thin bottom status bar so it no longer overlays the page viewport.	2026-04-26 21:47:50 +02:00
Alessandro	f1b014feb3	Automatic canvas handoffs - Auto-open Office and Browser canvas surfaces from fresh tool results, including history/result messages. - Preserve Browser target IDs when focusing a canvas session from tool output. - Convert substantial response-style artifacts into Office documents at runtime, without relying only on prompt compliance. - Attach Office artifact metadata to the completed response log so the canvas opens without leaving a dangling Processing group. - Polish Office UX by removing the inactive version-history action, showing only the healthy dot, and improving Collabora blank-load recovery with browser state cleanup. - Deduplicate auto-open events and ignore stale results.	2026-04-26 19:32:50 +02:00
Alessandro	370ac9b878	Make Browser dockable and stabilize canvas interaction Extend Browser into a reusable panel that can run in either the Universal Canvas or the floating modal. Add canvas registration, dock/undock behavior, and keep the existing modal path working as a fallback. Stabilize tab switching with viewer tokens and stale-frame rejection, prevent command snapshots from crossing active tabs, and keep tab changes responsive. Improve canvas navigation and scrolling by making screencast polling non-blocking and removing page-settle waits from wheel input, so the visible frame updates promptly without stretch/catch-up artifacts. Polish Browser busy feedback with a spinner-only status affordance to avoid misleading “updating browser” copy.	2026-04-26 17:09:21 +02:00
Alessandro	dccf017d2c	Redesign Browser viewer screencast transport and viewport fit Replace the Browser viewer’s screenshot polling with CDP screencast streaming for much smoother navigation. The runtime now starts/stops CDP screencasts cleanly, acknowledges frames, drops stale frames, and keeps the WebSocket payload compatible with the existing viewer. Also fixes modal viewport sizing by sending the initial stage dimensions on subscribe, applying CDP emulation sizing before the first frame, avoiding image stretching, and increasing screencast JPEG quality to 92. Regression coverage was added for the screencast path, frame ack/drop behavior, viewport sizing, and UI rendering assumptions. -- Still needs thorough performance audit and optimization --	2026-04-26 02:28:59 +02:00
Alessandro	cf67047ad3	Polish Browser chrome and extension management UX Refine the Browser modal UI with more native-feeling tabs, consistent chrome controls, right-side tab close buttons, and a cleaner extension dropdown. Move the Browser LLM preset into the dropdown with the active Main Model summary, simplify extension settings, remove the global extension enable switch and legacy extension root behavior, and add per-extension enable toggles. Also updates the Chrome extension install/review flow with contextual warning copy, “Scan with A0”, cleaner labels, hidden empty extension state, and regression coverage for the new Browser UX.	2026-04-26 00:09:16 +02:00
Alessandro	fa7eef1919	Use persistent full Chromium runtime for Browser - Always launch Browser with full Playwright Chromium instead of switching between headless shell and extension mode - Cache Chromium under /a0/usr/plugins/_browser/playwright with legacy lookup for existing installs - Store installed Browser extensions under /a0/usr/plugins/_browser/extensions with legacy extension-root compatibility - Show clearer first-run Chromium install messaging and extend the initial Browser timeout - Fix Browser spinner animation for startup and extension install states - Update Docker Playwright install script and regression coverage	2026-04-24 19:08:01 +02:00
Alessandro	fb98c2f89a	Fix Chrome extension install and Browser startup with extensions - Download Chrome Web Store extensions using the detected Chrome prodversion instead of a stale hardcoded version - Update extension settings copy to reflect Chrome Web Store URL support - Serialize Browser persistent-context startup and clean stale Chromium profile singleton locks - Increase Browser viewer subscribe timeout for extension-enabled cold starts - Add regressions for Web Store download URL handling, slow viewer startup, and stale profile lock cleanup	2026-04-24 18:12:18 +02:00
Alessandro	983d431a5e	browser: replace browser-use agent with native browser Introduce the new built-in Browser plugin for Agent Zero, replacing the legacy browser-use-based browser agent with a direct Playwright-powered browser tool, live WebUI viewer, browser session controls, status APIs, configuration, and extension-management support. Add browser-specific modal behavior so the browser can run as a floating, resizable, no-backdrop window, including modal focus, toggle, and idempotent open helpers for richer WebUI surfaces. Remove the old `_browser_agent` core plugin and the `browser-use` dependency, then clean up stale browser-model wiring and references across agent code, model configuration docs, setup guides, troubleshooting docs, skills, and Agent Zero knowledge. Update regression and WebUI extension-surface coverage for the new browser architecture and modal behavior. The legacy browser-use implementation has been extracted from core so it can continue separately as a community plugin published through the A0 Plugin Index for any user or professional that were relying on it for workflow.	2026-04-24 15:43:52 +02:00

48 commits