Commit graph

401 commits

Author SHA1 Message Date
Alessandro
d3c249cbdd Add Browser v1 explicit screenshot and form actions
Adds the explicit browser:screenshot action that writes JPEG/PNG files for vision_load, extends agent-callable Browser input actions, and documents the explicit vision workflow.

Adds the browser-forms on-demand skill and regression coverage for dispatch, runtime screenshot files, ref point resolution, upload path normalization, prompt discoverability, and label-wrapped form controls surfaced by the chat-driven E2E.
2026-05-05 15:54:13 +02:00
Alessandro
f9175ed00b Stabilize Browser modal switching
Keep Browser modal activation passive when switching from Desktop by reusing existing Browser sessions instead of creating a blank tab on viewer subscribe.

Add a Focus mode control to the Browser modal header matching Desktop's fullscreen/restore behavior.

Cover the passive subscribe path and Browser modal focus button in regression tests.
2026-05-05 14:34:42 +02:00
Alessandro
9390e42bcc Persist Agent Zero Desktop lifecycle
Keep one Xpra Desktop iframe alive across canvas, modal, and keepalive hosts instead of unloading it during normal UI handoffs. Add intentional shutdown/restart state so explicit shutdown is treated as closed, not crashed.

Add the desktop_shutdown Office API path, backend system-desktop shutdown cleanup, and an XFCE panel Shutdown Desktop launcher that requires a second click before writing the shutdown request marker. Hide unsafe logout, lock, and switch-user affordances and cover the lifecycle with focused tests.
2026-05-05 12:20:49 +02:00
Alessandro
78570e5689 Improve Linux Desktop state controls
Add a desktop_state helper, expanded desktopctl observe-act-verify commands, backend desktop_state support, Extra prompt state, and Xpra bridge diagnostics for the built-in Linux Desktop.

Update the Linux Desktop skill so agents prefer structured/app-native/keyboard workflows, treat coordinate clicks as last resort, and verify terminal or CLI-agent work with fresh final screenshots. Cover the behavior with focused Office desktop state, canvas setup, and office_session tests.
2026-05-05 11:20:50 +02:00
Alessandro
2398bd1601 Make Office artifacts ODF-first
Promote LibreOffice-native ODT, ODS, and ODP as first-class defaults for Writer, Spreadsheet, and Presentation while keeping OOXML as explicit compatibility formats.

Add ODF package generation, validation, read/edit support, and focused tests for Markdown, ODT, ODS, ODP, DOCX, XLSX, and PPTX artifact behavior.

Reduce automatic document response triggering so meta-discussions about generated files do not create artifacts, while explicit file and canvas requests still work through the intended Markdown editor or Desktop affordance.

Preserve the native A0 browser launcher, sync the live container, and validate the flow with real chats and Playwright.
2026-05-05 10:01:09 +02:00
Alessandro
d326513983 Defer office runtime preparation during startup
Some checks are pending
Build And Publish Docker Images / plan (push) Waiting to run
Build And Publish Docker Images / build (push) Blocked by required conditions
2026-05-04 23:04:06 +02:00
Alessandro
2d389af727 Defer Office desktop startup
Some checks failed
Build And Publish Docker Images / plan (push) Has been cancelled
Build And Publish Docker Images / build (push) Has been cancelled
Make the Office canvas mount passive so Xpra starts only when the Desktop surface is opened or an official Office document is created/opened.

Track Desktop host visibility to unload hidden frames, stop monitors, dedupe viewport resize work, and set Xpra offscreen mode according to HTTPS support. Add a near-future note for the tunnel memory footprint.

Show Office desktop startup progress

Display a loading message while the Agent Zero Desktop environment is starting or restarting, so the right-canvas Desktop button gives immediate feedback before Xpra finishes waking up.
2026-05-03 03:26:04 +02:00
Alessandro
48977bffc5
Merge pull request #1596 from ruizanthony/fix/code-execution-pty-fd-leak
fix(code_execution): close PTY file descriptors
2026-05-03 02:16:01 +02:00
Alessandro
d37500967f Restart browser runtime after stale context
Detect cached Playwright contexts that have already closed before reusing the browser runtime.

Clear stale browser pages, popup waiters, screencasts, and interaction state; stop the old Playwright instance; and restart cleanly on next use. Add regression coverage for stale context recovery and unexpected context close events.
2026-05-03 01:55:24 +02:00
Alessandro
3df27ccec3 Handle optional Xpra codec gaps on ARM64
Treat local Xpra GUI client packages as best-effort during Office runtime preparation so ARM64 codec dependency gaps do not surface as startup warnings when the browser-hosted Desktop is already usable.

Keep required Desktop Xpra packages strict, trim the ARM Docker fallback to the server/X11/html5 set, and add regression coverage for optional versus required xpra-codecs/libvpx9 failures.
2026-05-03 01:53:02 +02:00
Alessandro
677a0c1e64 Bridge desktop URLs into Browser
Register the Agent Zero Browser as the Desktop URL handler, queue URL intents from the Xfce environment, and route them into Browser on the opposite canvas/modal surface. Also make floating Browser and Desktop modals pass outside clicks through while preserving interaction inside the modal window.
2026-05-03 00:57:56 +02:00
Alessandro
6d9dedb821 Fix Office desktop runtime on arm64 2026-05-03 00:27:23 +02:00
Agent Zero Local
eecbb5ba34 fix(code_execution): handle closed PTYs while reading output 2026-05-02 21:33:08 +00:00
Agent Zero Local
d4eaa7c030 fix(code_execution): avoid double-close of PTY master fd
Use a shared mutable holder for the POSIX PTY master fd and invalidate it before close. This keeps EOF cleanup and TTYSession.close()/kill() idempotent and prevents closing an unrelated resource if the OS reuses the old fd number.
2026-05-02 21:32:12 +00:00
Agent Zero Local
a0f0c2e8d2 fix(code_execution): recover from closed PTY sessions
Detect closed or exited local TTY sessions before writing, convert invalid PTY write errors into retryable session failures, and reset/retry the terminal session once after send/read failures.
2026-05-02 21:32:12 +00:00
Agent Zero Local
2ce1947b0f fix(code_execution): close PTY file descriptors
Store and close POSIX PTY master descriptors when terminal sessions are closed or killed, and make local terminal session shutdown await the full TTY cleanup path. This prevents leaked /dev/ptmx descriptors from exhausting the process file descriptor limit.
2026-05-02 21:32:12 +00:00
Alessandro
dd696732c8 Fix canvas markdown rename before save
Route Office canvas renames through the document store so dirty or missing-on-disk Markdown sessions can be materialized at the new path without hitting the generic workdir filesystem rename endpoint. Add regression coverage for missing draft materialization, dirty markdown rename, and the custom rename hook contract.
2026-05-02 20:51:58 +02:00
Alessandro
5f2ef4f1da Hide Xfce browser and mail menu entries
Add local XDG overrides for the xfce4-mail-reader.desktop and xfce4-web-browser.desktop application IDs shipped by the current desktop runtime, while keeping the older exo-* IDs covered for compatibility. Update the desktop profile tests so future changes assert both generations of launcher IDs.
2026-05-02 20:48:14 +02:00
Alessandro
d8c0d6b9fe Fix Time Travel snapshot resilience
Force-add curated snapshot paths so workspace .gitignore rules cannot break Time Travel snapshots, while preserving Time Travel's own exclusions for secrets and generated files.

Repair invalid shadow Git repositories by restoring HEAD when possible or quarantining and reinitializing unusable repos, and canonicalize workspace paths to avoid duplicate shadow histories for aliases.

Add regression coverage for ignored paths, corrupt shadow HEAD recovery, and canonical workspace identity.
2026-05-02 20:27:28 +02:00
Alessandro
d6d97d037c Fix skills selector unloading
Remove dynamically loaded skills when they are deactivated from the Skills selector. Treat skill names and paths as aliases so scoped defaults, chat overrides, and loaded-skill state resolve consistently.
2026-05-02 20:14:49 +02:00
Alessandro
0da8f3dc2b Add OAuth disconnect and remaining quota visibility
Allow users to disconnect their OpenAI account by clearing stored ChatGPT OAuth tokens while preserving unrelated auth data.

Fetch and normalize Codex usage windows, then show remaining percentage and reset timing in the OAuth settings UI.

Add focused tests for usage parsing and disconnect cleanup.
2026-05-02 20:14:04 +02:00
Alessandro
e63173d812 Polish Agent Zero Desktop defaults
Show hidden files by default in Thunar-backed Desktop sessions while preserving existing file manager profile settings.

Hide the default Xfce Mail Reader and Web Browser helper entries from the Applications menu through local XDG overrides, and cover the generated Desktop profile artifacts with targeted tests.
2026-05-02 20:04:37 +02:00
Alessandro
c553e91c03 Add Browser extension UI open action
Detect openable Chrome extension UI pages from manifests and expose resolved chrome-extension URLs to the Browser UI.

Render an Open button in the compact Browser extension dropdown and cover manifest UI metadata with regression tests.
2026-05-02 20:02:28 +02:00
Alessandro
92ae20da2c Add Desktop habitat README
Install a curated README into generated Agent Zero Desktop sessions so the Xfce workspace explains the habitat concept, credits the open-source foundations and Jan Tomášek, and gives users Terminal commands for popular agent CLIs.

Keep the README as an _office plugin asset and copy it into the Desktop profile during launcher preparation.
2026-05-02 19:51:07 +02:00
Alessandro
27b3624a97 Add Office document rename action
Add a pencil action beside Save that reuses the existing file browser rename modal for open Office documents. Preserve document metadata after filesystem renames, retarget active LibreOffice desktop sessions to the new path, and cover the rename flow in Office regression tests.
2026-05-02 19:31:53 +02:00
Alessandro
e64b9b2538 Remove legacy Office canvas affordances
Route DOCX, spreadsheets, and presentations exclusively through the Xpra desktop LibreOffice session. Keep the custom canvas path focused on Markdown source editing, remove the old dashboard/preview/native LibreOfficeKit code, and update tests and runtime package declarations to match the new Office surface.
2026-05-02 19:24:49 +02:00
Alessandro
739c0a18a3 Improve Office desktop integration
Route binary Office documents through the persistent Desktop surface while keeping Markdown in the custom tabbed editor.

Harden Xpra clipboard bridging and explicit clipboard flags so host paste can reach the desktop session.

Align XFCE and LibreOffice profile paths with Agent Zero locations: downloads for wallpapers, configured workdir for default saves and the Workdir shortcut, and trusted metadata for generated launchers.
2026-05-02 18:39:32 +02:00
Alessandro
2f5f98521d Fix Desktop cursor and canvas resize handoff
Make the embedded Xpra Desktop use the browser cursor as the only visible cursor by suppressing the shadow pointer overlay and pointer-position renderer without blocking pointer input.

Prefer the active Office host iframe when choosing the Desktop frame, then force resize recovery during modal-to-canvas docking so the Xpra desktop, window, and canvas refill the canvas after handoff.
2026-05-02 18:05:39 +02:00
Alessandro
74dcb32814 Fix Desktop Xpra keyboard focus capture
Make the Desktop iframe explicitly focusable and re-arm Xpra keyboard capture on load and click so typed input reaches the remote session reliably.\n\nAdd regression assertions for the Xpra keyboard bridge contract.
2026-05-02 17:51:27 +02:00
Alessandro
ad7925b543 Enable clipboard shortcuts in Browser visual mode
Bridge copy, cut, paste, and common edit shortcuts from the Browser modal and canvas screenshot surface into the Playwright runtime while preserving native clipboard behavior for Agent Zero UI fields.

Add websocket and runtime clipboard handling with regression coverage for frontend shortcut routing, paste fallback, and viewer input dispatch.
2026-05-02 17:21:44 +02:00
Alessandro
ae94f158df Make Time Travel modal-only
Add a Time Travel entry directly under Files in the sidebar dropdown and route it through the existing modal. Stop Time Travel from registering or mounting a right-canvas surface, and keep modal refresh tied to the modal state.
2026-05-02 17:20:05 +02:00
Alessandro
9fc3ff20a4 Add browser extension uninstall controls
Expose extension deletion from the Browser internal settings page and keep the compact Browser dropdown focused on quick enable/install actions.\n\nAdd a guarded uninstall API that only deletes Browser-managed extension folders, updates enabled extension paths, refreshes the settings UI, and covers managed versus external paths with regression tests.
2026-05-02 17:05:33 +02:00
Alessandro
39a96012f9 Make browser annotation tray draggable
Fix annotation panel stacking so draft popovers render above the annotations recap.\n\nAllow the annotations recap tray to float within the browser stage by dragging its header, with bounded positioning and cleanup when annotations are cleared or the browser surface unmounts.
2026-05-02 16:55:50 +02:00
Alessandro
c2fb2c3c94 Add browser screenshot previews to tool messages
Render Browser tool Screenshot KVPs as clickable live thumbnails that open the Browser canvas while preserving the existing lower-row Browser action.\n\nAdd a lightweight websocket snapshot endpoint for existing browser runtimes and keep preview frame memory bounded with revocable object URLs.
2026-05-02 16:48:38 +02:00
Alessandro
90ae70eb6e Merge ready into browser multi-tab PR 2026-05-02 15:56:16 +02:00
Alessandro
12b96ae41e Harden browser multi-tab focus handling 2026-05-02 15:49:05 +02:00
Alessandro
3466160e4c Harden Office canvas sync and PPTX output
Sync document_artifact results into an already-open Office canvas without auto-opening a closed canvas.

Generate PPTX artifacts through the Office plugin writer so PowerPoint decks open in Impress with visible multi-slide content.

Add focused regression coverage for canvas sync behavior and PPTX slide creation.
2026-05-02 15:28:43 +02:00
Alessandro
04926a3a65 Harden Office artifact workflows
Keep Office document artifacts from auto-opening the canvas while adding plugin-owned Download and Open in canvas message actions. Add format-specific skills for Markdown, Word, Excel, and presentation workflows, and clarify the startup-warmed Desktop runtime remains visually opt-in.\n\nCover the Excel method=create path, Markdown-first/no-auto-open policies, response affordance copy, document action buttons, and Desktop bootstrap with focused regressions.
2026-05-02 14:46:56 +02:00
Alessandro
aea1718f9f Revert "Show active models for Default LLM"
This reverts commit 7c59ac9e57.
2026-05-02 14:12:25 +02:00
Alessandro
baac20f7a4 Fix Office artifact creation and canvas closing
Add an explicit close button to the right canvas toolbar, next to the undock control, and cover its label, handler, and ordering in the canvas regression test.

Treat document_artifact tool_args.method as an action alias so calls like method=create with format=xlsx create workbooks instead of falling back to LibreOffice status. Add regression coverage for the exact XLSX creation shape.
2026-05-02 14:08:35 +02:00
Alessandro
ce7ec3cb4c fix(canvas): keep browser and office surfaces opt-in
Make Markdown the first-class document workflow in the office skills and state the Desktop/LibreOffice path as opt-in for GUI or binary Office work.

Remove passive Browser canvas auto-opening from tool results; Browser result handling now only syncs an already-open Browser canvas, while explicit user buttons can still open the canvas or modal. Add regression coverage for the no-auto-open policy and Markdown-first skill guidance.
2026-05-02 14:08:35 +02:00
Alessandro
1a32d3a295 fix(desktop): stabilize Xpra viewport resizing
Make Desktop canvas and modal handoffs resize the live Xpra viewport deterministically by syncing the visible frame, making backend resize requests authoritative, unloading hidden iframe clients, and guarding Xpra HTML menu callbacks when the menu is disabled. Also forwards wheel events from the embedded Xpra canvas so mouse and trackpad scrolling reach the Linux desktop session.
2026-05-02 14:08:35 +02:00
Alessandro
eb5220b058 fix(browser): read content inside shadow DOM
Teach the browser page-content helper to traverse open shadow roots and assigned slot nodes when collecting text, rendering list/inline children, and resolving selectors. This lets Agent Zero inspect modern component-heavy pages more accurately without depending only on light-DOM textContent.

Bump the injected helper version so existing browser contexts can refresh to the new DOM traversal behavior.
2026-05-02 13:07:10 +02:00
Alessandro
62ac20e7b2 feat(desktop): add Linux Desktop skill controls
Add a linux-desktop skill that teaches Agent Zero how to operate the persistent XFCE/Xpra desktop through desktopctl.sh, including app launch, focus, click, typing, and stable folder entry points for Workdir, Projects, Skills, Agents, and Downloads.

Add a Calc cell-edit helper that opens a workbook through the visible LibreOffice Calc desktop session, updates a requested sheet cell, saves, and verifies the XLSX on disk. Expand the Office canvas setup tests to cover Desktop branding, Xpra package requirements, resize behavior, mobile canvas gating, and the new skill helpers.
2026-05-02 13:07:10 +02:00
Alessandro
24dd548ebf feat(office-ui): introduce the Desktop document canvas
Rework the Office canvas into the Desktop surface, with Markdown editing for text documents and official LibreOffice/Xpra sessions for DOCX, XLSX, and PPTX. The panel now presents Desktop-oriented actions, named header buttons, persistent session tabs, adaptive modal/canvas sizing, and fast client-side Xpra frame fitting during resize.

Stop auto-opening the canvas from document tool results, hide the canvas on mobile-width layouts, and emit resize lifecycle events so embedded desktop surfaces can pause expensive work while the user drags.
2026-05-02 13:07:10 +02:00
Alessandro
10a6cd28c6 feat(office): replace Collabora with LibreOffice document runtime
Remove the Collabora/WOPI runtime and route stack, including the old status APIs, proxy helpers, bootstrap extensions, and WOPI store tests. Add the Markdown-first document store, LibreOffice status/conversion helpers, LibreOfficeKit session bridge, and reusable Xpra virtual desktop gateway used by the new document runtime.

Update image and self-update bootstrap paths so existing containers can acquire the LibreOffice, XFCE, Xpra, and desktop-control dependencies through the normal install hooks instead of an ad hoc manual install.
2026-05-02 13:07:10 +02:00
TerminallyLazy
5012dd3128 feat(browser): multi-tab awareness + modifier-key click
- Auto-register tabs opened by site (window.open, target=_blank,
  ctrl-click) via context.on("page",...) with registry lock and
  closing-state guard.
- Modifier-key click via Playwright trusted input: keyboard.down/up
  around mouse.click for coord-based path; locator.click(modifiers=...)
  selector fallback for off-screen / hidden elements. Chrome focus
  rule: ctrl/meta-click keeps focus on origin tab; override via
  focus_popup arg.
- key_chord action: presses keys in order, releases in reverse;
  guarantees release on exception. Supports Ctrl+A/C/V style chords.
- mouse modifiers click-only (raises ValueError for non-click events).
- list(include_content=true) bulk read across all tabs in parallel
  via asyncio.gather (was sequential).
- multi action: batched sub-calls. Different browser_id groups run
  concurrently; same browser_id sequentially. Returns array of
  {ok, result|error} matching input order. Lets the agent fan out
  reads or coordinated mutations across tabs in one tool call.
- Cross-tab work no longer steals viewer focus.
  last_interacted_browser_id promotes only on open / set_active /
  same-tab work / Chrome popup rule. WebUI auto-open allowlist
  tightened to open|navigate|set_active so background actions don't
  drag the viewer.
- New set_active action for explicit focus switch.
- JS helper bumps VERSION to force re-injection on cached pages;
  exports boundingBoxFor returning {x,y,w,h,selector} for the
  trusted-input modifier-click paths.

Backwards-compatible: every new arg is optional with safe defaults.
No removed actions; existing call shapes preserved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 06:37:21 -04:00
Alessandro
7c71185f16 feat(a0-connector): lazy-load remote tool guidance
Some checks failed
Build And Publish Docker Images / plan (push) Has been cancelled
Build And Publish Docker Images / build (push) Has been cancelled
Move A0 CLI remote execution and file-editing guidance into skills, and gate compact remote tool stubs on subscribed CLI capabilities instead of always advertising unavailable tools. Retire verbose per-turn remote guidance extras while preserving connector protocol and tool schemas.
2026-04-28 16:14:53 +02:00
Alessandro
24812aabbb Fix Office session cleanup race
Keep newly-created Office sessions out of orphan cleanup so in-flight iframe loads do not lose their WOPI tokens during mount refreshes.

Add regression coverage for the fresh-session grace window while preserving cleanup for older orphaned sessions.
2026-04-28 16:14:53 +02:00
Alessandro
d387b1827f Fix Codex account SSE response recovery
Decode byte chunks from the live Codex/ChatGPT account SSE stream before parsing events.

Preserve accumulated output_text deltas when the final response.completed object is present but has no extractable output content.

Update the OAuth tests to cover byte-delivered SSE chunks and empty completed responses.
2026-04-28 16:14:53 +02:00