Commit graph

48 commits

Author SHA1 Message Date
Alessandro
d4dc83ba78 Expose installed plugin toggles
Advertise the installed_plugins connector capability and add a protected API endpoint that lists already-installed Agent Zero plugins and toggles supported plugins only.

The endpoint normalizes plugin metadata, preserves the installed-only safety boundary, and refuses changes to protected plugins such as _a0_connector so the CLI cannot disconnect itself.
2026-05-26 20:05:47 +02:00
Alessandro
4f06aa0a8e Fix MCP multimodal content handling
Preserve MCP image, audio, and resource tool results instead of collapsing non-text responses into an empty textual result. Images and image resources now flow into raw history as data URL attachments, while audio and non-image binary resources are saved as artifacts with normalized paths.

Extract shared media artifact helpers for base64 validation, image data URLs, decoded-size checks, artifact saving, MIME normalization, and safe filenames. Reuse the shared helpers from MCP, browser connector, and computer-use artifact paths, and add focused regression coverage.
2026-05-26 15:31:33 +02:00
Alessandro
97953db46b Guide computer-use remote through Linux AT-SPI
Add a Linux-specific host computer-use skill, route Wayland/AT-SPI backends to it instead of macOS AX guidance, and include compact structural tree outlines in AX/UIA snapshot responses so agents can pick paths and semantic targets from the tool result.
2026-05-23 19:25:51 +02:00
Alessandro
b670559322 Guide Windows computer use through UIA
Add the Windows host computer-use skill and teach computer_use_remote to surface UIA window-management guidance, selector passthrough, and click-last workflow hints. Keep backend-specific actions out of generic guidance while exposing Windows structural operations when the backend advertises them.

Tests: uv run --python 3.12 --with-requirements requirements.txt --with-requirements requirements2.txt --with-requirements requirements.dev.txt --with litellm pytest tests\\test_tool_action_contracts.py tests\\test_a0_connector_prompt_gating.py tests\\test_skills_runtime.py -q
2026-05-23 18:25:04 +02:00
Alessandro Frau
a931759868 Keep backend computer-use actions out of generic guidance
Move explicit AX action names and argument details out of the always-loaded computer_use_remote prompt and generic host-computer-use skill. The generic guidance now only explains backend discovery and skill loading, while host-computer-use-macos remains the detailed home for macOS structural targeting. Also soften the old Super+H hide-window guidance so window actions are chosen from the reported backend and verified visually.
2026-05-23 15:27:14 +02:00
Alessandro Frau
e7cb3aa3fa Split macOS computer-use backend guidance
Add a macOS-specific computer-use skill for AX structural targeting and keep the generic host skill backend-neutral. Surface backend ids, families, and advertised features from computer_use_remote start/status results, add backend-gated ax_snapshot and ax_action handling, and prompt the model to load the macOS skill only when the CLI reports matching support.
2026-05-23 15:08:06 +02:00
Alessandro Frau
a80cf3842e Treat computer-use approval denial as rearm required
Map COMPUTER_USE_APPROVAL_REQUIRED tool responses into the existing COMPUTER_USE_REARM_REQUIRED stop guidance. This keeps agents from retrying desktop actions or using screenshot fallbacks when macOS permissions still require a user-approved re-arm.
2026-05-23 13:32:47 +02:00
Alessandro
cee9abfde4 Expose computer-use captures as vision messages
Store computer-use screenshots as standalone RawMessage entries after the textual tool result, matching the existing vision_load path so the model receives a real multimodal message.

Prefer shared screenshot file paths over base64 artifacts when available, and tighten host computer-use guidance so agents stop instead of proceeding from unverified state when a screenshot is not visible.
2026-05-23 11:51:24 +02:00
Alessandro
2f9037a195 Prefer Super+H for host window hiding
Update computer_use_remote guidance for Ubuntu/GNOME/Wayland so hide-window tasks use Super+H instead of Alt+F9. Reinforce that type results only prove keystrokes were sent and that the agent must verify the fresh screenshot before typing follow-up text or claiming success.
2026-05-23 11:33:06 +02:00
Alessandro
ae5e462cd7 Separate host computer use from Xpra desktop
Clarify that computer_use_remote is the only host desktop-control path and that linux-desktop only targets the internal Docker/Xpra Desktop. Add host-computer-use retrieval triggers and regression coverage so host-screen queries rank ahead of the Xpra desktop skill while explicit Agent Zero Desktop requests still route to linux-desktop.
2026-05-23 11:19:23 +02:00
Alessandro
30d364bb97 Attach computer-use captures to tool results
Return computer-use captures as multimodal tool-result content so the model can visually inspect fresh screenshots after each remote action. Keep the textual preview for logs and prune older capture payloads to avoid runaway context growth.
2026-05-23 11:10:50 +02:00
Alessandro
1f34b87c00 Require visual verification for computer-use captures
Sanitize embedded image data URLs from prompt token estimates so screenshot attachments do not explode context accounting.\n\nStrengthen computer_use_remote prompt, skill, and capture-result text so state-changing desktop actions are treated as attempts until a fresh screen visibly confirms the requested outcome.
2026-05-23 10:32:37 +02:00
Alessandro
60c36d16d8 Expose computer_use_remote as a runtime-checked tool
Add the standard tool prompt contract so the model can call computer_use_remote in live sessions.

Keep availability, CLI enablement, trust mode, and re-arm enforcement as runtime checks instead of prompt-loader gating.

Update connector prompt and prompt-budget tests to cover the new exposure path.
2026-05-23 09:40:48 +02:00
Alessandro
770b53e292 Expose connector skill activation
Add a protected skills_activate endpoint and context-aware skills_list support so connector clients can activate skills in live chats. Advertise the capability through the connector API.
2026-05-22 17:03:04 +02:00
Alessandro
5464ead7ce Update computer use allow guidance
Point rearm-required Computer Use guidance at /computer-use on and remove the old confirm/free-run wording so the Agent Zero plugin matches the new CLI access model.
2026-05-22 16:20:58 +02:00
Alessandro
e36cf19bfc Avoid persisting computer-use capture artifacts
Attach base64 computer-use capture artifacts directly as data-image URLs in RawMessage content instead of materializing them under the connector temp capture directory. Keep legacy path-based captures as a fallback while preserving base64 compatibility for model adapters and avoiding durable screenshot files for artifact payloads.
2026-05-22 05:09:25 +02:00
Alessandro
38bbff3d9a Add connector message queue protocol
Some checks are pending
Build And Publish Docker Images / plan (push) Waiting to run
Build And Publish Docker Images / build (push) Blocked by required conditions
Advertise message queue support from the Agent Zero connector backend and add WebSocket handlers for queue add, remove, and send operations.

Include queue snapshots in context subscriptions and emit queue updates as the backend state changes so the CLI can stay in sync.
2026-05-15 18:13:32 +02:00
Alessandro
b48e31bead Forward remote exec reset flag
Forward reset=true from code_execution_remote replacement commands to the connected CLI and document when to use it versus runtime=reset. This lets the CLI tear down stuck host sessions before running the next command.

Tests from /home/eclypso/a0/a0-connector: PYTHONPATH=src conda run -n a0 pytest tests/test_plugin_backend.py::test_code_execution_remote_forwards_reset_true_with_replacement_command -v; ./.venv/bin/python -m pytest tests/test_plugin_backend.py -k 'code_execution_remote or select_remote_exec or ws_connector_exec_result' -v. Mirrored to live container 07e0288dc04f and health check returned HTTP 200.
2026-05-15 00:28:10 +02:00
Alessandro
7b61ceb241 Reflect connector model overrides in Web UI
Render custom per-chat model overrides in the model switcher instead of hiding them behind a generic Custom label.

Mark model override updates dirty so an already-open Web UI refreshes after CLI or Web UI changes, without exposing API key values in labels.

Add focused regression coverage for switcher rendering hooks and state-sync notifications.
2026-05-12 16:04:02 +02:00
Alessandro
4bab8da3f5 Keep host browser requests on Browser runtime
Route host/local browser requests through the Browser tool instead of desktop or shell fallbacks. Add remote-debugging setup guidance to Browser runtime errors and document the exact Chrome inspect setting in prompts, skills, and Web UI copy.
2026-05-12 15:45:29 +02:00
Alessandro
f17198e126 fix: tighten tool guidance and editor workflows 2026-05-11 11:51:58 +02:00
Alessandro
6d29268cbd refactor: align skills and tool guidance
Some checks are pending
Build And Publish Docker Images / plan (push) Waiting to run
Build And Publish Docker Images / build (push) Blocked by required conditions
Rename high-impact skills to task-oriented names and move plugin-owned skills into their owning plugin folders.\n\nAlign renamed skill frontmatter with the official SKILL.md standard by keeping trigger language in name/description metadata, replacing the old create-skill wizard with build-skill, and updating browser, A0 connector, computer-use, CLI setup, and scheduler skill references.\n\nTighten the recurring cross-provider guidance gaps surfaced by the evidence sweeps: memory requests now avoid promptinclude-file routing, scheduler prompts distinguish cron schedules from planned ISO dates, document questions prefer document_query, skills_tool search/read_file usage is clearer, normal notifications set info/priority 10, and local/host text editors preserve patch intent.\n\nUpdate regression tests for the renamed skills, plugin ownership, prompt budget reality, and standard frontmatter shape.
2026-05-10 07:13:14 +02:00
Alessandro
c8e239d5a2 Harden remote computer use readiness
Track computer-use CLI status, last error, and restore-token presence in connector metadata so stale Free Run settings are no longer treated as ready.

Materialize CLI-provided screenshot artifacts through Agent Zero's file helpers, stop dispatching computer_use_remote actions when metadata already reports rearm required, and teach the skill to give backend-agnostic rearm guidance without screenshot or vision fallbacks.
2026-05-10 00:05:08 +02:00
Alessandro
daf95ec3ab Normalize tool contracts and slim prompt surface
Standardize multi-action tools around tool_args.action while keeping parser compatibility for older tool/args, tool_name:action, and method-shaped requests. This keeps new prompts clean without breaking agents that learned the previous dialect.

Move A0 connector remote execution/file tools into stable standard prompts, make remote targeting independent of the active chat context, and skill-gate beta computer-use remote so it no longer weighs down the always-on tool list.

Align text editor, scheduler, skills, office artifact, memory, notify, and browser prompts/tools around the canonical action contract. Add scheduler update/timezone handling, skills_tool read_file, text editor patch coverage, and fixes for memory_forget, behaviour_adjustment, and code execution progress warnings.

Reduce default prompt pressure by compacting browser and scheduler prompts into skill-backed manifests, shortening skill catalog descriptions, and pruning noisy framework knowledge. Remove obsolete connector prompt stubs and root tool-call knowledge examples.

Tests: conda run -n a0 pytest tests/test_a0_connector_prompt_gating.py tests/test_tool_action_contracts.py tests/test_task_scheduler_timezone.py tests/test_text_editor_context_patch.py tests/test_tool_request_normalization.py tests/test_office_document_store.py::test_odf_is_advertised_and_docx_remains_explicit_compatibility tests/test_office_document_store.py::test_document_artifact_accepts_method_alias_for_ods_create tests/test_skills_runtime.py tests/test_default_prompt_budget.py::test_a0_small_profile_removed_and_prompt_text_generic -q
2026-05-09 21:54:43 +02:00
Alessandro
0a8aaee9ac Add host browser profile mode setting
Default Bring Your Own Browser mode to the existing browser profile while exposing a clean Agent profile option in Browser settings with a clear warning for existing-profile access.

Forward the selected profile mode through the connector browser runtime, tolerate legacy config modules and old saved configs, and update regression coverage for the new payload shape.
2026-05-09 16:25:27 +02:00
Alessandro
a3d41e2ca1 Split A0 remote workflow skills by affordance
Replace the combined A0 CLI remote workflow skill with separate text-editor and code-execution remote skills, update tool stubs to load the matching per-tool guide, and keep computer-use remote scoped to desktop control. Add prompt-gating coverage for the per-affordance skill split.
2026-05-08 18:53:37 +02:00
Alessandro
229de5166b Expose Browser runtime selection to CLI
Some checks are pending
Build And Publish Docker Images / plan (push) Waiting to run
Build And Publish Docker Images / build (push) Blocked by required conditions
Add a protected connector endpoint for reading and updating the Browser plugin runtime backend so the A0 CLI can switch between Docker browser and Bring Your Own Browser mode. Keep legacy host_when_available values normalized to host_required, move the host/container setting to the top of Browser settings, and cover the config normalization path.
2026-05-08 18:37:46 +02:00
Alessandro
c020f1af28 Send browser helper source to host connector
Make the Browser plugin the source of truth for browser-page-content.js by attaching its source and sha256 to host-browser operations when the CLI has no matching helper hash. Store the helper hash in connector metadata and cover the routing/ensure path in tests.
2026-05-08 15:23:44 +02:00
Alessandro
d47207dfd7 Refine host browser routing and settings copy
Some checks are pending
Build And Publish Docker Images / plan (push) Waiting to run
Build And Publish Docker Images / build (push) Blocked by required conditions
Store and surface host-browser preparation and CDP endpoint metadata from A0 CLI.

Let Browser runtime prepare candidate CLIs before the first action, and keep host-required errors more actionable.

Simplify Host Browser settings language and document the Chrome remote-debugging consent flow.
2026-05-08 06:37:32 +02:00
Alessandro
4b3e2eb327 Route Browser through A0 host connector
Integrate host-browser routing into the existing Browser tool. Store connector host-browser metadata, add pending browser op resolution, select connector runtimes from Browser settings, enforce host-content privacy policy, support automatic host preparation, and document the A0 CLI host-browser flow.
2026-05-08 04:22:18 +02:00
Alessandro
bb86e5a7a9 Associate connector hello metadata with chat context
Some checks are pending
Build And Publish Docker Images / plan (push) Waiting to run
Build And Publish Docker Images / build (push) Blocked by required conditions
Store remote tool metadata through a dedicated hello path, bind a declared context id to the websocket SID, and return acknowledged remote tool state so clients can verify gated tools such as code_execution_remote are visible to the active chat.
2026-05-07 03:47:03 +02:00
Alessandro
3368fccb3c Rename document skills and harden skill ownership
Rename Office skills to product-neutral Writer, Calc, Impress, and document-artifact names while removing the visible legacy directories. Tighten connector list/delete behavior to the enabled catalog and prevent deletion of built-in plugin skills; also surface invalid skill YAML instead of silently accepting it.
2026-05-07 00:15:34 +02:00
Alessandro
7c71185f16 feat(a0-connector): lazy-load remote tool guidance
Some checks failed
Build And Publish Docker Images / plan (push) Has been cancelled
Build And Publish Docker Images / build (push) Has been cancelled
Move A0 CLI remote execution and file-editing guidance into skills, and gate compact remote tool stubs on subscribed CLI capabilities instead of always advertising unavailable tools. Retire verbose per-turn remote guidance extras while preserving connector protocol and tool schemas.
2026-04-28 16:14:53 +02:00
Alessandro
ff828e294e Add plugin thumbnails 2026-04-28 15:04:19 +02:00
Alessandro
56a42b97d7 Make agent profiles context scoped
Persist the active agent profile with each chat context and add a context-scoped endpoint for switching profiles without mutating global settings. Update the WebUI selector and docs to treat settings as the default for new chats, and expose the switch through the A0 connector plugin.
2026-04-26 22:27:35 +02:00
Alessandro
603fc2064b improve computer-use screenshot refresh guidance
Add post-action settle/fresh-capture handling for computer_use_remote, include capture ids and coordinate-space summaries in screenshot attachments, and tighten prompt guidance so agents use the latest capture without assuming semantic/window targeting.
2026-04-24 14:27:11 +02:00
Alessandro
15c4303f69 Guide computer-use agents away from pointer clicks
Update computer_use_remote prompts to prioritize accessibility, semantic UI paths,
hotkeys, focus traversal, typing, and keyboard scrolling before pointer actions.

Clarify that scroll is the preferred non-click fallback for viewport movement when
keyboard scrolling cannot target the active pane, while move/click remain explicit
last-resort actions. Add a regression test covering remote scroll delta forwarding
and automatic screenshot refresh behavior.
2026-04-22 14:25:18 +02:00
Alessandro
1993f6f864 Store vision and computer-use images as path refs
Keep image payloads out of persistent agent history by storing vision and
computer-use captures as file path references instead of inline base64 data.

- update vision_load to attach image paths without compression or JPEG conversion
- update computer_use_remote to attach shared capture artifact paths directly
- serialize local image refs into provider-valid data URLs only at request prep
- reject base64/data URL attachments on the connector WebSocket path
- advertise path_or_url as the connector attachment mode
2026-04-21 18:18:59 +02:00
Alessandro
fe2310aa90 Add project-scoped LLM presets
Add LLM preset selection to project create/edit flows, backed by _model_config scoped project config. Support global, project, and combined preset APIs with explicit metadata while preserving plain YAML preset files. Copy selected preset chat/utility settings into project-scoped config, keep embedding settings from the effective config, and document/test the new project model config paths.
2026-04-21 18:18:59 +02:00
Alessandro
4c2bc3d783 Add context-based patch_text support to text_editor
Introduces patch_text editing for the Docker-local text_editor, sharing request validation and freshness-state logic with text_editor_remote while preserving legacy line-number edits. Adds anchored context patching, safer state handling after context edits, updated model guidance, live remote wrapper reuse, and focused regression coverage for chained patches and Python replacement cases.
2026-04-21 18:18:59 +02:00
Alessandro
20107ff921 Compress computer-use captures before embedding in history
Reduce the size of computer-use capture attachments stored by the
_a0_connector plugin so Windows screenshots remain usable.

- optimize capture images before embedding them in history
- convert large captures to JPEG data URLs instead of keeping full PNG payloads
- keep the existing capture-path fallback when inline payloads are missing
- preserve the current user-facing computer_use_remote flow while shrinking the
  history payload
2026-04-20 03:56:49 +02:00
Alessandro
d28c21e1a0 connector: block shell write actions inread-only mode
When code execution remote was enabled but CLI was in read-only mode, the shell could still write files to disk.
2026-04-20 03:07:42 +02:00
Alessandro
a5d733c85f connector: gate remote tool guidance on active permissions
Move the heavy remote-tool operating guidance out of the always-on tool prompts
and inject it only when the current context can actually use those tools.

- add extras prompts for computer_use_remote, code_execution_remote, and text_editor_remote
- trim the base tool prompts down to the stable contract and minimal notes
- inject detailed guidance from message-loop extensions instead of always paying the token cost
- store remote_files and remote_exec hello metadata alongside computer_use metadata
- make code_execution_remote follow the real F4 exec-enabled state
- make text_editor_remote follow the real F3 read-only vs read-write state
- surface read-only mode in the injected text-editor guidance and suppress write guidance there
- keep legacy fallback behavior for older CLIs that do not yet advertise the new hello metadata
2026-04-19 22:06:13 +02:00
Alessandro
bdf9cad447 add backend-aware computer-use and inline capture support
- extend `_a0_connector` computer-use metadata handling to retain
  `backend_id`, `backend_family`, `features`, and `support_reason` from the
  CLI hello payload
- update `computer_use_remote` to prefer inline `png_base64` screenshots for
  capture and auto-refresh flows, while keeping filesystem-path fallback for
  migration/debug cases
- include backend information in status formatting so remote computer-use
  sessions are easier to inspect across Wayland and Windows backends
- align the builtin Agent Zero plugin with the new multi-backend computer-use
  transport used by `a0` 1.5
- replaced heavy CU instructions with a SKILL.md
2026-04-19 18:50:14 +02:00
Alessandro
f86d1c555c add connector stale-read protection to remote patching
Add _text_editor-style freshness checks to the _a0_connector remote text editor flow.

- add local freshness helpers for remote file metadata and patch-state tracking
- require a prior read or write before allowing remote patch operations
- run remote patches through stat -> stale check -> patch using private websocket plumbing
- store freshness state in agent.data keyed by CLI-reported realpath
- reuse fw.text_editor patch_need_read and patch_stale_read prompt behavior
- refresh stored state after line-preserving patches and mark it stale after insert/delete or line-count changes
- return a clear compatibility error when the connected CLI does not support internal stat

This keeps the existing edits schema and human-facing success messages unchanged, and does not change remote tree publishing behavior.

Bump plugin version to match CLI Connector.
2026-04-16 15:21:01 +02:00
Alessandro
9db0edd89a Send connector exec config in ws hello
## Summary
- include `exec_config` in `_a0_connector` `connector_hello`
- source execution timeouts and prompt/dialog patterns from `_code_execution` config
- make the connector advertise execution policy explicitly to the CLI

## Why
The CLI should not depend on a local Agent Zero Core checkout just to run `code_execution_remote`. On Windows this broke remote execution even when the connector was active, because the CLI could not see the container's internal Core tree. The backend already owns the execution policy, so it should send that contract directly.

## What changed
- add `_a0_connector.helpers.exec_config.build_exec_config()`
- read `_code_execution` settings/defaults through plugin config resolution
- return `exec_config` from `_a0_connector.api.ws_connector` during `connector_hello`

## Impact
- removes an implicit host-side Core dependency from the connector flow
- lets the CLI keep only platform-specific shell / TTY behavior locally
- aligns Linux and Windows behavior behind the same handshake contract
2026-04-16 15:21:01 +02:00
Alessandro
1d8bc2b2c5 fix compaction in a0_connector plugin 2026-04-16 15:21:01 +02:00
Alessandro
8c5cf1f69f add built-in A0 CLI Connector plugin
Introduce the builtin `_a0_connector` plugin that lets the host-side
A0 CLI connect to Agent Zero over authenticated HTTP and `/ws`.

This adds connector capability discovery, chat/context lifecycle
endpoints, log streaming, and the remote text editing, code execution,
and file tree bridge used by the CLI workflow.
2026-04-11 18:56:32 +02:00