Commit graph

119 commits

Author SHA1 Message Date
Alessandro
1f34b87c00 Require visual verification for computer-use captures
Sanitize embedded image data URLs from prompt token estimates so screenshot attachments do not explode context accounting.\n\nStrengthen computer_use_remote prompt, skill, and capture-result text so state-changing desktop actions are treated as attempts until a fresh screen visibly confirms the requested outcome.
2026-05-23 10:32:37 +02:00
Alessandro
5e2c2a86ef Add skill visibility controls
Some checks are pending
Build And Publish Docker Images / plan (push) Waiting to run
Build And Publish Docker Images / build (push) Blocked by required conditions
Let users hide skills from the model-facing available catalog through the chat Skills selector while keeping pinned skill injection as a separate mode.

Hidden skills are filtered from skill listing, search, loading, relevant recall, and loaded-skill prompt injection, with chat-level show/hide overrides and persistent default hidden-skill config support.
2026-05-22 17:44:22 +02:00
Alessandro
430c48d1a5 Make browser screenshots ephemeral and context scoped
Route no-path browser screenshots through an in-process ephemeral image registry that vision_load consumes into the existing data-url model boundary. Stop materializing host-browser artifacts into tmp/browser/host-screenshots, keep explicit path screenshots durable, and make browser log metadata point at the active chat/task context while preserving browser-context detail.
2026-05-22 09:50:47 +02:00
Alessandro
d1827e6c66 Refactor: use user locale for time displays
Some checks are pending
Build And Publish Docker Images / plan (push) Waiting to run
Build And Publish Docker Images / build (push) Blocked by required conditions
Add user-configurable timezone and 12/24-hour preferences, then wire them through settings, runtime snapshots, scheduler payloads, wait handling, notifications, backups, memory, plugin metadata, and frontend formatters.

Keep UTC as the boundary for absolute instants while serializing user-facing dates in the configured or browser-resolved timezone. Preserve scheduler wall-clock inputs in the selected timezone, propagate TZ into desktop/runtime process environments, and restart active desktop sessions when the runtime timezone changes.

Cover the risky paths with timezone regression tests for settings normalization, auto and fixed timezone resolution, scheduler round-trips, memory timestamp conversion, and desktop timezone sync.
2026-05-21 15:26:00 +02:00
Alessandro
675afa8dee Refactor speech stack into built-in Kokoro TTS and Whisper STT plugins
Some checks are pending
Build And Publish Docker Images / plan (push) Waiting to run
Build And Publish Docker Images / build (push) Blocked by required conditions
Split the legacy core speech stack into two built-in, independently toggleable plugins: `_kokoro_tts` for TTS and `_whisper_stt` for STT.

This refactor keeps dependency installation and bootstrap concerns in Docker/bootstrap/preload, while moving speech-specific tooling, APIs, prompts, UI, and runtime behavior into the plugins. Core now exposes engine-agnostic `tts-service` and `stt-service` brokers, with browser-native TTS preserved as the fallback when Kokoro is disabled.

Included in this change:
- add built-in `_kokoro_tts` plugin with plugin-owned synth API, config, status UI, and provider registration
- add built-in `_whisper_stt` plugin with plugin-owned transcribe API, mic runtime, device UI, prompt injection, and provider registration
- remove legacy core speech APIs/helpers/settings/UI and delete unused `webui/js/speech_browser.js`
- replace the old hardcoded speech settings section with a generic voice surface backed by plugin extensions
- update preload/docs/tests to match the new plugin-owned speech architecture

Behavioral intent:
- both plugins are built-in but not `always_enabled`
- users can now hot-switch TTS and STT independently
- browser TTS remains available when `_kokoro_tts` is off
- Whisper mic UI only appears when `_whisper_stt` is enabled
2026-05-21 05:41:59 +02:00
Alessandro
d4a9cd82d5 Simplify plugin activation toggle UI
Replace the plugin list activation dropdown and advanced shortcut with a one-click ON/OFF switch. Keep project/profile-specific activation inside the plugin config flow, remove the old advanced-only modal, update plugin docs, and add regression coverage for the binary list toggle contract.
2026-05-21 04:31:19 +02:00
Alessandro
68c3b8b022 Move office and desktop state under plugin storage
Migrate retired /usr/_office and /usr/_desktop trees from plugin startup into /usr/plugins/<plugin>. Update office document storage, desktop session/runtime paths, and context-scoped screenshots to use the plugin-owned state layout. Add focused tests for retired-state migration and the new path behavior.
2026-05-12 16:21:43 +02:00
Alessandro
1f2d512226 fix(api): resolve image_get containment bypass (#1609)
Fixes agent0ai/agent-zero#1609.

Issue: "Unauthenticated Path-Containment Bypass in Agent Zero `/api/image_get`"
https://github.com/agent0ai/agent-zero/issues/1609

Resolve the path-containment bypass in /api/image_get by resolving requested images against the Agent Zero base directory before serving them, including symlink-aware validation and the development RFC fallback path.

Harden SVG and SVGZ responses with nosniff and a sandboxed CSP so uploaded SVGs cannot execute scripts in the Agent Zero origin. Add focused regressions for outside paths, symlink escapes, SVG headers, and development-mode remote validation.
2026-05-12 04:15:10 +02:00
Alessandro
f17198e126 fix: tighten tool guidance and editor workflows 2026-05-11 11:51:58 +02:00
Alessandro
daf95ec3ab Normalize tool contracts and slim prompt surface
Standardize multi-action tools around tool_args.action while keeping parser compatibility for older tool/args, tool_name:action, and method-shaped requests. This keeps new prompts clean without breaking agents that learned the previous dialect.

Move A0 connector remote execution/file tools into stable standard prompts, make remote targeting independent of the active chat context, and skill-gate beta computer-use remote so it no longer weighs down the always-on tool list.

Align text editor, scheduler, skills, office artifact, memory, notify, and browser prompts/tools around the canonical action contract. Add scheduler update/timezone handling, skills_tool read_file, text editor patch coverage, and fixes for memory_forget, behaviour_adjustment, and code execution progress warnings.

Reduce default prompt pressure by compacting browser and scheduler prompts into skill-backed manifests, shortening skill catalog descriptions, and pruning noisy framework knowledge. Remove obsolete connector prompt stubs and root tool-call knowledge examples.

Tests: conda run -n a0 pytest tests/test_a0_connector_prompt_gating.py tests/test_tool_action_contracts.py tests/test_task_scheduler_timezone.py tests/test_text_editor_context_patch.py tests/test_tool_request_normalization.py tests/test_office_document_store.py::test_odf_is_advertised_and_docx_remains_explicit_compatibility tests/test_office_document_store.py::test_document_artifact_accepts_method_alias_for_ods_create tests/test_skills_runtime.py tests/test_default_prompt_budget.py::test_a0_small_profile_removed_and_prompt_text_generic -q
2026-05-09 21:54:43 +02:00
Alessandro
30a97cb3f1 Ignore late deferred future completion
Some checks are pending
Build And Publish Docker Images / plan (push) Waiting to run
Build And Publish Docker Images / build (push) Blocked by required conditions
2026-05-09 17:36:19 +02:00
Alessandro
ef011600ef Serialize runtime package preparation
Run Office and Desktop apt operations through a shared in-process retry guard so startup/self-update hooks wait out transient apt locks instead of failing early. Include LibreOffice runtime packages in Desktop preparation because Desktop status and Writer/Calc/Impress launch paths require soffice, and cover both behaviors with regression tests.
2026-05-07 03:14:12 +02:00
Alessandro
3368fccb3c Rename document skills and harden skill ownership
Rename Office skills to product-neutral Writer, Calc, Impress, and document-artifact names while removing the visible legacy directories. Tighten connector list/delete behavior to the enabled catalog and prevent deletion of built-in plugin skills; also surface invalid skill YAML instead of silently accepting it.
2026-05-07 00:15:34 +02:00
Alessandro
76a282ba8f Move Linux Desktop runtime into _desktop
Add the built-in _desktop plugin as the owner of Xpra/Xfce lifecycle, /desktop route installation, Desktop state, session APIs, surface registration, and the linux-desktop skill. Leave _office with explicit compatibility facades and self-update delegates so users coming from 1.10 through 1.13 keep their runtime cleanup path.
2026-05-07 00:14:54 +02:00
Alessandro
2d389af727 Defer Office desktop startup
Some checks failed
Build And Publish Docker Images / plan (push) Has been cancelled
Build And Publish Docker Images / build (push) Has been cancelled
Make the Office canvas mount passive so Xpra starts only when the Desktop surface is opened or an official Office document is created/opened.

Track Desktop host visibility to unload hidden frames, stop monitors, dedupe viewport resize work, and set Xpra offscreen mode according to HTTPS support. Add a near-future note for the tunnel memory footprint.

Show Office desktop startup progress

Display a loading message while the Agent Zero Desktop environment is starting or restarting, so the right-canvas Desktop button gives immediate feedback before Xpra finishes waking up.
2026-05-03 03:26:04 +02:00
Alessandro
6d9dedb821 Fix Office desktop runtime on arm64 2026-05-03 00:27:23 +02:00
Alessandro
e5caed435b
Merge pull request #1560 from octo-patch/fix/issue-1548-per-project-plugin-config-ignored
fix: always check project-level plugin config as fallback (fixes #1548)
2026-05-02 20:39:43 +02:00
Alessandro
d6d97d037c Fix skills selector unloading
Remove dynamically loaded skills when they are deactivated from the Skills selector. Treat skill names and paths as aliases so scoped defaults, chat overrides, and loaded-skill state resolve consistently.
2026-05-02 20:14:49 +02:00
Alessandro
739c0a18a3 Improve Office desktop integration
Route binary Office documents through the persistent Desktop surface while keeping Markdown in the custom tabbed editor.

Harden Xpra clipboard bridging and explicit clipboard flags so host paste can reach the desktop session.

Align XFCE and LibreOffice profile paths with Agent Zero locations: downloads for wallpapers, configured workdir for default saves and the Workdir shortcut, and trusted metadata for generated launchers.
2026-05-02 18:39:32 +02:00
Alessandro
10a6cd28c6 feat(office): replace Collabora with LibreOffice document runtime
Remove the Collabora/WOPI runtime and route stack, including the old status APIs, proxy helpers, bootstrap extensions, and WOPI store tests. Add the Markdown-first document store, LibreOffice status/conversion helpers, LibreOfficeKit session bridge, and reusable Xpra virtual desktop gateway used by the new document runtime.

Update image and self-update bootstrap paths so existing containers can acquire the LibreOffice, XFCE, Xpra, and desktop-control dependencies through the normal install hooks instead of an ad hoc manual install.
2026-05-02 13:07:10 +02:00
Alessandro
5f063a6feb
Merge pull request #1550 from keyboardstaff/ready
Some checks are pending
Build And Publish Docker Images / plan (push) Waiting to run
Build And Publish Docker Images / build (push) Blocked by required conditions
fix: tool request validation crash on non-canonical field names
2026-04-27 19:26:06 +02:00
Alessandro
b782d40c3a fix: normalize tool request fallback fields 2026-04-27 19:24:56 +02:00
Alessandro
56a42b97d7 Make agent profiles context scoped
Persist the active agent profile with each chat context and add a context-scoped endpoint for switching profiles without mutating global settings. Update the WebUI selector and docs to treat settings as the default for new chats, and expose the switch through the A0 connector plugin.
2026-04-26 22:27:35 +02:00
octo-patch
2d0b164081 fix: always check project-level plugin config as fallback (fixes #1548)
When agent_profile is set to a concrete value (e.g. "agent0"), the
project/.a0proj/plugins/<plugin>/config.json path was skipped due to
an overly restrictive guard condition. Only wildcard/empty profiles
could reach the project-level config, making per-project plugin
overrides (including _model_config) silently ineffective for all
normal agent runs.

Remove the guard so the project-level config is always checked as a
fallback after the profile-scoped project path. Precedence is preserved:
profile-scoped project config > project-level config > global config.

Co-Authored-By: Octopus <liyuan851277048@icloud.com>
2026-04-25 09:14:13 +08:00
Alessandro
76130ae5ac accept Socket.IO disconnect reason
Update the WebSocket disconnect handler signature to accept the disconnect
reason now passed by python-socketio.

Agent Zero does not currently use the reason value, but keeping the parameter
matches the documented Socket.IO callback shape and avoids relying on the
library's legacy one-argument handler fallback.

python-socketio>=5.14.2 now documents server disconnect handlers as receiving sid, reason:
https://python-socketio.readthedocs.io/en/stable/server.html#connect-and-disconnect-events.

The 5.14.2 source also passes that reason into the disconnect event. It still has a legacy fallback that retries old one-arg handlers, so removing it would probably work today, but only by leaning on compatibility behavior.
2026-04-24 14:14:08 +02:00
Alessandro
539d809789 feat: add agent profile switcher to chat composer
Surface the active Agent Profile beside the model preset switcher and let users switch profiles through the existing settings flow.

- add agent profile metadata to state snapshots
- list available profiles in the chat composer profile dropdown
- persist profile changes via settings_get/settings_set
- add a Create new Agent Profile action that opens a guided a0-create-agent chat
- rename the agent-profile creation skill/docs from a0-new-agent to a0-create-agent
- clean up fetchApi imports for related WebUI modules
2026-04-22 14:25:18 +02:00
Alessandro
1993f6f864 Store vision and computer-use images as path refs
Keep image payloads out of persistent agent history by storing vision and
computer-use captures as file path references instead of inline base64 data.

- update vision_load to attach image paths without compression or JPEG conversion
- update computer_use_remote to attach shared capture artifact paths directly
- serialize local image refs into provider-valid data URLs only at request prep
- reject base64/data URL attachments on the connector WebSocket path
- advertise path_or_url as the connector attachment mode
2026-04-21 18:18:59 +02:00
Alessandro
fe2310aa90 Add project-scoped LLM presets
Add LLM preset selection to project create/edit flows, backed by _model_config scoped project config. Support global, project, and combined preset APIs with explicit metadata while preserving plain YAML preset files. Copy selected preset chat/utility settings into project-scoped config, keep embedding settings from the effective config, and document/test the new project model config paths.
2026-04-21 18:18:59 +02:00
Alessandro
79f948b076 Improve active skills management and simplify Skills UI
Unify skill handling layer and raise the active skills cap to 20.

The Skills UI now presents a simpler checklist-style flow for selecting active
skills, with live chat activation and saved defaults using the same visible list.
Skill contents can be opened in a read-only Ace viewer via the existing markdown
modal.
2026-04-21 05:47:22 +02:00
Alessandro
91f43e28b4 fix: preserve safe remote fetch compatibility for public sites
Restore remote document fetch compatibility for public sites after the
CVE-2026-4308 SSRF hardening.

The initial security fix correctly blocked non-public destinations, but
it also changed the outbound request fingerprint for `document_query`
remote fetches. Some public sites, including https://nvd.nist.gov/vuln/detail/CVE-2026-4308, used for testing, responded with HTTP
403 to the default `requests` user agent even though they remained safe
and publicly routable.

This change keeps the centralized SSRF protections in place while
restoring the previous request compatibility behavior by sending the
configured `USER_AGENT` header, falling back to the prior
`@mixedbread-ai/unstructured` value.

What is fixed:
- public URLs such as
  `https://nvd.nist.gov/vuln/detail/CVE-2026-4308`
  no longer fail with site-specific HTTP 403 due to request fingerprint
  changes introduced by the SSRF mitigation
2026-04-12 02:08:13 +02:00
Alessandro
6397acc092 Fix SSRF in document_query remote fetching (CVE-2026-4308)
Address CVE-2026-4308 in the document_query tool remote-fetch path.

The issue was originally reported by @YLChen-007.

This change replaces ad hoc remote document fetching with a centralized
safe fetch flow that validates remote URLs before any network request is
used for parsing. It blocks localhost and non-public IPv4/IPv6 targets,
validates every redirect hop, disables implicit trust of proxy env
settings for this path, and enforces a strict remote document size cap.

It also removes direct third-party loader access to attacker-controlled
URLs by prefetching remote content first and then parsing only trusted
local bytes or temp files for HTML, text, PDF, image, and unstructured
document handling.

Refs:
- CVE-2026-4308
- Report by @YLChen-007
2026-04-12 02:00:01 +02:00
Alessandro
48bbe778fe
Merge pull request #1496 from 3clyp50/cli
add a0-setup-cli Skill and restore lexical trigger matching
2026-04-11 18:50:57 +02:00
Alessandro
395ef8dd33 integrations: add native chat controls and email config presets
Add shared transport-level control commands so Telegram, WhatsApp, and
email threads can manage the active chat directly.

- add a shared integration command helper for /project, /config, /send,
  and /queue send
- wire native command handling into Telegram and WhatsApp sessions
- expose Telegram control commands through bot command routing and update
  transport docs
- add email thread command handling for existing A0 email conversations
- add an optional per-handler email conversation preset backed by model
  presets in the email settings UI and default config
- document the new transport control flow across Telegram, WhatsApp, and
  email
2026-04-11 18:49:13 +02:00
Alessandro
954eca3563 add a0-setup-cli Skill and restore lexical trigger matching
Add a builtin `a0-setup-cli` skill for guiding host-side A0 connector setup,
and restore the lightweight trigger-word based skill matching flow, which many users asked for.

- add builtin `skills/a0-setup-cli/` with installer-first host setup guidance,
  container guardrails, fallback install paths, and example responses
- fix `helpers.skills_cli` so builtin skills under `/skills` are discoverable,
  searchable, and validatable
- restore trigger-pattern scoring in runtime `search_skills()`
- re-enable `skills_tool:search` in the current tool flow
- add lightweight lexical relevant-skill recall for the current user message
  without reintroducing memory/vector-db skill recall
- update skill prompts to steer the agent toward search/load when requests
  match skill trigger phrases
2026-04-11 18:03:05 +02:00
Alessandro
5a2223596a stop tool dispatch at first completed json object
Tool execution no longer waits for the full streamed assistant text. We now detect the first explicitly closed top-level JSON object, freeze that snapshot as the canonical tool request, and stop the model stream there for dispatch.

To make that safe, DirtyJson completion semantics are tightened so completed=true only means the root object was explicitly closed, not that parsing hit end of file. I also restricted the new extraction path to object roots only, since tool calls are always brace-delimited objects, and added tests for parser completion and early stream stop.
2026-04-03 16:56:21 +02:00
Alessandro
ec80702b80 add completion detection to DirtyJson parser
Track parsing depth via _pop_stack() helper. Exposes a 'completed' flag that signals when the root JSON structure is fully closed, allowing stream consumers to break early instead of waiting for irrelevant tokens.
2026-04-03 15:45:58 +02:00
Alessandro
1cccb68d0d fix: guard against missing plugin directory in config loads
`find_plugin_dir` can return `None` if a plugin cannot be found. Passing
this null value to `files.get_abs_path` caused crashes during config
retrieval. `get_plugin_config` and `get_default_plugin_config` now check
for a valid directory and return early if it is missing.
2026-04-01 20:42:24 +02:00
frdel
b94d4b79ae refactor: comprehensive UI server restructuring and self-update enhancements
- Extract UI server setup into UiServerRuntime class with modular initialization
- Move environment configuration, route registration, and transport handlers to helpers/ui_server.py
- Add released_at timestamp tracking for git tags and branch heads across update system
- Implement get_current_major_main_latest_info to find latest same-major version on main branch
- Add major_upgrade_versions and main_branch_latest fields to update info payload
- Remove
2026-03-31 15:20:57 +02:00
frdel
fb02b5f4cc enable api caching
Some checks are pending
Build And Publish Docker Images / plan (push) Waiting to run
Build And Publish Docker Images / build (push) Blocked by required conditions
enable websocket and api caching params
2026-03-30 17:15:50 +02:00
Jan Tomášek
9390ba9624
Merge pull request #1344 from keyboardstaff/ws-rework
refactor: Comprehensive WebSocket System Rework
2026-03-30 16:45:49 +02:00
frdel
44e008745d Sanitize print logs; refactor popular plugin logic
Some checks are pending
Build And Publish Docker Images / plan (push) Waiting to run
Build And Publish Docker Images / build (push) Blocked by required conditions
Ensure printed output and HTML logs are safe by importing and applying sanitize_string, opening log files with utf-8 and errors='replace', and sanitizing text before writing. Add tests to verify lone surrogate characters are replaced and that logging won't crash on invalid Unicode. In the plugin installer UI, introduce POPULAR_PLUGIN_MIN_STARS and centralize popularity checking in _isPopularPlugin, using it for filtering and counts.
2026-03-30 11:50:59 +02:00
keyboardstaff
4e222e2cf7 refactor: extract constants, deduplicate ack pattern 2026-03-28 02:51:57 -07:00
keyboardstaff
04d930ab02 refactor: extract shared utilities, fix send_data signature & plugins.py bug
- Consolidate ConnectionIdentity, _ws_debug_enabled(), ws_debug() into ws.py as single-source exports, removing duplicate definitions in ws_manager.py and state_monitor.py
- Make send_data() optional args keyword-only to prevent positional argument confusion with the instance method signature
- Fix clear_plugin_cache in plugins.py: wrong parameter name (event_name → event_type) and stale namespace (/webui → /ws)
2026-03-28 00:38:46 -07:00
keyboardstaff
4065385630 fix: dispatch_to_all_sids security issue
- move _ws_contexts snapshot inside _contexts_lock critical section, add ctx is None skip logic to prevent security check bypass during concurrent disconnects
2026-03-27 23:04:30 -07:00
keyboardstaff
6e8c9d8224 refactor: extract _collect_results and unify internal helpers
1. Extract _collect_results method — Deduplicated ~30 lines of identical result processing from route_event and process_client_event (Exception→error / WsResult→as_result / dict→wrap / None→strategy branch) into a private method with a skip_none parameter.
    * route_event calls _collect_results(skip_none=False) — None becomes ok=True (server-initiated, callers expect a result for every handler)
    * process_client_event calls _collect_results(skip_none=True) — None is skipped (client-initiated, matching legacy _dispatch fire-and-forget semantics)
2. Document None semantics difference — Added # NOTE: comment at the route_event call site explaining why skip_none=False differs from process_client_event.
3. Unify _timestamp() usage — Replaced inline timestamp formatting in _wrap_envelope and handle_connect with self._timestamp() method reuse.#
2026-03-27 23:00:09 -07:00
keyboardstaff
b351de456e fix: resolve option whitelist, memory leak, task tracking, and dispatch unification
- Fix Memory Leaks: Resolved SID retention in _known_sids after disconnection and cleaned up unreferenced broadcast tasks in _schedule_lifecycle_broadcast.
- Unify Dispatching Paths: Unified client and server event dispatching through the process_client_event() method to ensure diagnostic consistency.
- Optimization & Cleanup: Expanded the _OPTION_KEYS whitelist, removed dead code (iter_event_types), and deleted unused websocket exports.
- Robustness: Added handling for None responses in process_client_event to prevent cluttering responses with empty results.
- Testing: Added test cases to verify SID TTL expiration and stale SID cleanup on disconnect.
2026-03-27 01:21:45 -07:00
frdel
2a47410e17 Add display_version field to repo version info for non-release commits
Add display_version to get_repo_version_info output that shows tag+commits (e.g. "v1.11+9") for development builds. Update self-update UI to prefer display_version over short_tag for current version display. Add describe field to modal when it differs from short_tag. Add test coverage for display_version generation on non-main branches.
2026-03-26 12:47:07 +01:00
frdel
f69147aee9 Move subagents import to function scope to avoid circular import
Defer `from helpers import subagents` to function scope in get_webui_extensions and _get_extension_classes to prevent circular import issues. Remove module-level subagents import.
2026-03-26 12:39:08 +01:00
frdel
261c4d6138 Add CLI trigger script for self-update with major version validation and backup configuration
- Add trigger_self_update.sh to executable permissions in Dockerfile
- Add trigger-update command mode to self_update_manager.py with argparse CLI
- Add queue_update_request helper to write trigger file with normalized parameters
- Add parse_selector_version, is_valid_selector_tag, is_supported_selector_tag helpers
- Add get_latest_same_major_tag to resolve "latest" within current major version line
- Add ensure
2026-03-26 12:32:37 +01:00
frdel
e0dae52b7f Replace file sync with capability detection for durable self-update manager
Replace sync_self_update_runtime_files with durable_self_update_supports_latest that checks whether the durable updater supports the "latest" selector by inspecting manager source code for LATEST_SELECTOR_TAG and resolve_requested_target. Check durable manager first, fall back to repo manager if missing. Block "latest" selection in schedule_update and hide it from get_selector_tag_options when durable updater lacks support.
2026-03-26 12:06:28 +01:00