vrr/agent-zero

mirror of https://github.com/agent0ai/agent-zero.git synced 2026-05-22 19:47:15 +00:00

History

Alessandro 675afa8dee Some checks are pending Build And Publish Docker Images / plan (push) Waiting to run Details Build And Publish Docker Images / build (push) Blocked by required conditions Details Refactor speech stack into built-in Kokoro TTS and Whisper STT plugins Split the legacy core speech stack into two built-in, independently toggleable plugins: `_kokoro_tts` for TTS and `_whisper_stt` for STT. This refactor keeps dependency installation and bootstrap concerns in Docker/bootstrap/preload, while moving speech-specific tooling, APIs, prompts, UI, and runtime behavior into the plugins. Core now exposes engine-agnostic `tts-service` and `stt-service` brokers, with browser-native TTS preserved as the fallback when Kokoro is disabled. Included in this change: - add built-in `_kokoro_tts` plugin with plugin-owned synth API, config, status UI, and provider registration - add built-in `_whisper_stt` plugin with plugin-owned transcribe API, mic runtime, device UI, prompt injection, and provider registration - remove legacy core speech APIs/helpers/settings/UI and delete unused `webui/js/speech_browser.js` - replace the old hardcoded speech settings section with a generic voice surface backed by plugin extensions - update preload/docs/tests to match the new plugin-owned speech architecture Behavioral intent: - both plugins are built-in but not `always_enabled` - users can now hot-switch TTS and STT independently - browser TTS remains available when `_kokoro_tts` is off - Whisper mic UI only appears when `_whisper_stt` is enabled		2026-05-21 05:41:59 +02:00
..
api	Refactor speech stack into built-in Kokoro TTS and Whisper STT plugins	2026-05-21 05:41:59 +02:00
extensions/webui	Refactor speech stack into built-in Kokoro TTS and Whisper STT plugins	2026-05-21 05:41:59 +02:00
helpers	Refactor speech stack into built-in Kokoro TTS and Whisper STT plugins	2026-05-21 05:41:59 +02:00
webui	Refactor speech stack into built-in Kokoro TTS and Whisper STT plugins	2026-05-21 05:41:59 +02:00
default_config.yaml	Refactor speech stack into built-in Kokoro TTS and Whisper STT plugins	2026-05-21 05:41:59 +02:00
hooks.py	Refactor speech stack into built-in Kokoro TTS and Whisper STT plugins	2026-05-21 05:41:59 +02:00
plugin.yaml	Refactor speech stack into built-in Kokoro TTS and Whisper STT plugins	2026-05-21 05:41:59 +02:00
README.md	Refactor speech stack into built-in Kokoro TTS and Whisper STT plugins	2026-05-21 05:41:59 +02:00

README.md

Kokoro TTS

Built-in speech synthesis plugin backed by Kokoro.

Behavior

Registers Kokoro as the active TTS provider when the plugin is enabled.
Keeps browser-native speechSynthesis as the fallback path when disabled.
Keeps Python dependencies on the core Docker/bootstrap path. This plugin does not install packages or binaries on demand.

Config

voice: Kokoro voice identifier
speed: Kokoro playback speed multiplier

Routes

POST /api/plugins/_kokoro_tts/synthesize
POST /api/plugins/_kokoro_tts/status