mirror of
https://github.com/agent0ai/agent-zero.git
synced 2026-05-23 12:44:31 +00:00
Split the legacy core speech stack into two built-in, independently toggleable plugins: `_kokoro_tts` for TTS and `_whisper_stt` for STT. This refactor keeps dependency installation and bootstrap concerns in Docker/bootstrap/preload, while moving speech-specific tooling, APIs, prompts, UI, and runtime behavior into the plugins. Core now exposes engine-agnostic `tts-service` and `stt-service` brokers, with browser-native TTS preserved as the fallback when Kokoro is disabled. Included in this change: - add built-in `_kokoro_tts` plugin with plugin-owned synth API, config, status UI, and provider registration - add built-in `_whisper_stt` plugin with plugin-owned transcribe API, mic runtime, device UI, prompt injection, and provider registration - remove legacy core speech APIs/helpers/settings/UI and delete unused `webui/js/speech_browser.js` - replace the old hardcoded speech settings section with a generic voice surface backed by plugin extensions - update preload/docs/tests to match the new plugin-owned speech architecture Behavioral intent: - both plugins are built-in but not `always_enabled` - users can now hot-switch TTS and STT independently - browser TTS remains available when `_kokoro_tts` is off - Whisper mic UI only appears when `_whisper_stt` is enabled
57 lines
2.5 KiB
HTML
57 lines
2.5 KiB
HTML
<html>
|
|
<head>
|
|
<script type="module">
|
|
import { store as chatNavStore } from "/components/chat/navigation/chat-navigation-store.js";
|
|
</script>
|
|
</head>
|
|
<body>
|
|
<div id="progress-bar-box" x-data>
|
|
<x-extension id="chat-input-progress-start"></x-extension>
|
|
|
|
<!-- Hidden legacy progress-bar element (keeps JS updater happy, not displayed) -->
|
|
<span id="progress-bar" style="display:none;"></span>
|
|
|
|
<div id="progress-bar-right">
|
|
<h4
|
|
id="progress-bar-stop-speech"
|
|
x-data="{
|
|
speaking: window.ttsService?.isSpeaking?.() || false,
|
|
_listener: null,
|
|
init() {
|
|
this._listener = (event) => {
|
|
this.speaking = !!event?.detail?.isSpeaking;
|
|
};
|
|
window.ttsService?.addEventListener?.('statechange', this._listener);
|
|
},
|
|
cleanup() {
|
|
if (this._listener) {
|
|
window.ttsService?.removeEventListener?.('statechange', this._listener);
|
|
}
|
|
},
|
|
}"
|
|
x-init="init()"
|
|
x-destroy="cleanup()"
|
|
x-cloak
|
|
x-show="speaking"
|
|
>
|
|
<span id="stop-speech" @click="window.ttsService?.stop?.()" style="cursor: pointer" title="Stop Speech" aria-label="Stop Speech">
|
|
<span class="icon material-symbols-outlined">volume_off</span>
|
|
</span>
|
|
</h4>
|
|
<x-extension id="chat-nav-after"></x-extension>
|
|
</div>
|
|
<x-extension id="chat-input-progress-end"></x-extension>
|
|
</div>
|
|
|
|
|
|
<style>
|
|
#progress-bar-box { background-color: var(--color-panel); padding: var(--spacing-xs) var(--spacing-md) 0 var(--spacing-md); display: flex; flex-wrap: nowrap; justify-content: flex-start; align-items: center; z-index: 1001; gap: var(--spacing-sm); min-height: 0; }
|
|
#progress-bar-box > x-extension { display: contents; }
|
|
#progress-bar-box h4 { margin: 0; }
|
|
#progress-bar-right { display: flex; align-items: center; gap: var(--spacing-sm); flex-shrink: 0; margin-left: auto;}
|
|
#progress-bar-right > x-extension { display: contents; }
|
|
.shiny-text { background: linear-gradient(to right, var(--color-primary-dark) 20%, var(--color-text) 40%, var(--color-text) 60%, var(--color-primary-dark) 60%); background-size: 200% auto; color: transparent; -webkit-background-clip: text; background-clip: text; animation: shine 1s linear infinite; }
|
|
@keyframes shine { to { background-position: -200% center; } }
|
|
</style>
|
|
</body>
|
|
</html>
|