airi/apps/server/docs/ai-context/verifications at server-dev - vrr/airi

vrr/airi

mirror of https://github.com/moeru-ai/airi.git synced 2026-05-17 12:49:33 +00:00

History

RainbowBird 9a182e0b68 Some checks are pending Cloudflare Workers (server-dev) / Deploy - stage-web (server-dev) (push) Waiting to run Details feat(server,stage-ui): bidirectional streaming TTS + audio path refactor Why: - Add a real bidirectional streaming TTS path: raw LLM tokens are forwarded to the upstream model (Volcengine v3 via the unspeech ws bridge) without client-side segmentation, so the model owns sentence splitting and audio chunks play as they arrive. - Move audio endpoints out of /api/v1/openai/. `/audio/voices`, `/audio/models`, `/audio/voices/streaming` are not real OpenAI public APIs, and the streaming TTS surface has nothing to do with OpenAI — keeping them under /openai/ mislabelled the contract. - Introduce `capabilities.speech.transport` on ProviderDefinition so future streaming providers (ElevenLabs / Cartesia / OpenAI Realtime) opt in without touching Stage.vue or the session factory. - Unify Stage.vue's TTS path through a single StageTtsSession so the chat-orchestrator hooks no longer branch on provider id. What: - apps/server: new ws proxy /api/v1/audio/speech/ws bridges client ↔ unspeech with auth, pre-flight flux check, billing from upstream session.finished.usage, OTel spans. - apps/server: audio routes moved from /api/v1/openai/audio/* to /api/v1/audio/* (hard cutover; 404 sentinel tests added). - apps/server: new /api/v1/audio/voices/streaming proxy reads voices from unspeech /api/voices?provider=volcengine. - apps/server: new STREAMING_TTS_UPSTREAM configKV entry + scripts/seed-streaming-tts.ts. - stage-ui: new libs/speech/streaming-pipeline.ts opens one ws per LLM intent (appendText / finish / cancel + onSentence / onError / onDone). - stage-ui: new libs/speech/tts-session.ts — StageTtsSession interface with segmenter and streaming adapters; factory dispatches by capabilities.speech.transport instead of hard-coded provider id. - stage-ui: providerOfficialSpeechStreaming with capabilities.speech = { transport: 'bidirectional-ws' }; settings page with model/voice picker + ws-based preview. - stage-ui: Stage.vue chat hooks collapsed to a single currentSession; hot-swap watcher cancels mid-session on provider/voice/model change; unmount cancels and drains playback. Tests: - 9 streaming-pipeline tests (happy path / buffered / error / cancel / truncation) - 11 tts-session tests (factory branch coverage + adapter contracts) - 4 audio-speech-ws route tests (forwarding / billing / pre-flight / config-missing) - 3 legacy-path 404 sentinels in v1 route tests - Verification doc updated to reflect automated coverage.		2026-05-17 01:35:40 +08:00
..
account-deletion.md	feat(auth): delete account (#1756 )	2026-04-28 20:53:00 +08:00
admin-flux-grants.md	refactor(server): drop redis stream + worker role (#1792 )	2026-05-08 21:14:01 +08:00
email-auth.md	fix(server/docker): squash auth ui to server image	2026-04-28 00:34:14 +08:00
llm-router.md	refactor(server): llm router config sync subscriber	2026-05-16 03:59:19 +08:00
streaming-tts.md	feat(server,stage-ui): bidirectional streaming TTS + audio path refactor	2026-05-17 01:35:40 +08:00