airi/packages
RainbowBird 9a182e0b68
Some checks are pending
Cloudflare Workers (server-dev) / Deploy - stage-web (server-dev) (push) Waiting to run
feat(server,stage-ui): bidirectional streaming TTS + audio path refactor
Why:
- Add a real bidirectional streaming TTS path: raw LLM tokens are
  forwarded to the upstream model (Volcengine v3 via the unspeech ws
  bridge) without client-side segmentation, so the model owns sentence
  splitting and audio chunks play as they arrive.
- Move audio endpoints out of /api/v1/openai/. `/audio/voices`,
  `/audio/models`, `/audio/voices/streaming` are not real OpenAI public
  APIs, and the streaming TTS surface has nothing to do with OpenAI —
  keeping them under /openai/ mislabelled the contract.
- Introduce `capabilities.speech.transport` on ProviderDefinition so
  future streaming providers (ElevenLabs / Cartesia / OpenAI Realtime)
  opt in without touching Stage.vue or the session factory.
- Unify Stage.vue's TTS path through a single StageTtsSession so the
  chat-orchestrator hooks no longer branch on provider id.

What:
- apps/server: new ws proxy /api/v1/audio/speech/ws bridges client ↔
  unspeech with auth, pre-flight flux check, billing from upstream
  session.finished.usage, OTel spans.
- apps/server: audio routes moved from /api/v1/openai/audio/* to
  /api/v1/audio/* (hard cutover; 404 sentinel tests added).
- apps/server: new /api/v1/audio/voices/streaming proxy reads voices
  from unspeech /api/voices?provider=volcengine.
- apps/server: new STREAMING_TTS_UPSTREAM configKV entry +
  scripts/seed-streaming-tts.ts.
- stage-ui: new libs/speech/streaming-pipeline.ts opens one ws per LLM
  intent (appendText / finish / cancel + onSentence / onError / onDone).
- stage-ui: new libs/speech/tts-session.ts — StageTtsSession interface
  with segmenter and streaming adapters; factory dispatches by
  capabilities.speech.transport instead of hard-coded provider id.
- stage-ui: providerOfficialSpeechStreaming with capabilities.speech =
  { transport: 'bidirectional-ws' }; settings page with model/voice
  picker + ws-based preview.
- stage-ui: Stage.vue chat hooks collapsed to a single currentSession;
  hot-swap watcher cancels mid-session on provider/voice/model change;
  unmount cancels and drains playback.

Tests:
- 9 streaming-pipeline tests (happy path / buffered / error / cancel /
  truncation)
- 11 tts-session tests (factory branch coverage + adapter contracts)
- 4 audio-speech-ws route tests (forwarding / billing / pre-flight /
  config-missing)
- 3 legacy-path 404 sentinels in v1 route tests
- Verification doc updated to reflect automated coverage.
2026-05-17 01:35:40 +08:00
..
airi-screenshot release: v0.10.2 2026-05-07 19:36:33 +08:00
audio chore(deps): bump dependencies 2026-03-23 02:15:40 +08:00
audio-pipelines-transcribe feat(stage-tamagotchi,stage-ui,ui): input should not flink, added retry for errored item, adjusted button 2026-04-23 01:09:46 +08:00
cap-vite release: v0.10.2 2026-05-07 19:36:33 +08:00
ccc style: many typescript v6 pending errors 2026-04-05 02:07:09 +08:00
core-agent feat(core-agent): harden registry buckets and bridge ingest isolation (#1819) 2026-05-13 12:02:35 +08:00
core-character release: v0.10.2 2026-05-07 19:36:33 +08:00
drizzle-duckdb-wasm chore(*): migrate drizzle-duckdb-wasm and duckdb-wasm to proj-airi/duckdb-wasm (#116) 2025-04-05 18:14:15 +08:00
duckdb-wasm chore(*): migrate drizzle-duckdb-wasm and duckdb-wasm to proj-airi/duckdb-wasm (#116) 2025-04-05 18:14:15 +08:00
electron-eventa release: v0.10.2 2026-05-07 19:36:33 +08:00
electron-screen-capture feat(server): llm & tts gateway (#1837) 2026-05-15 19:00:38 +08:00
electron-vueuse release: v0.10.2 2026-05-07 19:36:33 +08:00
font-chillroundm release: v0.10.2 2026-05-07 19:36:33 +08:00
font-cjkfonts-allseto release: v0.10.2 2026-05-07 19:36:33 +08:00
font-departure-mono release: v0.10.2 2026-05-07 19:36:33 +08:00
font-xiaolai release: v0.10.2 2026-05-07 19:36:33 +08:00
i18n feat(server): stream tts provider 2026-05-16 17:29:40 +08:00
memory-pgvector release: v0.10.2 2026-05-07 19:36:33 +08:00
model-driver-lipsync release: v0.10.2 2026-05-07 19:36:33 +08:00
model-driver-mediapipe chore(deps): bump dependencies 2026-04-18 19:18:17 +08:00
pipelines-audio release: v0.10.2 2026-05-07 19:36:33 +08:00
plugin-protocol release: v0.10.2 2026-05-07 19:36:33 +08:00
plugin-sdk release: v0.10.2 2026-05-07 19:36:33 +08:00
plugin-sdk-tamagotchi release: v0.10.2 2026-05-07 19:36:33 +08:00
scenarios-stage-tamagotchi-browser release: v0.10.2 2026-05-07 19:36:33 +08:00
scenarios-stage-tamagotchi-electron release: v0.10.2 2026-05-07 19:36:33 +08:00
server-runtime fix(server-runtime): ignore duplicate websocket listener (#1829) 2026-05-14 01:53:41 +08:00
server-schema release: v0.10.2 2026-05-07 19:36:33 +08:00
server-sdk release: v0.10.2 2026-05-07 19:36:33 +08:00
server-sdk-shared release: v0.10.2 2026-05-07 19:36:33 +08:00
server-shared release: v0.10.2 2026-05-07 19:36:33 +08:00
stage-layouts refactor(stage-*): improve transcription UX and DX (#1685) 2026-05-09 19:09:39 +08:00
stage-pages feat(server,stage-ui): bidirectional streaming TTS + audio path refactor 2026-05-17 01:35:40 +08:00
stage-shared feat(stage-tamagotchi,stage-shared): wire up global shortcut service and devtools (#1811) 2026-05-10 20:38:47 +09:00
stage-ui feat(server,stage-ui): bidirectional streaming TTS + audio path refactor 2026-05-17 01:35:40 +08:00
stage-ui-live2d fix(stage-ui-live2d): prevent motions from overriding lip sync mouth values (#1783) 2026-05-13 12:10:03 +08:00
stage-ui-three feat(stage-*): allow setting character offset in landscape (#1696) 2026-05-06 17:04:18 +08:00
stream-kit release: v0.10.2 2026-05-07 19:36:33 +08:00
ui fix(ui): add openOnClick prop to Combobox for reliable dropdown opening (#1795) 2026-05-09 22:44:27 +08:00
ui-loading-screens chore(deps): bump dependencies 2026-04-18 19:18:17 +08:00
ui-transitions chore(deps): remove deprecated unplugin-vue-router (#1664) 2026-04-15 15:02:10 +08:00
unocss-preset-fonts release: v0.10.2 2026-05-07 19:36:33 +08:00
vishot-runner-browser release: v0.10.2 2026-05-07 19:36:33 +08:00
vishot-runner-electron release: v0.10.2 2026-05-07 19:36:33 +08:00
vishot-runtime release: v0.10.2 2026-05-07 19:36:33 +08:00