mirror of
https://github.com/moeru-ai/airi.git
synced 2026-04-26 13:40:42 +00:00
## Problem The WebGPU inference pipeline had several structural issues: 1. **No unified protocol** — Kokoro TTS, Whisper ASR, and Background Removal workers each used their own ad-hoc message formats. Adding a new model meant reinventing worker communication from scratch. 2. **Infrastructure existed but was disconnected** — `GPUResourceCoordinator`, `LoadQueue`, `InferenceWorkerManager`, and `protocol.ts` were all implemented but had zero consumers. The adapters duplicated the same lifecycle/timeout/mutex patterns independently. 3. **Performance gaps** — Kokoro only offered fp32 on WebGPU (no fp16), Whisper warm-up compiled shaders for 187.5s of dummy audio, audio transfer went through unnecessary WAV blob encode/decode, and `listVoices` reloaded the model every time. 4. **Silent failures** — Whisper worker's `generate()` had no try-catch; errors were swallowed and the main thread waited until timeout. 5. **No graceful degradation** — Whisper and Background Removal workers hardcoded `device: 'webgpu'` with no WASM fallback. 6. **No observability** — Only Kokoro had performance tracing. No adapter reported status to `useInferenceStatus`. No cache management UI existed. 7. **Dead code accumulation** — Old `KokoroWorkerManager` (232 lines), legacy Whisper message types, and scattered duplicate constants. ## Changes ### Phase 0 — Critical Performance & Bugs - Add `fp16-webgpu` dtype for Kokoro TTS (~2x inference speed on supported GPUs) - Fix Whisper warm-up tensor from `[1, 128, 3000]` → `[1, 128, 1]` (minimal shader compilation) - Fix Whisper worker silent error bug (add try-catch to `generate()` and `load()`) ### Phase 1 — Data Transfer & Caching - Switch Kokoro audio to Float32Array transferable (skip WAV blob encode in worker, lightweight WAV encode on main thread) - Cache `listVoices` results (skip redundant model reload when adapter state is `ready`) - Normalize progress reporting to 0-100 across all adapters, differentiate `warmup` phase ### Phase 2 — Protocol Unification & Infrastructure - Migrate all 3 workers + 3 adapters to unified `protocol.ts` message types (`load-model`, `run-inference`, `model-ready`, `inference-result`, `progress`, `error`) - Wire `GPUResourceCoordinator` into all adapters (VRAM allocation tracking, LRU ordering, memory pressure events) - Wire `LoadQueue` into all adapters (priority-based sequential model loading: TTS=10 > ASR=5 > BG_REMOVAL=1) - Add `coordinator.ts` global singleton for GPU coordinator + load queue - Add WebGPU detection + WASM fallback in Whisper and Background Removal workers ### Phase 3 — Error Recovery & Observability - Add restart logic with exponential backoff to Whisper adapter (matching Kokoro's existing pattern) - Integrate `classifyError()` (OOM / DEVICE_LOST / TIMEOUT classification) in Whisper adapter - Extend `defaultPerfTracer` to Whisper `transcribe()` and Background Removal `processImage()` - Wire `useInferenceStatus` into all 3 adapters (downloading → ready → terminated lifecycle) ### Phase 4 — Tests - Add unit tests for `AsyncMutex` (4 tests), `LoadQueue` (4 tests), `GPUResourceCoordinator` (7 tests) — all 15 passing ### Phase 5 — Cleanup & Features - Delete old `KokoroWorkerManager` (232 lines, zero consumers) - Delete orphaned `libs/workers/types.ts` (old Whisper message types) - Clean up `workers/kokoro/types.ts` (remove legacy message types, keep domain types) - Create centralized `constants.ts` (MODEL_IDS, MODEL_NAMES, TIMEOUTS, MAX_RESTARTS) - Remove hardcoded WebGPU check from background-removal devtools pages (worker auto-detects) - Add `useModelPreload` composable for generic idle-time preloading - Add `useInferencePreload` composable that reads provider config and preloads configured local models - Wire preloading into both `stage-web` and `stage-tamagotchi` App.vue (Kokoro TTS preloads 3s after init) - Add `ModelCacheManager.vue` settings component (cache size display, per-model status, clear cache) - Document GPU Device isolation architecture in protocol.ts ## After This PR - All inference workers speak the same protocol → adding a new model adapter is straightforward - GPU memory is tracked across all models with automatic pressure warnings at 80%/95% of VRAM budget - Models load sequentially via priority queue → no bandwidth/VRAM contention - Workers auto-detect WebGPU and fall back to WASM → works on browsers without WebGPU - Kokoro TTS preloads during idle time → "instant" first use for configured users - All adapters auto-restart on worker crashes (max 3 attempts, exponential backoff) - 15 unit tests cover core infrastructure (mutex, queue, coordinator) - Zero dead code remains in the inference pipeline ## Test Plan - [x] `pnpm exec vitest run packages/stage-ui/src/libs/inference/` — 15 tests pass - [x] `pnpm -F @proj-airi/stage-ui exec tsc --noEmit` — no TypeScript errors - [x] `pnpm lint:fix` — no lint errors in changed files - [ ] Manual: verify Kokoro TTS works with fp16-webgpu on a supported browser - [ ] Manual: verify Whisper ASR loads and transcribes correctly - [ ] Manual: verify Background Removal works in devtools page - [ ] Manual: verify preloading triggers in console (`[Preload] Loading kokoro-...`) --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
266 lines
10 KiB
TypeScript
266 lines
10 KiB
TypeScript
import type { WebFontMeta } from '@unocss/preset-web-fonts'
|
|
import type { Preset, PresetOrFactoryAwaitable } from 'unocss'
|
|
|
|
import { setDefaultAutoSelectFamilyAttemptTimeout } from 'node:net'
|
|
|
|
import { createExternalPackageIconLoader } from '@iconify/utils/lib/loader/external-pkg'
|
|
import { presetChromatic } from '@proj-airi/unocss-preset-chromatic'
|
|
import { colorToString } from '@unocss/preset-mini/utils'
|
|
import { defineConfig, mergeConfigs, presetAttributify, presetIcons, presetTypography, presetWind3, transformerDirectives, transformerVariantGroup } from 'unocss'
|
|
import { presetScrollbar } from 'unocss-preset-scrollbar'
|
|
import { parseColor } from 'unocss/preset-mini'
|
|
|
|
// On Netlify, building will result in when fetching metadata and fonts from @unocss/preset-web-fonts plugin:
|
|
//
|
|
// [cause]: AggregateError [ETIMEDOUT]:
|
|
// at internalConnectMultiple (node:net:1134:18)
|
|
// code: 'ETIMEDOUT',
|
|
// [errors]: [
|
|
// Error: connect ETIMEDOUT 146.75.77.229:443 ...
|
|
// Error: connect ENETUNREACH 2a04:4e42:83::485:443 - Local (:::0) ...
|
|
// ]
|
|
//
|
|
// This is same for either Google Fonts or Fontsource as provider. But GitHub Actions and local development works fine.
|
|
// My assumption is that the default timeout for auto-selecting family is too short (250ms)[^1] for the implementation
|
|
// of the Happy Eyeballs algorithm in Node.js, which is used by the `net` module to connect to the server, workflows
|
|
// illustrates like this:
|
|
//
|
|
// lookupAndConnect > autoSelectFamilyAttemptTimeout > lookupAndConnectMultiple > internalConnectMultiple > defaultTriggerAsyncIdScope
|
|
//
|
|
// Such mechanism will be used when the `net` module attempts to connect to a server using both IPv4 and IPv6 addresses,
|
|
// which is the case for Netlify builder.
|
|
//
|
|
// In order to fix this issue, we can increase the timeout to 1000ms (1 second) so that the algorithm has more time to
|
|
// attempt to connect to the server before timing out.
|
|
//
|
|
// [^1]: https://github.com/nodejs/node/pull/44731/files#diff-d76469e9e7f555294a7a5488c5c8fc4ef8ce5aea448cc26a1322d1ab693e09caR921
|
|
setDefaultAutoSelectFamilyAttemptTimeout(1000)
|
|
|
|
export function presetStoryMockHover(): PresetOrFactoryAwaitable {
|
|
return {
|
|
name: 'story-mock-hover',
|
|
variants: [
|
|
(matcher) => {
|
|
if (!matcher.includes('hover')) {
|
|
return matcher
|
|
}
|
|
|
|
return {
|
|
matcher,
|
|
selector: (s) => {
|
|
return `${s}, ${s.replace(/:hover$/, '')}._hover`
|
|
},
|
|
}
|
|
},
|
|
],
|
|
}
|
|
}
|
|
|
|
export function safelistAllPrimaryBackgrounds(): string[] {
|
|
return [undefined, 50, 100, 200, 300, 400, 500, 600, 700, 800, 900, 950].map((shade) => {
|
|
const prefix = shade ? `bg-primary-${shade}` : `bg-primary`
|
|
return [
|
|
prefix,
|
|
...[5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100].map(opacity => `${prefix}/${opacity}`),
|
|
]
|
|
}).flat()
|
|
}
|
|
|
|
export function presetWebFontsFonts(provider: 'fontsource' | 'none'): Record<string, string | WebFontMeta | (string | WebFontMeta)[]> {
|
|
return {
|
|
'sans': {
|
|
name: provider === 'fontsource' ? 'DM Sans' : 'DM Sans Variable',
|
|
provider,
|
|
},
|
|
'serif': {
|
|
name: 'DM Serif Display',
|
|
provider,
|
|
},
|
|
'mono': {
|
|
name: 'DM Mono',
|
|
provider,
|
|
},
|
|
'cutejp': {
|
|
name: 'Kiwi Maru',
|
|
provider,
|
|
subsets: ['latin', 'japanese'],
|
|
},
|
|
'cuteen': {
|
|
name: provider === 'fontsource' ? 'Nunito' : 'Nunito Variable',
|
|
provider,
|
|
},
|
|
'jura': {
|
|
name: provider === 'fontsource' ? 'Jura' : 'Jura Variable',
|
|
provider,
|
|
},
|
|
'gugi': {
|
|
name: 'Gugi',
|
|
provider,
|
|
},
|
|
'quicksand': {
|
|
name: provider === 'fontsource' ? 'Quicksand' : 'Quicksand Variable',
|
|
provider,
|
|
},
|
|
'urbanist': {
|
|
name: provider === 'fontsource' ? 'Urbanist' : 'Urbanist Variable',
|
|
provider,
|
|
},
|
|
'comfortaa': {
|
|
name: provider === 'fontsource' ? 'Comfortaa' : 'Comfortaa Variable',
|
|
provider,
|
|
},
|
|
'm-plus-rounded': {
|
|
name: 'M PLUS Rounded 1c',
|
|
provider,
|
|
},
|
|
'quanlai': {
|
|
name: 'cjkfonts AllSeto',
|
|
provider: 'none',
|
|
},
|
|
'xiaolai': {
|
|
name: 'Xiaolai SC',
|
|
provider: 'none',
|
|
},
|
|
}
|
|
}
|
|
|
|
export function sharedUnoConfig() {
|
|
return defineConfig({
|
|
presets: [
|
|
presetWind3(),
|
|
presetAttributify(),
|
|
presetTypography(),
|
|
presetIcons({
|
|
scale: 1.2,
|
|
collections: {
|
|
...createExternalPackageIconLoader('@proj-airi/lobe-icons'),
|
|
...createExternalPackageIconLoader('@proj-airi/iconify-meteocons'),
|
|
},
|
|
}),
|
|
presetScrollbar(),
|
|
presetChromatic({
|
|
baseHue: 220.44,
|
|
colors: {
|
|
primary: 0,
|
|
complementary: 180,
|
|
},
|
|
}) as Preset,
|
|
],
|
|
transformers: [
|
|
transformerDirectives({
|
|
applyVariable: ['--at-apply'],
|
|
}),
|
|
transformerVariantGroup(),
|
|
],
|
|
safelist: [
|
|
...'prose prose-sm m-auto text-left'.split(' '),
|
|
...safelistAllPrimaryBackgrounds(),
|
|
],
|
|
// hyoban/unocss-preset-shadcn: Use shadcn ui with UnoCSS
|
|
// https://github.com/hyoban/unocss-preset-shadcn
|
|
//
|
|
// Thanks to
|
|
// https://github.com/unovue/shadcn-vue/issues/34#issuecomment-2467318118
|
|
// https://github.com/hyoban-template/shadcn-vue-unocss-starter
|
|
//
|
|
// By default, `.ts` and `.js` files are NOT extracted.
|
|
// If you want to extract them, use the following configuration.
|
|
// It's necessary to add the following configuration if you use shadcn-vue or shadcn-svelte.
|
|
content: {
|
|
pipeline: {
|
|
include: [
|
|
// the default
|
|
|
|
/\.(vue|svelte|[jt]sx|mdx?|astro|elm|php|phtml|html)($|\?)/,
|
|
// include js/ts files
|
|
'(components|src)/**/*.{js,ts,vue}', // THIS CAN INCLUDE node_modules
|
|
'**/stage-ui/**/*.{vue,js,ts}', // THIS TOO
|
|
'**/ui/**/*.{vue,js,ts}', // THIS TOO
|
|
],
|
|
exclude: [
|
|
|
|
/\/node_modules\//, // DO NOT SCAN THE BLACK HOLE
|
|
],
|
|
},
|
|
},
|
|
rules: [
|
|
|
|
[/^mask-\[(.*)\]$/, ([, suffix]) => ({ '-webkit-mask-image': suffix.replace(/_/g, ' ') })],
|
|
|
|
[/^bg-dotted-\[(.*)\]$/, ([, color], { theme }) => {
|
|
const parsedColor = parseColor(color, theme)
|
|
// Util usage: https://github.com/unocss/unocss/blob/f57ef6ae50006a92f444738e50f3601c0d1121f2/packages-presets/preset-mini/src/_utils/utilities.ts#L186
|
|
return {
|
|
'background-image': `radial-gradient(circle at 1px 1px, ${colorToString(parsedColor?.cssColor ?? parsedColor?.color ?? color, 'var(--un-background-opacity)')} 1px, transparent 0)`,
|
|
'--un-background-opacity': parsedColor?.cssColor?.alpha ?? parsedColor?.alpha ?? 1,
|
|
}
|
|
}],
|
|
|
|
[/drag-region/, () => ({ 'app-region': 'drag' })],
|
|
],
|
|
theme: {
|
|
fontFamily: {
|
|
'sans': `"DM Sans Variant", "DM Sans", ui-sans-serif, system-ui, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol", "Noto Color Emoji";`,
|
|
'sans-rounded': `"Comfortaa Variable", "Comfortaa", "DM Sans", ui-sans-serif, system-ui, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol", "Noto Color Emoji";`,
|
|
'cute': `"Nunito Variable", "Nunito", "ChillRoundM", "Kiwi Maru", "Comfortaa Variable", "Comfortaa", "DM Sans Variant", "DM Sans", ui-sans-serif, system-ui, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol", "Noto Color Emoji";`,
|
|
'cuteen': `"Nunito Variable", "Nunito", "ChillRoundM", "Kiwi Maru", "Comfortaa Variable", "Comfortaa", "DM Sans Variant", "DM Sans", ui-sans-serif, system-ui, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol", "Noto Color Emoji";`,
|
|
'cutejp': `"Nunito Variable", "Nunito", "ChillRoundM", "Kiwi Maru", "Comfortaa Variable", "Comfortaa", "DM Sans Variant", "DM Sans", ui-sans-serif, system-ui, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol", "Noto Color Emoji";`,
|
|
},
|
|
/**
|
|
* https://github.com/unocss/unocss/blob/1031312057a3bea1082b7d938eb2ad640f57613a/packages-presets/preset-wind4/src/theme/animate.ts
|
|
* https://unocss.dev/presets/wind4#transformdirectives
|
|
*/
|
|
animation: {
|
|
keyframes: {
|
|
overlayShow: '{from{opacity:0;}to{opacity:1;}}',
|
|
overlayHide: '{from{opacity:1;}to{opacity:0;}}',
|
|
contentShow: '{from:{opacity:0;transform:translate(-50%,-48%) scale(0.96);}to:{opacity:1;transform:translate(-50%,-50%) scale(1);}}',
|
|
contentHide: '{from:{opacity:1;transform:translate(-50%,-50%) scale(1);}to:{opacity:0;transform:translate(-50%,-48%) scale(0.96);}}',
|
|
slideUpAndFade: '{from{opacity:0;transform:translateY(2px)}to{opacity:1;transform:translateY(0)}}',
|
|
slideRightAndFade: '{from{opacity:0;transform:translateX(-2px)}to{opacity:1;transform:translateX(0)}}',
|
|
slideDownAndFade: '{from{opacity:0;transform:translateY(-2px)}to{opacity:1;transform:translateY(0)}}',
|
|
slideLeftAndFade: '{from{opacity:0;transform:translateX(2px)}to{opacity:1;transform:translateX(0)}}',
|
|
fadeIn: '{from{opacity:0;}to{opacity:1;}}',
|
|
fadeOut: '{from{opacity:1;}to{opacity:0;}}',
|
|
},
|
|
durations: {
|
|
overlayShow: '300ms',
|
|
overlayHide: '300ms',
|
|
contentShow: '150ms',
|
|
contentHide: '150ms',
|
|
slideUpAndFade: '400ms',
|
|
slideRightAndFade: '400ms',
|
|
slideDownAndFade: '400ms',
|
|
slideLeftAndFade: '400ms',
|
|
fadeIn: '200ms',
|
|
fadeOut: '200ms',
|
|
},
|
|
timingFns: {
|
|
overlayShow: 'cubic-bezier(0.16, 1, 0.3, 1)',
|
|
overlayHide: 'cubic-bezier(0.16, 1, 0.3, 1)',
|
|
contentShow: 'cubic-bezier(0.16, 1, 0.3, 1)',
|
|
contentHide: 'cubic-bezier(0.16, 1, 0.3, 1)',
|
|
slideUpAndFade: 'cubic-bezier(0.16, 1, 0.3, 1)',
|
|
slideRightAndFade: 'cubic-bezier(0.16, 1, 0.3, 1)',
|
|
slideDownAndFade: 'cubic-bezier(0.16, 1, 0.3, 1)',
|
|
slideLeftAndFade: 'cubic-bezier(0.16, 1, 0.3, 1)',
|
|
fadeIn: 'ease-in-out',
|
|
fadeOut: 'ease-in-out',
|
|
},
|
|
},
|
|
},
|
|
})
|
|
}
|
|
|
|
export function histoireUnoConfig() {
|
|
return defineConfig({
|
|
presets: [
|
|
presetStoryMockHover(),
|
|
],
|
|
})
|
|
}
|
|
|
|
export default mergeConfigs([
|
|
sharedUnoConfig(),
|
|
histoireUnoConfig(),
|
|
])
|