Commit graph

4 commits

Author SHA1 Message Date
RainbowBird
c627bce9c9
refactor(server): split services into domain/adapter layers, drop dead code
Why
- src/services/ was an unordered mix of single-file services and module
  directories with no shared classification axis, plus several long-dead
  admin batch helpers that survived the move to the simpler synchronous
  admin-flux-grants flow.

What
- services/ now has two top-level layers:
    domain/   — DB state + business rules (billing, characters, chats,
                flux, flux-transaction, llm-router, providers, request-log,
                stripe, user-deletion, admin/{flux-grants,router-config})
    adapters/ — thin wrappers over external SDKs / infra (config-kv, email,
                posthog, tts/)
- admin/* moved under domain/admin/ with consistent plural names
  (flux-grants, router-config).
- tts-adapters/ collapsed to adapters/tts/ (no redundant -adapters suffix
  once nested under adapters/).
- 63 src files + scripts/e2e-llm-router.ts + tests/verifications/_harness.ts
  had relative imports rewritten; git mv preserves blame.
- apps/server/CLAUDE.md and docs/ai-context/*.md updated to match new paths.

Dead code removed
- services/admin-flux-grant-batches/ (service + worker + tests, 1090 LOC) —
  superseded by admin-flux-grants and never wired into app.ts.
- routes/admin/flux-grant-batches/ — same.
- utils/redis-compressed.ts + test — zero production call sites.
- llm-router/index.ts re-exports trimmed from 26 to 6; only symbols with
  external consumers are kept.

Intentionally kept
- schemas/flux-grant-batch.ts and its schemas/index.ts export remain so the
  drizzle-kit generate diff stays empty. Removing them is a separate PR
  that owns the drop-table migration for flux_grant_batch /
  flux_grant_batch_recipient.

Verification
- pnpm -F @proj-airi/server typecheck: passes.
- pnpm exec eslint apps/server: 49 errors, identical to main baseline
  (all are pre-existing node/prefer-global/buffer in envelope-crypto and
  scripts/e2e-llm-router; untouched by this change).
- Vitest passes per-file; the 6 mockDB hook timeouts under full-parallel
  run are the known pushSchema-per-worker infra cost, not a regression.
2026-05-18 23:36:45 +08:00
RainbowBird
4da4a72703
feat(server): finalize in-process LLM/TTS router cutover
End-state of the multi-step KTD-5 / KTD-6 / U8 work. The knoway sidecar
is no longer reachable from server code; the router is required at boot
and now owns chat completions, TTS synthesis, and voice catalog listing.

Highlights:
- LLM_ROUTER_MASTER_KEY becomes required; app.ts drops the graceful-
  skip branch and the chat fallback fetch path is gone.
- /audio/speech and /audio/voices route through new routeTts /
  listTtsVoices entries that reuse the chat key-rotator + per-attempt
  timeout + abort propagation.
- DEFAULT_CHAT_MODEL / DEFAULT_TTS_MODEL move from env to configKV so
  default-model swaps are hot-reloadable via Pub/Sub.
- GATEWAY_BASE_URL removed from env schema, .env, .env.local, smoke,
  verification harness. Redis upstream-voices cache deleted — catalogs
  come from in-process adapter JSON.
- routeTts splits adapter error contract by ApiError statusCode:
  4xx propagates without fallback; 5xx folds into the network-failure
  fallback path. handleTTS wraps billing + span attribute in try/finally
  to plug a span leak when ttsMeter.accumulate() throws.
- seed-router-config.ts rewritten with --merge (default) / --reset /
  --dry-run modes and env-var key handoff (OPENROUTER_KEY / AZURE_KEY /
  DASHSCOPE_KEY) so prod seed flows never put plaintext on the CLI.
  Adds DashScope CosyVoice seeding.

Docs (CLAUDE.md, architecture-overview.md, transport-and-routes.md)
reflect the new boundary. verifications/llm-router.md replaces the
overstated "U1-U9 shipped" line with an evidence-vs-pending table.

Tests: full 40-file / 343-case server suite green. New regressions pin
ApiError 4xx → no-fallback, ApiError 5xx → fallback, TTS billing
failure → span closed and error propagated.
2026-05-18 23:32:33 +08:00
RainbowBird
f8d1fa7a64
refactor(server): drop redis stream + worker role (#1792)
Some checks failed
CI / Build Test (stage-tamagotchi) (push) Waiting to run
CI / Lint (push) Waiting to run
CI / Build Test (stage-tamagotchi-godot) (push) Waiting to run
CI / Build Test (stage-web) (push) Waiting to run
CI / Build Test (ui-loading-screens) (push) Waiting to run
CI / Build Test (ui-transitions) (push) Waiting to run
CI / Type Check (push) Waiting to run
CI / Check Provenance (push) Waiting to run
Cloudflare Workers / Deploy - stage-web (push) Waiting to run
Update Nix assets Hash / update (push) Has been cancelled
Update Nix pnpmDeps Hash / update (push) Has been cancelled
The Redis Stream `billing-events` + `worker` Railway role +
advisory-lock poller layered together didn't actually buy us reliability
— `debitFlux` swallowed XADD failures, leaving the door open to "balance
updated, ledger row never written". Collapse the whole thing back to:
`creditFlux` and `debitFlux` write `flux_transaction` ledger rows inline
within the same DB transaction that mutates `user_flux`, and `(user_id,
request_id)` remains the partial unique index that keeps retries safe.

Concrete changes:
- Inline ledger inserts in `BillingService.{debitFlux, creditFlux,
creditFluxFromStripeCheckout, creditFluxFromInvoice}`; drop `billingMq`
and `publishEvent` plumbing entirely.
- `routes/openai/v1` writes `llm_request_log` synchronously via the
existing `requestLogService`; the duplicate `llm-request-log.ts` service
module is removed.
- `bin/run-worker.ts`, `libs/mq/*`,
`services/billing/billing-events.ts`,
`services/billing/billing-consumer-handler.ts`, and matching tests are
deleted. CLI now exposes only `api`.
- `BILLING_EVENTS_*` env vars and the `DEFAULT_BILLING_EVENTS_STREAM`
helper are dropped; `docker-compose.yml` no longer ships a worker
service.
- `docs/ai-context/{workers-and-runtime, billing-architecture,
redis-boundaries-and-pubsub, data-model-and-state,
architecture-overview, README}.md`, `CLAUDE.md`, and the existing
verification docs are updated to describe the single-process synchronous
pipeline.

Tests: 29 files / 247 cases pass. Production deployments need to drop
the worker Railway service after this lands.
2026-05-08 21:14:01 +08:00
RainbowBird
93888e6935
feat(server): use stripe product as flux pricing (#1640) 2026-04-12 04:32:42 +08:00