Why
- src/services/ was an unordered mix of single-file services and module
directories with no shared classification axis, plus several long-dead
admin batch helpers that survived the move to the simpler synchronous
admin-flux-grants flow.
What
- services/ now has two top-level layers:
domain/ — DB state + business rules (billing, characters, chats,
flux, flux-transaction, llm-router, providers, request-log,
stripe, user-deletion, admin/{flux-grants,router-config})
adapters/ — thin wrappers over external SDKs / infra (config-kv, email,
posthog, tts/)
- admin/* moved under domain/admin/ with consistent plural names
(flux-grants, router-config).
- tts-adapters/ collapsed to adapters/tts/ (no redundant -adapters suffix
once nested under adapters/).
- 63 src files + scripts/e2e-llm-router.ts + tests/verifications/_harness.ts
had relative imports rewritten; git mv preserves blame.
- apps/server/CLAUDE.md and docs/ai-context/*.md updated to match new paths.
Dead code removed
- services/admin-flux-grant-batches/ (service + worker + tests, 1090 LOC) —
superseded by admin-flux-grants and never wired into app.ts.
- routes/admin/flux-grant-batches/ — same.
- utils/redis-compressed.ts + test — zero production call sites.
- llm-router/index.ts re-exports trimmed from 26 to 6; only symbols with
external consumers are kept.
Intentionally kept
- schemas/flux-grant-batch.ts and its schemas/index.ts export remain so the
drizzle-kit generate diff stays empty. Removing them is a separate PR
that owns the drop-table migration for flux_grant_batch /
flux_grant_batch_recipient.
Verification
- pnpm -F @proj-airi/server typecheck: passes.
- pnpm exec eslint apps/server: 49 errors, identical to main baseline
(all are pre-existing node/prefer-global/buffer in envelope-crypto and
scripts/e2e-llm-router; untouched by this change).
- Vitest passes per-file; the 6 mockDB hook timeouts under full-parallel
run are the known pushSchema-per-worker infra cost, not a regression.
End-state of the multi-step KTD-5 / KTD-6 / U8 work. The knoway sidecar
is no longer reachable from server code; the router is required at boot
and now owns chat completions, TTS synthesis, and voice catalog listing.
Highlights:
- LLM_ROUTER_MASTER_KEY becomes required; app.ts drops the graceful-
skip branch and the chat fallback fetch path is gone.
- /audio/speech and /audio/voices route through new routeTts /
listTtsVoices entries that reuse the chat key-rotator + per-attempt
timeout + abort propagation.
- DEFAULT_CHAT_MODEL / DEFAULT_TTS_MODEL move from env to configKV so
default-model swaps are hot-reloadable via Pub/Sub.
- GATEWAY_BASE_URL removed from env schema, .env, .env.local, smoke,
verification harness. Redis upstream-voices cache deleted — catalogs
come from in-process adapter JSON.
- routeTts splits adapter error contract by ApiError statusCode:
4xx propagates without fallback; 5xx folds into the network-failure
fallback path. handleTTS wraps billing + span attribute in try/finally
to plug a span leak when ttsMeter.accumulate() throws.
- seed-router-config.ts rewritten with --merge (default) / --reset /
--dry-run modes and env-var key handoff (OPENROUTER_KEY / AZURE_KEY /
DASHSCOPE_KEY) so prod seed flows never put plaintext on the CLI.
Adds DashScope CosyVoice seeding.
Docs (CLAUDE.md, architecture-overview.md, transport-and-routes.md)
reflect the new boundary. verifications/llm-router.md replaces the
overstated "U1-U9 shipped" line with an evidence-vs-pending table.
Tests: full 40-file / 343-case server suite green. New regressions pin
ApiError 4xx → no-fallback, ApiError 5xx → fallback, TTS billing
failure → span closed and error propagated.
The Redis Stream `billing-events` + `worker` Railway role +
advisory-lock poller layered together didn't actually buy us reliability
— `debitFlux` swallowed XADD failures, leaving the door open to "balance
updated, ledger row never written". Collapse the whole thing back to:
`creditFlux` and `debitFlux` write `flux_transaction` ledger rows inline
within the same DB transaction that mutates `user_flux`, and `(user_id,
request_id)` remains the partial unique index that keeps retries safe.
Concrete changes:
- Inline ledger inserts in `BillingService.{debitFlux, creditFlux,
creditFluxFromStripeCheckout, creditFluxFromInvoice}`; drop `billingMq`
and `publishEvent` plumbing entirely.
- `routes/openai/v1` writes `llm_request_log` synchronously via the
existing `requestLogService`; the duplicate `llm-request-log.ts` service
module is removed.
- `bin/run-worker.ts`, `libs/mq/*`,
`services/billing/billing-events.ts`,
`services/billing/billing-consumer-handler.ts`, and matching tests are
deleted. CLI now exposes only `api`.
- `BILLING_EVENTS_*` env vars and the `DEFAULT_BILLING_EVENTS_STREAM`
helper are dropped; `docker-compose.yml` no longer ships a worker
service.
- `docs/ai-context/{workers-and-runtime, billing-architecture,
redis-boundaries-and-pubsub, data-model-and-state,
architecture-overview, README}.md`, `CLAUDE.md`, and the existing
verification docs are updated to describe the single-process synchronous
pipeline.
Tests: 29 files / 247 cases pass. Production deployments need to drop
the worker Railway service after this lands.