Why
- src/services/ was an unordered mix of single-file services and module
directories with no shared classification axis, plus several long-dead
admin batch helpers that survived the move to the simpler synchronous
admin-flux-grants flow.
What
- services/ now has two top-level layers:
domain/ — DB state + business rules (billing, characters, chats,
flux, flux-transaction, llm-router, providers, request-log,
stripe, user-deletion, admin/{flux-grants,router-config})
adapters/ — thin wrappers over external SDKs / infra (config-kv, email,
posthog, tts/)
- admin/* moved under domain/admin/ with consistent plural names
(flux-grants, router-config).
- tts-adapters/ collapsed to adapters/tts/ (no redundant -adapters suffix
once nested under adapters/).
- 63 src files + scripts/e2e-llm-router.ts + tests/verifications/_harness.ts
had relative imports rewritten; git mv preserves blame.
- apps/server/CLAUDE.md and docs/ai-context/*.md updated to match new paths.
Dead code removed
- services/admin-flux-grant-batches/ (service + worker + tests, 1090 LOC) —
superseded by admin-flux-grants and never wired into app.ts.
- routes/admin/flux-grant-batches/ — same.
- utils/redis-compressed.ts + test — zero production call sites.
- llm-router/index.ts re-exports trimmed from 26 to 6; only symbols with
external consumers are kept.
Intentionally kept
- schemas/flux-grant-batch.ts and its schemas/index.ts export remain so the
drizzle-kit generate diff stays empty. Removing them is a separate PR
that owns the drop-table migration for flux_grant_batch /
flux_grant_batch_recipient.
Verification
- pnpm -F @proj-airi/server typecheck: passes.
- pnpm exec eslint apps/server: 49 errors, identical to main baseline
(all are pre-existing node/prefer-global/buffer in envelope-crypto and
scripts/e2e-llm-router; untouched by this change).
- Vitest passes per-file; the 6 mockDB hook timeouts under full-parallel
run are the known pushSchema-per-worker infra cost, not a regression.
4.7 KiB
Server CLAUDE.md
Agent-facing guide for apps/server. Detailed topic docs live in docs/ai-context/ — read the relevant file before modifying that area.
Overview
Hono-based Node.js backend. Owns auth, billing, chat sync, LLM gateway forwarding, and observability. Multi-instance deployed on Railway — design all features assuming N>1 instances sharing the same Postgres and Redis.
Deployment Model
- Hosted on Railway, multiple instances behind a load balancer.
- Single CLI role:
api(seesrc/bin/run.ts). No background polling loops, no fire-and-forget tasks — every write happens inside the request thread. - Stateless per-instance: no local state that matters across requests.
- Cross-instance coordination via Redis Pub/Sub (WebSocket broadcast). DB-level idempotency (
(userId, requestId)partial unique index onflux_transaction) covers retries. - Rate limiting is currently in-memory (not distributed) — keep this in mind when adding rate-sensitive features.
Tech Stack
Hono, Better Auth (OIDC provider, RS256 JWT), Drizzle ORM, PostgreSQL, Redis, Stripe, OpenTelemetry, Valibot, injeca (DI), tsx.
Commands
pnpm -F @proj-airi/server dev # dev with dotenvx (.env.local)
pnpm -F @proj-airi/server typecheck
pnpm -F @proj-airi/server exec vitest run # all server tests
pnpm exec vitest run apps/server/src/... # single test file
pnpm -F @proj-airi/server db:generate # drizzle-kit generate
pnpm -F @proj-airi/server db:push # drizzle-kit push
pnpm -F @proj-airi/server auth:generate # better-auth → src/schemas/accounts.ts
Local observability: docker compose -f apps/server/docker-compose.otel.yml up -d
Architecture Summary
Entry & DI: src/app.ts (createApp()) → logger, env, OTel, Postgres/Redis, DB migrations, services via injeca, routes/middleware. CLI entry src/bin/run.ts.
Layering:
- Routes (
src/routes/): thin — param validation (Valibot), auth guards, error mapping. No business logic here. - Services (
src/services/): core business logic and DB transactions. - Schemas (
src/schemas/): Drizzle table definitions. Migrations in@proj-airi/server-schema.
Middleware chain (/api/*): CORS → hono/logger → optional otel → sessionMiddleware → bodyLimit(1MB) → per-route guards. WebSocket /ws/chat registered before bodyLimit.
Error model: ApiError(statusCode, errorCode, message, details) in src/utils/error.ts.
Key Design Decisions
- Flux read/write separation:
FluxServicereads (Redis cache-aside),BillingServicewrites (single Postgres tx that mutatesuser_fluxand writes the matchingflux_transactionledger row). Never put write-balance logic influx.ts. - No async billing pipeline: debits and credits update balance + ledger in one transaction. The
(user_id, request_id)partial unique index gives DB-level idempotency for retries; LLMrequest logrows are written best-effort right after the response is delivered. - In-process LLM/TTS router:
/api/v1/openaiis dispatched byservices/domain/llm-routerreadingLLM_ROUTER_CONFIG(per-model upstream chain + envelope-encrypted keys).chat/completionswalks LLM upstreams with key fallback;audio/speechdelegates to a TTS adapter (azure/dashscope-cosyvoice/volcengine);audio/voicesreturns the adapter's compiled-in catalog. Server handles auth/billing/logging, not model execution. - Redis is cache + pub/sub, not truth: balance cache, app_settings read cache, WebSocket cross-instance pub/sub. Truth is always Postgres.
- Auth: Better Auth + OIDC.
sessionMiddlewarefills context but doesn't block;authGuardreturns 401. - Multi-instance safe: all writes go through Postgres transactions; cross-instance messaging uses Redis Pub/Sub. No async work, no in-process singletons — admin flux grants happen synchronously inside the POST that triggered them.
Detailed Context Docs
See docs/ai-context/README.md for the full index. Key files:
architecture-overview.md— entry, DI, assembly, boundariestransport-and-routes.md— API surface, route→service mappingdata-model-and-state.md— tables, state ownership, cachingbilling-architecture.md— Flux/Stripe ledgerredis-boundaries-and-pubsub.md— Redis key/channel boundariesauth-and-oidc.md— auth flows, OIDC, trusted clientsconfig-and-naming-conventions.md— configKV, naming rulesworkers-and-runtime.md— singleapirole, no background loops, no fire-and-forget; everything is synchronous in-requestadmin-flux-grants.md— synchronous one-shot flux grant endpoint (no batch tables, no state machine)observability-conventions.md— OTel naming, custom attributes