Consolidates the incremental refactor work into a single change set: modular web tools (api/web_tools), native Anthropic request building and SSE block policy, OpenAI conversion and error handling, provider transports and rate limiting, messaging handler and tree queue, safe logging, smoke tests, and broad test coverage.
5.4 KiB
Architecture Plan
This document is the baseline architecture guide referenced by AGENTS.md. It
records the intended dependency direction and the migration target for keeping
the project modular as providers, clients, and smoke tests grow.
Current Product Shape
free-claude-code is an Anthropic-compatible proxy with optional messaging
workers:
api/owns the HTTP routes, request orchestration, model routing, auth, and server lifecycle.providers/owns upstream model adapters, request conversion, stream conversion, provider rate limiting, and provider error mapping.messaging/owns Discord and Telegram adapters, command handling, tree threading, session persistence, transcript rendering, and voice intake.cli/owns package entrypoints and managed Claude CLI subprocess sessions.config/owns environment-backed settings and logging setup.smoke/owns opt-in product smoke scenarios and the public coverage inventory used by contract tests.
Intended Dependency Direction
The repo should preserve this dependency order:
flowchart TD
config[config] --> api[api]
config --> providers[providers]
config --> messaging[messaging]
core[core.anthropic] --> api
core --> providers
core --> messaging
providers --> api
api --> cli[cli]
api --> messaging
cli --> messaging
Runtime note: api.runtime imports cli and messaging to wire the optional
messaging stack; messaging does not import cli (session/CLI access is passed
in from api.runtime).
The practical rule is simpler than the graph: shared protocol helpers belong in neutral core modules, not under a provider package. Provider adapters may depend on the neutral protocol layer, but API and messaging code should not import provider internals.
The diagram above mixes Python import direction (e.g. config → providers)
with runtime composition (e.g. api.runtime constructs cli and messaging).
PLAN.md remains the product map; encoded rules (including root imports like
import api, relative imports, and api → providers facade allowlists) live in
tests/contracts/test_import_boundaries.py.
Contract highlights: api/ may import only providers.base, providers.exceptions,
and providers.registry from the providers package (not per-adapter modules).
core/ stays free of api, messaging, cli, providers, config, and smoke.
messaging/ does not import api, cli, or smoke, and may import providers
only via providers.nvidia_nim.voice (NVIDIA/Riva offline ASR). Stream contract
helpers live in core/anthropic/stream_contracts.py; live smoke imports that
module directly (no dedicated smoke SSE shim). NVIDIA NIM chat tuning uses the
canonical config.nim.NimSettings model on Settings; providers.registry
passes settings.nim into NvidiaNimProvider without a duplicate schema.
Default upstream base URLs use a single constant per endpoint in
providers/defaults.py (e.g. NVIDIA_NIM_DEFAULT_BASE). Process-cached provider
helpers (api.dependencies.get_provider / get_provider_for_type) exist for
scripts and unit tests; production HTTP handlers must use resolve_provider with
request.app so the app-scoped ProviderRegistry is used. The api package
__all__ exposes HTTP models and create_app only (not app, not those helpers).
api.app:create_app is the ASGI factory (e.g. uvicorn api.app:create_app --factory);
server.py still exposes server:app as a module-level instance for convenience.
Target Boundaries
core/anthropic/: Anthropic protocol helpers, stream primitives, content extraction, token estimation, user-facing error strings, request conversion, thinking, tool helpers, and stream contract assertions (stream_contracts.py) shared across API, providers, messaging, and tests.api/runtime.py: application composition, optional messaging startup, session store restoration, and cleanup ownership.providers/: provider descriptors, credential resolution, transport factories, scoped rate limiters, upstream request builders, and stream transformers.messaging/: platform-neutral orchestration split from command dispatch, rendering, voice handling, and persistence.cli/: typed Claude CLI runner config, subprocess management, and packaged user-facing entrypoints.
Smoke Coverage Policy
Default CI stays deterministic and runs uv run pytest against tests/.
Product smoke lives under smoke/ and is enabled with FCC_LIVE_SMOKE=1.
Smoke runs should use -n 0 unless a scenario is explicitly known to be safe
under xdist.
Live smoke has two valid skip classes:
missing_env: credentials, local services, binaries, or explicit opt-in flags are absent.upstream_unavailable: real providers, bot APIs, or local model servers are unreachable.
product_failure and harness_bug are regressions. When a provider is
explicitly selected by FCC_SMOKE_PROVIDER_MATRIX, missing configuration should
fail instead of being silently skipped.
Refactor Rules
- Keep public request/response shapes stable while moving internals.
- Complete module migrations in one change: update imports to the new owner and remove old compatibility shims unless preserving a published interface is explicitly required.
- Lock behavior with focused tests before moving shared protocol or runtime code.
- Run checks in this order:
uv run ruff format,uv run ruff check,uv run ty check,uv run pytest.