free-claude-code/PLAN.md
Alishahryar1 f3a7528d49
Some checks are pending
CI / checks (push) Waiting to run
Major refactor: API, providers, messaging, and Anthropic protocol
Consolidates the incremental refactor work into a single change set: modular web tools (api/web_tools), native Anthropic request building and SSE block policy, OpenAI conversion and error handling, provider transports and rate limiting, messaging handler and tree queue, safe logging, smoke tests, and broad test coverage.
2026-04-26 03:01:14 -07:00

117 lines
5.4 KiB
Markdown

# Architecture Plan
This document is the baseline architecture guide referenced by `AGENTS.md`. It
records the intended dependency direction and the migration target for keeping
the project modular as providers, clients, and smoke tests grow.
## Current Product Shape
`free-claude-code` is an Anthropic-compatible proxy with optional messaging
workers:
- `api/` owns the HTTP routes, request orchestration, model routing, auth, and
server lifecycle.
- `providers/` owns upstream model adapters, request conversion, stream
conversion, provider rate limiting, and provider error mapping.
- `messaging/` owns Discord and Telegram adapters, command handling, tree
threading, session persistence, transcript rendering, and voice intake.
- `cli/` owns package entrypoints and managed Claude CLI subprocess sessions.
- `config/` owns environment-backed settings and logging setup.
- `smoke/` owns opt-in product smoke scenarios and the public coverage
inventory used by contract tests.
## Intended Dependency Direction
The repo should preserve this dependency order:
```mermaid
flowchart TD
config[config] --> api[api]
config --> providers[providers]
config --> messaging[messaging]
core[core.anthropic] --> api
core --> providers
core --> messaging
providers --> api
api --> cli[cli]
api --> messaging
cli --> messaging
```
Runtime note: `api.runtime` imports `cli` and `messaging` to wire the optional
messaging stack; `messaging` does not import `cli` (session/CLI access is passed
in from `api.runtime`).
The practical rule is simpler than the graph: shared protocol helpers belong in
neutral core modules, not under a provider package. Provider adapters may depend
on the neutral protocol layer, but API and messaging code should not import
provider internals.
The diagram above mixes **Python import direction** (e.g. `config``providers`)
with **runtime composition** (e.g. `api.runtime` constructs `cli` and `messaging`).
`PLAN.md` remains the product map; **encoded** rules (including root imports like
`import api`, relative imports, and `api``providers` facade allowlists) live in
`tests/contracts/test_import_boundaries.py`.
**Contract highlights:** `api/` may import only `providers.base`, `providers.exceptions`,
and `providers.registry` from the providers package (not per-adapter modules).
`core/` stays free of `api`, `messaging`, `cli`, `providers`, `config`, and `smoke`.
`messaging/` does not import `api`, `cli`, or `smoke`, and may import `providers`
only via `providers.nvidia_nim.voice` (NVIDIA/Riva offline ASR). Stream contract
helpers live in `core/anthropic/stream_contracts.py`; live smoke imports that
module directly (no dedicated smoke SSE shim). NVIDIA NIM chat tuning uses the
canonical `config.nim.NimSettings` model on `Settings`; `providers.registry`
passes `settings.nim` into `NvidiaNimProvider` without a duplicate schema.
Default upstream base URLs use a single constant per endpoint in
`providers/defaults.py` (e.g. `NVIDIA_NIM_DEFAULT_BASE`). Process-cached provider
helpers (`api.dependencies.get_provider` / `get_provider_for_type`) exist for
scripts and unit tests; production HTTP handlers must use `resolve_provider` with
`request.app` so the app-scoped `ProviderRegistry` is used. The `api` package
`__all__` exposes HTTP models and `create_app` only (not `app`, not those helpers).
`api.app:create_app` is the ASGI factory (e.g. `uvicorn api.app:create_app --factory`);
`server.py` still exposes `server:app` as a module-level instance for convenience.
## Target Boundaries
- `core/anthropic/`: Anthropic protocol helpers, stream primitives, content
extraction, token estimation, user-facing error strings, request conversion,
thinking, tool helpers, and stream contract assertions
(`stream_contracts.py`) shared across API, providers, messaging, and tests.
- `api/runtime.py`: application composition, optional messaging startup,
session store restoration, and cleanup ownership.
- `providers/`: provider descriptors, credential resolution, transport
factories, scoped rate limiters, upstream request builders, and stream
transformers.
- `messaging/`: platform-neutral orchestration split from command dispatch,
rendering, voice handling, and persistence.
- `cli/`: typed Claude CLI runner config, subprocess management, and packaged
user-facing entrypoints.
## Smoke Coverage Policy
Default CI stays deterministic and runs `uv run pytest` against `tests/`.
Product smoke lives under `smoke/` and is enabled with `FCC_LIVE_SMOKE=1`.
Smoke runs should use `-n 0` unless a scenario is explicitly known to be safe
under xdist.
Live smoke has two valid skip classes:
- `missing_env`: credentials, local services, binaries, or explicit opt-in flags
are absent.
- `upstream_unavailable`: real providers, bot APIs, or local model servers are
unreachable.
`product_failure` and `harness_bug` are regressions. When a provider is
explicitly selected by `FCC_SMOKE_PROVIDER_MATRIX`, missing configuration should
fail instead of being silently skipped.
## Refactor Rules
- Keep public request/response shapes stable while moving internals.
- Complete module migrations in one change: update imports to the new owner and
remove old compatibility shims unless preserving a published interface is
explicitly required.
- Lock behavior with focused tests before moving shared protocol or runtime
code.
- Run checks in this order: `uv run ruff format`, `uv run ruff check`,
`uv run ty check`, `uv run pytest`.