# omniroute โ€” Codebase Documentation (Nederlands) ๐ŸŒ **Languages:** ๐Ÿ‡บ๐Ÿ‡ธ [English](../../../../docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡ช๐Ÿ‡ธ [es](../../es/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡ซ๐Ÿ‡ท [fr](../../fr/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡ฉ๐Ÿ‡ช [de](../../de/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡ฎ๐Ÿ‡น [it](../../it/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡ท๐Ÿ‡บ [ru](../../ru/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡จ๐Ÿ‡ณ [zh-CN](../../zh-CN/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡ฏ๐Ÿ‡ต [ja](../../ja/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡ฐ๐Ÿ‡ท [ko](../../ko/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡ธ๐Ÿ‡ฆ [ar](../../ar/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡ฎ๐Ÿ‡ณ [hi](../../hi/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡ฎ๐Ÿ‡ณ [in](../../in/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡น๐Ÿ‡ญ [th](../../th/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡ป๐Ÿ‡ณ [vi](../../vi/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡ฎ๐Ÿ‡ฉ [id](../../id/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡ฒ๐Ÿ‡พ [ms](../../ms/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡ณ๐Ÿ‡ฑ [nl](../../nl/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡ต๐Ÿ‡ฑ [pl](../../pl/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡ธ๐Ÿ‡ช [sv](../../sv/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡ณ๐Ÿ‡ด [no](../../no/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡ฉ๐Ÿ‡ฐ [da](../../da/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡ซ๐Ÿ‡ฎ [fi](../../fi/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡ต๐Ÿ‡น [pt](../../pt/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡ท๐Ÿ‡ด [ro](../../ro/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡ญ๐Ÿ‡บ [hu](../../hu/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡ง๐Ÿ‡ฌ [bg](../../bg/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡ธ๐Ÿ‡ฐ [sk](../../sk/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡บ๐Ÿ‡ฆ [uk-UA](../../uk-UA/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡ฎ๐Ÿ‡ฑ [he](../../he/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡ต๐Ÿ‡ญ [phi](../../phi/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡ง๐Ÿ‡ท [pt-BR](../../pt-BR/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡จ๐Ÿ‡ฟ [cs](../../cs/docs/CODEBASE_DOCUMENTATION.md) ยท ๐Ÿ‡น๐Ÿ‡ท [tr](../../tr/docs/CODEBASE_DOCUMENTATION.md) --- > A comprehensive, beginner-friendly guide to the **omniroute** multi-provider AI proxy router. --- ## 1. What Is omniroute? omniroute is a **proxy router** that sits between AI clients (Claude CLI, Codex, Cursor IDE, etc.) and AI providers (Anthropic, Google, OpenAI, AWS, GitHub, etc.). It solves one big problem: > **Different AI clients speak different "languages" (API formats), and different AI providers expect different "languages" too.** omniroute translates between them automatically. Think of it like a universal translator at the United Nations โ€” any delegate can speak any language, and the translator converts it for any other delegate. --- ## 2. Architecture Overview ```mermaid graph LR subgraph Clients A[Claude CLI] B[Codex] C[Cursor IDE] D[OpenAI-compatible] end subgraph omniroute E[Handler Layer] F[Translator Layer] G[Executor Layer] H[Services Layer] end subgraph Providers I[Anthropic Claude] J[Google Gemini] K[OpenAI / Codex] L[GitHub Copilot] M[AWS Kiro] N[Antigravity] O[Cursor API] end A --> E B --> E C --> E D --> E E --> F F --> G G --> I G --> J G --> K G --> L G --> M G --> N G --> O H -.-> E H -.-> G ``` ### Core Principle: Hub-and-Spoke Translation All format translation passes through **OpenAI format as the hub**: ``` Client Format โ†’ [OpenAI Hub] โ†’ Provider Format (request) Provider Format โ†’ [OpenAI Hub] โ†’ Client Format (response) ``` This means you only need **N translators** (one per format) instead of **Nยฒ** (every pair). --- ## 3. Project Structure ``` omniroute/ โ”œโ”€โ”€ open-sse/ โ† Core proxy library (portable, framework-agnostic) โ”‚ โ”œโ”€โ”€ index.js โ† Main entry point, exports everything โ”‚ โ”œโ”€โ”€ config/ โ† Configuration & constants โ”‚ โ”œโ”€โ”€ executors/ โ† Provider-specific request execution โ”‚ โ”œโ”€โ”€ handlers/ โ† Request handling orchestration โ”‚ โ”œโ”€โ”€ services/ โ† Business logic (auth, models, fallback, usage) โ”‚ โ”œโ”€โ”€ translator/ โ† Format translation engine โ”‚ โ”‚ โ”œโ”€โ”€ request/ โ† Request translators (8 files) โ”‚ โ”‚ โ”œโ”€โ”€ response/ โ† Response translators (7 files) โ”‚ โ”‚ โ””โ”€โ”€ helpers/ โ† Shared translation utilities (6 files) โ”‚ โ””โ”€โ”€ utils/ โ† Utility functions โ”œโ”€โ”€ src/ โ† Application layer (Express/Worker runtime) โ”‚ โ”œโ”€โ”€ app/ โ† Web UI, API routes, middleware โ”‚ โ”œโ”€โ”€ lib/ โ† Database, auth, and shared library code โ”‚ โ”œโ”€โ”€ mitm/ โ† Man-in-the-middle proxy utilities โ”‚ โ”œโ”€โ”€ models/ โ† Database models โ”‚ โ”œโ”€โ”€ shared/ โ† Shared utilities (wrappers around open-sse) โ”‚ โ”œโ”€โ”€ sse/ โ† SSE endpoint handlers โ”‚ โ””โ”€โ”€ store/ โ† State management โ”œโ”€โ”€ data/ โ† Runtime data (credentials, logs) โ”‚ โ””โ”€โ”€ provider-credentials.json (external credentials override, gitignored) โ””โ”€โ”€ tester/ โ† Test utilities ``` --- ## 4. Module-by-Module Breakdown ### 4.1 Config (`open-sse/config/`) The **single source of truth** for all provider configuration. | File | Purpose | | ----------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `constants.ts` | `PROVIDERS` object with base URLs, OAuth credentials (defaults), headers, and default system prompts for every provider. Also defines `HTTP_STATUS`, `ERROR_TYPES`, `COOLDOWN_MS`, `BACKOFF_CONFIG`, and `SKIP_PATTERNS`. | | `credentialLoader.ts` | Loads external credentials from `data/provider-credentials.json` and merges them over the hardcoded defaults in `PROVIDERS`. Keeps secrets out of source control while maintaining backwards compatibility. | | `providerModels.ts` | Central model registry: maps provider aliases โ†’ model IDs. Functions like `getModels()`, `getProviderByAlias()`. | | `codexInstructions.ts` | System instructions injected into Codex requests (editing constraints, sandbox rules, approval policies). | | `defaultThinkingSignature.ts` | Default "thinking" signatures for Claude and Gemini models. | | `ollamaModels.ts` | Schema definition for local Ollama models (name, size, family, quantization). | #### Credential Loading Flow ```mermaid flowchart TD A["App starts"] --> B["constants.ts defines PROVIDERS\nwith hardcoded defaults"] B --> C{"data/provider-credentials.json\nexists?"} C -->|Yes| D["credentialLoader reads JSON"] C -->|No| E["Use hardcoded defaults"] D --> F{"For each provider in JSON"} F --> G{"Provider exists\nin PROVIDERS?"} G -->|No| H["Log warning, skip"] G -->|Yes| I{"Value is object?"} I -->|No| J["Log warning, skip"] I -->|Yes| K["Merge clientId, clientSecret,\ntokenUrl, authUrl, refreshUrl"] K --> F H --> F J --> F F -->|Done| L["PROVIDERS ready with\nmerged credentials"] E --> L ``` --- ### 4.2 Executors (`open-sse/executors/`) Executors encapsulate **provider-specific logic** using the **Strategy Pattern**. Each executor overrides base methods as needed. ```mermaid classDiagram class BaseExecutor { +buildUrl(model, stream, options) +buildHeaders(credentials, stream, body) +transformRequest(body, model, stream, credentials) +execute(url, options) +shouldRetry(status, error) +refreshCredentials(credentials, log) } class DefaultExecutor { +refreshCredentials() } class AntigravityExecutor { +buildUrl() +buildHeaders() +transformRequest() +shouldRetry() +refreshCredentials() } class CursorExecutor { +buildUrl() +buildHeaders() +transformRequest() +parseResponse() +generateChecksum() } class KiroExecutor { +buildUrl() +buildHeaders() +transformRequest() +parseEventStream() +refreshCredentials() } BaseExecutor <|-- DefaultExecutor BaseExecutor <|-- AntigravityExecutor BaseExecutor <|-- CursorExecutor BaseExecutor <|-- KiroExecutor BaseExecutor <|-- CodexExecutor BaseExecutor <|-- GeminiCLIExecutor BaseExecutor <|-- GithubExecutor ``` | Executor | Provider | Key Specializations | | ---------------- | ------------------------------------------ | ------------------------------------------------------------------------------------------------------------------- | | `base.ts` | โ€” | Abstract base: URL building, headers, retry logic, credential refresh | | `default.ts` | Claude, Gemini, OpenAI, GLM, Kimi, MiniMax | Generic OAuth token refresh for standard providers | | `antigravity.ts` | Google Cloud Code | Project/session ID generation, multi-URL fallback, custom retry parsing from error messages ("reset after 2h7m23s") | | `cursor.ts` | Cursor IDE | **Most complex**: SHA-256 checksum auth, Protobuf request encoding, binary EventStream โ†’ SSE response parsing | | `codex.ts` | OpenAI Codex | Injects system instructions, manages thinking levels, removes unsupported parameters | | `gemini-cli.ts` | Google Gemini CLI | Custom URL building (`streamGenerateContent`), Google OAuth token refresh | | `github.ts` | GitHub Copilot | Dual token system (GitHub OAuth + Copilot token), VSCode header mimicking | | `kiro.ts` | AWS CodeWhisperer | AWS EventStream binary parsing, AMZN event frames, token estimation | | `index.ts` | โ€” | Factory: maps provider name โ†’ executor class, with default fallback | --- ### 4.3 Handlers (`open-sse/handlers/`) The **orchestration layer** โ€” coordinates translation, execution, streaming, and error handling. | File | Purpose | | --------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `chatCore.ts` | **Central orchestrator** (~600 lines). Handles the complete request lifecycle: format detection โ†’ translation โ†’ executor dispatch โ†’ streaming/non-streaming response โ†’ token refresh โ†’ error handling โ†’ usage logging. | | `responsesHandler.ts` | Adapter for OpenAI's Responses API: converts Responses format โ†’ Chat Completions โ†’ sends to `chatCore` โ†’ converts SSE back to Responses format. | | `embeddings.ts` | Embedding generation handler: resolves embedding model โ†’ provider, dispatches to provider API, returns OpenAI-compatible embedding response. Supports 6+ providers. | | `imageGeneration.ts` | Image generation handler: resolves image model โ†’ provider, supports OpenAI-compatible, Gemini-image (Antigravity), and fallback (Nebius) modes. Returns base64 or URL images. | #### Request Lifecycle (chatCore.ts) ```mermaid sequenceDiagram participant Client participant chatCore participant Translator participant Executor participant Provider Client->>chatCore: Request (any format) chatCore->>chatCore: Detect source format chatCore->>chatCore: Check bypass patterns chatCore->>chatCore: Resolve model & provider chatCore->>Translator: Translate request (source โ†’ OpenAI โ†’ target) chatCore->>Executor: Get executor for provider Executor->>Executor: Build URL, headers, transform request Executor->>Executor: Refresh credentials if needed Executor->>Provider: HTTP fetch (streaming or non-streaming) alt Streaming Provider-->>chatCore: SSE stream chatCore->>chatCore: Pipe through SSE transform stream Note over chatCore: Transform stream translates
each chunk: target โ†’ OpenAI โ†’ source chatCore-->>Client: Translated SSE stream else Non-streaming Provider-->>chatCore: JSON response chatCore->>Translator: Translate response chatCore-->>Client: Translated JSON end alt Error (401, 429, 500...) chatCore->>Executor: Retry with credential refresh chatCore->>chatCore: Account fallback logic end ``` --- ### 4.4 Services (`open-sse/services/`) Business logic that supports the handlers and executors. | File | Purpose | | -------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `provider.ts` | **Format detection** (`detectFormat`): analyzes request body structure to identify Claude/OpenAI/Gemini/Antigravity/Responses formats (includes `max_tokens` heuristic for Claude). Also: URL building, header building, thinking config normalization. Supports `openai-compatible-*` and `anthropic-compatible-*` dynamic providers. | | `model.ts` | Model string parsing (`claude/model-name` โ†’ `{provider: "claude", model: "model-name"}`), alias resolution with collision detection, input sanitization (rejects path traversal/control chars), and model info resolution with async alias getter support. | | `accountFallback.ts` | Rate-limit handling: exponential backoff (1s โ†’ 2s โ†’ 4s โ†’ max 2min), account cooldown management, error classification (which errors trigger fallback vs. not). | | `tokenRefresh.ts` | OAuth token refresh for **every provider**: Google (Gemini, Antigravity), Claude, Codex, Qwen, Qoder, GitHub (OAuth + Copilot dual-token), Kiro (AWS SSO OIDC + Social Auth). Includes in-flight promise deduplication cache and retry with exponential backoff. | | `combo.ts` | **Combo models**: chains of fallback models. If model A fails with a fallback-eligible error, try model B, then C, etc. Returns actual upstream status codes. | | `usage.ts` | Fetches quota/usage data from provider APIs (GitHub Copilot quotas, Antigravity model quotas, Codex rate limits, Kiro usage breakdowns, Claude settings). | | `accountSelector.ts` | Smart account selection with scoring algorithm: considers priority, health status, round-robin position, and cooldown state to pick the optimal account for each request. | | `contextManager.ts` | Request context lifecycle management: creates and tracks per-request context objects with metadata (request ID, timestamps, provider info) for debugging and logging. | | `ipFilter.ts` | IP-based access control: supports allowlist and blocklist modes. Validates client IP against configured rules before processing API requests. | | `sessionManager.ts` | Session tracking with client fingerprinting: tracks active sessions using hashed client identifiers, monitors request counts, and provides session metrics. | | `signatureCache.ts` | Request signature-based deduplication cache: prevents duplicate requests by caching recent request signatures and returning cached responses for identical requests within a time window. | | `systemPrompt.ts` | Global system prompt injection: prepends or appends a configurable system prompt to all requests, with per-provider compatibility handling. | | `thinkingBudget.ts` | Reasoning token budget management: supports passthrough, auto (strip thinking config), custom (fixed budget), and adaptive (complexity-scaled) modes for controlling thinking/reasoning tokens. | | `wildcardRouter.ts` | Wildcard model pattern routing: resolves wildcard patterns (e.g., `*/claude-*`) to concrete provider/model pairs based on availability and priority. | #### Token Refresh Deduplication ```mermaid sequenceDiagram participant R1 as Request 1 participant R2 as Request 2 participant Cache as refreshPromiseCache participant OAuth as OAuth Provider R1->>Cache: getAccessToken("gemini", token) Cache->>Cache: No in-flight promise Cache->>OAuth: Start refresh R2->>Cache: getAccessToken("gemini", token) Cache->>Cache: Found in-flight promise Cache-->>R2: Return existing promise OAuth-->>Cache: New access token Cache-->>R1: New access token Cache-->>R2: Same access token (shared) Cache->>Cache: Delete cache entry ``` #### Account Fallback State Machine ```mermaid stateDiagram-v2 [*] --> Active Active --> Error: Request fails (401/429/500) Error --> Cooldown: Apply backoff Cooldown --> Active: Cooldown expires Active --> Active: Request succeeds (reset backoff) state Error { [*] --> ClassifyError ClassifyError --> ShouldFallback: Rate limit / Auth / Transient ClassifyError --> NoFallback: 400 Bad Request } state Cooldown { [*] --> ExponentialBackoff ExponentialBackoff: Level 0 = 1s ExponentialBackoff: Level 1 = 2s ExponentialBackoff: Level 2 = 4s ExponentialBackoff: Max = 2min } ``` #### Combo Model Chain ```mermaid flowchart LR A["Request with\ncombo model"] --> B["Model A"] B -->|"2xx Success"| C["Return response"] B -->|"429/401/500"| D{"Fallback\neligible?"} D -->|Yes| E["Model B"] D -->|No| F["Return error"] E -->|"2xx Success"| C E -->|"429/401/500"| G{"Fallback\neligible?"} G -->|Yes| H["Model C"] G -->|No| F H -->|"2xx Success"| C H -->|"Fail"| I["All failed โ†’\nReturn last status"] ``` --- ### 4.5 Translator (`open-sse/translator/`) The **format translation engine** using a self-registering plugin system. #### Architectuur ```mermaid graph TD subgraph "Request Translation" A["Claude โ†’ OpenAI"] B["Gemini โ†’ OpenAI"] C["Antigravity โ†’ OpenAI"] D["OpenAI Responses โ†’ OpenAI"] E["OpenAI โ†’ Claude"] F["OpenAI โ†’ Gemini"] G["OpenAI โ†’ Kiro"] H["OpenAI โ†’ Cursor"] end subgraph "Response Translation" I["Claude โ†’ OpenAI"] J["Gemini โ†’ OpenAI"] K["Kiro โ†’ OpenAI"] L["Cursor โ†’ OpenAI"] M["OpenAI โ†’ Claude"] N["OpenAI โ†’ Antigravity"] O["OpenAI โ†’ Responses"] end ``` | Directory | Files | Description | | ------------ | ------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `request/` | 8 translators | Convert request bodies between formats. Each file self-registers via `register(from, to, fn)` on import. | | `response/` | 7 translators | Convert streaming response chunks between formats. Handles SSE event types, thinking blocks, tool calls. | | `helpers/` | 6 helpers | Shared utilities: `claudeHelper` (system prompt extraction, thinking config), `geminiHelper` (parts/contents mapping), `openaiHelper` (format filtering), `toolCallHelper` (ID generation, missing response injection), `maxTokensHelper`, `responsesApiHelper`. | | `index.ts` | โ€” | Translation engine: `translateRequest()`, `translateResponse()`, state management, registry. | | `formats.ts` | โ€” | Format constants: `OPENAI`, `CLAUDE`, `GEMINI`, `ANTIGRAVITY`, `KIRO`, `CURSOR`, `OPENAI_RESPONSES`. | #### Key Design: Self-Registering Plugins ```javascript // Each translator file calls register() on import: import { register } from "../index.js"; register("claude", "openai", translateClaudeToOpenAI); // The index.js imports all translator files, triggering registration: import "./request/claude-to-openai.js"; // โ† self-registers ``` --- ### 4.6 Utils (`open-sse/utils/`) | File | Purpose | | ------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | `error.ts` | Error response building (OpenAI-compatible format), upstream error parsing, Antigravity retry-time extraction from error messages, SSE error streaming. | | `stream.ts` | **SSE Transform Stream** โ€” the core streaming pipeline. Two modes: `TRANSLATE` (full format translation) and `PASSTHROUGH` (normalize + extract usage). Handles chunk buffering, usage estimation, content length tracking. Per-stream encoder/decoder instances avoid shared state. | | `streamHelpers.ts` | Low-level SSE utilities: `parseSSELine` (whitespace-tolerant), `hasValuableContent` (filters empty chunks for OpenAI/Claude/Gemini), `fixInvalidId`, `formatSSE` (format-aware SSE serialization with `perf_metrics` cleanup). | | `usageTracking.ts` | Token usage extraction from any format (Claude/OpenAI/Gemini/Responses), estimation with separate tool/message char-per-token ratios, buffer addition (2000 tokens safety margin), format-specific field filtering, console logging with ANSI colors. | | `requestLogger.ts` | Legacy file-based request logging helper kept for compatibility. Current deployments should prefer `APP_LOG_TO_FILE` for application logs and the call log pipeline for persisted request artifacts. | | `bypassHandler.ts` | Intercepts specific patterns from Claude CLI (title extraction, warmup, count) and returns fake responses without calling any provider. Supports both streaming and non-streaming. Intentionally limited to Claude CLI scope. | | `networkProxy.ts` | Resolves outbound proxy URL for a given provider with precedence: provider-specific config โ†’ global config โ†’ environment variables (`HTTPS_PROXY`/`HTTP_PROXY`/`ALL_PROXY`). Supports `NO_PROXY` exclusions. Caches config for 30s. | #### SSE Streaming Pipeline ```mermaid flowchart TD A["Provider SSE stream"] --> B["TextDecoder\n(per-stream instance)"] B --> C["Buffer lines\n(split on newline)"] C --> D["parseSSELine()\n(trim whitespace, parse JSON)"] D --> E{"Mode?"} E -->|TRANSLATE| F["translateResponse()\ntarget โ†’ OpenAI โ†’ source"] E -->|PASSTHROUGH| G["fixInvalidId()\nnormalize chunk"] F --> H["hasValuableContent()\nfilter empty chunks"] G --> H H -->|"Has content"| I["extractUsage()\ntrack token counts"] H -->|"Empty"| J["Skip chunk"] I --> K["formatSSE()\nserialize + clean perf_metrics"] K --> L["TextEncoder\n(per-stream instance)"] L --> M["Enqueue to\nclient stream"] style A fill:#f9f,stroke:#333 style M fill:#9f9,stroke:#333 ``` #### Request Logger Session Structure ``` logs/ โ””โ”€โ”€ claude_gemini_claude-sonnet_20260208_143045/ โ”œโ”€โ”€ 1_req_client.json โ† Raw client request โ”œโ”€โ”€ 2_req_source.json โ† After initial conversion โ”œโ”€โ”€ 3_req_openai.json โ† OpenAI intermediate format โ”œโ”€โ”€ 4_req_target.json โ† Final target format โ”œโ”€โ”€ 5_res_provider.txt โ† Provider SSE chunks (streaming) โ”œโ”€โ”€ 5_res_provider.json โ† Provider response (non-streaming) โ”œโ”€โ”€ 6_res_openai.txt โ† OpenAI intermediate chunks โ”œโ”€โ”€ 7_res_client.txt โ† Client-facing SSE chunks โ””โ”€โ”€ 6_error.json โ† Error details (if any) ``` --- ### 4.7 Application Layer (`src/`) | Directory | Purpose | | ------------- | ---------------------------------------------------------------------- | | `src/app/` | Web UI, API routes, Express middleware, OAuth callback handlers | | `src/lib/` | Database access (`localDb.ts`, `usageDb.ts`), authentication, shared | | `src/mitm/` | Man-in-the-middle proxy utilities for intercepting provider traffic | | `src/models/` | Database model definitions | | `src/shared/` | Wrappers around open-sse functions (provider, stream, error, etc.) | | `src/sse/` | SSE endpoint handlers that wire the open-sse library to Express routes | | `src/store/` | Application state management | #### Notable API Routes | Route | Methods | Purpose | | --------------------------------------------- | --------------- | ------------------------------------------------------------------------------------- | | `/api/provider-models` | GET/POST/DELETE | CRUD for custom models per provider | | `/api/models/catalog` | GET | Aggregated catalog of all models (chat, embedding, image, custom) grouped by provider | | `/api/settings/proxy` | GET/PUT/DELETE | Hierarchical outbound proxy configuration (`global/providers/combos/keys`) | | `/api/settings/proxy/test` | POST | Validates proxy connectivity and returns public IP/latency | | `/v1/providers/[provider]/chat/completions` | POST | Dedicated per-provider chat completions with model validation | | `/v1/providers/[provider]/embeddings` | POST | Dedicated per-provider embeddings with model validation | | `/v1/providers/[provider]/images/generations` | POST | Dedicated per-provider image generation with model validation | | `/api/settings/ip-filter` | GET/PUT | IP allowlist/blocklist management | | `/api/settings/thinking-budget` | GET/PUT | Reasoning token budget configuration (passthrough/auto/custom/adaptive) | | `/api/settings/system-prompt` | GET/PUT | Global system prompt injection for all requests | | `/api/sessions` | GET | Active session tracking and metrics | | `/api/rate-limits` | GET | Per-account rate limit status | --- ## 5. Key Design Patterns ### 5.1 Hub-and-Spoke Translation All formats translate through **OpenAI format as the hub**. Adding a new provider only requires writing **one pair** of translators (to/from OpenAI), not N pairs. ### 5.2 Executor Strategy Pattern Each provider has a dedicated executor class inheriting from `BaseExecutor`. The factory in `executors/index.ts` selects the right one at runtime. ### 5.3 Self-Registering Plugin System Translator modules register themselves on import via `register()`. Adding a new translator is just creating a file and importing it. ### 5.4 Account Fallback with Exponential Backoff When a provider returns 429/401/500, the system can switch to the next account, applying exponential cooldowns (1s โ†’ 2s โ†’ 4s โ†’ max 2min). ### 5.5 Combo Model Chains A "combo" groups multiple `provider/model` strings. If the first fails, fallback to the next automatically. ### 5.6 Stateful Streaming Translation Response translation maintains state across SSE chunks (thinking block tracking, tool call accumulation, content block indexing) via the `initState()` mechanism. ### 5.7 Usage Safety Buffer A 2000-token buffer is added to reported usage to prevent clients from hitting context window limits due to overhead from system prompts and format translation. --- ## 6. Supported Formats | Format | Direction | Identifier | | ----------------------- | --------------- | ------------------ | | OpenAI Chat Completions | source + target | `openai` | | OpenAI Responses API | source + target | `openai-responses` | | Anthropic Claude | source + target | `claude` | | Google Gemini | source + target | `gemini` | | Google Gemini CLI | target only | `gemini-cli` | | Antigravity | source + target | `antigravity` | | AWS Kiro | target only | `kiro` | | Cursor | target only | `cursor` | --- ## 7. Supported Providers | Provider | Auth Method | Executor | Key Notes | | ------------------------ | ---------------------- | ----------- | --------------------------------------------- | | Anthropic Claude | API key or OAuth | Default | Uses `x-api-key` header | | Google Gemini | API key or OAuth | Default | Uses `x-goog-api-key` header | | Google Gemini CLI | OAuth | GeminiCLI | Uses `streamGenerateContent` endpoint | | Antigravity | OAuth | Antigravity | Multi-URL fallback, custom retry parsing | | OpenAI | API key | Default | Standard Bearer auth | | Codex | OAuth | Codex | Injects system instructions, manages thinking | | GitHub Copilot | OAuth + Copilot token | Github | Dual token, VSCode header mimicking | | Kiro (AWS) | AWS SSO OIDC or Social | Kiro | Binary EventStream parsing | | Cursor IDE | Checksum auth | Cursor | Protobuf encoding, SHA-256 checksums | | Qwen | OAuth | Default | Standard auth | | Qoder | OAuth (Basic + Bearer) | Default | Dual auth header | | OpenRouter | API key | Default | Standard Bearer auth | | GLM, Kimi, MiniMax | API key | Default | Claude-compatible, use `x-api-key` | | `openai-compatible-*` | API key | Default | Dynamic: any OpenAI-compatible endpoint | | `anthropic-compatible-*` | API key | Default | Dynamic: any Claude-compatible endpoint | --- ## 8. Data Flow Summary ### Streaming Request ```mermaid flowchart LR A["Client"] --> B["detectFormat()"] B --> C["translateRequest()\nsource โ†’ OpenAI โ†’ target"] C --> D["Executor\nbuildUrl + buildHeaders"] D --> E["fetch(providerURL)"] E --> F["createSSEStream()\nTRANSLATE mode"] F --> G["parseSSELine()"] G --> H["translateResponse()\ntarget โ†’ OpenAI โ†’ source"] H --> I["extractUsage()\n+ addBuffer"] I --> J["formatSSE()"] J --> K["Client receives\ntranslated SSE"] K --> L["logUsage()\nsaveRequestUsage()"] ``` ### Non-Streaming Request ```mermaid flowchart LR A["Client"] --> B["detectFormat()"] B --> C["translateRequest()\nsource โ†’ OpenAI โ†’ target"] C --> D["Executor.execute()"] D --> E["translateResponse()\ntarget โ†’ OpenAI โ†’ source"] E --> F["Return JSON\nresponse"] ``` ### Bypass Flow (Claude CLI) ```mermaid flowchart LR A["Claude CLI request"] --> B{"Match bypass\npattern?"} B -->|"Title/Warmup/Count"| C["Generate fake\nOpenAI response"] B -->|"No match"| D["Normal flow"] C --> E["Translate to\nsource format"] E --> F["Return without\ncalling provider"] ```