Add tests for fcc-init entrypoint (cli/entrypoints.py) (#77)

2026-04-28 11:30:03 +00:00 · 2026-03-07 08:27:11 -08:00 · 2026-03-07 08:27:11 -08:00 · 884ddd77af
commit 884ddd77af
parent fc58b43c5e
2 changed files with 228 additions and 228 deletions
--- a/README.md
+++ b/README.md
@ -1,6 +1,6 @@
 <div align="center">

-# Free Claude Code
+# 🤖 Free Claude Code

 ### Use Claude Code CLI & VSCode for free. No Anthropic API key required.

@ -14,7 +14,7 @@

 A lightweight proxy that routes Claude Code's Anthropic API calls to **NVIDIA NIM** (40 req/min free), **OpenRouter** (hundreds of models), or **LM Studio** (fully local).

-[Features](#features) · [Quick Start](#quick-start) · [How It Works](#how-it-works) · [Discord Bot](#discord-bot) · [Configuration](#configuration) · [Development](#development)
+[Quick Start](#quick-start) · [Providers](#providers) · [Discord Bot](#discord-bot) · [Configuration](#configuration) · [Development](#development) · [Contributing](#contributing)

 ---

@ -32,12 +32,12 @@ A lightweight proxy that routes Claude Code's Anthropic API calls to **NVIDIA NI
 | **Zero Cost**              | 40 req/min free on NVIDIA NIM. Free models on OpenRouter. Fully local with LM Studio                                 |
 | **Drop-in Replacement**    | Set 2 env vars. No modifications to Claude Code CLI or VSCode extension needed                                       |
 | **3 Providers**            | NVIDIA NIM, OpenRouter (hundreds of models), LM Studio (local & offline)                                             |
-| **Per-Model Mapping**      | Route Opus / Sonnet / Haiku requests to different models and providers. Mix providers freely per model               |
+| **Per-Model Mapping**      | Route Opus / Sonnet / Haiku to different models and providers. Mix providers freely                                  |
 | **Thinking Token Support** | Parses `<think>` tags and `reasoning_content` into native Claude thinking blocks                                     |
 | **Heuristic Tool Parser**  | Models outputting tool calls as text are auto-parsed into structured tool use                                        |
 | **Request Optimization**   | 5 categories of trivial API calls intercepted locally, saving quota and latency                                      |
-| **Discord Bot**            | Remote autonomous coding with tree-based threading, session persistence, and live progress (Telegram also supported) |
-| **Smart Rate Limiting**    | Proactive rolling-window throttle + reactive 429 exponential backoff + optional concurrency cap across all providers |
+| **Smart Rate Limiting**    | Proactive rolling-window throttle + reactive 429 exponential backoff + optional concurrency cap                      |
+| **Discord / Telegram Bot** | Remote autonomous coding with tree-based threading, session persistence, and live progress                           |
 | **Subagent Control**       | Task tool interception forces `run_in_background=False`. No runaway subagents                                        |
 | **Extensible**             | Clean `BaseProvider` and `MessagingPlatform` ABCs. Add new providers or platforms easily                             |

@ -50,8 +50,7 @@ A lightweight proxy that routes Claude Code's Anthropic API calls to **NVIDIA NI
   - **OpenRouter**: [openrouter.ai/keys](https://openrouter.ai/keys)
   - **LM Studio**: No API key needed. Run locally with [LM Studio](https://lmstudio.ai)
 2. Install [Claude Code](https://github.com/anthropics/claude-code)
-3. Install [uv](https://github.com/astral-sh/uv)
-4. Update uv if already installed: `uv self update`
+3. Install [uv](https://github.com/astral-sh/uv) (or `uv self update` if already installed)

 ### Clone & Configure

@ -104,7 +103,7 @@ MODEL="lmstudio/unsloth/GLM-4.7-Flash-GGUF"         # fallback
 </details>

 <details>
-<summary><b>Mix providers</b> (use multiple providers together)</summary>
+<summary><b>Mix providers</b></summary>

 Each `MODEL_*` variable can use a different provider. `MODEL` is the fallback for unrecognized Claude models.

@ -151,7 +150,7 @@ That's it! Claude Code now uses your configured provider for free.
 ```

 4. Reload extensions.
-5. **If you see the login screen** ("How do you want to log in?"): Click **Anthropic Console**, then authorize. The extension will start working. You may be redirected to buy credits in the browser; ignore it, the extension already works.
+5. **If you see the login screen**: Click **Anthropic Console**, then authorize. The extension will start working. You may be redirected to buy credits in the browser; ignore it — the extension already works.

 To switch back to Anthropic models, comment out the added block and reload extensions.

@ -164,7 +163,7 @@ To switch back to Anthropic models, comment out the added block and reload exten

 https://github.com/user-attachments/assets/9a33c316-90f8-4418-9650-97e7d33ad645

-**1. Install [fzf](https://github.com/junegunn/fzf)** (highly recommended for the interactive picker):
+**1. Install [fzf](https://github.com/junegunn/fzf)**:

 ```bash
 brew install fzf        # macOS/Linux
@ -173,13 +172,12 @@ brew install fzf        # macOS/Linux
 **2. Add the alias to `~/.zshrc` or `~/.bashrc`:**

 ```bash
-# Use the absolute path to your cloned repo
 alias claude-pick="/absolute/path/to/free-claude-code/claude-pick"
 ```

-Then reload your shell (`source ~/.zshrc` or `source ~/.bashrc`) and run `claude-pick` to pick a model and launch Claude.
+Then reload your shell (`source ~/.zshrc` or `source ~/.bashrc`) and run `claude-pick`.

-**Skip the picker with a fixed model** (no picker needed):
+**Or use a fixed model alias** (no picker needed):

 ```bash
 alias claude-kimi='ANTHROPIC_BASE_URL="http://localhost:8082" ANTHROPIC_AUTH_TOKEN="freecc:moonshotai/kimi-k2.5" claude'
@ -189,8 +187,6 @@ alias claude-kimi='ANTHROPIC_BASE_URL="http://localhost:8082" ANTHROPIC_AUTH_TOK

 ### Install as a Package (no clone needed)

-If you just want to use the proxy without cloning the repo, install it as a `uv` tool:
-
 ```bash
 uv tool install git+https://github.com/Alishahryar1/free-claude-code.git
 fcc-init        # creates ~/.config/free-claude-code/.env from the built-in template
@ -213,58 +209,92 @@ free-claude-code    # starts the server
 │  Claude Code    │───────>│  Free Claude Code    │───────>│  LLM Provider    │
 │  CLI / VSCode   │<───────│  Proxy (:8082)       │<───────│  NIM / OR / LMS  │
 └─────────────────┘        └──────────────────────┘        └──────────────────┘
-   Anthropic API                     │                       OpenAI-compatible
-   format (SSE)              ┌───────┴────────┐                format (SSE)
-                             │ Optimizations  │
-                             ├────────────────┤
-                             │ Quota probes   │
-                             │ Title gen skip │
-                             │ Prefix detect  │
-                             │ Suggestion skip│
-                             │ Filepath mock  │
-                             └────────────────┘
+   Anthropic API                                             OpenAI-compatible
+   format (SSE)                                             format (SSE)
 ```

- **Transparent proxy**: Claude Code sends standard Anthropic API requests to the proxy server
- **Per-model routing**: Opus / Sonnet / Haiku requests are resolved to their model-specific backend and provider, with the default `MODEL` as fallback
- **Request optimization**: 5 categories of trivial requests (quota probes, title generation, prefix detection, suggestions, filepath extraction) are intercepted and responded to instantly without using API quota
- **Format translation**: real requests are translated from Anthropic format to the provider's OpenAI-compatible format and streamed back
- **Thinking tokens**: `<think>` tags and `reasoning_content` fields are converted into native Claude thinking blocks so Claude Code renders them correctly
+- **Transparent proxy**: Claude Code sends standard Anthropic API requests; the proxy forwards them to your configured provider
+- **Per-model routing**: Opus / Sonnet / Haiku requests resolve to their model-specific backend, with `MODEL` as fallback
+- **Request optimization**: 5 categories of trivial requests (quota probes, title generation, prefix detection, suggestions, filepath extraction) are intercepted and responded to locally without using API quota
+- **Format translation**: Requests are translated from Anthropic format to the provider's OpenAI-compatible format and streamed back
+- **Thinking tokens**: `<think>` tags and `reasoning_content` fields are converted into native Claude thinking blocks

 ---

 ## Providers

-| Provider       | Cost         | Rate Limit | Models                            | Best For                             |
-| -------------- | ------------ | ---------- | --------------------------------- | ------------------------------------ |
-| **NVIDIA NIM** | Free         | 40 req/min | Kimi K2, GLM5, Devstral, MiniMax  | Daily driver, generous free tier     |
-| **OpenRouter** | Free / Paid  | Varies     | 200+ (GPT-4o, Claude, Step, etc.) | Model variety, fallback options      |
-| **LM Studio**  | Free (local) | Unlimited  | Any GGUF model                    | Privacy, offline use, no rate limits |
+| Provider       | Cost         | Rate Limit | Best For                             |
+| -------------- | ------------ | ---------- | ------------------------------------ |
+| **NVIDIA NIM** | Free         | 40 req/min | Daily driver, generous free tier     |
+| **OpenRouter** | Free / Paid  | Varies     | Model variety, fallback options      |
+| **LM Studio**  | Free (local) | Unlimited  | Privacy, offline use, no rate limits |

-Switch providers by changing `MODEL` in `.env`. Use the prefix format `provider/model/name`. Invalid prefix causes an error.
+Models use a prefix format: `provider_prefix/model/name`. An invalid prefix causes an error.

-| Provider   | MODEL prefix      | API Key Variable     | Base URL                      |
+| Provider   | `MODEL` prefix    | API Key Variable     | Default Base URL              |
 | ---------- | ----------------- | -------------------- | ----------------------------- |
 | NVIDIA NIM | `nvidia_nim/...`  | `NVIDIA_NIM_API_KEY` | `integrate.api.nvidia.com/v1` |
 | OpenRouter | `open_router/...` | `OPENROUTER_API_KEY` | `openrouter.ai/api/v1`        |
 | LM Studio  | `lmstudio/...`    | (none)               | `localhost:1234/v1`           |

-LM Studio runs locally. Start the server in the Developer tab or via `lms server start`, load a model, and set `MODEL` to the model identifier.
+<details>
+<summary><b>NVIDIA NIM models</b></summary>
+
+Popular models (full list in [`nvidia_nim_models.json`](nvidia_nim_models.json)):
+
+- `nvidia_nim/minimaxai/minimax-m2.5`
+- `nvidia_nim/qwen/qwen3.5-397b-a17b`
+- `nvidia_nim/z-ai/glm5`
+- `nvidia_nim/moonshotai/kimi-k2.5`
+- `nvidia_nim/stepfun-ai/step-3.5-flash`
+
+Browse: [build.nvidia.com](https://build.nvidia.com/explore/discover) · Update list: `curl "https://integrate.api.nvidia.com/v1/models" > nvidia_nim_models.json`
+
+</details>
+
+<details>
+<summary><b>OpenRouter models</b></summary>
+
+Popular free models:
+
+- `open_router/arcee-ai/trinity-large-preview:free`
+- `open_router/stepfun/step-3.5-flash:free`
+- `open_router/deepseek/deepseek-r1-0528:free`
+- `open_router/openai/gpt-oss-120b:free`
+
+Browse: [openrouter.ai/models](https://openrouter.ai/models) · [Free models](https://openrouter.ai/collections/free-models)
+
+</details>
+
+<details>
+<summary><b>LM Studio models</b></summary>
+
+Run models locally with [LM Studio](https://lmstudio.ai). Load a model in the Chat or Developer tab, then set `MODEL` to its identifier.
+
+Examples with native tool-use support:
+
+- `LiquidAI/LFM2-24B-A2B-GGUF`
+- `unsloth/MiniMax-M2.5-GGUF`
+- `unsloth/GLM-4.7-Flash-GGUF`
+- `unsloth/Qwen3.5-35B-A3B-GGUF`
+
+Browse: [model.lmstudio.ai](https://model.lmstudio.ai)
+
+</details>

 ---

 ## Discord Bot

-Control Claude Code remotely from Discord. Send tasks, watch live progress, and manage multiple concurrent sessions. Telegram is also supported.
+Control Claude Code remotely from Discord (or Telegram). Send tasks, watch live progress, and manage multiple concurrent sessions.

 **Capabilities:**
-
 - Tree-based message threading: reply to a message to fork the conversation
 - Session persistence across server restarts
 - Live streaming of thinking tokens, tool calls, and results
- Unlimited concurrent Claude CLI sessions (provider concurrency controlled by `PROVIDER_MAX_CONCURRENCY`)
- **Voice notes**: send voice messages; they are transcribed and processed like regular prompts (see [Voice Notes](#voice-notes))
- Commands: `/stop` (cancel tasks; reply to a message to stop only that task), `/clear` (standalone: reset all sessions; reply to a message to clear that branch downwards), `/stats`
+- Unlimited concurrent Claude CLI sessions (concurrency controlled by `PROVIDER_MAX_CONCURRENCY`)
+- Voice notes: send voice messages; they are transcribed and processed as regular prompts
+- Commands: `/stop` (cancel a task; reply to a message to stop only that task), `/clear` (reset all sessions, or reply to clear a branch), `/stats`

 ### Setup

@ -278,7 +308,7 @@ DISCORD_BOT_TOKEN="your_discord_bot_token"
 ALLOWED_DISCORD_CHANNELS="123456789,987654321"
 ```

-> Enable Developer Mode in Discord (Settings → Advanced), then right-click a channel and "Copy ID" to get channel IDs. Comma-separate multiple channels. If empty, no channels are allowed.
+> Enable Developer Mode in Discord (Settings → Advanced), then right-click a channel and "Copy ID". Comma-separate multiple channels. If empty, no channels are allowed.

 3. **Configure the workspace** (where Claude will operate):

@ -293,11 +323,11 @@ ALLOWED_DIR="C:/Users/yourname/projects"
 uv run uvicorn server:app --host 0.0.0.0 --port 8082
 ```

-5. **Invite the bot** (OAuth2 URL Generator, scopes: `bot`, permissions: Read Messages, Send Messages, Manage Messages, Read Message History). Send a task to an allowed channel and Claude responds with live thinking tokens and tool calls. Use commands above to cancel or clear.
+5. **Invite the bot** via OAuth2 URL Generator (scopes: `bot`, permissions: Read Messages, Send Messages, Manage Messages, Read Message History).

-### Telegram (Alternative)
+### Telegram

-To use Telegram instead, set `MESSAGING_PLATFORM=telegram` and configure:
+Set `MESSAGING_PLATFORM=telegram` and configure:

 ```dotenv
 TELEGRAM_BOT_TOKEN="123456789:ABCdefGHIjklMNOpqrSTUvwxYZ"
@ -308,146 +338,88 @@ Get a token from [@BotFather](https://t.me/BotFather); find your user ID via [@u

 ### Voice Notes

-Send voice messages on Telegram or Discord; they are transcribed to text and processed as regular prompts. Two transcription backends are available:
+Send voice messages on Discord or Telegram; they are transcribed and processed as regular prompts.

- **Local Whisper** (default): Uses [Hugging Face transformers Whisper](https://huggingface.co/openai/whisper-large-v3-turbo) — free, no API key, works offline, CUDA compatible. No ffmpeg required.
- **NVIDIA NIM**: Uses NVIDIA NIM Whisper/Parkeet models via gRPC — requires `NVIDIA_NIM_API_KEY`.
+| Backend | Description | API Key |
+| ------- | ----------- | ------- |
+| **Local Whisper** (default) | [Hugging Face Whisper](https://huggingface.co/openai/whisper-large-v3-turbo) — free, offline, CUDA compatible | not required |
+| **NVIDIA NIM** | Whisper/Parakeet models via gRPC | `NVIDIA_NIM_API_KEY` |

-Install the optional voice extras:
+**Install the voice extras:**

 ```bash
-# For local Whisper (cpu/cuda)
-uv sync --extra voice_local
+# If you cloned the repo:
+uv sync --extra voice_local          # Local Whisper
+uv sync --extra voice                # NVIDIA NIM
+uv sync --extra voice --extra voice_local  # Both

-# For NVIDIA NIM transcription
-uv sync --extra voice
-
-# Install both
-uv sync --extra voice --extra voice_local
+# If you installed as a package (no clone):
+uv tool install "free-claude-code[voice_local] @ git+https://github.com/Alishahryar1/free-claude-code.git"
+uv tool install "free-claude-code[voice] @ git+https://github.com/Alishahryar1/free-claude-code.git"
+uv tool install "free-claude-code[voice,voice_local] @ git+https://github.com/Alishahryar1/free-claude-code.git"
 ```

-**Configuration:**
-
-| Variable             | Description                                                                 | Default |
-| -------------------- | --------------------------------------------------------------------------- | ------- |
-| `VOICE_NOTE_ENABLED` | Enable voice note handling                                                  | `true`  |
-| `WHISPER_DEVICE`     | `cpu` \| `cuda` \| `nvidia_nim`                                             | `cpu`   |
-| `WHISPER_MODEL`      | See supported models below                                                  | `base`  |
-| `HF_TOKEN`           | Hugging Face token for faster model downloads (optional, for local Whisper) | —       |
-| `NVIDIA_NIM_API_KEY` | API key for NVIDIA NIM (required for `nvidia_nim` device)                   | —       |
-
-**Supported `WHISPER_MODEL` values:**
-
-| Model                                                                       | Device         | Description                            |
-| --------------------------------------------------------------------------- | -------------- | -------------------------------------- |
-| `tiny`, `base`, `small`, `medium`, `large-v2`, `large-v3`, `large-v3-turbo` | `cpu` / `cuda` | Local Whisper (Hugging Face)           |
-| `openai/whisper-large-v3`                                                   | `nvidia_nim`   | Auto language detection (best overall) |
-| `nvidia/parakeet-ctc-1.1b-asr`                                              | `nvidia_nim`   | English-only                           |
-| `nvidia/parakeet-ctc-0.6b-asr`                                              | `nvidia_nim`   | English-only                           |
-| `nvidia/parakeet-ctc-0.6b-zh-cn`                                            | `nvidia_nim`   | Mandarin Chinese                       |
-| `nvidia/parakeet-ctc-0.6b-zh-tw`                                            | `nvidia_nim`   | Traditional Chinese                    |
-| `nvidia/parakeet-ctc-0.6b-es`                                               | `nvidia_nim`   | Spanish                                |
-| `nvidia/parakeet-ctc-0.6b-vi`                                               | `nvidia_nim`   | Vietnamese                             |
-| `nvidia/parakeet-1.1b-rnnt-multilingual-asr`                                | `nvidia_nim`   | Multilingual RNNT                      |
-
---
-
-## Models
-
-<details>
-<summary><b>NVIDIA NIM</b></summary>
-
-Full list in [`nvidia_nim_models.json`](nvidia_nim_models.json).
-
-Popular models:
-
- `nvidia_nim/minimaxai/minimax-m2.5`
- `nvidia_nim/qwen/qwen3.5-397b-a17b`
- `nvidia_nim/z-ai/glm5`
- `nvidia_nim/stepfun-ai/step-3.5-flash`
- `nvidia_nim/moonshotai/kimi-k2.5`
-
-Browse: [build.nvidia.com](https://build.nvidia.com/explore/discover)
-
-Update model list:
-
-```bash
-curl "https://integrate.api.nvidia.com/v1/models" > nvidia_nim_models.json
-```
-
-</details>
-
-<details>
-<summary><b>OpenRouter</b></summary>
-
-Hundreds of models from StepFun, OpenAI, Anthropic, Google, and more.
-
-Popular models:
-
- `open_router/arcee-ai/trinity-large-preview:free`
- `open_router/stepfun/step-3.5-flash:free`
- `open_router/deepseek/deepseek-r1-0528:free`
- `open_router/openai/gpt-oss-120b:free`
-
-Browse: [openrouter.ai/models](https://openrouter.ai/models)
-
-Browse free models: [https://openrouter.ai/collections/free-models](https://openrouter.ai/collections/free-models)
-
-</details>
-
-<details>
-<summary><b>LM Studio</b></summary>
-
-Run models locally with [LM Studio](https://lmstudio.ai). Load a model in the Chat or Developer tab, then set `MODEL` to its identifier.
-
-Examples (native tool-use support):
-
- `LiquidAI/LFM2-24B-A2B-GGUF`
- `unsloth/MiniMax-M2.5-GGUF`
- `unsloth/GLM-4.7-Flash-GGUF`
- `unsloth/Qwen3.5-35B-A3B-GGUF`
- `LocoreMind/LocoOperator-4B`
-
-Browse: [model.lmstudio.ai](https://model.lmstudio.ai)
-
-</details>
+Configure via `WHISPER_DEVICE` (`cpu` | `cuda` | `nvidia_nim`) and `WHISPER_MODEL`. See the [Configuration](#configuration) table for all voice variables and supported model values.

 ---

 ## Configuration

-| Variable                          | Description                                                                        | Default                                           |
-| --------------------------------- | ---------------------------------------------------------------------------------- | ------------------------------------------------- |
-| `MODEL`                           | Fallback model (prefix format: `provider/model/name`; invalid prefix causes error) | `nvidia_nim/stepfun-ai/step-3.5-flash`            |
-| `MODEL_OPUS`                      | Model for Claude Opus requests (optional, falls back to `MODEL`)                   | `nvidia_nim/z-ai/glm4.7`                          |
-| `MODEL_SONNET`                    | Model for Claude Sonnet requests (optional, falls back to `MODEL`)                 | `open_router/arcee-ai/trinity-large-preview:free` |
-| `MODEL_HAIKU`                     | Model for Claude Haiku requests (optional, falls back to `MODEL`)                  | `open_router/stepfun/step-3.5-flash:free`         |
-| `NVIDIA_NIM_API_KEY`              | NVIDIA API key (NIM provider)                                                      | required                                          |
-| `OPENROUTER_API_KEY`              | OpenRouter API key (OpenRouter provider)                                           | required                                          |
-| `LM_STUDIO_BASE_URL`              | LM Studio server URL                                                               | `http://localhost:1234/v1`                        |
-| `PROVIDER_RATE_LIMIT`             | LLM API requests per window                                                        | `40`                                              |
-| `PROVIDER_RATE_WINDOW`            | Rate limit window (seconds)                                                        | `60`                                              |
-| `PROVIDER_MAX_CONCURRENCY`        | Max simultaneous open provider streams                                             | `5`                                               |
-| `HTTP_READ_TIMEOUT`               | Read timeout for provider API requests (seconds)                                   | `120`                                             |
-| `HTTP_WRITE_TIMEOUT`              | Write timeout for provider API requests (seconds)                                  | `10`                                              |
-| `HTTP_CONNECT_TIMEOUT`            | Connect timeout for provider API requests (seconds)                                | `2`                                               |
-| `FAST_PREFIX_DETECTION`           | Enable fast prefix detection                                                       | `true`                                            |
-| `ENABLE_NETWORK_PROBE_MOCK`       | Enable network probe mock                                                          | `true`                                            |
-| `ENABLE_TITLE_GENERATION_SKIP`    | Skip title generation                                                              | `true`                                            |
-| `ENABLE_SUGGESTION_MODE_SKIP`     | Skip suggestion mode                                                               | `true`                                            |
-| `ENABLE_FILEPATH_EXTRACTION_MOCK` | Enable filepath extraction mock                                                    | `true`                                            |
-| `MESSAGING_PLATFORM`              | Messaging platform: `discord` or `telegram`                                        | `discord`                                         |
-| `DISCORD_BOT_TOKEN`               | Discord Bot Token                                                                  | `""`                                              |
-| `ALLOWED_DISCORD_CHANNELS`        | Comma-separated channel IDs (empty = none allowed)                                 | `""`                                              |
-| `TELEGRAM_BOT_TOKEN`              | Telegram Bot Token                                                                 | `""`                                              |
-| `ALLOWED_TELEGRAM_USER_ID`        | Allowed Telegram User ID                                                           | `""`                                              |
-| `VOICE_NOTE_ENABLED`              | Enable voice note handling                                                         | `true`                                            |
-| `WHISPER_MODEL`                   | Local Whisper model size                                                           | `base`                                            |
-| `WHISPER_DEVICE`                  | `cpu` \| `cuda`                                                                    | `cpu`                                             |
-| `MESSAGING_RATE_LIMIT`            | Messaging messages per window                                                      | `1`                                               |
-| `MESSAGING_RATE_WINDOW`           | Messaging window (seconds)                                                         | `1`                                               |
-| `CLAUDE_WORKSPACE`                | Directory for agent workspace                                                      | `./agent_workspace`                               |
-| `ALLOWED_DIR`                     | Allowed directories for agent                                                      | `""`                                              |
+### Core
+
+| Variable             | Description                                                              | Default                                           |
+| -------------------- | ------------------------------------------------------------------------ | ------------------------------------------------- |
+| `MODEL`              | Fallback model (`provider/model/name` format; invalid prefix → error)   | `nvidia_nim/stepfun-ai/step-3.5-flash`            |
+| `MODEL_OPUS`         | Model for Claude Opus requests (falls back to `MODEL`)                  | `nvidia_nim/z-ai/glm4.7`                          |
+| `MODEL_SONNET`       | Model for Claude Sonnet requests (falls back to `MODEL`)                | `open_router/arcee-ai/trinity-large-preview:free` |
+| `MODEL_HAIKU`        | Model for Claude Haiku requests (falls back to `MODEL`)                 | `open_router/stepfun/step-3.5-flash:free`         |
+| `NVIDIA_NIM_API_KEY` | NVIDIA API key                                                           | required for NIM                                  |
+| `OPENROUTER_API_KEY` | OpenRouter API key                                                       | required for OpenRouter                           |
+| `LM_STUDIO_BASE_URL` | LM Studio server URL                                                     | `http://localhost:1234/v1`                        |
+
+### Rate Limiting & Timeouts
+
+| Variable                   | Description                                | Default |
+| -------------------------- | ------------------------------------------ | ------- |
+| `PROVIDER_RATE_LIMIT`      | LLM API requests per window                | `40`    |
+| `PROVIDER_RATE_WINDOW`     | Rate limit window (seconds)                | `60`    |
+| `PROVIDER_MAX_CONCURRENCY` | Max simultaneous open provider streams     | `5`     |
+| `HTTP_READ_TIMEOUT`        | Read timeout for provider requests (s)     | `120`   |
+| `HTTP_WRITE_TIMEOUT`       | Write timeout for provider requests (s)    | `10`    |
+| `HTTP_CONNECT_TIMEOUT`     | Connect timeout for provider requests (s)  | `2`     |
+
+### Messaging & Voice
+
+| Variable                   | Description                                                        | Default   |
+| -------------------------- | ------------------------------------------------------------------ | --------- |
+| `MESSAGING_PLATFORM`       | `discord` or `telegram`                                            | `discord` |
+| `DISCORD_BOT_TOKEN`        | Discord bot token                                                  | `""`      |
+| `ALLOWED_DISCORD_CHANNELS` | Comma-separated channel IDs (empty = none allowed)                 | `""`      |
+| `TELEGRAM_BOT_TOKEN`       | Telegram bot token                                                 | `""`      |
+| `ALLOWED_TELEGRAM_USER_ID` | Allowed Telegram user ID                                           | `""`      |
+| `CLAUDE_WORKSPACE`         | Directory where the agent operates                                 | `./agent_workspace` |
+| `ALLOWED_DIR`              | Allowed directories for the agent                                  | `""`      |
+| `MESSAGING_RATE_LIMIT`     | Messaging messages per window                                      | `1`       |
+| `MESSAGING_RATE_WINDOW`    | Messaging window (seconds)                                         | `1`       |
+| `VOICE_NOTE_ENABLED`       | Enable voice note handling                                         | `true`    |
+| `WHISPER_DEVICE`           | `cpu` \| `cuda` \| `nvidia_nim`                                    | `cpu`     |
+| `WHISPER_MODEL`            | Whisper model (local: `tiny`/`base`/`small`/`medium`/`large-v2`/`large-v3`/`large-v3-turbo`; NIM: `openai/whisper-large-v3`, `nvidia/parakeet-ctc-1.1b-asr`, etc.) | `base` |
+| `HF_TOKEN`                 | Hugging Face token for faster downloads (local Whisper, optional)  | —         |
+
+<details>
+<summary><b>Advanced: Request optimization flags</b></summary>
+
+These are enabled by default and intercept trivial Claude Code requests locally to save API quota.
+
+| Variable                          | Description                    | Default |
+| --------------------------------- | ------------------------------ | ------- |
+| `FAST_PREFIX_DETECTION`           | Enable fast prefix detection   | `true`  |
+| `ENABLE_NETWORK_PROBE_MOCK`       | Mock network probe requests    | `true`  |
+| `ENABLE_TITLE_GENERATION_SKIP`    | Skip title generation requests | `true`  |
+| `ENABLE_SUGGESTION_MODE_SKIP`     | Skip suggestion mode requests  | `true`  |
+| `ENABLE_FILEPATH_EXTRACTION_MOCK` | Mock filepath extraction       | `true`  |
+
+</details>

 See [`.env.example`](.env.example) for all supported parameters.

@ -473,18 +445,14 @@ free-claude-code/

 ```bash
 uv run ruff format     # Format code
-uv run ruff check      # Code style checking
+uv run ruff check      # Lint
 uv run ty check        # Type checking
 uv run pytest          # Run tests
 ```

---
+### Extending

-## Extending
-
-### Adding a Provider
-
-For **OpenAI-compatible APIs** (Groq, Together AI, etc.), extend `OpenAICompatibleProvider`:
+**Adding an OpenAI-compatible provider** (Groq, Together AI, etc.) — extend `OpenAICompatibleProvider`:

 ```python
 from providers.openai_compat import OpenAICompatibleProvider
@ -494,50 +462,11 @@ class MyProvider(OpenAICompatibleProvider):
    def __init__(self, config: ProviderConfig):
        super().__init__(config, provider_name="MYPROVIDER",
                         base_url="https://api.example.com/v1", api_key=config.api_key)
-
-    def _build_request_body(self, request):
-        return build_request_body(request)  # Your request builder
 ```

-For **fully custom APIs**, extend `BaseProvider` directly:
+**Adding a fully custom provider** — extend `BaseProvider` directly and implement `stream_response()`.

-```python
-from providers.base import BaseProvider, ProviderConfig
-
-class MyProvider(BaseProvider):
-    async def stream_response(self, request, input_tokens=0, *, request_id=None):
-        # Yield Anthropic SSE format events
-        ...
-```
-
-### Adding a Messaging Platform
-
-Extend `MessagingPlatform` in `messaging/` to add Slack or other platforms:
-
-```python
-from messaging.base import MessagingPlatform
-
-class MyPlatform(MessagingPlatform):
-    async def start(self):
-        # Initialize connection
-        ...
-
-    async def stop(self):
-        # Cleanup
-        ...
-
-    async def send_message(self, chat_id, text, reply_to=None, parse_mode=None, message_thread_id=None):
-        # Send a message
-        ...
-
-    async def edit_message(self, chat_id, message_id, text, parse_mode=None):
-        # Edit an existing message
-        ...
-
-    def on_message(self, handler):
-        # Register callback for incoming messages
-        ...
-```
+**Adding a messaging platform** — extend `MessagingPlatform` in `messaging/` and implement `start()`, `stop()`, `send_message()`, `edit_message()`, and `on_message()`.

 ---

@ -547,13 +476,9 @@ class MyPlatform(MessagingPlatform):
 - Add new LLM providers (Groq, Together AI, etc.)
 - Add new messaging platforms (Slack, etc.)
 - Improve test coverage
- Not accepting Docker Integration for now
- New and interesting features

 ```bash
-# Fork the repo, then:
 git checkout -b my-feature
-# Make your changes
 uv run ruff format && uv run ruff check && uv run ty check && uv run pytest
 # Open a pull request
 ```
@ -562,6 +487,6 @@ uv run ruff format && uv run ruff check && uv run ty check && uv run pytest

 ## License

-This project is licensed under the **MIT License**. See the [LICENSE](LICENSE) file for details.
+MIT License. See [LICENSE](LICENSE) for details.

 Built with [FastAPI](https://fastapi.tiangolo.com/), [OpenAI Python SDK](https://github.com/openai/openai-python), [discord.py](https://github.com/Rapptz/discord.py), and [python-telegram-bot](https://python-telegram-bot.org/).