15 KiB
Free Claude Code
Use Claude Code CLI & VSCode — for free. No Anthropic API key required.
A lightweight proxy server that translates Claude Code's Anthropic API calls into NVIDIA NIM, OpenRouter, or LM Studio format. Get 40 free requests/min on NVIDIA NIM, access hundreds of models on OpenRouter, or run fully local with LM Studio.
Features · Quick Start · How It Works · Telegram Bot · Configuration
Claude Code running via NVIDIA NIM — completely free
Features
| Feature | Description |
|---|---|
| Zero Cost | 40 req/min free on NVIDIA NIM. Free models on OpenRouter. Fully local with LM Studio |
| Drop-in Replacement | Set 2 env vars — no modifications to Claude Code CLI or VSCode extension needed |
| 3 Providers | NVIDIA NIM, OpenRouter (hundreds of models), LM Studio (local & offline) |
| Thinking Token Support | Parses <think> tags and reasoning_content into native Claude thinking blocks |
| Heuristic Tool Parser | Models outputting tool calls as text are auto-parsed into structured tool use |
| Request Optimization | 5 categories of trivial API calls intercepted locally — saves quota and latency |
| Telegram Bot | Remote autonomous coding with tree-based threading, session persistence, and live progress |
| Smart Rate Limiting | Proactive rolling-window throttle + reactive 429 exponential backoff across all providers |
| Subagent Control | Task tool interception forces run_in_background=False — no runaway subagents |
| Extensible | Clean BaseProvider and MessagingPlatform ABCs — add new providers or platforms easily |
Quick Start
Prerequisites
- Get an API key (or use LM Studio locally):
- NVIDIA NIM: build.nvidia.com/settings/api-keys
- OpenRouter: openrouter.ai/keys
- LM Studio: No API key needed — run locally with LM Studio
- Install Claude Code
- Install uv
Clone & Configure
git clone https://github.com/Alishahryar1/free-claude-code.git
cd free-claude-code
cp .env.example .env
Choose your provider and edit .env:
NVIDIA NIM (recommended — 40 req/min free)
PROVIDER_TYPE=nvidia_nim
NVIDIA_NIM_API_KEY=nvapi-your-key-here
MODEL=moonshotai/kimi-k2-thinking
OpenRouter (hundreds of models)
PROVIDER_TYPE=open_router
OPENROUTER_API_KEY=sk-or-your-key-here
MODEL=stepfun/step-3.5-flash:free
LM Studio (fully local, no API key)
PROVIDER_TYPE=lmstudio
MODEL=lmstudio-community/qwen2.5-7b-instruct
Run It
Terminal 1 — Start the proxy server:
uv run uvicorn server:app --host 0.0.0.0 --port 8082
Terminal 2 — Run Claude Code:
ANTHROPIC_AUTH_TOKEN=freecc ANTHROPIC_BASE_URL=http://localhost:8082 claude
That's it! Claude Code now uses your configured provider for free.
VSCode Extension Setup
- Start the proxy server (same as above).
- Open Settings (
Ctrl + ,) and search forclaude-code.environmentVariables. - Click Edit in settings.json and add:
"claude-code.environmentVariables": [
{ "name": "ANTHROPIC_BASE_URL", "value": "http://localhost:8082" },
{ "name": "ANTHROPIC_AUTH_TOKEN", "value": "freecc" }
]
- Reload extensions.
- If you see the login screen ("How do you want to log in?"): Click Anthropic Console, then authorize. The extension will start working. You may be redirected to buy credits in the browser — ignore that; the extension already works.
To switch back to Anthropic models, comment out the added block and reload extensions.
How It Works
┌─────────────────┐ ┌──────────────────────┐ ┌──────────────────┐
│ Claude Code │───────>│ Free Claude Code │───────>│ LLM Provider │
│ CLI / VSCode │<───────│ Proxy (:8082) │<───────│ NIM / OR / LMS │
└─────────────────┘ └──────────────────────┘ └──────────────────┘
Anthropic API │ OpenAI-compatible
format (SSE) ┌───────┴────────┐ format (SSE)
│ Optimizations │
├────────────────┤
│ Quota probes │
│ Title gen skip │
│ Prefix detect │
│ Suggestion skip│
│ Filepath mock │
└────────────────┘
- Transparent proxy — Claude Code sends standard Anthropic API requests to the proxy server
- Request optimization — 5 categories of trivial requests (quota probes, title generation, prefix detection, suggestions, filepath extraction) are intercepted and responded to instantly without using API quota
- Format translation — Real requests are translated from Anthropic format to the provider's OpenAI-compatible format and streamed back
- Thinking tokens —
<think>tags andreasoning_contentfields are converted into native Claude thinking blocks so Claude Code renders them correctly
Providers
| Provider | Cost | Rate Limit | Models | Best For |
|---|---|---|---|---|
| NVIDIA NIM | Free | 40 req/min | Kimi K2, GLM5, Devstral, MiniMax | Daily driver — generous free tier |
| OpenRouter | Free / Pay | Varies | 200+ (GPT-4o, Claude, Step, etc.) | Model variety, fallback options |
| LM Studio | Free (local) | Unlimited | Any GGUF model | Privacy, offline use, no rate limits |
Switch providers by changing PROVIDER_TYPE in .env:
| Provider | PROVIDER_TYPE |
API Key Variable | Base URL |
|---|---|---|---|
| NVIDIA NIM | nvidia_nim |
NVIDIA_NIM_API_KEY |
integrate.api.nvidia.com/v1 |
| OpenRouter | open_router |
OPENROUTER_API_KEY |
openrouter.ai/api/v1 |
| LM Studio | lmstudio |
(none) | localhost:1234/v1 |
OpenRouter gives access to hundreds of models (StepFun, OpenAI, Anthropic, etc.) through a single API. Set MODEL to any OpenRouter model ID.
LM Studio runs locally — start the server in LM Studio's Developer tab or via lms server start, load a model, and set MODEL to the model identifier.
Telegram Bot
Control Claude Code remotely from your phone. Send tasks, watch live progress, and manage multiple concurrent sessions.
Capabilities:
- Tree-based message threading — reply to messages to fork conversations
- Session persistence across server restarts
- Live streaming of thinking tokens, tool calls, and results
- Up to 10 concurrent Claude CLI sessions
- Commands:
/stop(cancel tasks),/clear(reset all sessions),/stats
Setup
-
Get a Bot Token — Message @BotFather on Telegram, send
/newbot, and copy the HTTP API Token. -
Edit
.env:
TELEGRAM_BOT_TOKEN=123456789:ABCdefGHIjklMNOpqrSTUvwxYZ
ALLOWED_TELEGRAM_USER_ID=your_telegram_user_id
To find your Telegram user ID, message @userinfobot.
- Configure the workspace (where Claude will operate):
CLAUDE_WORKSPACE=./agent_workspace
ALLOWED_DIR=C:/Users/yourname/projects
- Start the server:
uv run uvicorn server:app --host 0.0.0.0 --port 8082
- Send a message to the bot with a task. Claude responds with thinking tokens, tool calls as they execute, and the final result. Reply
/stopto a running task to cancel it.
Models
NVIDIA NIM
Full list in nvidia_nim_models.json.
Popular models:
moonshotai/kimi-k2-thinkingz-ai/glm5stepfun-ai/step-3.5-flashmoonshotai/kimi-k2.5minimaxai/minimax-m2.1mistralai/devstral-2-123b-instruct-2512
Browse: build.nvidia.com
Update model list:
curl "https://integrate.api.nvidia.com/v1/models" > nvidia_nim_models.json
OpenRouter
Hundreds of models from StepFun, OpenAI, Anthropic, Google, and more.
Examples:
stepfun/step-3.5-flash:freeopenai/gpt-4o-minianthropic/claude-3.5-sonnet
Browse: openrouter.ai/models
LM Studio
Run models locally with LM Studio. Load a model in the Chat or Developer tab, then set MODEL to its identifier.
Examples (native tool-use support):
lmstudio-community/qwen2.5-7b-instructlmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUFbartowski/Ministral-8B-Instruct-2410-GGUF
Browse: model.lmstudio.ai
Configuration
| Variable | Description | Default |
|---|---|---|
PROVIDER_TYPE |
Provider: nvidia_nim, open_router, or lmstudio |
nvidia_nim |
MODEL |
Model to use for all requests | stepfun-ai/step-3.5-flash |
NVIDIA_NIM_API_KEY |
NVIDIA API key (NIM provider) | required |
OPENROUTER_API_KEY |
OpenRouter API key (OpenRouter provider) | required |
LM_STUDIO_BASE_URL |
LM Studio server URL | http://localhost:1234/v1 |
PROVIDER_RATE_LIMIT |
LLM API requests per window | 40 |
PROVIDER_RATE_WINDOW |
Rate limit window (seconds) | 60 |
FAST_PREFIX_DETECTION |
Enable fast prefix detection | true |
ENABLE_NETWORK_PROBE_MOCK |
Enable network probe mock | true |
ENABLE_TITLE_GENERATION_SKIP |
Skip title generation | true |
ENABLE_SUGGESTION_MODE_SKIP |
Skip suggestion mode | true |
ENABLE_FILEPATH_EXTRACTION_MOCK |
Enable filepath extraction mock | true |
TELEGRAM_BOT_TOKEN |
Telegram Bot Token | "" |
ALLOWED_TELEGRAM_USER_ID |
Allowed Telegram User ID | "" |
MESSAGING_RATE_LIMIT |
Telegram messages per window | 1 |
MESSAGING_RATE_WINDOW |
Messaging window (seconds) | 1 |
CLAUDE_WORKSPACE |
Directory for agent workspace | ./agent_workspace |
ALLOWED_DIR |
Allowed directories for agent | "" |
MAX_CLI_SESSIONS |
Max concurrent CLI sessions | 10 |
See .env.example for all supported parameters.
Development
Project Structure
free-claude-code/
├── server.py # Entry point
├── api/ # FastAPI routes, request detection, optimization handlers
├── providers/ # BaseProvider ABC + NVIDIA NIM, OpenRouter, LM Studio
├── messaging/ # MessagingPlatform ABC + Telegram bot, session management
├── config/ # Settings, NIM config, logging
├── cli/ # CLI session and process management
├── utils/ # Text utilities
└── tests/ # Pytest test suite
Commands
uv run pytest # Run tests
uv run ty check # Type checking
uv run ruff check # Code style checking
uv run ruff format # Code formatting
Extending
Adding a Provider
Extend BaseProvider in providers/ to add support for other APIs:
from providers.base import BaseProvider, ProviderConfig
class MyProvider(BaseProvider):
async def stream_response(self, request, input_tokens=0, *, request_id=None):
# Yield Anthropic SSE format events
...
Adding a Messaging Platform
Extend MessagingPlatform in messaging/ to add Discord, Slack, or other platforms:
from messaging.base import MessagingPlatform
class MyPlatform(MessagingPlatform):
async def start(self):
# Initialize connection
...
async def stop(self):
# Cleanup
...
async def send_message(self, chat_id, text, reply_to=None, parse_mode=None):
# Send a message
...
async def edit_message(self, chat_id, message_id, text, parse_mode=None):
# Edit an existing message
...
def on_message(self, handler):
# Register callback for incoming messages
...
Contributing
Contributions are welcome! Here are some ways to help:
- Report bugs or suggest features via Issues
- Add new LLM providers (Groq, Together AI, etc.)
- Add new messaging platforms (Discord, Slack, etc.)
- Improve test coverage
# Fork the repo, then:
git checkout -b my-feature
# Make your changes
uv run pytest && uv run ty check
# Open a pull request
License
This project is licensed under the MIT License — see the LICENSE file for details.
Built with FastAPI, OpenAI Python SDK, and python-telegram-bot.