free-claude-code/README.md
2026-02-15 22:03:14 -08:00

15 KiB

Free Claude Code

Use Claude Code CLI & VSCode — for free. No Anthropic API key required.

GitHub Stars GitHub Forks License: MIT Python 3.14

uv Tested with Pytest Type checking: Ty Code style: Ruff Logging: Loguru

A lightweight proxy server that translates Claude Code's Anthropic API calls into NVIDIA NIM, OpenRouter, or LM Studio format. Get 40 free requests/min on NVIDIA NIM, access hundreds of models on OpenRouter, or run fully local with LM Studio.

Features · Quick Start · How It Works · Telegram Bot · Configuration


Free Claude Code in action

Claude Code running via NVIDIA NIM — completely free

Features

Feature Description
Zero Cost 40 req/min free on NVIDIA NIM. Free models on OpenRouter. Fully local with LM Studio
Drop-in Replacement Set 2 env vars — no modifications to Claude Code CLI or VSCode extension needed
3 Providers NVIDIA NIM, OpenRouter (hundreds of models), LM Studio (local & offline)
Thinking Token Support Parses <think> tags and reasoning_content into native Claude thinking blocks
Heuristic Tool Parser Models outputting tool calls as text are auto-parsed into structured tool use
Request Optimization 5 categories of trivial API calls intercepted locally — saves quota and latency
Telegram Bot Remote autonomous coding with tree-based threading, session persistence, and live progress
Smart Rate Limiting Proactive rolling-window throttle + reactive 429 exponential backoff across all providers
Subagent Control Task tool interception forces run_in_background=False — no runaway subagents
Extensible Clean BaseProvider and MessagingPlatform ABCs — add new providers or platforms easily

Quick Start

Prerequisites

  1. Get an API key (or use LM Studio locally):
  2. Install Claude Code
  3. Install uv

Clone & Configure

git clone https://github.com/Alishahryar1/free-claude-code.git
cd free-claude-code
cp .env.example .env

Choose your provider and edit .env:

NVIDIA NIM (recommended — 40 req/min free)
PROVIDER_TYPE=nvidia_nim
NVIDIA_NIM_API_KEY=nvapi-your-key-here
MODEL=moonshotai/kimi-k2-thinking
OpenRouter (hundreds of models)
PROVIDER_TYPE=open_router
OPENROUTER_API_KEY=sk-or-your-key-here
MODEL=stepfun/step-3.5-flash:free
LM Studio (fully local, no API key)
PROVIDER_TYPE=lmstudio
MODEL=lmstudio-community/qwen2.5-7b-instruct

Run It

Terminal 1 — Start the proxy server:

uv run uvicorn server:app --host 0.0.0.0 --port 8082

Terminal 2 — Run Claude Code:

ANTHROPIC_AUTH_TOKEN=freecc ANTHROPIC_BASE_URL=http://localhost:8082 claude

That's it! Claude Code now uses your configured provider for free.

VSCode Extension Setup
  1. Start the proxy server (same as above).
  2. Open Settings (Ctrl + ,) and search for claude-code.environmentVariables.
  3. Click Edit in settings.json and add:
"claude-code.environmentVariables": [
  { "name": "ANTHROPIC_BASE_URL", "value": "http://localhost:8082" },
  { "name": "ANTHROPIC_AUTH_TOKEN", "value": "freecc" }
]
  1. Reload extensions.
  2. If you see the login screen ("How do you want to log in?"): Click Anthropic Console, then authorize. The extension will start working. You may be redirected to buy credits in the browser — ignore that; the extension already works.

To switch back to Anthropic models, comment out the added block and reload extensions.


How It Works

┌─────────────────┐        ┌──────────────────────┐        ┌──────────────────┐
│  Claude Code    │───────>│  Free Claude Code    │───────>│  LLM Provider    │
│  CLI / VSCode   │<───────│  Proxy (:8082)       │<───────│  NIM / OR / LMS  │
└─────────────────┘        └──────────────────────┘        └──────────────────┘
   Anthropic API                     │                       OpenAI-compatible
   format (SSE)              ┌───────┴────────┐                format (SSE)
                             │ Optimizations  │
                             ├────────────────┤
                             │ Quota probes   │
                             │ Title gen skip │
                             │ Prefix detect  │
                             │ Suggestion skip│
                             │ Filepath mock  │
                             └────────────────┘
  • Transparent proxy — Claude Code sends standard Anthropic API requests to the proxy server
  • Request optimization — 5 categories of trivial requests (quota probes, title generation, prefix detection, suggestions, filepath extraction) are intercepted and responded to instantly without using API quota
  • Format translation — Real requests are translated from Anthropic format to the provider's OpenAI-compatible format and streamed back
  • Thinking tokens<think> tags and reasoning_content fields are converted into native Claude thinking blocks so Claude Code renders them correctly

Providers

Provider Cost Rate Limit Models Best For
NVIDIA NIM Free 40 req/min Kimi K2, GLM5, Devstral, MiniMax Daily driver — generous free tier
OpenRouter Free / Pay Varies 200+ (GPT-4o, Claude, Step, etc.) Model variety, fallback options
LM Studio Free (local) Unlimited Any GGUF model Privacy, offline use, no rate limits

Switch providers by changing PROVIDER_TYPE in .env:

Provider PROVIDER_TYPE API Key Variable Base URL
NVIDIA NIM nvidia_nim NVIDIA_NIM_API_KEY integrate.api.nvidia.com/v1
OpenRouter open_router OPENROUTER_API_KEY openrouter.ai/api/v1
LM Studio lmstudio (none) localhost:1234/v1

OpenRouter gives access to hundreds of models (StepFun, OpenAI, Anthropic, etc.) through a single API. Set MODEL to any OpenRouter model ID.

LM Studio runs locally — start the server in LM Studio's Developer tab or via lms server start, load a model, and set MODEL to the model identifier.


Telegram Bot

Control Claude Code remotely from your phone. Send tasks, watch live progress, and manage multiple concurrent sessions.

Capabilities:

  • Tree-based message threading — reply to messages to fork conversations
  • Session persistence across server restarts
  • Live streaming of thinking tokens, tool calls, and results
  • Up to 10 concurrent Claude CLI sessions
  • Commands: /stop (cancel tasks), /clear (reset all sessions), /stats

Setup

  1. Get a Bot Token — Message @BotFather on Telegram, send /newbot, and copy the HTTP API Token.

  2. Edit .env:

TELEGRAM_BOT_TOKEN=123456789:ABCdefGHIjklMNOpqrSTUvwxYZ
ALLOWED_TELEGRAM_USER_ID=your_telegram_user_id

To find your Telegram user ID, message @userinfobot.

  1. Configure the workspace (where Claude will operate):
CLAUDE_WORKSPACE=./agent_workspace
ALLOWED_DIR=C:/Users/yourname/projects
  1. Start the server:
uv run uvicorn server:app --host 0.0.0.0 --port 8082
  1. Send a message to the bot with a task. Claude responds with thinking tokens, tool calls as they execute, and the final result. Reply /stop to a running task to cancel it.

Models

NVIDIA NIM

Full list in nvidia_nim_models.json.

Popular models:

  • moonshotai/kimi-k2-thinking
  • z-ai/glm5
  • stepfun-ai/step-3.5-flash
  • moonshotai/kimi-k2.5
  • minimaxai/minimax-m2.1
  • mistralai/devstral-2-123b-instruct-2512

Browse: build.nvidia.com

Update model list:

curl "https://integrate.api.nvidia.com/v1/models" > nvidia_nim_models.json
OpenRouter

Hundreds of models from StepFun, OpenAI, Anthropic, Google, and more.

Examples:

  • stepfun/step-3.5-flash:free
  • openai/gpt-4o-mini
  • anthropic/claude-3.5-sonnet

Browse: openrouter.ai/models

LM Studio

Run models locally with LM Studio. Load a model in the Chat or Developer tab, then set MODEL to its identifier.

Examples (native tool-use support):

  • lmstudio-community/qwen2.5-7b-instruct
  • lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF
  • bartowski/Ministral-8B-Instruct-2410-GGUF

Browse: model.lmstudio.ai


Configuration

Variable Description Default
PROVIDER_TYPE Provider: nvidia_nim, open_router, or lmstudio nvidia_nim
MODEL Model to use for all requests stepfun-ai/step-3.5-flash
NVIDIA_NIM_API_KEY NVIDIA API key (NIM provider) required
OPENROUTER_API_KEY OpenRouter API key (OpenRouter provider) required
LM_STUDIO_BASE_URL LM Studio server URL http://localhost:1234/v1
PROVIDER_RATE_LIMIT LLM API requests per window 40
PROVIDER_RATE_WINDOW Rate limit window (seconds) 60
FAST_PREFIX_DETECTION Enable fast prefix detection true
ENABLE_NETWORK_PROBE_MOCK Enable network probe mock true
ENABLE_TITLE_GENERATION_SKIP Skip title generation true
ENABLE_SUGGESTION_MODE_SKIP Skip suggestion mode true
ENABLE_FILEPATH_EXTRACTION_MOCK Enable filepath extraction mock true
TELEGRAM_BOT_TOKEN Telegram Bot Token ""
ALLOWED_TELEGRAM_USER_ID Allowed Telegram User ID ""
MESSAGING_RATE_LIMIT Telegram messages per window 1
MESSAGING_RATE_WINDOW Messaging window (seconds) 1
CLAUDE_WORKSPACE Directory for agent workspace ./agent_workspace
ALLOWED_DIR Allowed directories for agent ""
MAX_CLI_SESSIONS Max concurrent CLI sessions 10

See .env.example for all supported parameters.


Development

Project Structure

free-claude-code/
├── server.py              # Entry point
├── api/                   # FastAPI routes, request detection, optimization handlers
├── providers/             # BaseProvider ABC + NVIDIA NIM, OpenRouter, LM Studio
├── messaging/             # MessagingPlatform ABC + Telegram bot, session management
├── config/                # Settings, NIM config, logging
├── cli/                   # CLI session and process management
├── utils/                 # Text utilities
└── tests/                 # Pytest test suite

Commands

uv run pytest          # Run tests
uv run ty check        # Type checking
uv run ruff check      # Code style checking
uv run ruff format     # Code formatting

Extending

Adding a Provider

Extend BaseProvider in providers/ to add support for other APIs:

from providers.base import BaseProvider, ProviderConfig

class MyProvider(BaseProvider):
    async def stream_response(self, request, input_tokens=0, *, request_id=None):
        # Yield Anthropic SSE format events
        ...

Adding a Messaging Platform

Extend MessagingPlatform in messaging/ to add Discord, Slack, or other platforms:

from messaging.base import MessagingPlatform

class MyPlatform(MessagingPlatform):
    async def start(self):
        # Initialize connection
        ...

    async def stop(self):
        # Cleanup
        ...

    async def send_message(self, chat_id, text, reply_to=None, parse_mode=None):
        # Send a message
        ...

    async def edit_message(self, chat_id, message_id, text, parse_mode=None):
        # Edit an existing message
        ...

    def on_message(self, handler):
        # Register callback for incoming messages
        ...

Contributing

Contributions are welcome! Here are some ways to help:

  • Report bugs or suggest features via Issues
  • Add new LLM providers (Groq, Together AI, etc.)
  • Add new messaging platforms (Discord, Slack, etc.)
  • Improve test coverage
# Fork the repo, then:
git checkout -b my-feature
# Make your changes
uv run pytest && uv run ty check
# Open a pull request

License

This project is licensed under the MIT License — see the LICENSE file for details.

Built with FastAPI, OpenAI Python SDK, and python-telegram-bot.