mirror of https://github.com/Alishahryar1/free-claude-code.git synced 2026-05-20 00:56:31 +00:00

mirror of https://github.com/Alishahryar1/free-claude-code link from https://www.reddit.com/r/ZaiGLM/comments/1rhslai/glm5_is_officially_fixed_on_nvidia_nim_and_you/

Find a file

Alishahryar1 54a9dc4e34 Some checks are pending CI / Ban type ignore suppressions (push) Waiting to run Details CI / pytest (push) Waiting to run Details CI / ruff-check (push) Waiting to run Details CI / ruff-format (push) Waiting to run Details CI / ty (push) Waiting to run Details Update README voice section		2026-05-18 13:51:56 -07:00
.github	Remove agg ci job	2026-05-17 15:19:01 -07:00
api	Remove check local button from admin page	2026-05-17 13:04:07 -07:00
assets	Update README	2026-05-17 16:51:00 -07:00
cli	Reapply "set auto compaction window for fcc-claude to 190K"	2026-05-17 19:58:44 -07:00
config	Add fireworks AI support (#476 )	2026-05-18 05:32:25 -07:00
core	Revert "Improve provider error diagnostics"	2026-05-16 14:49:36 -07:00
messaging	feat(logging): structured TRACE events and end-to-end request correlation	2026-05-10 18:24:48 -07:00
providers	Add fireworks AI support (#476 )	2026-05-18 05:32:25 -07:00
smoke	Remove Anthropic API key from proxy child env	2026-05-16 15:13:05 -07:00
tests	Add fireworks AI support (#476 )	2026-05-18 05:32:25 -07:00
.env.example	open admin page on server startup	2026-05-17 12:59:48 -07:00
.gitignore	Add fireworks AI support (#476 )	2026-05-18 05:32:25 -07:00
.python-version	Set python version to 3.14.0	2026-03-02 05:13:04 -08:00
AGENTS.md	Remove agg ci job	2026-05-17 15:19:01 -07:00
CLAUDE.md	Updated Claude.md to point to AGENTS.md	2026-03-08 12:19:18 -07:00
LICENSE	added license	2026-01-28 13:36:34 -08:00
pyproject.toml	build(deps): bump the minor-and-patch group with 6 updates (#457 )	2026-05-15 12:55:49 -07:00
README.md	Update README voice section	2026-05-18 13:51:56 -07:00
server.py	Report startup validation failures without tracebacks	2026-04-30 00:43:43 -07:00
uv.lock	build(deps): bump the minor-and-patch group with 6 updates (#457 )	2026-05-15 12:55:49 -07:00

README.md

🤖 Free Claude Code

Use Claude Code CLI, VS Code, JetBrains ACP, or chat bots through your own Anthropic-compatible proxy.

Free Claude Code routes Anthropic Messages API traffic from Claude Code to NVIDIA NIM, Kimi, Wafer, OpenRouter, DeepSeek, LM Studio, llama.cpp, or Ollama. It keeps Claude Code's client-side protocol stable while letting you choose free, paid, or local models.

Quick Start · Providers · Clients · Integrations · Development

Star History

What You Get

Drop-in proxy for Claude Code's Anthropic API calls.
Ten provider backends: NVIDIA NIM, Kimi, Wafer, OpenRouter, DeepSeek, LM Studio, llama.cpp, Ollama, OpenCode Zen, and Z.ai.
Per-model routing: send Opus, Sonnet, Haiku, and fallback traffic to different providers.
Native Claude Code /model picker support through the proxy's /v1/models endpoint (Claude Code must opt in to Gateway model discovery; see Model Picker).
Streaming, tool use, reasoning/thinking block handling, and local request optimizations.
Optional Discord or Telegram bot wrapper for remote coding sessions.
Optional Usage through the VSCode extension.
Optional voice-note transcription through local Whisper or NVIDIA NIM.
Local Admin UI at /admin to edit supported proxy settings, validate changes, and check providers (loopback access only).

Quick Start

1. Install the latest version of Claude Code

npm install -g @anthropic-ai/claude-code

2. Install Runtime Requirements

Install the latest version of uv and Python 3.14.

macOS/Linux:

curl -LsSf https://astral.sh/uv/install.sh | sh
uv self update
uv python install 3.14

Windows PowerShell:

powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
uv self update
uv python install 3.14

3. Get An NVIDIA NIM API Key

Create a free NVIDIA NIM API key, then keep it ready for the Admin UI setup step.

See NVIDIA NIM provider setup.

4. Install The Proxy

uv tool install --force git+https://github.com/Alishahryar1/free-claude-code.git

Use the same command to update to the latest version.

5. Start The Proxy

fcc-server

After startup, Uvicorn prints the proxy bind address and the app logs the admin URL:

INFO:     Admin UI: http://127.0.0.1:8082/admin (local-only)

Many terminals make these clickable. Use your configured PORT if it is not 8082.

6. Open The Admin UI And Configure NVIDIA NIM

Open the Admin UI URL from the terminal output.

Paste your NVIDIA NIM API key into NVIDIA_NIM_API_KEY, then click Validate and Apply.

The default model is already set to nvidia_nim/z-ai/glm4.7. You can change it later from the same Admin UI.

7. Run Claude Code

fcc-claude

fcc-claude reads the current configured port and auth token each time it starts, sets the Claude Code environment variables (including a 190k-token CLAUDE_CODE_AUTO_COMPACT_WINDOW for auto-compaction), and then launches the real claude command.

Choose A Provider

Pick one provider, enter its key or local URL in the Admin UI, and set MODEL to a provider-prefixed model slug. MODEL is the fallback. MODEL_OPUS, MODEL_SONNET, and MODEL_HAIKU can override routing for Claude Code's model tiers.

1. NVIDIA NIM

Get a key at build.nvidia.com/settings/api-keys.

In the Admin UI, paste it into NVIDIA_NIM_API_KEY. The default MODEL is nvidia_nim/z-ai/glm4.7.

Popular examples:

nvidia_nim/z-ai/glm4.7
nvidia_nim/z-ai/glm5
nvidia_nim/moonshotai/kimi-k2.5
nvidia_nim/minimaxai/minimax-m2.5

Browse models at build.nvidia.com.

2. Kimi

Get a key at platform.moonshot.ai/console/api-keys.

In the Admin UI, paste it into KIMI_API_KEY, then set MODEL to a Kimi slug such as kimi/kimi-k2.5.

Browse models at platform.moonshot.ai.

3. Wafer

Get a key from wafer.ai. In the Admin UI, paste it into WAFER_API_KEY, then set MODEL to a Wafer Pass model such as wafer/DeepSeek-V4-Pro.

Popular examples:

wafer/DeepSeek-V4-Pro
wafer/MiniMax-M2.7
wafer/Qwen3.5-397B-A17B
wafer/GLM-5.1

This provider uses Wafer's Anthropic-compatible endpoint at https://pass.wafer.ai/v1/messages.

4. OpenRouter

Get a key at openrouter.ai/keys.

In the Admin UI, paste it into OPENROUTER_API_KEY, then set MODEL to an OpenRouter slug such as open_router/stepfun/step-3.5-flash:free.

Browse all models or free models.

5. DeepSeek

Get a key at platform.deepseek.com/api_keys.

In the Admin UI, paste it into DEEPSEEK_API_KEY, then set MODEL to a DeepSeek slug such as deepseek/deepseek-chat.

This provider uses DeepSeek's Anthropic-compatible endpoint, not the OpenAI chat-completions endpoint.

6. LM Studio

Start LM Studio's local server and load a model. In the Admin UI, keep or update LM_STUDIO_BASE_URL, then set MODEL to the model identifier shown by LM Studio, prefixed with lmstudio/.

Prefer models with tool-use support for Claude Code workflows.

7. llama.cpp

Start llama-server with an Anthropic-compatible /v1/messages endpoint and enough context for Claude Code requests.

In the Admin UI, keep or update LLAMACPP_BASE_URL, then set MODEL to the local model slug, prefixed with llamacpp/.

For local coding models, context size matters. If llama.cpp returns HTTP 400 for normal Claude Code requests, increase --ctx-size and verify the model/server build supports the requested features.

8. Ollama

Run Ollama and pull a model:

ollama pull llama3.1
ollama serve

In the Admin UI, keep or update OLLAMA_BASE_URL, then set MODEL to the same tag shown by ollama list, prefixed with ollama/.

OLLAMA_BASE_URL is the Ollama server root; do not append /v1. Example model slugs include ollama/llama3.1 and ollama/llama3.1:8b.

9. OpenCode Zen

Get an API key at opencode.ai/auth.

In the Admin UI, paste it into OPENCODE_API_KEY, then set MODEL to an OpenCode Zen model slug such as opencode/gpt-5.3-codex.

OpenCode Zen is a curated model gateway that provides access to models from Anthropic, OpenAI, Google, DeepSeek, and more through a single API key and OpenAI-compatible endpoint at https://opencode.ai/zen/v1.

Popular examples:

opencode/gpt-5.3-codex
opencode/claude-sonnet-4
opencode/deepseek-v4-flash-free (free)
opencode/gemini-3-flash
opencode/big-pickle (free)
opencode/glm-5.1

Browse available models at opencode.ai.

10. Z.ai

Get an API key at Z.ai/manage-apikey/apikey-list.

In the Admin UI, paste it into ZAI_API_KEY, then set MODEL to a Z.ai model slug such as zai/glm-5.1.

Z.ai provides GLM models through the OpenAI-compatible Coding Plan endpoint at https://api.z.ai/api/coding/paas/v4.

Popular examples:

zai/glm-5.1
zai/glm-5-turbo

Browse models at Z.ai.

11. Mix Providers By Model Tier

Each model tier can use a different provider by setting MODEL_OPUS, MODEL_SONNET, and MODEL_HAIKU in the Admin UI. Leave a tier blank to inherit MODEL.

For example, you can route Opus to nvidia_nim/moonshotai/kimi-k2.5, Sonnet to open_router/deepseek/deepseek-r1-0528:free, Haiku to lmstudio/unsloth/GLM-4.7-Flash-GGUF, and keep the fallback MODEL on zai/glm-5.1.

Connect Claude Code

1. Claude Code CLI

For terminal use, prefer the installed launcher:

fcc-claude

Keep fcc-server running while you work. The Admin UI manages proxy config, restarts the server when runtime settings change, and fcc-claude reads the current Admin UI-managed port and auth token every time it starts. It also sets CLAUDE_CODE_AUTO_COMPACT_WINDOW to 190000 for auto-compaction.

2. VS Code Extension

Open Settings, search for claude-code.environmentVariables, choose Edit in settings.json, and add:

"claudeCode.environmentVariables": [
  { "name": "ANTHROPIC_BASE_URL", "value": "http://localhost:8082" },
  { "name": "ANTHROPIC_AUTH_TOKEN", "value": "freecc" },
  { "name": "CLAUDE_CODE_ENABLE_GATEWAY_MODEL_DISCOVERY", "value": "1" },
  { "name": "CLAUDE_CODE_AUTO_COMPACT_WINDOW", "value": "190000" }
]

Reload the extension. If the extension shows a login screen, choose the Anthropic Console path once; the local proxy still handles model traffic after the environment variables are active.

3. JetBrains ACP

Edit the installed Claude ACP config:

Windows: C:\Users\%USERNAME%\AppData\Roaming\JetBrains\acp-agents\installed.json
Linux/macOS: ~/.jetbrains/acp.json

Set the environment for acp.registry.claude-acp:

"env": {
  "ANTHROPIC_BASE_URL": "http://localhost:8082",
  "ANTHROPIC_AUTH_TOKEN": "freecc",
  "CLAUDE_CODE_ENABLE_GATEWAY_MODEL_DISCOVERY": "1",
  "CLAUDE_CODE_AUTO_COMPACT_WINDOW": "190000"
}

Restart the IDE after changing the file.

4. Model Picker

Claude Code model picker showing gateway models

Optional Integrations

For every integration below, change managed proxy settings only in the Admin UI at /admin: edit fields, click Validate, then Apply. The footer shows where the managed config is stored; this README does not walk through editing that file by hand.

1. Discord And Telegram Bots

The bot wrapper runs Claude Code sessions remotely, streams progress, supports reply-based conversation branches, and can stop or clear tasks.

Discord

Create the bot in the Discord Developer Portal.
Enable Message Content Intent.
Invite the bot with read, send, and message history permissions.
Copy the bot token and the numeric channel ID (or IDs) where the bot should respond.

Telegram

Create a bot with @BotFather and copy the bot token.
Get your numeric user ID from @userinfobot so only you can use the bot.

Configure in the Admin UI

With fcc-server running, open the Admin UI URL from the terminal output.
In the sidebar, choose Messaging.
Set Messaging Platform to discord or telegram.
For Discord, paste Discord Bot Token and Allowed Discord Channels. For Telegram, paste Telegram Bot Token and Allowed Telegram User ID.
Set Allowed Directory to an absolute path on the machine running the proxy—the workspace root the bot may use.
Click Validate, then Apply. Restart the server if the UI says one is required.

Admin UI Messaging view with bot and voice settings

Admin UI → Messaging (platform, bots, and Voice)

Useful commands

/stop cancels a task; reply to a task message to stop only that branch.
/clear resets sessions; reply to clear one branch.
/stats shows session state.

2. Voice Notes

Voice notes work on Discord and Telegram after you extend your proxy install with the matching optional extras. Re-run uv tool install --force with the extras you need (same Git URL as Quick Start):

# NVIDIA NIM transcription (Riva gRPC)
uv tool install --force "free-claude-code[voice] @ git+https://github.com/Alishahryar1/free-claude-code.git"

# Local Whisper (CPU or CUDA)
uv tool install --force "free-claude-code[voice_local] @ git+https://github.com/Alishahryar1/free-claude-code.git"

# Both backends
uv tool install --force "free-claude-code[voice,voice_local] @ git+https://github.com/Alishahryar1/free-claude-code.git"

For cuda local Whisper, add --torch-backend cu130 to the voice_local install command. Restart fcc-server after reinstalling.

In the Admin UI, open Messaging and scroll to Voice. Turn on Voice Notes, choose Whisper Device (cpu, cuda, or nvidia_nim), set Whisper Model, and enter Hugging Face Token when your setup needs it. For nvidia_nim transcription, install the voice extra and set NVIDIA NIM API Key on the Providers view. The screenshot above shows the Voice block in the same view.

How It Works

Free Claude Code request flow architecture

Diagram source: assets/how-it-works.mmd.

Important pieces:

FastAPI exposes Anthropic-compatible routes such as /v1/messages, /v1/messages/count_tokens, and /v1/models.
Model routing resolves the Claude model name to MODEL_OPUS, MODEL_SONNET, MODEL_HAIKU, or MODEL.
NIM, OpenCode Zen, Z.ai use OpenAI chat streaming translated into Anthropic SSE.
Wafer, OpenRouter, DeepSeek, LM Studio, llama.cpp, and Ollama use Anthropic Messages style transports.
The proxy normalizes thinking blocks, tool calls, token usage metadata, and provider errors into the shape Claude Code expects.
Request optimizations answer trivial Claude Code probes locally to save latency and quota.

Development

1. Project Structure

free-claude-code/
├── server.py              # ASGI entry point
├── api/                   # FastAPI routes, service layer, routing, optimizations
├── core/                  # Shared Anthropic protocol helpers and SSE utilities
├── providers/             # Provider transports, registry, rate limiting
├── messaging/             # Discord/Telegram adapters, sessions, voice
├── cli/                   # Package entry points and Claude process management
├── config/                # Settings, provider catalog, logging
└── tests/                 # Unit and contract tests

2. Run From Source

Use this path if you are developing or want to run directly from a checkout:

git clone https://github.com/Alishahryar1/free-claude-code.git
cd free-claude-code
uv run uvicorn server:app --host 0.0.0.0 --port 8082

3. Commands

uv run ruff format
uv run ruff check
uv run ty check
uv run pytest

Run them in that order before pushing. CI enforces the same checks.

4. Package Scripts

pyproject.toml installs:

fcc-server: starts the proxy with configured host and port.
fcc-init: optional advanced scaffold for ~/.fcc/.env; prefer the Admin UI for normal configuration.
fcc-claude: launches Claude Code with the configured local proxy URL, auth token, model discovery flag, and a 190k CLAUDE_CODE_AUTO_COMPACT_WINDOW for auto-compaction.
free-claude-code: compatibility alias for fcc-server.

5. Extending

Add OpenAI-compatible providers by extending OpenAIChatTransport.
Add Anthropic Messages providers by extending AnthropicMessagesTransport.
Register provider metadata in config.provider_catalog and factory wiring in providers.registry.
Add messaging platforms by implementing the MessagingPlatform interface in messaging/.

Contributing

.env.example lists env key names as a read-only reference for contributors; use the Admin UI to change managed proxy settings.
Report bugs and feature requests in Issues.
Keep changes small and covered by focused tests.
Do not open Docker integration PRs.
Do not open README change PRs just open an issue for it.
Run the full check sequence before opening a pull request.
The syntax except X, Y is brought back in python 3.14 final version (not in 3.14 alpha). Keep in mind before opening PRs.

License

MIT License. See LICENSE for details.