vrr/free-claude-code

mirror of https://github.com/Alishahryar1/free-claude-code.git synced 2026-04-28 03:20:01 +00:00

Author	SHA1	Message	Date
arssing	2fe15bd2cd	feat: add proxy support for httpx clients (#125 ) Add proxy support for providers based on [doc](https://www.python-httpx.org/advanced/proxies/): - Add per-provider proxy support (HTTP and SOCKS5) for all 4 providers: nvidia_nim, open_router, lmstudio, llamacpp - Each provider gets its own env var (NVIDIA_NIM_PROXY, OPENROUTER_PROXY, LMSTUDIO_PROXY, LLAMACPP_PROXY) for independent proxy configuration --------- Co-authored-by: Alishahryar1 <alishahryar2@gmail.com>	2026-04-22 17:06:16 -07:00
Pavel Yurchenko	e719e4aed2	feat: deepseek api support (#118 ) ## Summary * add native DeepSeek provider support via the shared OpenAI-compatible provider base * allow `deepseek/...` model prefixes in config validation * add `DEEPSEEK_API_KEY` and `DEEPSEEK_BASE_URL` settings * add DeepSeek entries to `.env.example` and `config/env.example` * implement `DeepSeekProvider` and register it in provider dependencies * add a DeepSeek request builder with DeepSeek-specific thinking payload handling * preserve Anthropic thinking blocks as `reasoning_content` for DeepSeek-compatible continuation flows * update `claude-pick` to discover DeepSeek models from the DeepSeek API * document DeepSeek usage in `README.md` * add tests for config validation, provider dependency wiring, request building, and streaming behavior ## Motivation DeepSeek exposes an OpenAI-compatible API and can be used directly without routing through OpenRouter. This lets users spend their existing DeepSeek balance through the proxy while keeping the same Claude Code workflow and per-model provider mapping. ## Example ```dotenv DEEPSEEK_API_KEY="sk-..." DEEPSEEK_BASE_URL="https://api.deepseek.com" MODEL_OPUS="deepseek/deepseek-reasoner" MODEL_SONNET="deepseek/deepseek-chat" MODEL_HAIKU="deepseek/deepseek-chat" MODEL="deepseek/deepseek-chat" --------- Co-authored-by: Alishahryar1 <alishahryar2@gmail.com>	2026-04-22 17:06:01 -07:00
Alishahryar1	c0d0ac6d42	lint	2026-04-18 16:33:49 -07:00
Alishahryar1	835d0454e8	Fixes for issue 113 and 116	2026-04-18 16:32:31 -07:00
Alishahryar1	b75f47b62d	Gate NIM thinking params behind NIM_ENABLE_THINKING env var Mistral models reject chat_template_kwargs, causing 400 errors. Make thinking params (chat_template_kwargs, reasoning_budget) opt-in via NIM_ENABLE_THINKING env var (default false) so only models that need it (kimi, nemotron) receive them.	2026-03-27 21:44:36 -07:00
th-ch	f703a0e403	Implement optional authentication (Anthropic style) (#80 ) Some checks are pending CI / checks (push) Waiting to run Details	2026-03-27 11:11:47 -07:00
Alishahryar1	5a36a32836	feat: add llama.cpp provider for local anthropic messages API	2026-03-08 10:38:25 -07:00
Ali Khokhar	c5341ecbbe	Add option for an installable package (#75 )	2026-03-06 22:06:33 -08:00
Alishahryar1	a7d88d5cbd	Updated README with per-model mapping, fixed test .env isolation	2026-03-01 21:52:35 -08:00
Ali Khokhar	0b324e0421	Per claude model mapping (#66 )	2026-03-01 21:32:23 -08:00
Mauro Druwel	de70700dde	feat: Use NVIDIA NIM ASR for audio transcription (#53 ) ## Summary Added NVIDIA NIM as a second transcription option ( alongside local Whisper). This lets you transcribe voice notes using NVIDIA's cloud API instead of running Whisper locally. ## What changed - Transcription: Now supports the two backends - Local Whisper: Free, runs on your GPU/CPU (existing) - NVIDIA NIM: Cloud API via Riva gRPC (new) - Supported models: 8 NVIDIA NIM models added (Parakeet variants for different languages, Whisper Large V3) --------- Co-authored-by: Alishahryar1 <alishahryar2@gmail.com>	2026-02-28 08:48:59 -08:00
Alishahryar1	a74ec74271	Major refactor done with minimax m2.5	2026-02-28 04:36:29 -08:00
Alishahryar1	d6a0e1a401	Provider inferred from model name using prefix	2026-02-19 20:53:02 -08:00
Claude	45b7e4cafd	Make PROVIDER_MAX_CONCURRENCY required with default of 5 - `max_concurrency` is now always an `int` (default 5) — `None`/unlimited is no longer a valid state; omitting the env var uses the default - `GlobalRateLimiter`: semaphore is always created; `concurrency_slot()` no longer has None guards; log message always includes concurrency - `ProviderConfig.max_concurrency`: `int = 5` (was `int \| None = None`) - `Settings.provider_max_concurrency`: `int = Field(default=5, ...)` — setting env var to an invalid value (e.g. empty string) raises - `.env.example`: uncommented `PROVIDER_MAX_CONCURRENCY=5` - README: updated config table default from `—` to `5` - Tests: removed `test_concurrency_slot_noop_when_not_configured`; updated mock settings to use `5` instead of `None` https://claude.ai/code/session_014mrF1WMNgmNjtPBuoQHsbg	2026-02-19 14:39:42 +00:00
Claude	99f99fce90	Remove max_cli_sessions — CLI session pool is now unbounded The max_sessions cap in CLISessionManager was the only thing enforcing a limit on concurrent CLI processes. Now that provider concurrency is controlled at the streaming layer (PROVIDER_MAX_CONCURRENCY semaphore), the CLI session pool cap is redundant and removed entirely. Changes: - cli/manager.py: remove max_sessions param, cap check, _cleanup_idle_sessions_unlocked, max_sessions from get_stats() - config/settings.py: remove max_cli_sessions field - api/app.py: remove max_sessions=settings.max_cli_sessions from CLISessionManager constructor - messaging/handler.py: remove "Waiting for slot" status check; stats display no longer shows Max CLI - .env.example: remove MAX_CLI_SESSIONS line - tests/cli/test_cli.py: remove max_sessions args and assertion from manager tests - tests/cli/test_cli_manager_edge_cases.py: remove two tests for cap/cleanup behavior - tests/api/test_app_lifespan_and_errors.py: remove max_cli_sessions from all SimpleNamespace settings - tests/config/test_config.py: remove max_cli_sessions isinstance assertion - tests/conftest.py: remove max_sessions from mock stats - tests/messaging/test_handler.py: merge slot/capacity tests into single new-conversation test; remove Max CLI assertion from stats test - tests/messaging/test_handler_markdown_and_status_edges.py: remove "Waiting for slot" assertion; drop max_sessions from all stats mocks https://claude.ai/code/session_014mrF1WMNgmNjtPBuoQHsbg	2026-02-19 14:31:47 +00:00
Claude	afaf50a972	Add queue-level concurrency limit to provider streaming Adds max_concurrency cap to GlobalRateLimiter using asyncio.Semaphore. A request now waits for a concurrency slot before the sliding window rate limit check, so at most N streams are open to the provider simultaneously, even when the rate window would allow more. Changes: - providers/rate_limit.py: max_concurrency param, _concurrency_sem, concurrency_slot() asynccontextmanager - providers/openai_compat.py: pass max_concurrency to limiter; wrap execute_with_retry + stream iteration in concurrency_slot() - providers/base.py: max_concurrency field on ProviderConfig - config/settings.py: provider_max_concurrency setting (PROVIDER_MAX_CONCURRENCY env var, default None = unlimited) - api/dependencies.py: pass provider_max_concurrency into all three provider ProviderConfig instantiations - .env.example: document PROVIDER_MAX_CONCURRENCY (commented out) - tests/providers/test_provider_rate_limit.py: 5 new tests covering concurrency limit enforcement, slot release on exception, noop when unconfigured - tests/api/test_dependencies.py: add provider_max_concurrency=None to mock settings helper https://claude.ai/code/session_014mrF1WMNgmNjtPBuoQHsbg	2026-02-19 14:23:21 +00:00
Alishahryar1	c35ecba9d8	Update Whisper model configuration to use 'base' as the default model ID	2026-02-18 19:36:58 -08:00
Alishahryar1	75e066f17f	Refactor voice note transcription to use Hugging Face transformers Whisper pipeline - Updated transcription logic to utilize Hugging Face's Whisper models instead of faster-whisper. - Introduced new model mapping and pipeline loading functions. - Adjusted tests to reflect changes in the transcription process. - Updated documentation in README, .env.example, and settings to align with the new implementation. - Ensured compatibility with CUDA 13 and removed unnecessary dependencies.	2026-02-18 06:18:28 -08:00
Cursor Agent	db646ef2db	Remove auto support for whisper_device; only cpu and cuda allowed - Validate whisper_device in Settings and _get_local_model - Reject 'auto' with clear ValueError/ValidationError - Update docs in config, .env.example, README - Add tests for invalid device and valid cpu/cuda Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>	2026-02-18 13:38:59 +00:00
Cursor Agent	2135e6da05	Add large-v3 and large-v3-turbo whisper model options Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>	2026-02-18 13:37:58 +00:00
Cursor Agent	eabe8db2e8	Remove CPU fallbacks for voice note transcribe; auto/cuda/cpu fail fast - Remove _cuda_failed_models and inference-time CPU fallback - auto: try CUDA only, fail fast on RuntimeError (no CPU fallback) - cpu/cuda: use device directly, fail fast on errors - Update docs in config, .env.example, README Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>	2026-02-18 13:37:23 +00:00
Alishahryar1	b05d0d2703	new linter rules and fixes	2026-02-18 04:13:41 -08:00
Alishahryar1	d668f6e476	Add voice note transcription feature - Introduced voice note handling for Discord and Telegram platforms. - Added configuration options for voice note functionality in settings.py and .env.example. - Updated README to include voice note instructions and configuration details. - Implemented audio attachment processing and transcription using faster-whisper. - Enabled voice note support through message handlers in both platforms.	2026-02-16 20:14:59 -08:00
Alishahryar1	01852e1638	Add configurable HTTP timeouts for provider API requests Updated the README to include new timeout settings. Implemented these timeouts in the provider classes and added corresponding tests to ensure they are correctly passed to the client. Also included environment variable support for the new settings.	2026-02-16 01:40:15 -08:00
Alishahryar1	6511542bfe	Implement Discord bot support and update README for messaging platform changes	2026-02-16 00:08:09 -08:00
Alishahryar1	b83be84313	Add LM Studio provider support - Introduced `LMStudioProvider` to the provider system. - Added a new fixture `lmstudio_provider` in `conftest.py` for testing. - Updated `get_provider` function to handle `lmstudio` as a valid provider type. - Enhanced README and `.env.example` to include LM Studio configuration details. - Updated settings to accommodate LM Studio's base URL and provider type. - Added tests to verify the functionality of the LM Studio provider.	2026-02-15 19:41:03 -08:00
Alishahryar1	054f9869b7	Refactor rate limiting configuration to use unified provider settings - Replaced NVIDIA NIM and OpenRouter specific rate limit settings with a generic provider rate limit in settings, tests, and environment files. - Updated README.md to reflect the new provider rate limit configuration. - Adjusted tests to validate the new provider rate limit attributes.	2026-02-15 11:03:59 -08:00
Alishahryar1	e5a096049d	feat: add OpenRouter support and configuration options - Introduced OpenRouter as a new provider option in settings and environment configuration. - Updated README.md to include instructions for using OpenRouter. - Enhanced the message converter to support reasoning content for OpenRouter. - Added tests for OpenRouter provider functionality and message conversion. - Updated dependencies to include OpenRouterProvider.	2026-02-15 10:50:53 -08:00
Alishahryar1	6102583026	Major Refactor Part 2 with kimi-k2.5 in claude code	2026-02-05 16:09:16 -08:00
Alishahryar1	fcbe204f44	Major refactor done with kimi-k2.5 in claude code	2026-02-05 10:51:33 -08:00
Alishahryar1	81d41fb6d5	Added mocking for suggestion mode and file path extraction	2026-02-03 19:24:18 -08:00
Alishahryar1	ec86f5bda6	Update code to always log everything to server.log removing server_debug.jsonl	2026-02-03 19:13:04 -08:00
Alishahryar1	0e06307366	added 2 new optimization for mocking title and quota	2026-01-30 03:20:34 -08:00
Alishahryar1	d958544c3d	Migrated from telethon to telegram bot api	2026-01-30 00:38:39 -08:00
Alishahryar1	6eca7377ce	Revert "Added rate limiter queue for telegram" This reverts commit `6a4409d625`.	2026-01-29 23:16:01 -08:00
Alishahryar1	82ab6d64e1	Revert "added messaging app rate limiter params" This reverts commit `8ee1da7140`.	2026-01-29 23:14:59 -08:00
Alishahryar1	8ee1da7140	added messaging app rate limiter params	2026-01-29 22:38:26 -08:00
Alishahryar1	6a4409d625	Added rate limiter queue for telegram	2026-01-29 22:28:18 -08:00
Alishahryar1	245faff9fd	removed legacy code	2026-01-29 22:06:47 -08:00
Alishahryar1	f1efa22a82	fixed issues after refactor	2026-01-29 14:50:05 -08:00
Alishahryar1	8678a62915	Major refactor done by itself	2026-01-29 14:40:08 -08:00

41 commits