Commit graph

158 commits

Author SHA1 Message Date
Alishahryar1
55131019e1 Sync config defaults and proxy docs
Some checks are pending
CI / checks (push) Waiting to run
2026-04-22 17:34:00 -07:00
Pavel Yurchenko
e719e4aed2
feat: deepseek api support (#118)
## Summary

* add native DeepSeek provider support via the shared OpenAI-compatible
provider base
* allow `deepseek/...` model prefixes in config validation
* add `DEEPSEEK_API_KEY` and `DEEPSEEK_BASE_URL` settings
* add DeepSeek entries to `.env.example` and `config/env.example`
* implement `DeepSeekProvider` and register it in provider dependencies
* add a DeepSeek request builder with DeepSeek-specific thinking payload
handling
* preserve Anthropic thinking blocks as `reasoning_content` for
DeepSeek-compatible continuation flows
* update `claude-pick` to discover DeepSeek models from the DeepSeek API
* document DeepSeek usage in `README.md`
* add tests for config validation, provider dependency wiring, request
building, and streaming behavior

## Motivation

DeepSeek exposes an OpenAI-compatible API and can be used directly
without routing through OpenRouter. This lets users spend their existing
DeepSeek balance through the proxy while keeping the same Claude Code
workflow and per-model provider mapping.

## Example

```dotenv
DEEPSEEK_API_KEY="sk-..."
DEEPSEEK_BASE_URL="https://api.deepseek.com"

MODEL_OPUS="deepseek/deepseek-reasoner"
MODEL_SONNET="deepseek/deepseek-chat"
MODEL_HAIKU="deepseek/deepseek-chat"
MODEL="deepseek/deepseek-chat"

---------

Co-authored-by: Alishahryar1 <alishahryar2@gmail.com>
2026-04-22 17:06:01 -07:00
Alishahryar1
835d0454e8 Fixes for issue 113 and 116 2026-04-18 16:32:31 -07:00
Muhammad Hamid Raza
7468f53ab7
Fix README installation section for uv (#107)
Some checks failed
CI / checks (push) Has been cancelled
2026-03-30 11:08:07 -07:00
Alishahryar1
b75f47b62d Gate NIM thinking params behind NIM_ENABLE_THINKING env var
Mistral models reject chat_template_kwargs, causing 400 errors. Make
thinking params (chat_template_kwargs, reasoning_budget) opt-in via
NIM_ENABLE_THINKING env var (default false) so only models that need it
(kimi, nemotron) receive them.
2026-03-27 21:44:36 -07:00
th-ch
f703a0e403
Implement optional authentication (Anthropic style) (#80)
Some checks are pending
CI / checks (push) Waiting to run
2026-03-27 11:11:47 -07:00
Avishek Behera
587931d279
(doc): Update README with PowerShell and proxy server instructions (#101) 2026-03-27 11:08:43 -07:00
Ali Khokhar
747262a7ce
Update README.md
Some checks failed
CI / checks (push) Has been cancelled
2026-03-15 12:54:30 -07:00
Xi Gou
4ead059760
update vscode config item name (#81)
Identifier
anthropic.claude-code
Version
2.1.72
Last Updated
2 hours ago
Size
281.35MB
2026-03-11 06:32:20 -07:00
Ali Khokhar
2324be4989
Update README.md
Some checks failed
CI / checks (push) Has been cancelled
2026-03-08 14:35:37 -07:00
Alishahryar1
5a36a32836 feat: add llama.cpp provider for local anthropic messages API 2026-03-08 10:38:25 -07:00
Ali Khokhar
884ddd77af
Add tests for fcc-init entrypoint (cli/entrypoints.py) (#77)
Some checks are pending
CI / checks (push) Waiting to run
2026-03-07 08:27:11 -08:00
Alishahryar1
fc58b43c5e Update README
Some checks are pending
CI / checks (push) Waiting to run
2026-03-06 22:19:54 -08:00
Ali Khokhar
c5341ecbbe
Add option for an installable package (#75) 2026-03-06 22:06:33 -08:00
Ali Khokhar
a599319dd6
Update README.md
Some checks failed
CI / checks (push) Has been cancelled
2026-03-05 00:19:20 -08:00
Ali Khokhar
160370268a
Update README with note on new features
Added a note about new features in the README.
2026-03-01 22:30:04 -08:00
Ali Khokhar
63d7f2afe8
Update README 2026-03-01 22:25:06 -08:00
Alishahryar1
ff14baa2d5 Updated README 2026-03-01 22:08:51 -08:00
Alishahryar1
aaa62a2bd7 Relaxed python version requirements 2026-03-01 22:00:34 -08:00
Alishahryar1
c1d1368940 Updated README 2026-03-01 21:54:59 -08:00
Alishahryar1
a7d88d5cbd Updated README with per-model mapping, fixed test .env isolation 2026-03-01 21:52:35 -08:00
Alishahryar1
598e21387e Updated README 2026-03-01 21:37:34 -08:00
Ali Khokhar
0b324e0421
Per claude model mapping (#66) 2026-03-01 21:32:23 -08:00
Alishahryar1
763c8b62b7 Updated README 2026-03-01 12:47:20 -08:00
Alishahryar1
efb8605258 Updated README 2026-03-01 12:44:40 -08:00
Ali Khokhar
25b329a3fc
Update README
Removed duplicate VSCode Extension Setup instructions from README.md.
2026-03-01 05:30:30 -08:00
Mauro Druwel
de70700dde
feat: Use NVIDIA NIM ASR for audio transcription (#53)
## Summary
Added NVIDIA NIM as a second transcription option ( alongside local
Whisper). This lets you transcribe voice notes using NVIDIA's cloud API
instead of running Whisper locally.

## What changed

- **Transcription**: Now supports the two backends

  - Local Whisper: Free, runs on your GPU/CPU (existing)
  - NVIDIA NIM: Cloud API via Riva gRPC (new)

- **Supported models**: 8 NVIDIA NIM models added (Parakeet variants for
different languages, Whisper Large V3)

---------

Co-authored-by: Alishahryar1 <alishahryar2@gmail.com>
2026-02-28 08:48:59 -08:00
Alishahryar1
cfe43bf5be Updated README 2026-02-28 04:21:05 -08:00
Ali Khokhar
7d99b38b70
Update environment variable syntax in README 2026-02-28 04:04:56 -08:00
Ali Khokhar
f9e8226120
Clarify Docker integration acceptance in README
Updated README to clarify Docker integration status.
2026-02-27 20:00:57 -08:00
Ali Khokhar
c4d8681000
Backup/before cleanup 20260222 230402 (#58) 2026-02-27 19:50:21 -08:00
Cursor Agent
5d5055f96f docs: update README for removed PROVIDER_TYPE, model prefix format
Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
2026-02-20 09:37:25 +00:00
Ali Khokhar
4c0c1f125b
Update README.md 2026-02-20 01:33:57 -08:00
Rishi Khare
8ffe587a8f docs: rename model picker summary to Multi-Model Support (Model Picker)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 10:40:09 -05:00
Rishi Khare
a5496346ca docs: clarify claude-pick avoids needing to edit MODEL in .env
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 10:16:22 -05:00
Rishi Khare
39ad80f6e6 docs: mention source ~/.bashrc as alternative to ~/.zshrc in model picker
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 10:00:43 -05:00
Rishi Khare
5c6d8e150e docs: move model picker to summary within getting started and add demo video
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 09:58:55 -05:00
Claude
45b7e4cafd
Make PROVIDER_MAX_CONCURRENCY required with default of 5
- `max_concurrency` is now always an `int` (default 5) — `None`/unlimited
  is no longer a valid state; omitting the env var uses the default
- `GlobalRateLimiter`: semaphore is always created; `concurrency_slot()`
  no longer has None guards; log message always includes concurrency
- `ProviderConfig.max_concurrency`: `int = 5` (was `int | None = None`)
- `Settings.provider_max_concurrency`: `int = Field(default=5, ...)` —
  setting env var to an invalid value (e.g. empty string) raises
- `.env.example`: uncommented `PROVIDER_MAX_CONCURRENCY=5`
- README: updated config table default from `—` to `5`
- Tests: removed `test_concurrency_slot_noop_when_not_configured`;
  updated mock settings to use `5` instead of `None`

https://claude.ai/code/session_014mrF1WMNgmNjtPBuoQHsbg
2026-02-19 14:39:42 +00:00
Claude
41fd316c76
Update README for provider concurrency and removal of MAX_CLI_SESSIONS
- Config table: add PROVIDER_MAX_CONCURRENCY, remove MAX_CLI_SESSIONS
- Discord Bot capabilities: replace "Up to 10 concurrent" with "Unlimited concurrent... (controlled by PROVIDER_MAX_CONCURRENCY)"
- Features table: note optional concurrency cap in Smart Rate Limiting row

https://claude.ai/code/session_014mrF1WMNgmNjtPBuoQHsbg
2026-02-19 14:34:15 +00:00
Alishahryar1
c35ecba9d8 Update Whisper model configuration to use 'base' as the default model ID 2026-02-18 19:36:58 -08:00
Ali Khokhar
889556c2f9
Merge pull request #42 from rishiskhare/model-picker 2026-02-18 18:38:41 -08:00
Alishahryar1
06fff52deb Updated readme 2026-02-18 17:26:46 -08:00
Rishi Khare
406de89ae3 docs: clarify absolute path required for claude-pick alias
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-18 18:09:26 -05:00
Rishi Khare
142dd587c8 refactor: remove MODEL_PICKER flag — claude-pick always picks
The flag was unnecessary: running claude-pick implies wanting the picker.
Remove MODEL_PICKER from claude-pick and README, restore .env.example
to upstream.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-18 18:06:15 -05:00
Rishi Khare
c66ed28b45 feat: add claude-pick interactive model picker
- Add `claude-pick` bash script: reads PROVIDER_TYPE from .env, fetches
  available models (NVIDIA NIM, OpenRouter, LM Studio), and launches Claude
  with the selected model via fzf. Falls back to direct launch when
  MODEL_PICKER=false.
- Add MODEL_PICKER=false flag to .env.example.
- Document setup in README (fzf install, alias, fixed-model alias pattern).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-18 18:06:15 -05:00
Alishahryar1
75e066f17f Refactor voice note transcription to use Hugging Face transformers Whisper pipeline
- Updated transcription logic to utilize Hugging Face's Whisper models instead of faster-whisper.
- Introduced new model mapping and pipeline loading functions.
- Adjusted tests to reflect changes in the transcription process.
- Updated documentation in README, .env.example, and settings to align with the new implementation.
- Ensured compatibility with CUDA 13 and removed unnecessary dependencies.
2026-02-18 06:18:28 -08:00
Cursor Agent
db646ef2db Remove auto support for whisper_device; only cpu and cuda allowed
- Validate whisper_device in Settings and _get_local_model
- Reject 'auto' with clear ValueError/ValidationError
- Update docs in config, .env.example, README
- Add tests for invalid device and valid cpu/cuda

Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
2026-02-18 13:38:59 +00:00
Cursor Agent
2135e6da05 Add large-v3 and large-v3-turbo whisper model options
Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
2026-02-18 13:37:58 +00:00
Cursor Agent
eabe8db2e8 Remove CPU fallbacks for voice note transcribe; auto/cuda/cpu fail fast
- Remove _cuda_failed_models and inference-time CPU fallback
- auto: try CUDA only, fail fast on RuntimeError (no CPU fallback)
- cpu/cuda: use device directly, fail fast on errors
- Update docs in config, .env.example, README

Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
2026-02-18 13:37:23 +00:00
Ali Khokhar
ed5162da17
Update README 2026-02-16 20:31:53 -08:00