Commit graph

12 commits

Author SHA1 Message Date
Alishahryar1
6f3d762a4f Revert "Add per-model thinking toggles"
This reverts commit 1f12a33dd7.
2026-04-24 00:26:15 -07:00
Alishahryar1
1f12a33dd7 Add per-model thinking toggles 2026-04-24 00:14:49 -07:00
arssing
2fe15bd2cd
feat: add proxy support for httpx clients (#125)
Add proxy support for providers based on
[doc](https://www.python-httpx.org/advanced/proxies/):

- Add per-provider proxy support (HTTP and SOCKS5) for all 4 providers:
nvidia_nim, open_router, lmstudio, llamacpp
- Each provider gets its own env var (NVIDIA_NIM_PROXY,
OPENROUTER_PROXY, LMSTUDIO_PROXY, LLAMACPP_PROXY) for independent proxy
configuration

---------

Co-authored-by: Alishahryar1 <alishahryar2@gmail.com>
2026-04-22 17:06:16 -07:00
Pavel Yurchenko
e719e4aed2
feat: deepseek api support (#118)
## Summary

* add native DeepSeek provider support via the shared OpenAI-compatible
provider base
* allow `deepseek/...` model prefixes in config validation
* add `DEEPSEEK_API_KEY` and `DEEPSEEK_BASE_URL` settings
* add DeepSeek entries to `.env.example` and `config/env.example`
* implement `DeepSeekProvider` and register it in provider dependencies
* add a DeepSeek request builder with DeepSeek-specific thinking payload
handling
* preserve Anthropic thinking blocks as `reasoning_content` for
DeepSeek-compatible continuation flows
* update `claude-pick` to discover DeepSeek models from the DeepSeek API
* document DeepSeek usage in `README.md`
* add tests for config validation, provider dependency wiring, request
building, and streaming behavior

## Motivation

DeepSeek exposes an OpenAI-compatible API and can be used directly
without routing through OpenRouter. This lets users spend their existing
DeepSeek balance through the proxy while keeping the same Claude Code
workflow and per-model provider mapping.

## Example

```dotenv
DEEPSEEK_API_KEY="sk-..."
DEEPSEEK_BASE_URL="https://api.deepseek.com"

MODEL_OPUS="deepseek/deepseek-reasoner"
MODEL_SONNET="deepseek/deepseek-chat"
MODEL_HAIKU="deepseek/deepseek-chat"
MODEL="deepseek/deepseek-chat"

---------

Co-authored-by: Alishahryar1 <alishahryar2@gmail.com>
2026-04-22 17:06:01 -07:00
Alishahryar1
87d8ce1196 feat(lmstudio): route natively to Anthropic /v1/messages endpoint
- Rewrites LMStudioProvider to inherit from BaseProvider
- Passes requests natively to /v1/messages using httpx instead of AsyncOpenAI
- Auto-translates internal ThinkingConfig to Anthropic schema
- Updates .env.example with model routing instructions
- Adjusts test suite for new native integration
2026-03-08 08:17:05 -07:00
Ali Khokhar
0b324e0421
Per claude model mapping (#66) 2026-03-01 21:32:23 -08:00
Alishahryar1
d6a0e1a401 Provider inferred from model name using prefix 2026-02-19 20:53:02 -08:00
Claude
45b7e4cafd
Make PROVIDER_MAX_CONCURRENCY required with default of 5
- `max_concurrency` is now always an `int` (default 5) — `None`/unlimited
  is no longer a valid state; omitting the env var uses the default
- `GlobalRateLimiter`: semaphore is always created; `concurrency_slot()`
  no longer has None guards; log message always includes concurrency
- `ProviderConfig.max_concurrency`: `int = 5` (was `int | None = None`)
- `Settings.provider_max_concurrency`: `int = Field(default=5, ...)` —
  setting env var to an invalid value (e.g. empty string) raises
- `.env.example`: uncommented `PROVIDER_MAX_CONCURRENCY=5`
- README: updated config table default from `—` to `5`
- Tests: removed `test_concurrency_slot_noop_when_not_configured`;
  updated mock settings to use `5` instead of `None`

https://claude.ai/code/session_014mrF1WMNgmNjtPBuoQHsbg
2026-02-19 14:39:42 +00:00
Claude
afaf50a972
Add queue-level concurrency limit to provider streaming
Adds max_concurrency cap to GlobalRateLimiter using asyncio.Semaphore.
A request now waits for a concurrency slot before the sliding window rate
limit check, so at most N streams are open to the provider simultaneously,
even when the rate window would allow more.

Changes:
- providers/rate_limit.py: max_concurrency param, _concurrency_sem, concurrency_slot() asynccontextmanager
- providers/openai_compat.py: pass max_concurrency to limiter; wrap execute_with_retry + stream iteration in concurrency_slot()
- providers/base.py: max_concurrency field on ProviderConfig
- config/settings.py: provider_max_concurrency setting (PROVIDER_MAX_CONCURRENCY env var, default None = unlimited)
- api/dependencies.py: pass provider_max_concurrency into all three provider ProviderConfig instantiations
- .env.example: document PROVIDER_MAX_CONCURRENCY (commented out)
- tests/providers/test_provider_rate_limit.py: 5 new tests covering concurrency limit enforcement, slot release on exception, noop when unconfigured
- tests/api/test_dependencies.py: add provider_max_concurrency=None to mock settings helper

https://claude.ai/code/session_014mrF1WMNgmNjtPBuoQHsbg
2026-02-19 14:23:21 +00:00
Alishahryar1
b05d0d2703 new linter rules and fixes 2026-02-18 04:13:41 -08:00
Cursor Agent
e9beb28897 fix: validate API keys at provider init to prevent 403 'authorization missing'
When NVIDIA_NIM_API_KEY or OPENROUTER_API_KEY is empty or not set,
the proxy forwarded requests without a valid Authorization header,
causing providers to return 403 with 'Header of type authorization
was missing'.

Now fail fast with HTTP 503 and a clear message telling users to add
the key to .env, with links to obtain keys.

Fixes #29

Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
2026-02-17 07:33:56 +00:00
Cursor Agent
4b4f87515d Phase 7: Directory restructuring (messaging/ and tests/)
- Create messaging/platforms/ (base, discord, telegram, factory)
- Create messaging/rendering/ (discord_markdown, telegram_markdown)
- Create messaging/trees/ (data, repository, processor, queue_manager)
- Organize tests/ into api/, providers/, messaging/, cli/, config/
- Add backward-compatible re-exports at old locations
- Update handler.py and test_messaging_factory.py imports
- Fix Telegram type hints for TELEGRAM_AVAILABLE=False case
- Fix Python 3 except syntax in discord_markdown

Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
2026-02-17 02:25:42 +00:00
Renamed from tests/test_dependencies.py (Browse further)