Commit graph

16 commits

Author SHA1 Message Date
Alishahryar1
48b085950a Warn on inherited auth token
Some checks are pending
CI / checks (push) Waiting to run
2026-04-24 00:42:33 -07:00
Alishahryar1
6f3d762a4f Revert "Add per-model thinking toggles"
This reverts commit 1f12a33dd7.
2026-04-24 00:26:15 -07:00
Alishahryar1
9c28af7cf1 Fix auth token dotenv precedence 2026-04-24 00:25:31 -07:00
Alishahryar1
1f12a33dd7 Add per-model thinking toggles 2026-04-24 00:14:49 -07:00
Alishahryar1
55131019e1 Sync config defaults and proxy docs
Some checks are pending
CI / checks (push) Waiting to run
2026-04-22 17:34:00 -07:00
Pavel Yurchenko
e719e4aed2
feat: deepseek api support (#118)
## Summary

* add native DeepSeek provider support via the shared OpenAI-compatible
provider base
* allow `deepseek/...` model prefixes in config validation
* add `DEEPSEEK_API_KEY` and `DEEPSEEK_BASE_URL` settings
* add DeepSeek entries to `.env.example` and `config/env.example`
* implement `DeepSeekProvider` and register it in provider dependencies
* add a DeepSeek request builder with DeepSeek-specific thinking payload
handling
* preserve Anthropic thinking blocks as `reasoning_content` for
DeepSeek-compatible continuation flows
* update `claude-pick` to discover DeepSeek models from the DeepSeek API
* document DeepSeek usage in `README.md`
* add tests for config validation, provider dependency wiring, request
building, and streaming behavior

## Motivation

DeepSeek exposes an OpenAI-compatible API and can be used directly
without routing through OpenRouter. This lets users spend their existing
DeepSeek balance through the proxy while keeping the same Claude Code
workflow and per-model provider mapping.

## Example

```dotenv
DEEPSEEK_API_KEY="sk-..."
DEEPSEEK_BASE_URL="https://api.deepseek.com"

MODEL_OPUS="deepseek/deepseek-reasoner"
MODEL_SONNET="deepseek/deepseek-chat"
MODEL_HAIKU="deepseek/deepseek-chat"
MODEL="deepseek/deepseek-chat"

---------

Co-authored-by: Alishahryar1 <alishahryar2@gmail.com>
2026-04-22 17:06:01 -07:00
Alishahryar1
835d0454e8 Fixes for issue 113 and 116 2026-04-18 16:32:31 -07:00
Alishahryar1
f9e7f65f4c Fix NVIDIA NIM reasoning params for updated API
Replace dropped params (thinking, reasoning_split, include_reasoning,
return_tokens_as_token_ids, reasoning_effort) with the new API format:
chat_template_kwargs.enable_thinking=True and reasoning_budget=max_tokens.
2026-03-26 12:25:04 -07:00
Alishahryar1
5a36a32836 feat: add llama.cpp provider for local anthropic messages API 2026-03-08 10:38:25 -07:00
Alishahryar1
a7d88d5cbd Updated README with per-model mapping, fixed test .env isolation 2026-03-01 21:52:35 -08:00
Ali Khokhar
0b324e0421
Per claude model mapping (#66) 2026-03-01 21:32:23 -08:00
Alishahryar1
a74ec74271 Major refactor done with minimax m2.5 2026-02-28 04:36:29 -08:00
Alishahryar1
2c1158f62f removed a test 2026-02-19 20:06:15 -08:00
Claude
99f99fce90
Remove max_cli_sessions — CLI session pool is now unbounded
The max_sessions cap in CLISessionManager was the only thing enforcing
a limit on concurrent CLI processes. Now that provider concurrency is
controlled at the streaming layer (PROVIDER_MAX_CONCURRENCY semaphore),
the CLI session pool cap is redundant and removed entirely.

Changes:
- cli/manager.py: remove max_sessions param, cap check, _cleanup_idle_sessions_unlocked, max_sessions from get_stats()
- config/settings.py: remove max_cli_sessions field
- api/app.py: remove max_sessions=settings.max_cli_sessions from CLISessionManager constructor
- messaging/handler.py: remove "Waiting for slot" status check; stats display no longer shows Max CLI
- .env.example: remove MAX_CLI_SESSIONS line
- tests/cli/test_cli.py: remove max_sessions args and assertion from manager tests
- tests/cli/test_cli_manager_edge_cases.py: remove two tests for cap/cleanup behavior
- tests/api/test_app_lifespan_and_errors.py: remove max_cli_sessions from all SimpleNamespace settings
- tests/config/test_config.py: remove max_cli_sessions isinstance assertion
- tests/conftest.py: remove max_sessions from mock stats
- tests/messaging/test_handler.py: merge slot/capacity tests into single new-conversation test; remove Max CLI assertion from stats test
- tests/messaging/test_handler_markdown_and_status_edges.py: remove "Waiting for slot" assertion; drop max_sessions from all stats mocks

https://claude.ai/code/session_014mrF1WMNgmNjtPBuoQHsbg
2026-02-19 14:31:47 +00:00
Cursor Agent
db646ef2db Remove auto support for whisper_device; only cpu and cuda allowed
- Validate whisper_device in Settings and _get_local_model
- Reject 'auto' with clear ValueError/ValidationError
- Update docs in config, .env.example, README
- Add tests for invalid device and valid cpu/cuda

Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
2026-02-18 13:38:59 +00:00
Cursor Agent
4b4f87515d Phase 7: Directory restructuring (messaging/ and tests/)
- Create messaging/platforms/ (base, discord, telegram, factory)
- Create messaging/rendering/ (discord_markdown, telegram_markdown)
- Create messaging/trees/ (data, repository, processor, queue_manager)
- Organize tests/ into api/, providers/, messaging/, cli/, config/
- Add backward-compatible re-exports at old locations
- Update handler.py and test_messaging_factory.py imports
- Fix Telegram type hints for TELEGRAM_AVAILABLE=False case
- Fix Python 3 except syntax in discord_markdown

Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
2026-02-17 02:25:42 +00:00
Renamed from tests/test_config.py (Browse further)