Commit graph

424 commits

Author SHA1 Message Date
Alishahryar1
0c8d59e33e Removed deprecated modules and updated imports 2026-02-19 20:38:11 -08:00
Alishahryar1
2b0495dd08 moved text.py to common utils for providers 2026-02-19 20:32:45 -08:00
Alishahryar1
2ad64cc97a quoted string vars in env example 2026-02-19 20:27:28 -08:00
Alishahryar1
d21ed84171 updated uv version 2026-02-19 20:23:37 -08:00
Alishahryar1
2c1158f62f removed a test 2026-02-19 20:06:15 -08:00
Alishahryar1
aec4510a0a Upgraded python version 2026-02-19 20:02:44 -08:00
Alishahryar1
39cc39c341 standardized python version 2026-02-19 20:01:01 -08:00
Ali Khokhar
81a73f3349
Merge pull request #46 from rishiskhare/main 2026-02-19 10:17:26 -08:00
Rishi Khare
8ffe587a8f docs: rename model picker summary to Multi-Model Support (Model Picker)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 10:40:09 -05:00
Rishi Khare
a5496346ca docs: clarify claude-pick avoids needing to edit MODEL in .env
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 10:16:22 -05:00
Rishi Khare
39ad80f6e6 docs: mention source ~/.bashrc as alternative to ~/.zshrc in model picker
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 10:00:43 -05:00
Rishi Khare
5c6d8e150e docs: move model picker to summary within getting started and add demo video
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 09:58:55 -05:00
Ali Khokhar
00e6419881
Merge pull request #45 from Alishahryar1/claude/queue-concurrency-limits-EFnHT 2026-02-19 06:41:11 -08:00
Claude
45b7e4cafd
Make PROVIDER_MAX_CONCURRENCY required with default of 5
- `max_concurrency` is now always an `int` (default 5) — `None`/unlimited
  is no longer a valid state; omitting the env var uses the default
- `GlobalRateLimiter`: semaphore is always created; `concurrency_slot()`
  no longer has None guards; log message always includes concurrency
- `ProviderConfig.max_concurrency`: `int = 5` (was `int | None = None`)
- `Settings.provider_max_concurrency`: `int = Field(default=5, ...)` —
  setting env var to an invalid value (e.g. empty string) raises
- `.env.example`: uncommented `PROVIDER_MAX_CONCURRENCY=5`
- README: updated config table default from `—` to `5`
- Tests: removed `test_concurrency_slot_noop_when_not_configured`;
  updated mock settings to use `5` instead of `None`

https://claude.ai/code/session_014mrF1WMNgmNjtPBuoQHsbg
2026-02-19 14:39:42 +00:00
Claude
41fd316c76
Update README for provider concurrency and removal of MAX_CLI_SESSIONS
- Config table: add PROVIDER_MAX_CONCURRENCY, remove MAX_CLI_SESSIONS
- Discord Bot capabilities: replace "Up to 10 concurrent" with "Unlimited concurrent... (controlled by PROVIDER_MAX_CONCURRENCY)"
- Features table: note optional concurrency cap in Smart Rate Limiting row

https://claude.ai/code/session_014mrF1WMNgmNjtPBuoQHsbg
2026-02-19 14:34:15 +00:00
Claude
99f99fce90
Remove max_cli_sessions — CLI session pool is now unbounded
The max_sessions cap in CLISessionManager was the only thing enforcing
a limit on concurrent CLI processes. Now that provider concurrency is
controlled at the streaming layer (PROVIDER_MAX_CONCURRENCY semaphore),
the CLI session pool cap is redundant and removed entirely.

Changes:
- cli/manager.py: remove max_sessions param, cap check, _cleanup_idle_sessions_unlocked, max_sessions from get_stats()
- config/settings.py: remove max_cli_sessions field
- api/app.py: remove max_sessions=settings.max_cli_sessions from CLISessionManager constructor
- messaging/handler.py: remove "Waiting for slot" status check; stats display no longer shows Max CLI
- .env.example: remove MAX_CLI_SESSIONS line
- tests/cli/test_cli.py: remove max_sessions args and assertion from manager tests
- tests/cli/test_cli_manager_edge_cases.py: remove two tests for cap/cleanup behavior
- tests/api/test_app_lifespan_and_errors.py: remove max_cli_sessions from all SimpleNamespace settings
- tests/config/test_config.py: remove max_cli_sessions isinstance assertion
- tests/conftest.py: remove max_sessions from mock stats
- tests/messaging/test_handler.py: merge slot/capacity tests into single new-conversation test; remove Max CLI assertion from stats test
- tests/messaging/test_handler_markdown_and_status_edges.py: remove "Waiting for slot" assertion; drop max_sessions from all stats mocks

https://claude.ai/code/session_014mrF1WMNgmNjtPBuoQHsbg
2026-02-19 14:31:47 +00:00
Claude
afaf50a972
Add queue-level concurrency limit to provider streaming
Adds max_concurrency cap to GlobalRateLimiter using asyncio.Semaphore.
A request now waits for a concurrency slot before the sliding window rate
limit check, so at most N streams are open to the provider simultaneously,
even when the rate window would allow more.

Changes:
- providers/rate_limit.py: max_concurrency param, _concurrency_sem, concurrency_slot() asynccontextmanager
- providers/openai_compat.py: pass max_concurrency to limiter; wrap execute_with_retry + stream iteration in concurrency_slot()
- providers/base.py: max_concurrency field on ProviderConfig
- config/settings.py: provider_max_concurrency setting (PROVIDER_MAX_CONCURRENCY env var, default None = unlimited)
- api/dependencies.py: pass provider_max_concurrency into all three provider ProviderConfig instantiations
- .env.example: document PROVIDER_MAX_CONCURRENCY (commented out)
- tests/providers/test_provider_rate_limit.py: 5 new tests covering concurrency limit enforcement, slot release on exception, noop when unconfigured
- tests/api/test_dependencies.py: add provider_max_concurrency=None to mock settings helper

https://claude.ai/code/session_014mrF1WMNgmNjtPBuoQHsbg
2026-02-19 14:23:21 +00:00
Alishahryar1
4aff0b910f provider type quoted 2026-02-18 19:54:30 -08:00
Alishahryar1
cf1284b784 default voice note enabled set to false 2026-02-18 19:54:13 -08:00
Alishahryar1
416f259c8b Reordered env example 2026-02-18 19:53:30 -08:00
Alishahryar1
c35ecba9d8 Update Whisper model configuration to use 'base' as the default model ID 2026-02-18 19:36:58 -08:00
Alishahryar1
9a18e4f1d8 Removed plan.md 2026-02-18 18:51:40 -08:00
Ali Khokhar
4984fffffa
Merge pull request #43 from suryawanshishantanu6/feature/fix-input-token 2026-02-18 18:43:14 -08:00
Shantanu Suryawanshi
3d81f44803
Merge branch 'main' into feature/fix-input-token 2026-02-18 21:41:26 -05:00
Ali Khokhar
889556c2f9
Merge pull request #42 from rishiskhare/model-picker 2026-02-18 18:38:41 -08:00
Shantanu Suryawanshi
24a5e4d968 Fixing sse stream 2026-02-18 21:31:28 -05:00
Alishahryar1
e7ac85264f Improved optimizations to decrease llm calls further and increase throughput 2026-02-18 17:54:41 -08:00
Alishahryar1
593fb55954 Added fix for large replies being truncated entirely leaving no response text 2026-02-18 17:39:38 -08:00
Alishahryar1
06fff52deb Updated readme 2026-02-18 17:26:46 -08:00
Alishahryar1
16fa9d90cd Add message_thread_id support across messaging components
- Introduced message_thread_id to the IncomingMessage model for handling forum topic IDs in Telegram.
- Updated messaging platforms (Discord and Telegram) to accept and process message_thread_id in send_message methods.
- Modified message handlers to utilize message_thread_id when sending messages.
- Enhanced test cases to validate the integration of message_thread_id in message handling.

This change improves support for forum supergroups in Telegram and enhances message management across platforms.
2026-02-18 16:10:57 -08:00
Rishi Khare
406de89ae3 docs: clarify absolute path required for claude-pick alias
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-18 18:09:26 -05:00
Rishi Khare
142dd587c8 refactor: remove MODEL_PICKER flag — claude-pick always picks
The flag was unnecessary: running claude-pick implies wanting the picker.
Remove MODEL_PICKER from claude-pick and README, restore .env.example
to upstream.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-18 18:06:15 -05:00
Rishi Khare
c66ed28b45 feat: add claude-pick interactive model picker
- Add `claude-pick` bash script: reads PROVIDER_TYPE from .env, fetches
  available models (NVIDIA NIM, OpenRouter, LM Studio), and launches Claude
  with the selected model via fzf. Falls back to direct launch when
  MODEL_PICKER=false.
- Add MODEL_PICKER=false flag to .env.example.
- Document setup in README (fzf install, alias, fixed-model alias pattern).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-18 18:06:15 -05:00
Alishahryar1
8807f58267 decreased default message rate limit 2026-02-18 13:35:09 -08:00
Alishahryar1
2220880671 Add voice note cancellation feature during transcription
- Implemented functionality to cancel pending voice transcriptions when a user replies with the /clear command.
- Updated the Telegram and Discord platform classes to manage pending voice messages, including registration and cancellation logic.
- Enhanced the message handler to delete associated messages and notify users when a voice note is cancelled.
- Added tests to ensure the cancellation feature works as expected during transcription.
2026-02-18 06:36:42 -08:00
Alishahryar1
f66834e402 Add optional voice extra imports for analysis in pyproject.toml
- Updated the pyproject.toml to include allowed unresolved imports for optional voice dependencies: torch, transformers, and librosa.
- This change facilitates smoother integration in CI environments where these packages may not be installed.
2026-02-18 06:22:49 -08:00
Alishahryar1
75e066f17f Refactor voice note transcription to use Hugging Face transformers Whisper pipeline
- Updated transcription logic to utilize Hugging Face's Whisper models instead of faster-whisper.
- Introduced new model mapping and pipeline loading functions.
- Adjusted tests to reflect changes in the transcription process.
- Updated documentation in README, .env.example, and settings to align with the new implementation.
- Ensured compatibility with CUDA 13 and removed unnecessary dependencies.
2026-02-18 06:18:28 -08:00
Ali Khokhar
34fb8e2ca7
Merge pull request #40 from Alishahryar1/cursor/voice-note-transcribe-fallbacks-adb6
Voice note transcribe fallbacks
2026-02-18 05:39:39 -08:00
Cursor Agent
db646ef2db Remove auto support for whisper_device; only cpu and cuda allowed
- Validate whisper_device in Settings and _get_local_model
- Reject 'auto' with clear ValueError/ValidationError
- Update docs in config, .env.example, README
- Add tests for invalid device and valid cpu/cuda

Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
2026-02-18 13:38:59 +00:00
Cursor Agent
2135e6da05 Add large-v3 and large-v3-turbo whisper model options
Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
2026-02-18 13:37:58 +00:00
Cursor Agent
eabe8db2e8 Remove CPU fallbacks for voice note transcribe; auto/cuda/cpu fail fast
- Remove _cuda_failed_models and inference-time CPU fallback
- auto: try CUDA only, fail fast on RuntimeError (no CPU fallback)
- cpu/cuda: use device directly, fail fast on errors
- Update docs in config, .env.example, README

Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
2026-02-18 13:37:23 +00:00
Alishahryar1
f07d38655a updated agent instructions 2026-02-18 05:32:43 -08:00
Alishahryar1
a9e5de0765 lint 2026-02-18 04:19:37 -08:00
Alishahryar1
b05d0d2703 new linter rules and fixes 2026-02-18 04:13:41 -08:00
Ali Khokhar
c8b887c0d3
Merge pull request #38 from Alishahryar1/claude/add-linting-rules-tjRGC
Configure Ruff linter and formatter with project standards
2026-02-18 04:04:03 -08:00
Alishahryar1
c4a9b06f1a Merge branch 'main' of https://github.com/Alishahryar1/cc-nim 2026-02-18 04:03:30 -08:00
Alishahryar1
7b6fcf02b8 Updated agent md files for cloud agents 2026-02-18 04:03:29 -08:00
Claude
b57defe61d
Add ruff, ruff format, and ty best-practice rules to pyproject.toml
- [tool.ruff]: target-version=py314 (matches requires-python>=3.14.2), line-length=88
- [tool.ruff.lint]: select E/W/F/I/UP/B/C4/SIM/PERF/RUF; ignore E501 (formatter handles it)
- [tool.ruff.lint.isort]: known-first-party covers all local packages
- [tool.ruff.format]: double quotes, space indent, auto line-ending
- [tool.ty.environment]: python-version=3.14 for type-checking accuracy

https://claude.ai/code/session_01RVyLT88hTnfJZAVLUbWawS
2026-02-18 12:03:04 +00:00
Ali Khokhar
08a31aa3ed
Merge pull request #36 from Alishahryar1/cursor/claude-api-403-error-944f 2026-02-16 23:35:26 -08:00
Cursor Agent
e9beb28897 fix: validate API keys at provider init to prevent 403 'authorization missing'
When NVIDIA_NIM_API_KEY or OPENROUTER_API_KEY is empty or not set,
the proxy forwarded requests without a valid Authorization header,
causing providers to return 403 with 'Header of type authorization
was missing'.

Now fail fast with HTTP 503 and a clear message telling users to add
the key to .env, with links to obtain keys.

Fixes #29

Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
2026-02-17 07:33:56 +00:00