Commit graph

447 commits

Author SHA1 Message Date
Ali Khokhar
fae8a2a044
Remove over-engineering: drop tree_queue setter, _set_connected(), fi… (#63)
…x cancel_all() TOCTOU

- Remove tree_queue property setter (backward-compat hack; all callers
already migrated to replace_tree_queue()); keep property getter only
- Update 2 remaining tests that still used direct assignment to use
replace_tree_queue()
- Remove _set_connected() 1-line wrapper on DiscordPlatform; assign
_connected directly
- Fix cancel_all() TOCTOU: hold self._lock for the full loop so newly
created trees cannot slip through between the snapshot and cancellation

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-03-01 12:34:00 -08:00
Ali Khokhar
25b329a3fc
Update README
Removed duplicate VSCode Extension Setup instructions from README.md.
2026-03-01 05:30:30 -08:00
Alishahryar1
35a2760f6e Fixed encapsulation violations 2026-03-01 04:28:22 -08:00
Alishahryar1
302ee28585 Removed dead code 2026-03-01 04:21:06 -08:00
Alishahryar1
34757511a0 Improve deterministic error surfacing across stream and API 2026-03-01 01:32:52 -08:00
Alishahryar1
7f2612d2df Added optimization logging 2026-03-01 01:02:59 -08:00
Ali Khokhar
aee9f0ad93
Add code review fix plan covering 11 issues across modularity, encapsulation, performance, and dead code (#62) 2026-03-01 00:45:33 -08:00
Alishahryar1
c54c57a742 style 2026-02-28 09:13:25 -08:00
Alishahryar1
744eec2772 Major cleanup with GLM-5 2026-02-28 09:10:21 -08:00
Mauro Druwel
de70700dde
feat: Use NVIDIA NIM ASR for audio transcription (#53)
## Summary
Added NVIDIA NIM as a second transcription option ( alongside local
Whisper). This lets you transcribe voice notes using NVIDIA's cloud API
instead of running Whisper locally.

## What changed

- **Transcription**: Now supports the two backends

  - Local Whisper: Free, runs on your GPU/CPU (existing)
  - NVIDIA NIM: Cloud API via Riva gRPC (new)

- **Supported models**: 8 NVIDIA NIM models added (Parakeet variants for
different languages, Whisper Large V3)

---------

Co-authored-by: Alishahryar1 <alishahryar2@gmail.com>
2026-02-28 08:48:59 -08:00
Alishahryar1
f1f6080224 Updated agent instructions and renamed lint check to format check 2026-02-28 07:20:00 -08:00
Alishahryar1
a74ec74271 Major refactor done with minimax m2.5 2026-02-28 04:36:29 -08:00
Alishahryar1
cfe43bf5be Updated README 2026-02-28 04:21:05 -08:00
Ali Khokhar
7d99b38b70
Update environment variable syntax in README 2026-02-28 04:04:56 -08:00
Alishahryar1
79a1ae0c54 minor refactor using minimax m2.5 2026-02-27 20:44:39 -08:00
Ali Khokhar
f9e8226120
Clarify Docker integration acceptance in README
Updated README to clarify Docker integration status.
2026-02-27 20:00:57 -08:00
Ali Khokhar
c4d8681000
Backup/before cleanup 20260222 230402 (#58) 2026-02-27 19:50:21 -08:00
Ali Khokhar
e2840095ce
Merge pull request #47 from Alishahryar1/cursor/readme-env-example-consistency-0722 2026-02-20 01:38:00 -08:00
Cursor Agent
5d5055f96f docs: update README for removed PROVIDER_TYPE, model prefix format
Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
2026-02-20 09:37:25 +00:00
Ali Khokhar
4c0c1f125b
Update README.md 2026-02-20 01:33:57 -08:00
Alishahryar1
d6a0e1a401 Provider inferred from model name using prefix 2026-02-19 20:53:02 -08:00
Alishahryar1
21959b6189 lint 2026-02-19 20:40:05 -08:00
Alishahryar1
b6e602d058 removed deadcode 2026-02-19 20:39:38 -08:00
Alishahryar1
0c8d59e33e Removed deprecated modules and updated imports 2026-02-19 20:38:11 -08:00
Alishahryar1
2b0495dd08 moved text.py to common utils for providers 2026-02-19 20:32:45 -08:00
Alishahryar1
2ad64cc97a quoted string vars in env example 2026-02-19 20:27:28 -08:00
Alishahryar1
d21ed84171 updated uv version 2026-02-19 20:23:37 -08:00
Alishahryar1
2c1158f62f removed a test 2026-02-19 20:06:15 -08:00
Alishahryar1
aec4510a0a Upgraded python version 2026-02-19 20:02:44 -08:00
Alishahryar1
39cc39c341 standardized python version 2026-02-19 20:01:01 -08:00
Ali Khokhar
81a73f3349
Merge pull request #46 from rishiskhare/main 2026-02-19 10:17:26 -08:00
Rishi Khare
8ffe587a8f docs: rename model picker summary to Multi-Model Support (Model Picker)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 10:40:09 -05:00
Rishi Khare
a5496346ca docs: clarify claude-pick avoids needing to edit MODEL in .env
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 10:16:22 -05:00
Rishi Khare
39ad80f6e6 docs: mention source ~/.bashrc as alternative to ~/.zshrc in model picker
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 10:00:43 -05:00
Rishi Khare
5c6d8e150e docs: move model picker to summary within getting started and add demo video
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 09:58:55 -05:00
Ali Khokhar
00e6419881
Merge pull request #45 from Alishahryar1/claude/queue-concurrency-limits-EFnHT 2026-02-19 06:41:11 -08:00
Claude
45b7e4cafd
Make PROVIDER_MAX_CONCURRENCY required with default of 5
- `max_concurrency` is now always an `int` (default 5) — `None`/unlimited
  is no longer a valid state; omitting the env var uses the default
- `GlobalRateLimiter`: semaphore is always created; `concurrency_slot()`
  no longer has None guards; log message always includes concurrency
- `ProviderConfig.max_concurrency`: `int = 5` (was `int | None = None`)
- `Settings.provider_max_concurrency`: `int = Field(default=5, ...)` —
  setting env var to an invalid value (e.g. empty string) raises
- `.env.example`: uncommented `PROVIDER_MAX_CONCURRENCY=5`
- README: updated config table default from `—` to `5`
- Tests: removed `test_concurrency_slot_noop_when_not_configured`;
  updated mock settings to use `5` instead of `None`

https://claude.ai/code/session_014mrF1WMNgmNjtPBuoQHsbg
2026-02-19 14:39:42 +00:00
Claude
41fd316c76
Update README for provider concurrency and removal of MAX_CLI_SESSIONS
- Config table: add PROVIDER_MAX_CONCURRENCY, remove MAX_CLI_SESSIONS
- Discord Bot capabilities: replace "Up to 10 concurrent" with "Unlimited concurrent... (controlled by PROVIDER_MAX_CONCURRENCY)"
- Features table: note optional concurrency cap in Smart Rate Limiting row

https://claude.ai/code/session_014mrF1WMNgmNjtPBuoQHsbg
2026-02-19 14:34:15 +00:00
Claude
99f99fce90
Remove max_cli_sessions — CLI session pool is now unbounded
The max_sessions cap in CLISessionManager was the only thing enforcing
a limit on concurrent CLI processes. Now that provider concurrency is
controlled at the streaming layer (PROVIDER_MAX_CONCURRENCY semaphore),
the CLI session pool cap is redundant and removed entirely.

Changes:
- cli/manager.py: remove max_sessions param, cap check, _cleanup_idle_sessions_unlocked, max_sessions from get_stats()
- config/settings.py: remove max_cli_sessions field
- api/app.py: remove max_sessions=settings.max_cli_sessions from CLISessionManager constructor
- messaging/handler.py: remove "Waiting for slot" status check; stats display no longer shows Max CLI
- .env.example: remove MAX_CLI_SESSIONS line
- tests/cli/test_cli.py: remove max_sessions args and assertion from manager tests
- tests/cli/test_cli_manager_edge_cases.py: remove two tests for cap/cleanup behavior
- tests/api/test_app_lifespan_and_errors.py: remove max_cli_sessions from all SimpleNamespace settings
- tests/config/test_config.py: remove max_cli_sessions isinstance assertion
- tests/conftest.py: remove max_sessions from mock stats
- tests/messaging/test_handler.py: merge slot/capacity tests into single new-conversation test; remove Max CLI assertion from stats test
- tests/messaging/test_handler_markdown_and_status_edges.py: remove "Waiting for slot" assertion; drop max_sessions from all stats mocks

https://claude.ai/code/session_014mrF1WMNgmNjtPBuoQHsbg
2026-02-19 14:31:47 +00:00
Claude
afaf50a972
Add queue-level concurrency limit to provider streaming
Adds max_concurrency cap to GlobalRateLimiter using asyncio.Semaphore.
A request now waits for a concurrency slot before the sliding window rate
limit check, so at most N streams are open to the provider simultaneously,
even when the rate window would allow more.

Changes:
- providers/rate_limit.py: max_concurrency param, _concurrency_sem, concurrency_slot() asynccontextmanager
- providers/openai_compat.py: pass max_concurrency to limiter; wrap execute_with_retry + stream iteration in concurrency_slot()
- providers/base.py: max_concurrency field on ProviderConfig
- config/settings.py: provider_max_concurrency setting (PROVIDER_MAX_CONCURRENCY env var, default None = unlimited)
- api/dependencies.py: pass provider_max_concurrency into all three provider ProviderConfig instantiations
- .env.example: document PROVIDER_MAX_CONCURRENCY (commented out)
- tests/providers/test_provider_rate_limit.py: 5 new tests covering concurrency limit enforcement, slot release on exception, noop when unconfigured
- tests/api/test_dependencies.py: add provider_max_concurrency=None to mock settings helper

https://claude.ai/code/session_014mrF1WMNgmNjtPBuoQHsbg
2026-02-19 14:23:21 +00:00
Alishahryar1
4aff0b910f provider type quoted 2026-02-18 19:54:30 -08:00
Alishahryar1
cf1284b784 default voice note enabled set to false 2026-02-18 19:54:13 -08:00
Alishahryar1
416f259c8b Reordered env example 2026-02-18 19:53:30 -08:00
Alishahryar1
c35ecba9d8 Update Whisper model configuration to use 'base' as the default model ID 2026-02-18 19:36:58 -08:00
Alishahryar1
9a18e4f1d8 Removed plan.md 2026-02-18 18:51:40 -08:00
Ali Khokhar
4984fffffa
Merge pull request #43 from suryawanshishantanu6/feature/fix-input-token 2026-02-18 18:43:14 -08:00
Shantanu Suryawanshi
3d81f44803
Merge branch 'main' into feature/fix-input-token 2026-02-18 21:41:26 -05:00
Ali Khokhar
889556c2f9
Merge pull request #42 from rishiskhare/model-picker 2026-02-18 18:38:41 -08:00
Shantanu Suryawanshi
24a5e4d968 Fixing sse stream 2026-02-18 21:31:28 -05:00
Alishahryar1
e7ac85264f Improved optimizations to decrease llm calls further and increase throughput 2026-02-18 17:54:41 -08:00