Commit graph

8 commits

Author SHA1 Message Date
Alishahryar1
34757511a0 Improve deterministic error surfacing across stream and API 2026-03-01 01:32:52 -08:00
Ali Khokhar
aee9f0ad93
Add code review fix plan covering 11 issues across modularity, encapsulation, performance, and dead code (#62) 2026-03-01 00:45:33 -08:00
Alishahryar1
a74ec74271 Major refactor done with minimax m2.5 2026-02-28 04:36:29 -08:00
Alishahryar1
79a1ae0c54 minor refactor using minimax m2.5 2026-02-27 20:44:39 -08:00
Claude
afaf50a972
Add queue-level concurrency limit to provider streaming
Adds max_concurrency cap to GlobalRateLimiter using asyncio.Semaphore.
A request now waits for a concurrency slot before the sliding window rate
limit check, so at most N streams are open to the provider simultaneously,
even when the rate window would allow more.

Changes:
- providers/rate_limit.py: max_concurrency param, _concurrency_sem, concurrency_slot() asynccontextmanager
- providers/openai_compat.py: pass max_concurrency to limiter; wrap execute_with_retry + stream iteration in concurrency_slot()
- providers/base.py: max_concurrency field on ProviderConfig
- config/settings.py: provider_max_concurrency setting (PROVIDER_MAX_CONCURRENCY env var, default None = unlimited)
- api/dependencies.py: pass provider_max_concurrency into all three provider ProviderConfig instantiations
- .env.example: document PROVIDER_MAX_CONCURRENCY (commented out)
- tests/providers/test_provider_rate_limit.py: 5 new tests covering concurrency limit enforcement, slot release on exception, noop when unconfigured
- tests/api/test_dependencies.py: add provider_max_concurrency=None to mock settings helper

https://claude.ai/code/session_014mrF1WMNgmNjtPBuoQHsbg
2026-02-19 14:23:21 +00:00
Shantanu Suryawanshi
24a5e4d968 Fixing sse stream 2026-02-18 21:31:28 -05:00
Alishahryar1
b05d0d2703 new linter rules and fixes 2026-02-18 04:13:41 -08:00
Cursor Agent
bab86a2687 Phase 2: Extract OpenAICompatibleProvider base class
- Create providers/openai_compat.py with shared streaming logic
- Refactor NvidiaNimProvider, OpenRouterProvider, LMStudioProvider to extend it
- OpenRouter overrides _handle_extra_reasoning for reasoning_details
- Update test patches to providers.openai_compat

Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
2026-02-17 01:57:34 +00:00