Commit graph

124 commits

Author SHA1 Message Date
Ali Khokhar
fae8a2a044
Remove over-engineering: drop tree_queue setter, _set_connected(), fi… (#63)
…x cancel_all() TOCTOU

- Remove tree_queue property setter (backward-compat hack; all callers
already migrated to replace_tree_queue()); keep property getter only
- Update 2 remaining tests that still used direct assignment to use
replace_tree_queue()
- Remove _set_connected() 1-line wrapper on DiscordPlatform; assign
_connected directly
- Fix cancel_all() TOCTOU: hold self._lock for the full loop so newly
created trees cannot slip through between the snapshot and cancellation

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-03-01 12:34:00 -08:00
Alishahryar1
35a2760f6e Fixed encapsulation violations 2026-03-01 04:28:22 -08:00
Alishahryar1
302ee28585 Removed dead code 2026-03-01 04:21:06 -08:00
Alishahryar1
34757511a0 Improve deterministic error surfacing across stream and API 2026-03-01 01:32:52 -08:00
Alishahryar1
7f2612d2df Added optimization logging 2026-03-01 01:02:59 -08:00
Ali Khokhar
aee9f0ad93
Add code review fix plan covering 11 issues across modularity, encapsulation, performance, and dead code (#62) 2026-03-01 00:45:33 -08:00
Alishahryar1
744eec2772 Major cleanup with GLM-5 2026-02-28 09:10:21 -08:00
Mauro Druwel
de70700dde
feat: Use NVIDIA NIM ASR for audio transcription (#53)
## Summary
Added NVIDIA NIM as a second transcription option ( alongside local
Whisper). This lets you transcribe voice notes using NVIDIA's cloud API
instead of running Whisper locally.

## What changed

- **Transcription**: Now supports the two backends

  - Local Whisper: Free, runs on your GPU/CPU (existing)
  - NVIDIA NIM: Cloud API via Riva gRPC (new)

- **Supported models**: 8 NVIDIA NIM models added (Parakeet variants for
different languages, Whisper Large V3)

---------

Co-authored-by: Alishahryar1 <alishahryar2@gmail.com>
2026-02-28 08:48:59 -08:00
Alishahryar1
a74ec74271 Major refactor done with minimax m2.5 2026-02-28 04:36:29 -08:00
Alishahryar1
79a1ae0c54 minor refactor using minimax m2.5 2026-02-27 20:44:39 -08:00
Ali Khokhar
c4d8681000
Backup/before cleanup 20260222 230402 (#58) 2026-02-27 19:50:21 -08:00
Alishahryar1
d6a0e1a401 Provider inferred from model name using prefix 2026-02-19 20:53:02 -08:00
Alishahryar1
21959b6189 lint 2026-02-19 20:40:05 -08:00
Alishahryar1
0c8d59e33e Removed deprecated modules and updated imports 2026-02-19 20:38:11 -08:00
Alishahryar1
2b0495dd08 moved text.py to common utils for providers 2026-02-19 20:32:45 -08:00
Alishahryar1
2c1158f62f removed a test 2026-02-19 20:06:15 -08:00
Claude
45b7e4cafd
Make PROVIDER_MAX_CONCURRENCY required with default of 5
- `max_concurrency` is now always an `int` (default 5) — `None`/unlimited
  is no longer a valid state; omitting the env var uses the default
- `GlobalRateLimiter`: semaphore is always created; `concurrency_slot()`
  no longer has None guards; log message always includes concurrency
- `ProviderConfig.max_concurrency`: `int = 5` (was `int | None = None`)
- `Settings.provider_max_concurrency`: `int = Field(default=5, ...)` —
  setting env var to an invalid value (e.g. empty string) raises
- `.env.example`: uncommented `PROVIDER_MAX_CONCURRENCY=5`
- README: updated config table default from `—` to `5`
- Tests: removed `test_concurrency_slot_noop_when_not_configured`;
  updated mock settings to use `5` instead of `None`

https://claude.ai/code/session_014mrF1WMNgmNjtPBuoQHsbg
2026-02-19 14:39:42 +00:00
Claude
99f99fce90
Remove max_cli_sessions — CLI session pool is now unbounded
The max_sessions cap in CLISessionManager was the only thing enforcing
a limit on concurrent CLI processes. Now that provider concurrency is
controlled at the streaming layer (PROVIDER_MAX_CONCURRENCY semaphore),
the CLI session pool cap is redundant and removed entirely.

Changes:
- cli/manager.py: remove max_sessions param, cap check, _cleanup_idle_sessions_unlocked, max_sessions from get_stats()
- config/settings.py: remove max_cli_sessions field
- api/app.py: remove max_sessions=settings.max_cli_sessions from CLISessionManager constructor
- messaging/handler.py: remove "Waiting for slot" status check; stats display no longer shows Max CLI
- .env.example: remove MAX_CLI_SESSIONS line
- tests/cli/test_cli.py: remove max_sessions args and assertion from manager tests
- tests/cli/test_cli_manager_edge_cases.py: remove two tests for cap/cleanup behavior
- tests/api/test_app_lifespan_and_errors.py: remove max_cli_sessions from all SimpleNamespace settings
- tests/config/test_config.py: remove max_cli_sessions isinstance assertion
- tests/conftest.py: remove max_sessions from mock stats
- tests/messaging/test_handler.py: merge slot/capacity tests into single new-conversation test; remove Max CLI assertion from stats test
- tests/messaging/test_handler_markdown_and_status_edges.py: remove "Waiting for slot" assertion; drop max_sessions from all stats mocks

https://claude.ai/code/session_014mrF1WMNgmNjtPBuoQHsbg
2026-02-19 14:31:47 +00:00
Claude
afaf50a972
Add queue-level concurrency limit to provider streaming
Adds max_concurrency cap to GlobalRateLimiter using asyncio.Semaphore.
A request now waits for a concurrency slot before the sliding window rate
limit check, so at most N streams are open to the provider simultaneously,
even when the rate window would allow more.

Changes:
- providers/rate_limit.py: max_concurrency param, _concurrency_sem, concurrency_slot() asynccontextmanager
- providers/openai_compat.py: pass max_concurrency to limiter; wrap execute_with_retry + stream iteration in concurrency_slot()
- providers/base.py: max_concurrency field on ProviderConfig
- config/settings.py: provider_max_concurrency setting (PROVIDER_MAX_CONCURRENCY env var, default None = unlimited)
- api/dependencies.py: pass provider_max_concurrency into all three provider ProviderConfig instantiations
- .env.example: document PROVIDER_MAX_CONCURRENCY (commented out)
- tests/providers/test_provider_rate_limit.py: 5 new tests covering concurrency limit enforcement, slot release on exception, noop when unconfigured
- tests/api/test_dependencies.py: add provider_max_concurrency=None to mock settings helper

https://claude.ai/code/session_014mrF1WMNgmNjtPBuoQHsbg
2026-02-19 14:23:21 +00:00
Shantanu Suryawanshi
24a5e4d968 Fixing sse stream 2026-02-18 21:31:28 -05:00
Alishahryar1
e7ac85264f Improved optimizations to decrease llm calls further and increase throughput 2026-02-18 17:54:41 -08:00
Alishahryar1
593fb55954 Added fix for large replies being truncated entirely leaving no response text 2026-02-18 17:39:38 -08:00
Alishahryar1
16fa9d90cd Add message_thread_id support across messaging components
- Introduced message_thread_id to the IncomingMessage model for handling forum topic IDs in Telegram.
- Updated messaging platforms (Discord and Telegram) to accept and process message_thread_id in send_message methods.
- Modified message handlers to utilize message_thread_id when sending messages.
- Enhanced test cases to validate the integration of message_thread_id in message handling.

This change improves support for forum supergroups in Telegram and enhances message management across platforms.
2026-02-18 16:10:57 -08:00
Alishahryar1
2220880671 Add voice note cancellation feature during transcription
- Implemented functionality to cancel pending voice transcriptions when a user replies with the /clear command.
- Updated the Telegram and Discord platform classes to manage pending voice messages, including registration and cancellation logic.
- Enhanced the message handler to delete associated messages and notify users when a voice note is cancelled.
- Added tests to ensure the cancellation feature works as expected during transcription.
2026-02-18 06:36:42 -08:00
Alishahryar1
75e066f17f Refactor voice note transcription to use Hugging Face transformers Whisper pipeline
- Updated transcription logic to utilize Hugging Face's Whisper models instead of faster-whisper.
- Introduced new model mapping and pipeline loading functions.
- Adjusted tests to reflect changes in the transcription process.
- Updated documentation in README, .env.example, and settings to align with the new implementation.
- Ensured compatibility with CUDA 13 and removed unnecessary dependencies.
2026-02-18 06:18:28 -08:00
Cursor Agent
db646ef2db Remove auto support for whisper_device; only cpu and cuda allowed
- Validate whisper_device in Settings and _get_local_model
- Reject 'auto' with clear ValueError/ValidationError
- Update docs in config, .env.example, README
- Add tests for invalid device and valid cpu/cuda

Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
2026-02-18 13:38:59 +00:00
Alishahryar1
b05d0d2703 new linter rules and fixes 2026-02-18 04:13:41 -08:00
Cursor Agent
e9beb28897 fix: validate API keys at provider init to prevent 403 'authorization missing'
When NVIDIA_NIM_API_KEY or OPENROUTER_API_KEY is empty or not set,
the proxy forwarded requests without a valid Authorization header,
causing providers to return 403 with 'Header of type authorization
was missing'.

Now fail fast with HTTP 503 and a clear message telling users to add
the key to .env, with links to obtain keys.

Fixes #29

Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
2026-02-17 07:33:56 +00:00
Alishahryar1
7300156925 Add status message handling for voice note processing
- Introduced a new optional field `status_message_id` in the IncomingMessage model to track the status of voice note processing.
- Updated the Telegram and Discord platforms to utilize the `status_message_id` for editing status messages instead of sending new ones.
- Modified tests to assert the correct status message ID is used during voice note handling.
- Changed status message text from "Processing voice note..." to "Transcribing voice note..." for clarity.
2026-02-16 20:48:47 -08:00
Alishahryar1
2b1ae3deea Add voice note processing feedback for Telegram and Discord platforms
- Implemented a queue message to indicate "Processing voice note..." for both Telegram and Discord platforms.
- Updated the Telegram platform to send a status message when handling voice notes.
- Enhanced the test for Telegram voice handling to verify the queue message is sent correctly.
2026-02-16 20:37:10 -08:00
Alishahryar1
d668f6e476 Add voice note transcription feature
- Introduced voice note handling for Discord and Telegram platforms.
- Added configuration options for voice note functionality in settings.py and .env.example.
- Updated README to include voice note instructions and configuration details.
- Implemented audio attachment processing and transcription using faster-whisper.
- Enabled voice note support through message handlers in both platforms.
2026-02-16 20:14:59 -08:00
Cursor Agent
034d185770 Fix type check errors and CI
- Add re-exports: _is_gfm_table_header_line, _normalize_gfm_tables (discord_markdown)
- Add re-exports: _parse_allowed_channels, _get_discord (discord)
- Add re-exports: NetworkError, RetryAfter, TelegramError (telegram)
- Fix Python 3 except syntax (discord, discord_markdown, telegram_markdown, request_utils)
- Update test patches to messaging.platforms.* for moved modules

Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
2026-02-17 02:31:40 +00:00
Cursor Agent
4b4f87515d Phase 7: Directory restructuring (messaging/ and tests/)
- Create messaging/platforms/ (base, discord, telegram, factory)
- Create messaging/rendering/ (discord_markdown, telegram_markdown)
- Create messaging/trees/ (data, repository, processor, queue_manager)
- Organize tests/ into api/, providers/, messaging/, cli/, config/
- Add backward-compatible re-exports at old locations
- Update handler.py and test_messaging_factory.py imports
- Fix Telegram type hints for TELEGRAM_AVAILABLE=False case
- Fix Python 3 except syntax in discord_markdown

Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
2026-02-17 02:25:42 +00:00
Cursor Agent
bfc781e0ed Phase 4-6: Dead code removal, performance, minor fixes
Phase 4:
- Remove legacy SessionRecord, _sessions, _msg_to_session from SessionStore
- Fix hardcoded provider in root endpoint (use settings.provider_type)
- Update session store tests

Phase 5:
- Use list-based string accumulation in ThinkingSegment, TextSegment, ToolCallSegment
- Cache MAX_MESSAGE_LOG_ENTRIES_PER_CHAT at SessionStore init
- Use iterative DFS in MessageTree.get_descendants

Phase 6:
- Add comment for abstract async generator workaround in BaseProvider
- Rename TELEGRAM_EDIT log tags to PLATFORM_EDIT in handler

Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
2026-02-17 02:01:01 +00:00
Cursor Agent
72b7e34999 Phase 3: Fix encapsulation violations
- Add MessageTree.set_current_task() method
- Update tree_processor to use set_current_task instead of _current_task
- Move nim_settings out of ProviderConfig, pass only to NvidiaNimProvider
- Update api/dependencies and all tests

Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
2026-02-17 01:58:51 +00:00
Cursor Agent
bab86a2687 Phase 2: Extract OpenAICompatibleProvider base class
- Create providers/openai_compat.py with shared streaming logic
- Refactor NvidiaNimProvider, OpenRouterProvider, LMStudioProvider to extend it
- OpenRouter overrides _handle_extra_reasoning for reasoning_details
- Update test patches to providers.openai_compat

Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
2026-02-17 01:57:34 +00:00
Cursor Agent
d3f5f8877f Phase 1: Extract shared provider utils into providers/common/
- Create providers/common/ with sse_builder, message_converter, think_parser,
  heuristic_tool_parser, error_mapping
- Update nvidia_nim/utils and errors to re-export from common for backward compat
- Update all provider clients and tests to import from providers.common
- Remove duplicated files from nvidia_nim/utils/

Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
2026-02-17 01:55:38 +00:00
Alishahryar1
6abcdb4017 Add clear command functionality to message handler
- Implemented handling of the `/clear` command to clear specific branches or entire trees based on message replies.
- Added tests for various scenarios of the clear command, including clearing branches, handling unknown replies, and clearing entire trees.
- Enhanced `TreeQueueManager` with methods to cancel branches and remove subtrees, ensuring proper state management in the session store.
- Updated `SessionStore` and `TreeRepository` to support removal of node mappings and trees, improving data integrity during clear operations.
2026-02-16 16:23:26 -08:00
Alishahryar1
b8d1bfecbb Improved test coverage 2026-02-16 15:58:16 -08:00
Alishahryar1
71cc814efb Add plans directory support to CLI session management
- Introduced `plans_directory` parameter in `CLISessionManager`, `CLISession`, and updated related methods to handle plan files.
- Updated `api/app.py` to pass the plans directory to the CLI session manager.
- Added a new test in `test_cli.py` to verify that the `--settings plansDirectory` argument is included when starting a task with a specified plans directory.
2026-02-16 15:34:26 -08:00
Alishahryar1
6939b52bca removed support for nested subagent 2026-02-16 14:58:51 -08:00
Alishahryar1
01852e1638 Add configurable HTTP timeouts for provider API requests
Updated the README to include new timeout settings. Implemented these timeouts in the provider classes and added corresponding tests to ensure they are correctly passed to the client. Also included environment variable support for the new settings.
2026-02-16 01:40:15 -08:00
Alishahryar1
1cde6d425f lint 2026-02-16 00:20:28 -08:00
Alishahryar1
6511542bfe Implement Discord bot support and update README for messaging platform changes 2026-02-16 00:08:09 -08:00
Alishahryar1
1d52e5d3bb Updated thinking keys 2026-02-15 23:49:54 -08:00
Alishahryar1
ad44594197 Added ruff check to workflow and fixed ruff check errors 2026-02-15 22:02:15 -08:00
Alishahryar1
539854fe7b Refactor done using GLM-5 2026-02-15 21:58:03 -08:00
Alishahryar1
4fb585ed5b added more tests 2026-02-15 19:50:06 -08:00
Alishahryar1
b83be84313 Add LM Studio provider support
- Introduced `LMStudioProvider` to the provider system.
- Added a new fixture `lmstudio_provider` in `conftest.py` for testing.
- Updated `get_provider` function to handle `lmstudio` as a valid provider type.
- Enhanced README and `.env.example` to include LM Studio configuration details.
- Updated settings to accommodate LM Studio's base URL and provider type.
- Added tests to verify the functionality of the LM Studio provider.
2026-02-15 19:41:03 -08:00
Alishahryar1
054f9869b7 Refactor rate limiting configuration to use unified provider settings
- Replaced NVIDIA NIM and OpenRouter specific rate limit settings with a generic provider rate limit in settings, tests, and environment files.
- Updated README.md to reflect the new provider rate limit configuration.
- Adjusted tests to validate the new provider rate limit attributes.
2026-02-15 11:03:59 -08:00