…x cancel_all() TOCTOU
- Remove tree_queue property setter (backward-compat hack; all callers
already migrated to replace_tree_queue()); keep property getter only
- Update 2 remaining tests that still used direct assignment to use
replace_tree_queue()
- Remove _set_connected() 1-line wrapper on DiscordPlatform; assign
_connected directly
- Fix cancel_all() TOCTOU: hold self._lock for the full loop so newly
created trees cannot slip through between the snapshot and cancellation
---------
Co-authored-by: Claude <noreply@anthropic.com>
## Summary
Added NVIDIA NIM as a second transcription option ( alongside local
Whisper). This lets you transcribe voice notes using NVIDIA's cloud API
instead of running Whisper locally.
## What changed
- **Transcription**: Now supports the two backends
- Local Whisper: Free, runs on your GPU/CPU (existing)
- NVIDIA NIM: Cloud API via Riva gRPC (new)
- **Supported models**: 8 NVIDIA NIM models added (Parakeet variants for
different languages, Whisper Large V3)
---------
Co-authored-by: Alishahryar1 <alishahryar2@gmail.com>
The max_sessions cap in CLISessionManager was the only thing enforcing
a limit on concurrent CLI processes. Now that provider concurrency is
controlled at the streaming layer (PROVIDER_MAX_CONCURRENCY semaphore),
the CLI session pool cap is redundant and removed entirely.
Changes:
- cli/manager.py: remove max_sessions param, cap check, _cleanup_idle_sessions_unlocked, max_sessions from get_stats()
- config/settings.py: remove max_cli_sessions field
- api/app.py: remove max_sessions=settings.max_cli_sessions from CLISessionManager constructor
- messaging/handler.py: remove "Waiting for slot" status check; stats display no longer shows Max CLI
- .env.example: remove MAX_CLI_SESSIONS line
- tests/cli/test_cli.py: remove max_sessions args and assertion from manager tests
- tests/cli/test_cli_manager_edge_cases.py: remove two tests for cap/cleanup behavior
- tests/api/test_app_lifespan_and_errors.py: remove max_cli_sessions from all SimpleNamespace settings
- tests/config/test_config.py: remove max_cli_sessions isinstance assertion
- tests/conftest.py: remove max_sessions from mock stats
- tests/messaging/test_handler.py: merge slot/capacity tests into single new-conversation test; remove Max CLI assertion from stats test
- tests/messaging/test_handler_markdown_and_status_edges.py: remove "Waiting for slot" assertion; drop max_sessions from all stats mocks
https://claude.ai/code/session_014mrF1WMNgmNjtPBuoQHsbg
- Introduced message_thread_id to the IncomingMessage model for handling forum topic IDs in Telegram.
- Updated messaging platforms (Discord and Telegram) to accept and process message_thread_id in send_message methods.
- Modified message handlers to utilize message_thread_id when sending messages.
- Enhanced test cases to validate the integration of message_thread_id in message handling.
This change improves support for forum supergroups in Telegram and enhances message management across platforms.
- Implemented functionality to cancel pending voice transcriptions when a user replies with the /clear command.
- Updated the Telegram and Discord platform classes to manage pending voice messages, including registration and cancellation logic.
- Enhanced the message handler to delete associated messages and notify users when a voice note is cancelled.
- Added tests to ensure the cancellation feature works as expected during transcription.
- Updated transcription logic to utilize Hugging Face's Whisper models instead of faster-whisper.
- Introduced new model mapping and pipeline loading functions.
- Adjusted tests to reflect changes in the transcription process.
- Updated documentation in README, .env.example, and settings to align with the new implementation.
- Ensured compatibility with CUDA 13 and removed unnecessary dependencies.
- Validate whisper_device in Settings and _get_local_model
- Reject 'auto' with clear ValueError/ValidationError
- Update docs in config, .env.example, README
- Add tests for invalid device and valid cpu/cuda
Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
- Remove _cuda_failed_models and inference-time CPU fallback
- auto: try CUDA only, fail fast on RuntimeError (no CPU fallback)
- cpu/cuda: use device directly, fail fast on errors
- Update docs in config, .env.example, README
Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
- Introduced a new optional field `status_message_id` in the IncomingMessage model to track the status of voice note processing.
- Updated the Telegram and Discord platforms to utilize the `status_message_id` for editing status messages instead of sending new ones.
- Modified tests to assert the correct status message ID is used during voice note handling.
- Changed status message text from "Processing voice note..." to "Transcribing voice note..." for clarity.
- Implemented a queue message to indicate "Processing voice note..." for both Telegram and Discord platforms.
- Updated the Telegram platform to send a status message when handling voice notes.
- Enhanced the test for Telegram voice handling to verify the queue message is sent correctly.
- Introduced a mechanism to track models that fail CUDA inference, allowing for automatic fallback to CPU on subsequent requests.
- Updated the model loading logic to respect the CUDA failure state, improving robustness in audio transcription.
- Removed debug logging code from Discord platform to streamline error handling.
- Introduced voice note handling for Discord and Telegram platforms.
- Added configuration options for voice note functionality in settings.py and .env.example.
- Updated README to include voice note instructions and configuration details.
- Implemented audio attachment processing and transcription using faster-whisper.
- Enabled voice note support through message handlers in both platforms.
- Add MessageTree.set_current_task() method
- Update tree_processor to use set_current_task instead of _current_task
- Move nim_settings out of ProviderConfig, pass only to NvidiaNimProvider
- Update api/dependencies and all tests
Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
- Implemented handling of the `/clear` command to clear specific branches or entire trees based on message replies.
- Added tests for various scenarios of the clear command, including clearing branches, handling unknown replies, and clearing entire trees.
- Enhanced `TreeQueueManager` with methods to cancel branches and remove subtrees, ensuring proper state management in the session store.
- Updated `SessionStore` and `TreeRepository` to support removal of node mappings and trees, improving data integrity during clear operations.
- Added request ID context to logging in FastAPI routes and NVIDIA NIM provider.
- Improved logging format to include context variables for better traceability.
- Updated message handling in Telegram and Claude handlers to log message previews.
- Enhanced error logging in NVIDIA NIM provider with request ID for easier debugging.
- Added logging for tree repository actions to track tree and node registrations.
- Added a step to fail the CI if any '# type: ignore' comments are found in Python files.
- Refactored tests to use mocking for better isolation and reliability.
- Updated type hints and casting in several files to improve type safety.
The drain/put approach used get_nowait() and put_nowait(). Each
put_nowait() increments _unfinished_tasks without a corresponding
task_done(), permanently inflating the counter on every invocation.
Revert to reading the internal deque directly (list(self._queue._queue))
which does not mutate any queue state. This matches the approach used
in remove_from_queue.
Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
- messaging/telegram.py: Remove unused type: ignore, fix retry_after typing
with isinstance(timedelta), use local app variable for None narrowing
- messaging/tree_data.py: Replace _queue._queue access with drain-and-restore
approach for get_queue_snapshot (avoids private API)
- tests/test_api.py: Use APIError instead of RuntimeError for status_code test
- tests/test_config.py: Use cast(Any, ...) for invalid validation tests
- tests/test_dependencies.py: Add isinstance check for NvidiaNimProvider
- tests/test_handler_markdown_and_status_edges.py: Use patch.object for
tree_queue method mocks
- tests/test_response_models.py: Add isinstance narrowing for content blocks,
use Literal list for stop_reason parametrization
- tests/test_restart_reply_restore.py: Use patch.object for enqueue mock
- tests/test_server_module.py: Use patch.object for uvicorn.run and
get_settings
- tests/test_telegram_edge_cases.py: Use patch.object for method mocks
- tests/test_tree_concurrency.py: Add None assertions for get_node/get_tree
Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>