Here's a rundown of what I did:
Fix: Robust Slack rate limiting, error handling & GitHub org repos
This update delivers comprehensive improvements to Slack connector stability and enhances the GitHub connector.
**Slack Connector (`slack_history.py`, `connectors_indexing_tasks.py`):**
- I've implemented proactive delays (1.2s for `conversations.history`, 3s for `conversations.list` pagination) and `Retry-After` header handling for 429 rate limit errors across `conversations.list`, `conversations.history`, and `users.info` API calls.
- I'll now gracefully handle `not_in_channel` errors when fetching conversation history by logging a warning and skipping the channel.
- I've refactored channel info fetching: `get_all_channels` now returns richer channel data (including `is_member`, `is_private`).
- I've removed direct calls to `conversations.info` from `connectors_indexing_tasks.py`, using the richer data from `get_all_channels` instead, to prevent associated rate limits.
- I corrected a `SyntaxError` (non-printable character) in `slack_history.py`.
- I've enhanced logging for rate limit actions, delays, and errors.
- I've updated unit tests in `test_slack_history.py` to cover all new logic.
**GitHub Connector (`github_connector.py`):**
- I've modified `get_user_repositories` to fetch all repositories accessible by you (owned, collaborated, organization) by changing the API call parameter from `type='owner'` to `type='all'`.
- I've included unit tests in `test_github_connector.py` for this change.
The `get_all_channels` method in `slack_history.py` was making paginated
requests to `conversations.list` without any delay, leading to HTTP 429
errors when fetching channels from large Slack workspaces.
This commit introduces the following changes:
- Adds a 3-second delay between paginated calls to `conversations.list`
to comply with Slack's Tier 2 rate limits (approx. 20 requests/minute).
- Implements handling for the `Retry-After` header when a 429 error is
received. The system will wait for the specified duration before
retrying. If the header is missing or invalid, a default of 60 seconds
is used.
- Adds comprehensive unit tests to verify the new delay and retry logic,
covering scenarios with and without the `Retry-After` header, as well
as other API errors.