Updates the GitHub Actions workflow (`.github/workflows/docker-publish.yml`)
to build and push Docker images for both `linux/amd64` and `linux/arm64`
architectures.
The `platforms` attribute has been added to the `docker/build-push-action`
step for both the backend and frontend jobs. This ensures that you on
different CPU architectures can use the published images from ghcr.io.
Adds a GitHub Actions workflow to automatically build and publish Docker images for the backend and frontend services.
The workflow (`.github/workflows/docker-publish.yml`) is triggered on pushes to the `main` branch. It includes two jobs:
1. `build_and_push_backend`: Builds the Docker image from `surfsense_backend/Dockerfile` and pushes it to `ghcr.io/<owner>/surfsense_backend:<commit_sha>`.
2. `build_and_push_frontend`: Builds the Docker image from `surfsense_web/Dockerfile` and pushes it to `ghcr.io/<owner>/surfsense_web:<commit_sha>`.
Both jobs include steps for:
- Checking out the repository.
- Setting up QEMU and Docker Buildx.
- Logging into the GitHub Container Registry (ghcr.io) using `secrets.GITHUB_TOKEN`.
- Building and pushing the respective Docker images, tagged with the commit SHA.
- Adding OCI labels for image source, creation date, and revision.
This CI pipeline automates the process of creating and distributing Docker images for the application, ensuring that new versions are available in the GitHub Container Registry upon changes to the main branch.
This commit addresses recurring `SyntaxError: invalid non-printable character U+001B`
errors in `surfsense_backend/app/connectors/slack_history.py`.
The file was cleaned to remove all occurrences of the
U+001B (ESCAPE) character. This ensures that previously introduced
problematic control characters are fully removed, allowing the application
to parse and load the module correctly.
Here's a rundown of what I did:
Fix: Robust Slack rate limiting, error handling & GitHub org repos
This update delivers comprehensive improvements to Slack connector stability and enhances the GitHub connector.
**Slack Connector (`slack_history.py`, `connectors_indexing_tasks.py`):**
- I've implemented proactive delays (1.2s for `conversations.history`, 3s for `conversations.list` pagination) and `Retry-After` header handling for 429 rate limit errors across `conversations.list`, `conversations.history`, and `users.info` API calls.
- I'll now gracefully handle `not_in_channel` errors when fetching conversation history by logging a warning and skipping the channel.
- I've refactored channel info fetching: `get_all_channels` now returns richer channel data (including `is_member`, `is_private`).
- I've removed direct calls to `conversations.info` from `connectors_indexing_tasks.py`, using the richer data from `get_all_channels` instead, to prevent associated rate limits.
- I corrected a `SyntaxError` (non-printable character) in `slack_history.py`.
- I've enhanced logging for rate limit actions, delays, and errors.
- I've updated unit tests in `test_slack_history.py` to cover all new logic.
**GitHub Connector (`github_connector.py`):**
- I've modified `get_user_repositories` to fetch all repositories accessible by you (owned, collaborated, organization) by changing the API call parameter from `type='owner'` to `type='all'`.
- I've included unit tests in `test_github_connector.py` for this change.
This commit includes two main improvements:
1. Slack Connector (`slack_history.py`):
- Addresses API rate limiting for `conversations.list` by introducing a 3-second delay between paginated calls.
- Implements handling for the `Retry-After` header when HTTP 429 errors occur.
- Fixes a `SyntaxError` caused by a non-printable character accidentally introduced in a previous modification.
- Adds comprehensive unit tests for the rate limiting and retry logic in `test_slack_history.py`.
2. GitHub Connector (`github_connector.py`):
- Modifies `get_user_repositories` to fetch all repositories accessible by you (including organization repositories) by changing the API call parameter from `type='owner'` to `type='all'`.
- Adds unit tests in `test_github_connector.py` to verify this change and other connector functionalities.
The `get_all_channels` method in `slack_history.py` was making paginated
requests to `conversations.list` without any delay, leading to HTTP 429
errors when fetching channels from large Slack workspaces.
This commit introduces the following changes:
- Adds a 3-second delay between paginated calls to `conversations.list`
to comply with Slack's Tier 2 rate limits (approx. 20 requests/minute).
- Implements handling for the `Retry-After` header when a 429 error is
received. The system will wait for the specified duration before
retrying. If the header is missing or invalid, a default of 60 seconds
is used.
- Adds comprehensive unit tests to verify the new delay and retry logic,
covering scenarios with and without the `Retry-After` header, as well
as other API errors.