mirror of
https://github.com/MODSetter/SurfSense.git
synced 2025-09-01 18:19:08 +00:00
327 lines
14 KiB
Text
327 lines
14 KiB
Text
---
|
|
title: Manual Installation
|
|
description: Setting up SurfSense manually for customized deployments (Preferred)
|
|
full: true
|
|
---
|
|
|
|
# Manual Installation (Preferred)
|
|
|
|
This guide provides step-by-step instructions for setting up SurfSense without Docker. This approach gives you more control over the installation process and allows for customization of the environment.
|
|
|
|
## Prerequisites
|
|
|
|
Before beginning the manual installation, ensure you have completed all the [prerequisite setup steps](/docs), including:
|
|
|
|
- PGVector setup
|
|
- **File Processing ETL Service** (choose one):
|
|
- Unstructured.io API key (Supports 34+ formats)
|
|
- LlamaIndex API key (enhanced parsing, supports 50+ formats)
|
|
- Docling (local processing, no API key required, supports PDF, Office docs, images, HTML, CSV)
|
|
- Other required API keys
|
|
|
|
## Backend Setup
|
|
|
|
The backend is the core of SurfSense. Follow these steps to set it up:
|
|
|
|
### 1. Environment Configuration
|
|
|
|
First, create and configure your environment variables by copying the example file:
|
|
|
|
**Linux/macOS:**
|
|
|
|
```bash
|
|
cd surfsense_backend
|
|
cp .env.example .env
|
|
```
|
|
|
|
**Windows (Command Prompt):**
|
|
|
|
```cmd
|
|
cd surfsense_backend
|
|
copy .env.example .env
|
|
```
|
|
|
|
**Windows (PowerShell):**
|
|
|
|
```powershell
|
|
cd surfsense_backend
|
|
Copy-Item -Path .env.example -Destination .env
|
|
```
|
|
|
|
Edit the `.env` file and set the following variables:
|
|
|
|
| ENV VARIABLE | DESCRIPTION |
|
|
| -------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
| DATABASE_URL | PostgreSQL connection string (e.g., `postgresql+asyncpg://postgres:postgres@localhost:5432/surfsense`) |
|
|
| SECRET_KEY | JWT Secret key for authentication (should be a secure random string) |
|
|
| NEXT_FRONTEND_URL | URL where your frontend application is hosted (e.g., `http://localhost:3000`) |
|
|
| AUTH_TYPE | Authentication method: `GOOGLE` for OAuth with Google, `LOCAL` for email/password authentication |
|
|
| GOOGLE_OAUTH_CLIENT_ID | (Optional) Client ID from Google Cloud Console (required if AUTH_TYPE=GOOGLE) |
|
|
| GOOGLE_OAUTH_CLIENT_SECRET | (Optional) Client secret from Google Cloud Console (required if AUTH_TYPE=GOOGLE) |
|
|
| EMBEDDING_MODEL | Name of the embedding model (e.g., `mixedbread-ai/mxbai-embed-large-v1`) |
|
|
| RERANKERS_MODEL_NAME | Name of the reranker model (e.g., `ms-marco-MiniLM-L-12-v2`) |
|
|
| RERANKERS_MODEL_TYPE | Type of reranker model (e.g., `flashrank`) |
|
|
| TTS_SERVICE | Text-to-Speech API provider for Podcasts (e.g., `local/kokoro`, `openai/tts-1`). See [supported providers](https://docs.litellm.ai/docs/text_to_speech#supported-providers) |
|
|
| TTS_SERVICE_API_KEY | API key for the Text-to-Speech service |
|
|
| TTS_SERVICE_API_BASE | (Optional) Custom API base URL for the Text-to-Speech service |
|
|
| STT_SERVICE | Speech-to-Text API provider for Podcasts (e.g., `openai/whisper-1`). See [supported providers](https://docs.litellm.ai/docs/audio_transcription#supported-providers) |
|
|
| STT_SERVICE_API_KEY | API key for the Speech-to-Text service |
|
|
| STT_SERVICE_API_BASE | (Optional) Custom API base URL for the Speech-to-Text service |
|
|
| FIRECRAWL_API_KEY | API key for Firecrawl service for web crawling |
|
|
| ETL_SERVICE | Document parsing service: `UNSTRUCTURED` (supports 34+ formats), `LLAMACLOUD` (supports 50+ formats including legacy document types), or `DOCLING` (local processing, supports PDF, Office docs, images, HTML, CSV) |
|
|
| UNSTRUCTURED_API_KEY | API key for Unstructured.io service for document parsing (required if ETL_SERVICE=UNSTRUCTURED) |
|
|
| LLAMA_CLOUD_API_KEY | API key for LlamaCloud service for document parsing (required if ETL_SERVICE=LLAMACLOUD) |
|
|
|
|
|
|
**Optional Backend LangSmith Observability:**
|
|
| ENV VARIABLE | DESCRIPTION |
|
|
|--------------|-------------|
|
|
| LANGSMITH_TRACING | Enable LangSmith tracing (e.g., `true`) |
|
|
| LANGSMITH_ENDPOINT | LangSmith API endpoint (e.g., `https://api.smith.langchain.com`) |
|
|
| LANGSMITH_API_KEY | Your LangSmith API key |
|
|
| LANGSMITH_PROJECT | LangSmith project name (e.g., `surfsense`) |
|
|
|
|
**Uvicorn Server Configuration**
|
|
| ENV VARIABLE | DESCRIPTION | DEFAULT VALUE |
|
|
|------------------------------|---------------------------------------------|---------------|
|
|
| UVICORN_HOST | Host address to bind the server | 0.0.0.0 |
|
|
| UVICORN_PORT | Port to run the backend API | 8000 |
|
|
| UVICORN_LOG_LEVEL | Logging level (e.g., info, debug, warning) | info |
|
|
| UVICORN_PROXY_HEADERS | Enable/disable proxy headers | false |
|
|
| UVICORN_FORWARDED_ALLOW_IPS | Comma-separated list of allowed IPs | 127.0.0.1 |
|
|
| UVICORN_WORKERS | Number of worker processes | 1 |
|
|
| UVICORN_ACCESS_LOG | Enable/disable access log (true/false) | true |
|
|
| UVICORN_LOOP | Event loop implementation | auto |
|
|
| UVICORN_HTTP | HTTP protocol implementation | auto |
|
|
| UVICORN_WS | WebSocket protocol implementation | auto |
|
|
| UVICORN_LIFESPAN | Lifespan implementation | auto |
|
|
| UVICORN_LOG_CONFIG | Path to logging config file or empty string | |
|
|
| UVICORN_SERVER_HEADER | Enable/disable Server header | true |
|
|
| UVICORN_DATE_HEADER | Enable/disable Date header | true |
|
|
| UVICORN_LIMIT_CONCURRENCY | Max concurrent connections | |
|
|
| UVICORN_LIMIT_MAX_REQUESTS | Max requests before worker restart | |
|
|
| UVICORN_TIMEOUT_KEEP_ALIVE | Keep-alive timeout (seconds) | 5 |
|
|
| UVICORN_TIMEOUT_NOTIFY | Worker shutdown notification timeout (sec) | 30 |
|
|
| UVICORN_SSL_KEYFILE | Path to SSL key file | |
|
|
| UVICORN_SSL_CERTFILE | Path to SSL certificate file | |
|
|
| UVICORN_SSL_KEYFILE_PASSWORD | Password for SSL key file | |
|
|
| UVICORN_SSL_VERSION | SSL version | |
|
|
| UVICORN_SSL_CERT_REQS | SSL certificate requirements | |
|
|
| UVICORN_SSL_CA_CERTS | Path to CA certificates file | |
|
|
| UVICORN_SSL_CIPHERS | SSL ciphers | |
|
|
| UVICORN_HEADERS | Comma-separated list of headers | |
|
|
| UVICORN_USE_COLORS | Enable/disable colored logs | true |
|
|
| UVICORN_UDS | Unix domain socket path | |
|
|
| UVICORN_FD | File descriptor to bind to | |
|
|
| UVICORN_ROOT_PATH | Root path for the application | |
|
|
|
|
Refer to the `.env.example` file for all available Uvicorn options and their usage. Uncomment and set in your `.env` file as needed.
|
|
|
|
For more details, see the [Uvicorn documentation](https://www.uvicorn.org/#command-line-options).
|
|
|
|
### 2. Install Dependencies
|
|
|
|
Install the backend dependencies using `uv`:
|
|
|
|
**Linux/macOS:**
|
|
|
|
```bash
|
|
# Install uv if you don't have it
|
|
curl -fsSL https://astral.sh/uv/install.sh | bash
|
|
|
|
# Install dependencies
|
|
uv sync
|
|
```
|
|
|
|
**Windows (PowerShell):**
|
|
|
|
```powershell
|
|
# Install uv if you don't have it
|
|
iwr -useb https://astral.sh/uv/install.ps1 | iex
|
|
|
|
# Install dependencies
|
|
uv sync
|
|
```
|
|
|
|
**Windows (Command Prompt):**
|
|
|
|
```cmd
|
|
# Install dependencies with uv (after installing uv)
|
|
uv sync
|
|
```
|
|
|
|
### 3. Run the Backend
|
|
|
|
Start the backend server:
|
|
|
|
**Linux/macOS/Windows:**
|
|
|
|
```bash
|
|
# Run without hot reloading
|
|
uv run main.py
|
|
|
|
# Or with hot reloading for development
|
|
uv run main.py --reload
|
|
```
|
|
|
|
If everything is set up correctly, you should see output indicating the server is running on `http://localhost:8000`.
|
|
|
|
## Frontend Setup
|
|
|
|
### 1. Environment Configuration
|
|
|
|
Set up the frontend environment:
|
|
|
|
**Linux/macOS:**
|
|
|
|
```bash
|
|
cd surfsense_web
|
|
cp .env.example .env
|
|
```
|
|
|
|
**Windows (Command Prompt):**
|
|
|
|
```cmd
|
|
cd surfsense_web
|
|
copy .env.example .env
|
|
```
|
|
|
|
**Windows (PowerShell):**
|
|
|
|
```powershell
|
|
cd surfsense_web
|
|
Copy-Item -Path .env.example -Destination .env
|
|
```
|
|
|
|
Edit the `.env` file and set:
|
|
|
|
| ENV VARIABLE | DESCRIPTION |
|
|
| ------------------------------- | ------------------------------------------- |
|
|
| NEXT_PUBLIC_FASTAPI_BACKEND_URL | Backend URL (e.g., `http://localhost:8000`) |
|
|
| NEXT_PUBLIC_FASTAPI_BACKEND_AUTH_TYPE | Same value as set in backend AUTH_TYPE i.e `GOOGLE` for OAuth with Google, `LOCAL` for email/password authentication |
|
|
| NEXT_PUBLIC_ETL_SERVICE | Document parsing service (should match backend ETL_SERVICE): `UNSTRUCTURED`, `LLAMACLOUD`, or `DOCLING` - affects supported file formats in upload interface |
|
|
|
|
### 2. Install Dependencies
|
|
|
|
Install the frontend dependencies:
|
|
|
|
**Linux/macOS:**
|
|
|
|
```bash
|
|
# Install pnpm if you don't have it
|
|
npm install -g pnpm
|
|
|
|
# Install dependencies
|
|
pnpm install
|
|
```
|
|
|
|
**Windows:**
|
|
|
|
```powershell
|
|
# Install pnpm if you don't have it
|
|
npm install -g pnpm
|
|
|
|
# Install dependencies
|
|
pnpm install
|
|
```
|
|
|
|
### 3. Run the Frontend
|
|
|
|
Start the Next.js development server:
|
|
|
|
**Linux/macOS/Windows:**
|
|
|
|
```bash
|
|
pnpm run dev
|
|
```
|
|
|
|
The frontend should now be running at `http://localhost:3000`.
|
|
|
|
## Browser Extension Setup (Optional)
|
|
|
|
The SurfSense browser extension allows you to save any webpage, including those protected behind authentication.
|
|
|
|
### 1. Environment Configuration
|
|
|
|
**Linux/macOS:**
|
|
|
|
```bash
|
|
cd surfsense_browser_extension
|
|
cp .env.example .env
|
|
```
|
|
|
|
**Windows (Command Prompt):**
|
|
|
|
```cmd
|
|
cd surfsense_browser_extension
|
|
copy .env.example .env
|
|
```
|
|
|
|
**Windows (PowerShell):**
|
|
|
|
```powershell
|
|
cd surfsense_browser_extension
|
|
Copy-Item -Path .env.example -Destination .env
|
|
```
|
|
|
|
Edit the `.env` file:
|
|
|
|
| ENV VARIABLE | DESCRIPTION |
|
|
| ------------------------- | ----------------------------------------------------- |
|
|
| PLASMO_PUBLIC_BACKEND_URL | SurfSense Backend URL (e.g., `http://127.0.0.1:8000`) |
|
|
|
|
### 2. Build the Extension
|
|
|
|
Build the extension for your browser using the [Plasmo framework](https://docs.plasmo.com/framework/workflows/build#with-a-specific-target).
|
|
|
|
**Linux/macOS/Windows:**
|
|
|
|
```bash
|
|
# Install dependencies
|
|
pnpm install
|
|
|
|
# Build for Chrome (default)
|
|
pnpm build
|
|
|
|
# Or for other browsers
|
|
pnpm build --target=firefox
|
|
pnpm build --target=edge
|
|
```
|
|
|
|
### 3. Load the Extension
|
|
|
|
Load the extension in your browser's developer mode and configure it with your SurfSense API key.
|
|
|
|
## Verification
|
|
|
|
To verify your installation:
|
|
|
|
1. Open your browser and navigate to `http://localhost:3000`
|
|
2. Sign in with your Google account
|
|
3. Create a search space and try uploading a document
|
|
4. Test the chat functionality with your uploaded content
|
|
|
|
## Troubleshooting
|
|
|
|
- **Database Connection Issues**: Verify your PostgreSQL server is running and pgvector is properly installed
|
|
- **Authentication Problems**: Check your Google OAuth configuration and ensure redirect URIs are set correctly
|
|
- **LLM Errors**: Confirm your LLM API keys are valid and the selected models are accessible
|
|
- **File Upload Failures**: Validate your Unstructured.io API key
|
|
- **Windows-specific**: If you encounter path issues, ensure you're using the correct path separator (`\` instead of `/`)
|
|
- **macOS-specific**: If you encounter permission issues, you may need to use `sudo` for some installation commands
|
|
|
|
## Next Steps
|
|
|
|
Now that you have SurfSense running locally, you can explore its features:
|
|
|
|
- Create search spaces for organizing your content
|
|
- Upload documents or use the browser extension to save webpages
|
|
- Ask questions about your saved content
|
|
- Explore the advanced RAG capabilities
|
|
|
|
For production deployments, consider setting up:
|
|
|
|
- A reverse proxy like Nginx
|
|
- SSL certificates for secure connections
|
|
- Proper database backups
|
|
- User access controls
|