mirror of
https://github.com/lfnovo/open-notebook.git
synced 2026-04-30 20:39:55 +00:00
567 lines
9.5 KiB
Markdown
567 lines
9.5 KiB
Markdown
# Advanced Configuration
|
|
|
|
Performance tuning, debugging, and advanced features.
|
|
|
|
---
|
|
|
|
## Performance Tuning
|
|
|
|
### Concurrency Control
|
|
|
|
```env
|
|
# Max concurrent database operations (default: 5)
|
|
# Increase: Faster processing, more conflicts
|
|
# Decrease: Slower, fewer conflicts
|
|
SURREAL_COMMANDS_MAX_TASKS=5
|
|
```
|
|
|
|
**Guidelines:**
|
|
- CPU: 2 cores → 2-3 tasks
|
|
- CPU: 4 cores → 5 tasks (default)
|
|
- CPU: 8+ cores → 10-20 tasks
|
|
|
|
Higher concurrency = more throughput but more database conflicts (retries handle this).
|
|
|
|
### Retry Strategy
|
|
|
|
```env
|
|
# How to wait between retries
|
|
SURREAL_COMMANDS_RETRY_WAIT_STRATEGY=exponential_jitter
|
|
|
|
# Options:
|
|
# - exponential_jitter (recommended)
|
|
# - exponential
|
|
# - fixed
|
|
# - random
|
|
```
|
|
|
|
For high-concurrency deployments, use `exponential_jitter` to prevent thundering herd.
|
|
|
|
### Timeout Tuning
|
|
|
|
```env
|
|
# Client timeout (default: 300 seconds)
|
|
API_CLIENT_TIMEOUT=300
|
|
|
|
# LLM timeout (default: 60 seconds)
|
|
ESPERANTO_LLM_TIMEOUT=60
|
|
```
|
|
|
|
**Guideline:** Set `API_CLIENT_TIMEOUT` > `ESPERANTO_LLM_TIMEOUT` + buffer
|
|
|
|
```
|
|
Example:
|
|
ESPERANTO_LLM_TIMEOUT=120
|
|
API_CLIENT_TIMEOUT=180 # 120 + 60 second buffer
|
|
```
|
|
|
|
---
|
|
|
|
## Batching
|
|
|
|
### TTS Batch Size
|
|
|
|
For podcast generation, control concurrent TTS requests:
|
|
|
|
```env
|
|
# Default: 5
|
|
TTS_BATCH_SIZE=2
|
|
```
|
|
|
|
**Providers and recommendations:**
|
|
- OpenAI: 5 (can handle many concurrent)
|
|
- Google: 4 (good concurrency)
|
|
- ElevenLabs: 2 (limited concurrent requests)
|
|
- Local TTS: 1 (single-threaded)
|
|
|
|
Lower = slower but more stable. Higher = faster but more load on provider.
|
|
|
|
---
|
|
|
|
## Logging & Debugging
|
|
|
|
### Enable Detailed Logging
|
|
|
|
```bash
|
|
# Start with debug logging
|
|
RUST_LOG=debug # For Rust components
|
|
LOGLEVEL=DEBUG # For Python components
|
|
```
|
|
|
|
### Debug Specific Components
|
|
|
|
```bash
|
|
# Only surreal operations
|
|
RUST_LOG=surrealdb=debug
|
|
|
|
# Only langchain
|
|
LOGLEVEL=langchain:debug
|
|
|
|
# Only specific module
|
|
RUST_LOG=open_notebook::database=debug
|
|
```
|
|
|
|
### LangSmith Tracing
|
|
|
|
For debugging LLM workflows:
|
|
|
|
```env
|
|
LANGCHAIN_TRACING_V2=true
|
|
LANGCHAIN_ENDPOINT="https://api.smith.langchain.com"
|
|
LANGCHAIN_API_KEY=your-key
|
|
LANGCHAIN_PROJECT="Open Notebook"
|
|
```
|
|
|
|
Then visit https://smith.langchain.com to see traces.
|
|
|
|
---
|
|
|
|
## Port Configuration
|
|
|
|
### Default Ports
|
|
|
|
```
|
|
Frontend: 8502 (Docker deployment)
|
|
Frontend: 3000 (Development from source)
|
|
API: 5055
|
|
SurrealDB: 8000
|
|
```
|
|
|
|
### Changing Frontend Port
|
|
|
|
Edit `docker-compose.yml`:
|
|
|
|
```yaml
|
|
services:
|
|
open-notebook:
|
|
ports:
|
|
- "8001:8502" # Change from 8502 to 8001
|
|
```
|
|
|
|
Access at: `http://localhost:8001`
|
|
|
|
API auto-detects to: `http://localhost:5055` ✓
|
|
|
|
### Changing API Port
|
|
|
|
```yaml
|
|
services:
|
|
open-notebook:
|
|
ports:
|
|
- "127.0.0.1:8502:8502" # Frontend
|
|
- "5056:5055" # Change API from 5055 to 5056
|
|
environment:
|
|
- API_URL=http://localhost:5056 # Update API_URL
|
|
```
|
|
|
|
Access API directly: `http://localhost:5056/docs`
|
|
|
|
**Note:** When changing API port, you must set `API_URL` explicitly since auto-detection assumes port 5055.
|
|
|
|
### Changing SurrealDB Port
|
|
|
|
```yaml
|
|
services:
|
|
surrealdb:
|
|
ports:
|
|
- "8001:8000" # Change from 8000 to 8001
|
|
environment:
|
|
- SURREAL_URL=ws://surrealdb:8001/rpc # Update connection URL
|
|
```
|
|
|
|
**Important:** Internal Docker network uses container name (`surrealdb`), not `localhost`.
|
|
|
|
---
|
|
|
|
## SSL/TLS Configuration
|
|
|
|
### Custom CA Certificate
|
|
|
|
For self-signed certs on local providers:
|
|
|
|
```env
|
|
ESPERANTO_SSL_CA_BUNDLE=/path/to/ca-bundle.pem
|
|
```
|
|
|
|
### Disable Verification (Development Only)
|
|
|
|
```env
|
|
# WARNING: Only for testing/development
|
|
# Vulnerable to MITM attacks
|
|
ESPERANTO_SSL_VERIFY=false
|
|
```
|
|
|
|
---
|
|
|
|
## Multi-Provider Setup
|
|
|
|
### Use Different Providers for Different Tasks
|
|
|
|
```env
|
|
# Language model (main)
|
|
OPENAI_API_KEY=sk-proj-...
|
|
|
|
# Embeddings (alternative)
|
|
# (Future: Configure different embedding provider)
|
|
|
|
# TTS (different provider)
|
|
ELEVENLABS_API_KEY=...
|
|
```
|
|
|
|
### OpenAI-Compatible with Fallback
|
|
|
|
```env
|
|
# Primary
|
|
OPENAI_COMPATIBLE_BASE_URL=http://localhost:1234/v1
|
|
OPENAI_COMPATIBLE_API_KEY=key1
|
|
|
|
# Can also set specific modality endpoints
|
|
OPENAI_COMPATIBLE_BASE_URL_LLM=http://localhost:1234/v1
|
|
OPENAI_COMPATIBLE_BASE_URL_EMBEDDING=http://localhost:8001/v1
|
|
```
|
|
|
|
---
|
|
|
|
## Security Hardening
|
|
|
|
### Change Default Credentials
|
|
|
|
```env
|
|
# Don't use defaults in production
|
|
SURREAL_USER=your_secure_username
|
|
SURREAL_PASSWORD=$(openssl rand -base64 32) # Generate secure password
|
|
```
|
|
|
|
### Add Password Protection
|
|
|
|
```env
|
|
# Protect your Open Notebook instance
|
|
OPEN_NOTEBOOK_PASSWORD=your_secure_password
|
|
```
|
|
|
|
### Use HTTPS
|
|
|
|
```env
|
|
# Always use HTTPS in production
|
|
API_URL=https://mynotebook.example.com
|
|
```
|
|
|
|
### Firewall Rules
|
|
|
|
Restrict access to your Open Notebook:
|
|
- Port 8502 (frontend): Only from your IP
|
|
- Port 5055 (API): Only from frontend
|
|
- Port 8000 (SurrealDB): Never expose to internet
|
|
|
|
---
|
|
|
|
## Web Scraping & Content Extraction
|
|
|
|
Open Notebook uses multiple services for content extraction:
|
|
|
|
### Firecrawl
|
|
|
|
For advanced web scraping:
|
|
|
|
```env
|
|
FIRECRAWL_API_KEY=your-key
|
|
```
|
|
|
|
Get key from: https://firecrawl.dev/
|
|
|
|
### Jina AI
|
|
|
|
Alternative web extraction:
|
|
|
|
```env
|
|
JINA_API_KEY=your-key
|
|
```
|
|
|
|
Get key from: https://jina.ai/
|
|
|
|
---
|
|
|
|
## Environment Variable Groups
|
|
|
|
### API Keys (Choose at least one)
|
|
```env
|
|
OPENAI_API_KEY
|
|
ANTHROPIC_API_KEY
|
|
GOOGLE_API_KEY
|
|
GROQ_API_KEY
|
|
MISTRAL_API_KEY
|
|
DEEPSEEK_API_KEY
|
|
OPENROUTER_API_KEY
|
|
XAI_API_KEY
|
|
```
|
|
|
|
### AI Provider Endpoints
|
|
```env
|
|
OLLAMA_API_BASE
|
|
OPENAI_COMPATIBLE_BASE_URL
|
|
AZURE_OPENAI_ENDPOINT
|
|
GEMINI_API_BASE_URL
|
|
```
|
|
|
|
### Database
|
|
```env
|
|
SURREAL_URL
|
|
SURREAL_USER
|
|
SURREAL_PASSWORD
|
|
SURREAL_NAMESPACE
|
|
SURREAL_DATABASE
|
|
```
|
|
|
|
### Performance
|
|
```env
|
|
SURREAL_COMMANDS_MAX_TASKS
|
|
SURREAL_COMMANDS_RETRY_ENABLED
|
|
SURREAL_COMMANDS_RETRY_MAX_ATTEMPTS
|
|
SURREAL_COMMANDS_RETRY_WAIT_STRATEGY
|
|
SURREAL_COMMANDS_RETRY_WAIT_MIN
|
|
SURREAL_COMMANDS_RETRY_WAIT_MAX
|
|
```
|
|
|
|
### API Settings
|
|
```env
|
|
API_URL
|
|
INTERNAL_API_URL
|
|
API_CLIENT_TIMEOUT
|
|
ESPERANTO_LLM_TIMEOUT
|
|
```
|
|
|
|
### Audio/TTS
|
|
```env
|
|
ELEVENLABS_API_KEY
|
|
TTS_BATCH_SIZE
|
|
```
|
|
|
|
### Debugging
|
|
```env
|
|
LANGCHAIN_TRACING_V2
|
|
LANGCHAIN_ENDPOINT
|
|
LANGCHAIN_API_KEY
|
|
LANGCHAIN_PROJECT
|
|
```
|
|
|
|
---
|
|
|
|
## Testing Configuration
|
|
|
|
### Quick Test
|
|
|
|
```bash
|
|
# Add test config
|
|
export OPENAI_API_KEY=sk-test-key
|
|
export API_URL=http://localhost:5055
|
|
|
|
# Test connection
|
|
curl http://localhost:5055/health
|
|
|
|
# Test with sample
|
|
curl -X POST http://localhost:5055/api/chat \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"message":"Hello"}'
|
|
```
|
|
|
|
### Validate Config
|
|
|
|
```bash
|
|
# Check environment variables are set
|
|
env | grep OPENAI_API_KEY
|
|
|
|
# Verify database connection
|
|
python -c "import os; print(os.getenv('SURREAL_URL'))"
|
|
```
|
|
|
|
---
|
|
|
|
## Troubleshooting Performance
|
|
|
|
### High Memory Usage
|
|
|
|
```env
|
|
# Reduce concurrency
|
|
SURREAL_COMMANDS_MAX_TASKS=2
|
|
|
|
# Reduce TTS batch size
|
|
TTS_BATCH_SIZE=1
|
|
```
|
|
|
|
### High CPU Usage
|
|
|
|
```env
|
|
# Check worker count
|
|
SURREAL_COMMANDS_MAX_TASKS
|
|
|
|
# Reduce if maxed out:
|
|
SURREAL_COMMANDS_MAX_TASKS=5
|
|
```
|
|
|
|
### Slow Responses
|
|
|
|
```env
|
|
# Check timeout settings
|
|
API_CLIENT_TIMEOUT=300
|
|
|
|
# Check retry config
|
|
SURREAL_COMMANDS_RETRY_MAX_ATTEMPTS=3
|
|
```
|
|
|
|
### Database Conflicts
|
|
|
|
```env
|
|
# Reduce concurrency
|
|
SURREAL_COMMANDS_MAX_TASKS=3
|
|
|
|
# Use jitter strategy
|
|
SURREAL_COMMANDS_RETRY_WAIT_STRATEGY=exponential_jitter
|
|
```
|
|
|
|
---
|
|
|
|
## Backup & Restore
|
|
|
|
### Data Locations
|
|
|
|
| Path | Contents |
|
|
|------|----------|
|
|
| `./data` or `/app/data` | Uploads, podcasts, checkpoints |
|
|
| `./surreal_data` or `/mydata` | SurrealDB database files |
|
|
|
|
### Quick Backup
|
|
|
|
```bash
|
|
# Stop services (recommended for consistency)
|
|
docker compose down
|
|
|
|
# Create timestamped backup
|
|
tar -czf backup-$(date +%Y%m%d-%H%M%S).tar.gz \
|
|
notebook_data/ surreal_data/
|
|
|
|
# Restart services
|
|
docker compose up -d
|
|
```
|
|
|
|
### Automated Backup Script
|
|
|
|
```bash
|
|
#!/bin/bash
|
|
# backup.sh - Run daily via cron
|
|
|
|
BACKUP_DIR="/path/to/backups"
|
|
DATE=$(date +%Y%m%d-%H%M%S)
|
|
|
|
# Create backup
|
|
tar -czf "$BACKUP_DIR/open-notebook-$DATE.tar.gz" \
|
|
/path/to/notebook_data \
|
|
/path/to/surreal_data
|
|
|
|
# Keep only last 7 days
|
|
find "$BACKUP_DIR" -name "open-notebook-*.tar.gz" -mtime +7 -delete
|
|
|
|
echo "Backup complete: open-notebook-$DATE.tar.gz"
|
|
```
|
|
|
|
Add to cron:
|
|
```bash
|
|
# Daily backup at 2 AM
|
|
0 2 * * * /path/to/backup.sh >> /var/log/open-notebook-backup.log 2>&1
|
|
```
|
|
|
|
### Restore
|
|
|
|
```bash
|
|
# Stop services
|
|
docker compose down
|
|
|
|
# Remove old data (careful!)
|
|
rm -rf notebook_data/ surreal_data/
|
|
|
|
# Extract backup
|
|
tar -xzf backup-20240115-120000.tar.gz
|
|
|
|
# Restart services
|
|
docker compose up -d
|
|
```
|
|
|
|
### Migration Between Servers
|
|
|
|
```bash
|
|
# On source server
|
|
docker compose down
|
|
tar -czf open-notebook-migration.tar.gz notebook_data/ surreal_data/
|
|
|
|
# Transfer to new server
|
|
scp open-notebook-migration.tar.gz user@newserver:/path/
|
|
|
|
# On new server
|
|
tar -xzf open-notebook-migration.tar.gz
|
|
docker compose up -d
|
|
```
|
|
|
|
---
|
|
|
|
## Container Management
|
|
|
|
### Common Commands
|
|
|
|
```bash
|
|
# Start services
|
|
docker compose up -d
|
|
|
|
# Stop services
|
|
docker compose down
|
|
|
|
# View logs (all services)
|
|
docker compose logs -f
|
|
|
|
# View logs (specific service)
|
|
docker compose logs -f api
|
|
|
|
# Restart specific service
|
|
docker compose restart api
|
|
|
|
# Update to latest version
|
|
docker compose down
|
|
docker compose pull
|
|
docker compose up -d
|
|
|
|
# Check resource usage
|
|
docker stats
|
|
|
|
# Check service health
|
|
docker compose ps
|
|
```
|
|
|
|
### Clean Up
|
|
|
|
```bash
|
|
# Remove stopped containers
|
|
docker compose rm
|
|
|
|
# Remove unused images
|
|
docker image prune
|
|
|
|
# Full cleanup (careful!)
|
|
docker system prune -a
|
|
```
|
|
|
|
---
|
|
|
|
## Summary
|
|
|
|
**Most deployments need:**
|
|
- One AI provider API key
|
|
- Default database settings
|
|
- Default timeouts
|
|
|
|
**Tune performance only if:**
|
|
- You have specific bottlenecks
|
|
- High-concurrency workload
|
|
- Custom hardware (very fast or very slow)
|
|
|
|
**Advanced features:**
|
|
- Firecrawl for better web scraping
|
|
- LangSmith for debugging workflows
|
|
- Custom CA bundles for self-signed certs
|