open-notebook/docs/5-CONFIGURATION/advanced.md
2026-01-04 11:42:13 -03:00

9.5 KiB

Advanced Configuration

Performance tuning, debugging, and advanced features.


Performance Tuning

Concurrency Control

# Max concurrent database operations (default: 5)
# Increase: Faster processing, more conflicts
# Decrease: Slower, fewer conflicts
SURREAL_COMMANDS_MAX_TASKS=5

Guidelines:

  • CPU: 2 cores → 2-3 tasks
  • CPU: 4 cores → 5 tasks (default)
  • CPU: 8+ cores → 10-20 tasks

Higher concurrency = more throughput but more database conflicts (retries handle this).

Retry Strategy

# How to wait between retries
SURREAL_COMMANDS_RETRY_WAIT_STRATEGY=exponential_jitter

# Options:
# - exponential_jitter (recommended)
# - exponential
# - fixed
# - random

For high-concurrency deployments, use exponential_jitter to prevent thundering herd.

Timeout Tuning

# Client timeout (default: 300 seconds)
API_CLIENT_TIMEOUT=300

# LLM timeout (default: 60 seconds)
ESPERANTO_LLM_TIMEOUT=60

Guideline: Set API_CLIENT_TIMEOUT > ESPERANTO_LLM_TIMEOUT + buffer

Example:
  ESPERANTO_LLM_TIMEOUT=120
  API_CLIENT_TIMEOUT=180  # 120 + 60 second buffer

Batching

TTS Batch Size

For podcast generation, control concurrent TTS requests:

# Default: 5
TTS_BATCH_SIZE=2

Providers and recommendations:

  • OpenAI: 5 (can handle many concurrent)
  • Google: 4 (good concurrency)
  • ElevenLabs: 2 (limited concurrent requests)
  • Local TTS: 1 (single-threaded)

Lower = slower but more stable. Higher = faster but more load on provider.


Logging & Debugging

Enable Detailed Logging

# Start with debug logging
RUST_LOG=debug  # For Rust components
LOGLEVEL=DEBUG  # For Python components

Debug Specific Components

# Only surreal operations
RUST_LOG=surrealdb=debug

# Only langchain
LOGLEVEL=langchain:debug

# Only specific module
RUST_LOG=open_notebook::database=debug

LangSmith Tracing

For debugging LLM workflows:

LANGCHAIN_TRACING_V2=true
LANGCHAIN_ENDPOINT="https://api.smith.langchain.com"
LANGCHAIN_API_KEY=your-key
LANGCHAIN_PROJECT="Open Notebook"

Then visit https://smith.langchain.com to see traces.


Port Configuration

Default Ports

Frontend: 8502 (Docker deployment)
Frontend: 3000 (Development from source)
API: 5055
SurrealDB: 8000

Changing Frontend Port

Edit docker-compose.yml:

services:
  open-notebook:
    ports:
      - "8001:8502"  # Change from 8502 to 8001

Access at: http://localhost:8001

API auto-detects to: http://localhost:5055

Changing API Port

services:
  open-notebook:
    ports:
      - "127.0.0.1:8502:8502"  # Frontend
      - "5056:5055"            # Change API from 5055 to 5056
    environment:
      - API_URL=http://localhost:5056  # Update API_URL

Access API directly: http://localhost:5056/docs

Note: When changing API port, you must set API_URL explicitly since auto-detection assumes port 5055.

Changing SurrealDB Port

services:
  surrealdb:
    ports:
      - "8001:8000"  # Change from 8000 to 8001
    environment:
      - SURREAL_URL=ws://surrealdb:8001/rpc  # Update connection URL

Important: Internal Docker network uses container name (surrealdb), not localhost.


SSL/TLS Configuration

Custom CA Certificate

For self-signed certs on local providers:

ESPERANTO_SSL_CA_BUNDLE=/path/to/ca-bundle.pem

Disable Verification (Development Only)

# WARNING: Only for testing/development
# Vulnerable to MITM attacks
ESPERANTO_SSL_VERIFY=false

Multi-Provider Setup

Use Different Providers for Different Tasks

# Language model (main)
OPENAI_API_KEY=sk-proj-...

# Embeddings (alternative)
# (Future: Configure different embedding provider)

# TTS (different provider)
ELEVENLABS_API_KEY=...

OpenAI-Compatible with Fallback

# Primary
OPENAI_COMPATIBLE_BASE_URL=http://localhost:1234/v1
OPENAI_COMPATIBLE_API_KEY=key1

# Can also set specific modality endpoints
OPENAI_COMPATIBLE_BASE_URL_LLM=http://localhost:1234/v1
OPENAI_COMPATIBLE_BASE_URL_EMBEDDING=http://localhost:8001/v1

Security Hardening

Change Default Credentials

# Don't use defaults in production
SURREAL_USER=your_secure_username
SURREAL_PASSWORD=$(openssl rand -base64 32)  # Generate secure password

Add Password Protection

# Protect your Open Notebook instance
OPEN_NOTEBOOK_PASSWORD=your_secure_password

Use HTTPS

# Always use HTTPS in production
API_URL=https://mynotebook.example.com

Firewall Rules

Restrict access to your Open Notebook:

  • Port 8502 (frontend): Only from your IP
  • Port 5055 (API): Only from frontend
  • Port 8000 (SurrealDB): Never expose to internet

Web Scraping & Content Extraction

Open Notebook uses multiple services for content extraction:

Firecrawl

For advanced web scraping:

FIRECRAWL_API_KEY=your-key

Get key from: https://firecrawl.dev/

Jina AI

Alternative web extraction:

JINA_API_KEY=your-key

Get key from: https://jina.ai/


Environment Variable Groups

API Keys (Choose at least one)

OPENAI_API_KEY
ANTHROPIC_API_KEY
GOOGLE_API_KEY
GROQ_API_KEY
MISTRAL_API_KEY
DEEPSEEK_API_KEY
OPENROUTER_API_KEY
XAI_API_KEY

AI Provider Endpoints

OLLAMA_API_BASE
OPENAI_COMPATIBLE_BASE_URL
AZURE_OPENAI_ENDPOINT
GEMINI_API_BASE_URL

Database

SURREAL_URL
SURREAL_USER
SURREAL_PASSWORD
SURREAL_NAMESPACE
SURREAL_DATABASE

Performance

SURREAL_COMMANDS_MAX_TASKS
SURREAL_COMMANDS_RETRY_ENABLED
SURREAL_COMMANDS_RETRY_MAX_ATTEMPTS
SURREAL_COMMANDS_RETRY_WAIT_STRATEGY
SURREAL_COMMANDS_RETRY_WAIT_MIN
SURREAL_COMMANDS_RETRY_WAIT_MAX

API Settings

API_URL
INTERNAL_API_URL
API_CLIENT_TIMEOUT
ESPERANTO_LLM_TIMEOUT

Audio/TTS

ELEVENLABS_API_KEY
TTS_BATCH_SIZE

Debugging

LANGCHAIN_TRACING_V2
LANGCHAIN_ENDPOINT
LANGCHAIN_API_KEY
LANGCHAIN_PROJECT

Testing Configuration

Quick Test

# Add test config
export OPENAI_API_KEY=sk-test-key
export API_URL=http://localhost:5055

# Test connection
curl http://localhost:5055/health

# Test with sample
curl -X POST http://localhost:5055/api/chat \
  -H "Content-Type: application/json" \
  -d '{"message":"Hello"}'

Validate Config

# Check environment variables are set
env | grep OPENAI_API_KEY

# Verify database connection
python -c "import os; print(os.getenv('SURREAL_URL'))"

Troubleshooting Performance

High Memory Usage

# Reduce concurrency
SURREAL_COMMANDS_MAX_TASKS=2

# Reduce TTS batch size
TTS_BATCH_SIZE=1

High CPU Usage

# Check worker count
SURREAL_COMMANDS_MAX_TASKS

# Reduce if maxed out:
SURREAL_COMMANDS_MAX_TASKS=5

Slow Responses

# Check timeout settings
API_CLIENT_TIMEOUT=300

# Check retry config
SURREAL_COMMANDS_RETRY_MAX_ATTEMPTS=3

Database Conflicts

# Reduce concurrency
SURREAL_COMMANDS_MAX_TASKS=3

# Use jitter strategy
SURREAL_COMMANDS_RETRY_WAIT_STRATEGY=exponential_jitter

Backup & Restore

Data Locations

Path Contents
./data or /app/data Uploads, podcasts, checkpoints
./surreal_data or /mydata SurrealDB database files

Quick Backup

# Stop services (recommended for consistency)
docker compose down

# Create timestamped backup
tar -czf backup-$(date +%Y%m%d-%H%M%S).tar.gz \
  notebook_data/ surreal_data/

# Restart services
docker compose up -d

Automated Backup Script

#!/bin/bash
# backup.sh - Run daily via cron

BACKUP_DIR="/path/to/backups"
DATE=$(date +%Y%m%d-%H%M%S)

# Create backup
tar -czf "$BACKUP_DIR/open-notebook-$DATE.tar.gz" \
  /path/to/notebook_data \
  /path/to/surreal_data

# Keep only last 7 days
find "$BACKUP_DIR" -name "open-notebook-*.tar.gz" -mtime +7 -delete

echo "Backup complete: open-notebook-$DATE.tar.gz"

Add to cron:

# Daily backup at 2 AM
0 2 * * * /path/to/backup.sh >> /var/log/open-notebook-backup.log 2>&1

Restore

# Stop services
docker compose down

# Remove old data (careful!)
rm -rf notebook_data/ surreal_data/

# Extract backup
tar -xzf backup-20240115-120000.tar.gz

# Restart services
docker compose up -d

Migration Between Servers

# On source server
docker compose down
tar -czf open-notebook-migration.tar.gz notebook_data/ surreal_data/

# Transfer to new server
scp open-notebook-migration.tar.gz user@newserver:/path/

# On new server
tar -xzf open-notebook-migration.tar.gz
docker compose up -d

Container Management

Common Commands

# Start services
docker compose up -d

# Stop services
docker compose down

# View logs (all services)
docker compose logs -f

# View logs (specific service)
docker compose logs -f api

# Restart specific service
docker compose restart api

# Update to latest version
docker compose down
docker compose pull
docker compose up -d

# Check resource usage
docker stats

# Check service health
docker compose ps

Clean Up

# Remove stopped containers
docker compose rm

# Remove unused images
docker image prune

# Full cleanup (careful!)
docker system prune -a

Summary

Most deployments need:

  • One AI provider API key
  • Default database settings
  • Default timeouts

Tune performance only if:

  • You have specific bottlenecks
  • High-concurrency workload
  • Custom hardware (very fast or very slow)

Advanced features:

  • Firecrawl for better web scraping
  • LangSmith for debugging workflows
  • Custom CA bundles for self-signed certs