open-notebook/docs/troubleshooting/common-issues.md
2025-10-20 15:25:28 -03:00

17 KiB

Common Issues and Solutions

This document covers the most frequently encountered issues when installing, configuring, and using Open Notebook, along with their solutions.

🆘 Quick Fixes for Setup Issues

Most problems are caused by incorrect API_URL configuration. Choose your scenario below for instant fixes.

"Unable to connect to server" or "Connection Error"

This is the #1 issue for new users. The frontend can't reach the API.

Diagnostic Checklist:

  1. Are both ports exposed?

    docker ps
    # Should show: 0.0.0.0:8502->8502 AND 0.0.0.0:5055->5055
    

    Fix: Add -p 5055:5055 to your docker run command, or add it to docker-compose.yml:

    ports:
      - "8502:8502"
      - "5055:5055"  # Add this!
    
  2. Are you accessing from a different machine than where Docker is running?

    Determine your server's IP:

    # On the Docker host machine:
    hostname -I          # Linux
    ipconfig             # Windows
    ifconfig | grep inet # Mac
    

    Fix: Set environment variable (replace with your actual server IP):

    Docker Compose:

    environment:
      - API_URL=http://192.168.1.100:5055
    

    Docker Run:

    -e API_URL=http://192.168.1.100:5055
    
  3. Using localhost in API_URL but accessing remotely?

    Wrong:

    Access from browser: http://192.168.1.100:8502
    API_URL setting: http://localhost:5055  # This won't work!
    

    Correct:

    Access from browser: http://192.168.1.100:8502
    API_URL setting: http://192.168.1.100:5055
    

Common Scenarios:

Your Setup Access URL API_URL Value
Docker on your laptop, accessed locally http://localhost:8502 Not needed (or http://localhost:5055)
Docker on Proxmox VM at 192.168.1.50 http://192.168.1.50:8502 http://192.168.1.50:5055
Docker on Raspberry Pi at 10.0.0.10 http://10.0.0.10:8502 http://10.0.0.10:5055
Docker on NAS at nas.local http://nas.local:8502 http://nas.local:5055
Behind reverse proxy at notebook.mydomain.com https://notebook.mydomain.com https://notebook.mydomain.com/api

After changing API_URL:

Always restart the container:

# Docker Compose
docker compose down
docker compose up -d

# Docker Run
docker stop open-notebook
docker rm open-notebook
# Then run your docker run command again

Frontend trying to connect on port 8502 instead of 5055

Symptom: Frontend tries to access http://10.10.10.107:8502/api/config instead of using port 5055.

Cause: API_URL is not set correctly or you're using an old version.

Fix:

  1. Ensure you're using version 1.0.6+ which supports runtime API_URL
  2. Set API_URL environment variable (not NEXT_PUBLIC_API_URL)
  3. Restart container after setting the variable
    docker compose down && docker compose up -d
    

"API config endpoint returned status 404"

Cause: You added /api to the end of API_URL.

Wrong: API_URL=http://192.168.1.100:5055/api

Correct: API_URL=http://192.168.1.100:5055

The /api path is added automatically by the application.


"Missing authorization header"

Cause: You have password authentication enabled but it's not configured correctly.

Fix: Set the password in your environment:

environment:
  - OPEN_NOTEBOOK_PASSWORD=your_secure_password

Or provide it when logging into the web interface.


Installation Problems

Port Already in Use

Problem: Error message "Port 8502 is already in use" or similar port conflicts.

Symptoms:

  • Cannot start React frontend
  • Error messages about address already in use
  • Services failing to bind to ports

Solutions:

  1. Find and stop conflicting process:

    # Check what's using port 8502
    lsof -i :8502
    
    # Kill the process (replace PID with actual process ID)
    kill -9 <PID>
    
  2. Use different ports:

    # For React frontend
    uv run --env-file .env cd frontend && npm run dev --server.port=8503
    
    # For Docker deployment, modify docker-compose.yml
    ports:
      - "8503:8502"  # host:container
    
  3. Common port conflicts:

    • Port 8502 (Next.js): Often used by other Next.js apps
    • Port 5055 (API): May conflict with other web services
    • Port 8000 (SurrealDB): May conflict with other databases

Permission Denied (Docker)

Problem: Docker commands fail with permission denied errors.

Symptoms:

  • "permission denied while trying to connect to the Docker daemon socket"
  • Docker commands require sudo

Solutions:

  1. Add user to docker group (Linux):

    sudo usermod -aG docker $USER
    
    # Log out and log back in, or run:
    newgrp docker
    
  2. Start Docker service (Linux):

    sudo systemctl start docker
    sudo systemctl enable docker
    
  3. Restart Docker Desktop (Windows/Mac):

    • Close Docker Desktop completely
    • Restart Docker Desktop
    • Wait for it to fully start

Python/uv Installation Issues

Problem: uv command not found or Python version conflicts.

Symptoms:

  • "uv: command not found"
  • Python version mismatch errors
  • Virtual environment issues

Solutions:

  1. Install uv package manager:

    # macOS
    brew install uv
    
    # Linux/WSL
    curl -LsSf https://astral.sh/uv/install.sh | sh
    source ~/.bashrc
    
    # Windows
    powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
    
  2. Fix Python version issues:

    # Install specific Python version
    uv python install 3.11
    
    # Pin Python version for project
    uv python pin 3.11
    
    # Recreate virtual environment
    uv sync --reinstall
    
  3. Clear uv cache:

    uv cache clean
    

SurrealDB Connection Issues

Problem: Cannot connect to SurrealDB database.

Symptoms:

  • "Connection refused" errors
  • Database queries failing
  • Timeout errors

Solutions:

  1. Check SurrealDB is running:

    # For Docker
    docker compose ps surrealdb
    
    # Check logs
    docker compose logs surrealdb
    
  2. Verify connection settings:

    # Check environment variables
    echo $SURREAL_URL
    echo $SURREAL_USER
    
    # Test connection
    curl http://localhost:8000/health
    
  3. Restart SurrealDB:

    docker compose restart surrealdb
    # Wait 10 seconds for startup
    sleep 10
    
  4. Check file permissions:

    # Ensure data directory is writable
    ls -la surreal_data/
    
    # Fix permissions if needed
    sudo chown -R $USER:$USER surreal_data/
    

Runtime Errors

AI Provider API Errors

Problem: Errors when using AI models (OpenAI, Anthropic, etc.).

Symptoms:

  • "Invalid API key" errors
  • "Rate limit exceeded" messages
  • Model not found errors

Solutions:

  1. Verify API keys:

    # Check key format (don't expose full key)
    echo $OPENAI_API_KEY | cut -c1-10
    
    # Test OpenAI key
    curl -H "Authorization: Bearer $OPENAI_API_KEY" \
         https://api.openai.com/v1/models
    
  2. Check billing and usage:

  3. Verify model availability:

    # Check model names in settings
    # Use gpt-5-mini instead of gpt-4-mini
    # Use claude-3-haiku-20240307 instead of claude-3-haiku
    
  4. Handle rate limits:

    • Wait before retrying
    • Use lower-tier models for testing
    • Check provider rate limits

API Timeout Errors During Transformations

Problem: Timeout errors when running transformations or generating insights, even though the operation completes successfully.

Symptoms:

  • "timeout of 30000ms exceeded" in React frontend
  • "Failed to connect to API: timed out" in Streamlit UI
  • Transformation completes after a few minutes, but error appears after 30-60 seconds
  • Common with local models (Ollama), remote LM Studio, or slow hardware

Solutions:

  1. Increase API client timeout (recommended):

    # Add to your .env file
    API_CLIENT_TIMEOUT=600  # 10 minutes (600 seconds)
    

    This controls how long the frontend/UI waits for API responses. Default is 300 seconds (5 minutes).

  2. Adjust timeout based on your setup:

    # Fast cloud APIs (OpenAI, Anthropic, Groq)
    API_CLIENT_TIMEOUT=300  # 5 minutes (default)
    
    # Local Ollama on GPU
    API_CLIENT_TIMEOUT=600  # 10 minutes
    
    # Local Ollama on CPU or slow hardware
    API_CLIENT_TIMEOUT=1200  # 20 minutes
    
    # Remote LM Studio over slow network
    API_CLIENT_TIMEOUT=900  # 15 minutes
    
  3. Increase LLM provider timeout if needed:

    # Add to your .env file if the model itself is timing out
    ESPERANTO_LLM_TIMEOUT=180  # 3 minutes (default is 60s)
    

    Only increase this if you see errors during actual model inference, not just HTTP timeouts.

  4. Use faster models for testing:

    • Test with cloud APIs first to verify setup
    • Try smaller local models (e.g., gemma2:2b instead of llama3:70b)
    • Preload models before running transformations: ollama run model-name
  5. Restart services after configuration changes:

    # For Docker
    docker compose down
    docker compose up -d
    
    # For source installation
    make stop-all
    make start-all
    

Important Notes:

  • API_CLIENT_TIMEOUT should be HIGHER than ESPERANTO_LLM_TIMEOUT for proper error handling
  • If transformations complete successfully after refresh, you only need to increase API_CLIENT_TIMEOUT
  • First time running a model may be slower due to model loading

Related GitHub Issue: #131

Memory and Performance Issues

Problem: Application running slowly or crashing due to memory issues.

Symptoms:

  • Slow response times
  • Out of memory errors
  • Application crashes
  • High CPU usage

Solutions:

  1. Increase Docker memory:

    # Docker Desktop → Settings → Resources → Memory
    # Increase to 4GB or more
    
  2. Monitor resource usage:

    # Check Docker stats
    docker stats
    
    # Check system resources
    htop
    top
    
  3. Optimize model usage:

    • Use smaller models (gpt-5-mini vs gpt-5)
    • Reduce context window size
    • Process fewer documents at once
  4. Clear application cache:

    # Clear Python cache
    find . -name "__pycache__" -type d -exec rm -rf {} +
    
    # Clear Next.js cache
    rm -rf ~/.streamlit/cache/
    

Background Job Failures

Problem: Background tasks (podcast generation, transformations) failing.

Symptoms:

  • Jobs stuck in "processing" state
  • No podcast audio generated
  • Transformations not completing

Solutions:

  1. Check worker status:

    # Check if worker is running
    pgrep -f "surreal-commands-worker"
    
    # Restart worker
    make worker-restart
    
  2. Check job logs:

    # View worker logs
    docker compose logs worker
    
    # Check command status in database
    # (Access through UI or API)
    
  3. Verify AI provider configuration:

    • Ensure TTS/STT models are configured
    • Check API keys for required providers
    • Test models individually
  4. Clear stuck jobs:

    # Restart all services
    make stop-all
    make start-all
    

File Upload Issues

Problem: Cannot upload files or file processing fails.

Symptoms:

  • Upload button not working
  • File processing errors
  • Unsupported file type messages

Solutions:

  1. Check file size limits:

    # Default Next.js limit is 200MB
    # Large files may timeout
    
  2. Verify file types:

    • PDF: Standard PDF files (not password protected)
    • Images: PNG, JPG, GIF, WebP
    • Audio: MP3, WAV, M4A
    • Video: MP4, AVI, MOV (for transcript extraction)
    • Documents: TXT, DOC, DOCX
  3. Check file permissions:

    # Ensure files are readable
    ls -la /path/to/file
    
    # Fix permissions
    chmod 644 /path/to/file
    
  4. Test with smaller files:

    • Try with a simple text file first
    • Gradually increase complexity

Performance Issues

Slow Search and Chat

Problem: Search and chat responses are very slow.

Symptoms:

  • Long wait times for responses
  • Timeout errors
  • Poor user experience

Solutions:

  1. Optimize embedding model:

    • Use faster embedding models
    • Reduce embedding dimensions
    • Process fewer documents at once
  2. Database optimization:

    # Check database performance
    docker compose logs surrealdb
    
    # Consider using RocksDB for better performance
    # (Already configured in docker-compose.yml)
    
  3. Reduce context size:

    • Limit search results
    • Use shorter prompts
    • Reduce notebook size
  4. Use faster models:

    • gpt-5-mini instead of gpt-5
    • claude-3-haiku instead of claude-3-opus
    • Local models for simple tasks

High Resource Usage

Problem: Application consuming too much CPU or memory.

Symptoms:

  • High CPU usage in task manager
  • System becoming unresponsive
  • Docker containers using excessive resources

Solutions:

  1. Set resource limits:

    # In docker-compose.yml
    services:
      open_notebook:
        deploy:
          resources:
            limits:
              memory: 2G
              cpus: "1.0"
    
  2. Monitor and identify bottlenecks:

    # Check which service is consuming resources
    docker stats
    
    # Check system processes
    htop
    
  3. Optimize processing:

    • Process documents in batches
    • Use background jobs for heavy tasks
    • Limit concurrent operations

Configuration Problems

Environment Variables Not Loading

Problem: Environment variables are not being read correctly.

Symptoms:

  • Default values being used instead of configured values
  • API keys not recognized
  • Connection errors to external services

Solutions:

  1. Check file names:

    # For source installation
    ls -la .env
    
    # For Docker
    ls -la docker.env
    
  2. Verify file format:

    # Check for invisible characters
    cat -A .env
    
    # Ensure no spaces around equals
    OPENAI_API_KEY=value  # Correct
    OPENAI_API_KEY = value  # Incorrect
    
  3. Check environment loading:

    # Test environment variable
    echo $OPENAI_API_KEY
    
    # For Docker
    docker compose config
    
  4. Restart services after changes:

    # For Docker
    docker compose down
    docker compose up -d
    
    # For source installation
    make stop-all
    make start-all
    

Model Configuration Issues

Problem: AI models not working or configured incorrectly.

Symptoms:

  • Model not found errors
  • Incorrect responses
  • Configuration not saving

Solutions:

  1. Check model names:

    # Use exact model names from provider documentation
    # OpenAI: gpt-5-mini, gpt-5, text-embedding-3-small
    # Anthropic: claude-3-haiku-20240307, claude-3-sonnet-20240229
    
  2. Verify provider configuration:

    • Check API keys are valid
    • Ensure models are available for your account
    • Test with simple requests first
  3. Reset model configuration:

    • Go to Models
    • Clear all configurations
    • Reconfigure with known working models
  4. Check provider status:

    • Visit provider status pages
    • Check for service outages
    • Try alternative providers

Database Schema Issues

Problem: Database schema conflicts or migration issues.

Symptoms:

  • Field validation errors
  • Query failures
  • Data not saving correctly

Solutions:

  1. Check database logs:

    docker compose logs surrealdb
    
  2. Reset database (WARNING: This deletes all data):

    # Stop services
    make stop-all
    
    # Remove database files
    rm -rf surreal_data/
    
    # Restart services (will recreate database)
    make start-all
    
  3. Manual schema update:

    # Run migrations
    uv run python -m open_notebook.database.async_migrate
    
  4. Check SurrealDB version:

    # Ensure using compatible version
    docker compose pull surrealdb
    docker compose up -d
    

Getting Help

If you've tried the solutions above and are still experiencing issues:

  1. Collect diagnostic information:

    # System information
    uname -a
    docker version
    docker compose version
    
    # Service status
    make status
    
    # Recent logs
    docker compose logs --tail=100 > logs.txt
    
  2. Create a minimal reproduction:

    • Start with a fresh installation
    • Use minimal configuration
    • Document exact steps to reproduce
  3. Ask for help:

Remember to remove API keys and sensitive information before sharing logs or configuration files.