12 KiB
Frequently Asked Questions
This document addresses common questions about Open Notebook usage, configuration, and best practices.
General Usage
What is Open Notebook?
Open Notebook is an open-source, privacy-focused alternative to Google's Notebook LM. It allows you to:
- Create and manage research notebooks
- Chat with your documents using AI
- Generate podcasts from your content
- Search across all your sources with semantic search
- Transform and analyze your content
How is Open Notebook different from Google Notebook LM?
Privacy: Your data stays local by default. Only your chosen AI providers receive queries. Flexibility: Support for 15+ AI providers (OpenAI, Anthropic, Google, local models, etc.) Customization: Open source, so you can modify and extend functionality Control: You control your data, models, and processing
Do I need technical skills to use Open Notebook?
Basic usage: No technical skills required. The Docker installation is designed for non-technical users. Advanced features: Some technical knowledge helpful for:
- Custom model configurations
- API integrations
- Source code modifications
Can I use Open Notebook offline?
Partially: The application runs locally, but requires internet for:
- AI model API calls (unless using local models like Ollama)
- Web content scraping
- Some file processing features
Fully offline: Possible with local models (Ollama) for basic functionality.
What file types does Open Notebook support?
Documents:
- PDF (text extraction)
- Microsoft Word (DOC, DOCX)
- Plain text (TXT)
- Markdown (MD)
Web Content:
- URLs (automatic web scraping)
- YouTube videos (transcript extraction)
- Web articles and blog posts
Media:
- Images (PNG, JPG, GIF, WebP) with OCR
- Audio files (MP3, WAV, M4A) with transcription
- Video files (MP4, AVI, MOV) for transcript extraction
Other:
- Direct text input
- CSV data
- Code files (with syntax highlighting)
How much does it cost to run Open Notebook?
Software: Free (open source) AI API costs: Pay-per-use to providers:
- OpenAI: ~$0.50-5 per 1M tokens depending on model
- Anthropic: ~$3-75 per 1M tokens depending on model
- Google: Often free tier available
- Local models: Free after initial setup
Typical monthly costs: $5-50 for moderate usage, depending on chosen models.
AI Models and Providers
Which AI provider should I choose?
For beginners: OpenAI (reliable, well-documented, good balance of cost/quality) For advanced users: Mix of providers based on specific needs For privacy: Local models (Ollama) or European providers (Mistral) For cost optimization: DeepSeek, Google (free tier), or OpenRouter
Can I use multiple AI providers simultaneously?
Yes: Open Notebook supports multiple providers. You can configure different providers for different tasks:
- OpenAI for chat
- Google for embeddings
- ElevenLabs for text-to-speech
- Anthropic for complex reasoning
What are the best model combinations?
Budget-friendly:
- Language:
gpt-4o-mini(OpenAI) ordeepseek-chat(DeepSeek) - Embedding:
text-embedding-3-small(OpenAI) - TTS:
tts-1(OpenAI)
High-quality:
- Language:
claude-3-5-sonnet(Anthropic) orgpt-4o(OpenAI) - Embedding:
text-embedding-3-large(OpenAI) - TTS:
eleven_turbo_v2_5(ElevenLabs)
Privacy-focused:
- Language: Local Ollama models
- Embedding: Local embedding models
- TTS: Local TTS solutions
How do I set up local models with Ollama?
- Install Ollama: Download from https://ollama.ai
- Start Ollama:
ollama serve - Download models:
ollama pull llama2 - Configure Open Notebook:
OLLAMA_API_BASE=http://localhost:11434 - Select models: In Settings → Models, choose Ollama models
Why are my AI requests failing?
Common causes:
- Invalid API keys
- Insufficient credits/billing
- Model not available for your account
- Rate limiting
- Network connectivity issues
Solutions:
- Verify API keys in provider dashboard
- Check billing and usage limits
- Try different models
- Wait and retry for rate limits
- Check internet connection
How do I optimize AI costs?
Model selection:
- Use smaller models for simple tasks
- Use larger models only for complex reasoning
- Leverage free tiers when available
Usage optimization:
- Process documents in batches
- Use shorter prompts
- Cache results when possible
- Use local models for frequent tasks
Provider diversity:
- Use OpenRouter for expensive models
- Use free tier providers for testing
- Mix providers based on strength
Data Management
Where is my data stored?
Local storage: By default, all data is stored locally:
- Database: SurrealDB files in
surreal_data/ - Uploads: Files in
notebook_data/ - No external data transmission (except to chosen AI providers)
Cloud storage: Not implemented, but can be configured with external storage solutions.
How do I backup my data?
Manual backup:
# Create backup
tar -czf backup-$(date +%Y%m%d).tar.gz notebook_data/ surreal_data/
# Restore backup
tar -xzf backup-20240101.tar.gz
Automated backup: Set up cron jobs or use your preferred backup solution to backup the data directories.
Can I sync data between devices?
Currently: No built-in sync functionality. Workarounds:
- Use shared network storage for data directories
- Manual backup/restore between devices
- Database replication (advanced)
How do I migrate data between installations?
- Stop services:
make stop-all - Copy data directories:
cp -r surreal_data/ new_installation/ cp -r notebook_data/ new_installation/ - Start new installation
- Verify data integrity
What happens to my data if I delete a notebook?
Soft deletion: Notebooks are marked as archived, not permanently deleted. Hard deletion: Currently not implemented in UI, but possible via API. Recovery: Archived notebooks can be restored from the database.
How do I clean up old data?
Manual cleanup:
- Delete unused notebooks through UI
- Remove old files from
notebook_data/ - Clear browser cache
Database cleanup: Advanced users can query the database directly to remove old records.
Best Practices
How should I organize my notebooks?
By topic: Create separate notebooks for different research areas By project: One notebook per project or course By source type: Separate notebooks for different content types By time period: Monthly or quarterly notebooks
What's the optimal notebook size?
Recommended: 20-100 sources per notebook Performance: Larger notebooks may have slower search Organization: Better to have focused notebooks than everything in one
How do I get the best search results?
Use descriptive queries: Instead of "data", use "data analysis methods" Combine keywords: Use multiple related terms Use natural language: Ask questions as you would to a human Refine iteratively: Start broad, then get more specific
How can I improve chat responses?
Provide context: Reference specific sources or topics Be specific: Ask detailed questions rather than general ones Use follow-up questions: Build on previous responses Include examples: Show what kind of response you want
What's the best way to process large documents?
Break into sections: Split large documents into smaller parts Use transformations: Apply summarization before adding to notebook Batch processing: Process multiple documents at once Use background jobs: For heavy processing tasks
How do I handle multiple languages?
Model selection: Choose models that support your languages Language-specific providers: Some providers are better for certain languages Separate notebooks: Consider separate notebooks for different languages Encoding: Ensure proper text encoding for non-English content
What are the security best practices?
API keys: Never share API keys publicly Password protection: Use strong passwords for public deployments Network security: Use HTTPS for production deployments Regular updates: Keep Docker images updated Backup encryption: Encrypt backups if they contain sensitive data
How do I optimize performance?
Hardware:
- Use SSD storage for database
- Allocate sufficient RAM (4GB+ recommended)
- Use fast internet connection
Configuration:
- Choose appropriate models for your needs
- Optimize embedding dimensions
- Use efficient file formats
Usage patterns:
- Process documents in batches
- Use background jobs for heavy tasks
- Clear cache periodically
Technical Questions
Can I use Open Notebook programmatically?
Yes: Open Notebook provides a comprehensive REST API:
- Full API documentation at
/docsendpoint - Support for all UI functionality
- Authentication via API keys
- Webhook support for notifications
How do I extend Open Notebook?
Plugin system: Add custom transformations and processors API integration: Build custom applications using the API Source code: Modify the open-source codebase Custom models: Add support for new AI providers
Can I run Open Notebook in production?
Yes: Designed for production use with:
- Docker deployment
- Horizontal scaling capability
- Security features
- Monitoring and logging
Considerations:
- Use production-grade database settings
- Implement proper backup strategy
- Configure monitoring and alerting
- Use HTTPS and security best practices
How do I contribute to Open Notebook?
Ways to contribute:
- Report bugs and issues
- Suggest new features
- Contribute code improvements
- Improve documentation
- Help other users in the community
Getting started:
- Join Discord community
- Check GitHub issues
- Read contribution guidelines
- Start with small improvements
What's the development roadmap?
Current focus:
- Stability and performance improvements
- Additional AI provider support
- Enhanced podcast generation
- Better mobile experience
Future plans:
- Multi-user support
- Advanced analytics
- Integration with external tools
- Cloud deployment options
Troubleshooting
My question isn't answered here. What should I do?
- Check the troubleshooting guide: Common Issues
- Search existing issues: GitHub repository issues
- Ask the community: Discord server
- Create a GitHub issue: For bugs or feature requests
- Check the documentation: Other documentation sections
How do I report a bug?
Include:
- Steps to reproduce
- Expected vs actual behavior
- Error messages and logs
- System information
- Configuration details (without API keys)
Submit to: GitHub Issues with bug report template
How do I request a new feature?
Process:
- Check if feature already exists or is planned
- Discuss in Discord to gauge interest
- Create detailed GitHub issue
- Consider contributing implementation
Where can I get help with installation?
Resources:
- Installation Guide
- Docker Deployment Guide
- ChatGPT Installation Assistant
- Discord community support
How do I stay updated with new releases?
Methods:
- Watch GitHub repository
- Join Discord for announcements
- Follow release notes
- Enable automatic Docker updates
This FAQ is updated regularly based on community questions and feedback. If you have a question that's not covered here, please ask in our Discord community or create a GitHub issue.