vrr/open-notebook

Fork 0

mirror of https://github.com/lfnovo/open-notebook.git synced 2026-04-30 20:39:55 +00:00

LUIS NOVO 472d0e47c3 docs: Lots of documentation improvements

2026-01-04 11:42:13 -03:00

12 KiB

Raw Blame History

AI Providers - Configuration Reference

Complete setup instructions for each AI provider. Pick the one you're using.

Cloud Providers (Recommended for Most)

OpenAI

Cost: ~$0.03-0.15 per 1K tokens (varies by model)

Setup:

1. Go to https://platform.openai.com/api-keys
2. Create account (if needed)
3. Create new API key (starts with "sk-proj-")
4. Add $5+ credits to account
5. Add to .env:
   OPENAI_API_KEY=sk-proj-...
6. Restart services

Environment Variable:

OPENAI_API_KEY=sk-proj-xxxxx

Available Models (in Open Notebook):

gpt-4o — Best quality, fast (latest version)
gpt-4o-mini — Fast, cheap, good for testing
o1 — Advanced reasoning model (slower, more expensive)
o1-mini — Faster reasoning model

Recommended:

For general use: gpt-4o (best balance)
For testing/cheap: gpt-4o-mini (90% cheaper)
For complex reasoning: o1 (best for hard problems)

Cost Estimate:

Light use: $1-5/month
Medium use: $10-30/month
Heavy use: $50-100+/month

Troubleshooting:

"Invalid API key" → Check key starts with "sk-proj-"
"Rate limit exceeded" → Wait or upgrade account
"Model not available" → Try gpt-4o-mini instead

Anthropic (Claude)

Cost: ~$0.80-3.00 per 1M tokens (cheaper than OpenAI for long context)

Setup:

1. Go to https://console.anthropic.com/
2. Create account or login
3. Go to API keys section
4. Create new API key (starts with "sk-ant-")
5. Add to .env:
   ANTHROPIC_API_KEY=sk-ant-...
6. Restart services

Environment Variable:

ANTHROPIC_API_KEY=sk-ant-xxxxx

Available Models:

claude-sonnet-4-5-20250929 — Latest, best quality (recommended)
claude-3-5-sonnet-20241022 — Previous generation, still excellent
claude-3-5-haiku-20241022 — Fast, cheap
claude-opus-4-5-20251101 — Most powerful, expensive

Recommended:

For general use: claude-sonnet-4-5 (best overall, latest)
For cheap: claude-3-5-haiku (80% cheaper)
For complex: claude-opus-4-5 (most capable)

Cost Estimate:

Sonnet: $3-20/month (typical use)
Haiku: $0.50-3/month
Opus: $10-50+/month

Advantages:

Great long-context support (200K tokens)
Excellent reasoning
Fast processing

Troubleshooting:

"Invalid API key" → Check it starts with "sk-ant-"
"Overloaded" → Anthropic is busy, retry later
"Model unavailable" → Check model name is correct

Google Gemini

Cost: ~$0.075-0.30 per 1K tokens (competitive with OpenAI)

Setup:

1. Go to https://aistudio.google.com/app/apikey
2. Create account or login
3. Create new API key
4. Add to .env:
   GOOGLE_API_KEY=AIzaSy...
5. Restart services

Environment Variable:

GOOGLE_API_KEY=AIzaSy...
# Optional: override default endpoint
GEMINI_API_BASE_URL=https://generativelanguage.googleapis.com/v1beta/models

Available Models:

gemini-2.0-flash-exp — Latest experimental, fastest (recommended)
gemini-2.0-flash — Stable version, fast, cheap
gemini-1.5-pro-latest — More capable, longer context
gemini-1.5-flash — Previous generation, very cheap

Recommended:

For general use: gemini-2.0-flash-exp (best value, latest)
For cheap: gemini-1.5-flash (very cheap)
For complex/long context: gemini-1.5-pro-latest (2M token context)

Advantages:

Very long context (1M tokens)
Multimodal (images, audio, video)
Good for podcasts

Troubleshooting:

"API key invalid" → Get fresh key from aistudio.google.com
"Quota exceeded" → Free tier limited, upgrade account
"Model not found" → Check model name spelling

Groq

Cost: ~$0.05 per 1M tokens (cheapest, but limited models)

Setup:

1. Go to https://console.groq.com/keys
2. Create account or login
3. Create new API key
4. Add to .env:
   GROQ_API_KEY=gsk_...
5. Restart services

Environment Variable:

GROQ_API_KEY=gsk_xxxxx

Available Models:

llama-3.3-70b-versatile — Best on Groq (recommended)
llama-3.1-70b-versatile — Fast, capable
mixtral-8x7b-32768 — Good alternative
gemma2-9b-it — Small, very fast

Recommended:

For quality: llama-3.3-70b-versatile (best overall)
For speed: gemma2-9b-it (ultra-fast)
For balance: llama-3.1-70b-versatile

Advantages:

Ultra-fast inference
Very cheap
Great for transformations/batch work

Disadvantages:

Limited model selection
Smaller models than OpenAI/Anthropic

Troubleshooting:

"Rate limited" → Free tier has limits, upgrade
"Model not available" → Check supported models list

OpenRouter

Cost: Varies by model ($0.05-15 per 1M tokens)

Setup:

1. Go to https://openrouter.ai/keys
2. Create account or login
3. Add credits to your account
4. Create new API key
5. Add to .env:
   OPENROUTER_API_KEY=sk-or-...
6. Restart services

Environment Variable:

OPENROUTER_API_KEY=sk-or-xxxxx

Available Models (100+ options):

OpenAI: openai/gpt-4o, openai/o1
Anthropic: anthropic/claude-sonnet-4.5, anthropic/claude-3.5-haiku
Google: google/gemini-2.0-flash-exp, google/gemini-1.5-pro
Meta: meta-llama/llama-3.3-70b-instruct, meta-llama/llama-3.1-405b-instruct
Mistral: mistralai/mistral-large-2411
DeepSeek: deepseek/deepseek-chat
And many more...

Recommended:

For quality: anthropic/claude-sonnet-4.5 (best overall)
For speed/cost: google/gemini-2.0-flash-exp (very fast, cheap)
For open-source: meta-llama/llama-3.3-70b-instruct
For reasoning: openai/o1

Advantages:

One API key for 100+ models
Unified billing
Easy model comparison
Access to models that may have waitlists elsewhere

Cost Estimate:

Light use: $1-5/month
Medium use: $10-30/month
Heavy use: Depends on models chosen

Troubleshooting:

"Invalid API key" → Check it starts with "sk-or-"
"Insufficient credits" → Add credits at openrouter.ai
"Model not available" → Check model ID spelling (use full path)

Self-Hosted / Local

Ollama (Recommended for Local)

Cost: Free (electricity only)

Setup:

1. Install Ollama: https://ollama.ai
2. Run Ollama in background:
   ollama serve

3. Download a model:
   ollama pull mistral
   # or llama2, neural-chat, phi, etc.

4. Add to .env:
   OLLAMA_API_BASE=http://localhost:11434
   # If on different machine:
   # OLLAMA_API_BASE=http://10.0.0.5:11434

5. Restart services

Environment Variable:

OLLAMA_API_BASE=http://localhost:11434

Available Models:

llama3.3:70b — Best quality (requires 40GB+ RAM)
llama3.1:8b — Recommended, balanced (8GB RAM)
qwen2.5:7b — Excellent for code and reasoning
mistral:7b — Good general purpose
phi3:3.8b — Small, fast (4GB RAM)
gemma2:9b — Google's model, balanced
Many more: ollama list to see available

Recommended:

For quality (with GPU): llama3.3:70b (best)
For general use: llama3.1:8b (best balance)
For speed/low memory: phi3:3.8b (very fast)
For coding: qwen2.5:7b (excellent at code)

Hardware Requirements:

GPU (NVIDIA/AMD):
  8GB VRAM: Runs most models fine
  6GB VRAM: Works, slower
  4GB VRAM: Small models only

CPU-only:
  16GB+ RAM: Slow but works
  8GB RAM: Very slow
  4GB RAM: Not recommended

Advantages:

Completely private (runs locally)
Free (electricity only)
No API key needed
Works offline

Disadvantages:

Slower than cloud (unless on GPU)
Smaller models than cloud
Requires local hardware

Troubleshooting:

"Connection refused" → Ollama not running or wrong port
"Model not found" → Download it: ollama pull modelname
"Out of memory" → Use smaller model or add more RAM

LM Studio (Local Alternative)

Cost: Free

Setup:

1. Download LM Studio: https://lmstudio.ai
2. Open app
3. Download a model from library
4. Go to "Local Server" tab
5. Start server (default port: 1234)
6. Add to .env:
   OPENAI_COMPATIBLE_BASE_URL=http://localhost:1234/v1
   OPENAI_COMPATIBLE_API_KEY=not-needed
7. Restart services

Environment Variables:

OPENAI_COMPATIBLE_BASE_URL=http://localhost:1234/v1
OPENAI_COMPATIBLE_API_KEY=lm-studio  # Just a placeholder

Advantages:

GUI interface (easier than Ollama CLI)
Good model selection
Privacy-focused
Works offline

Disadvantages:

Desktop only (Mac/Windows/Linux)
Slower than cloud
Requires local GPU

Custom OpenAI-Compatible

For Text Generation UI, vLLM, or other OpenAI-compatible endpoints:

Add to .env:
OPENAI_COMPATIBLE_BASE_URL=http://your-endpoint/v1
OPENAI_COMPATIBLE_API_KEY=your-api-key

If you need different endpoints for different modalities:

# Language model
OPENAI_COMPATIBLE_BASE_URL_LLM=http://localhost:8000/v1
OPENAI_COMPATIBLE_API_KEY_LLM=sk-...

# Embeddings
OPENAI_COMPATIBLE_BASE_URL_EMBEDDING=http://localhost:8001/v1
OPENAI_COMPATIBLE_API_KEY_EMBEDDING=sk-...

# TTS (text-to-speech)
OPENAI_COMPATIBLE_BASE_URL_TTS=http://localhost:8002/v1
OPENAI_COMPATIBLE_API_KEY_TTS=sk-...

Enterprise

Azure OpenAI

Cost: Same as OpenAI (usage-based)

Setup:

1. Create Azure OpenAI service in Azure portal
2. Deploy GPT-4/3.5-turbo model
3. Get your endpoint and key
4. Add to .env:
   AZURE_OPENAI_API_KEY=your-key
   AZURE_OPENAI_ENDPOINT=https://your-name.openai.azure.com/
   AZURE_OPENAI_API_VERSION=2024-12-01-preview
5. Restart services

Environment Variables:

AZURE_OPENAI_API_KEY=xxxxx
AZURE_OPENAI_ENDPOINT=https://your-instance.openai.azure.com/
AZURE_OPENAI_API_VERSION=2024-12-01-preview

# Optional: Different deployments for different modalities
AZURE_OPENAI_API_KEY_LLM=xxxxx
AZURE_OPENAI_ENDPOINT_LLM=https://your-instance.openai.azure.com/
AZURE_OPENAI_API_VERSION_LLM=2024-12-01-preview

Advantages:

Enterprise support
VPC integration
Compliance (HIPAA, SOC2, etc.)

Disadvantages:

More complex setup
Higher overhead
Requires Azure account

Embeddings (For Search/Semantic Features)

By default, Open Notebook uses the LLM provider's embeddings. To use a different provider:

OpenAI Embeddings (Default)

# Uses OpenAI's embedding model automatically
# Requires OPENAI_API_KEY
# No separate configuration needed

Custom Embeddings

# For other embedding providers (future feature)
EMBEDDING_PROVIDER=openai  # or custom

Choosing Your Provider

1. Don't want to run locally and don't want to mess around with different providers:

Use OpenAI

Cloud-based
Good quality
Reasonable cost
Simplest setup, supports all modes (text, embedding, tts, stt, etc)

For budget-conscious: Groq, OpenRouter or Ollama

Groq: Super cheap cloud
Ollama: Free, but local
OpenRouter: many open source models very accessible

For privacy-first: Ollama or LM Studio and Speaches

Everything stays local
Works offline
No API keys sent anywhere

For enterprise: Azure OpenAI

Compliance
VPC integration
Support

Next Steps

Choose your provider from above
Get API key (if cloud) or install locally (if Ollama)
Add to .env
Restart services
Go to Settings → Models in Open Notebook
Verify it works with a test chat

Done!

Environment Reference - Complete list of all environment variables
Advanced Configuration - Timeouts, SSL, performance tuning
Ollama Setup - Detailed Ollama configuration guide
OpenAI-Compatible - LM Studio and other compatible providers
Troubleshooting - Common issues and fixes

12 KiB Raw Blame History

AI Providers - Configuration Reference

Cloud Providers (Recommended for Most)

OpenAI

Anthropic (Claude)

Google Gemini

Groq

OpenRouter

Self-Hosted / Local

Ollama (Recommended for Local)

LM Studio (Local Alternative)

Custom OpenAI-Compatible

Enterprise

Azure OpenAI

Embeddings (For Search/Semantic Features)

OpenAI Embeddings (Default)

Custom Embeddings

Choosing Your Provider

Next Steps

Related

12 KiB

Raw Blame History