mirror of
https://github.com/lfnovo/open-notebook.git
synced 2026-04-28 11:30:00 +00:00
891 lines
27 KiB
Markdown
891 lines
27 KiB
Markdown
# Open Notebook Architecture
|
|
|
|
## High-Level Overview
|
|
|
|
Open Notebook follows a three-tier architecture with clear separation of concerns:
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────┐
|
|
│ Your Browser │
|
|
│ Access: http://your-server-ip:8502 │
|
|
└────────────────┬────────────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌───────────────┐
|
|
│ Port 8502 │ ← Next.js Frontend (what you see)
|
|
│ Frontend │ Also proxies API requests internally!
|
|
└───────┬───────┘
|
|
│ proxies /api/* requests ↓
|
|
▼
|
|
┌───────────────┐
|
|
│ Port 5055 │ ← FastAPI Backend (handles requests)
|
|
│ API │
|
|
└───────┬───────┘
|
|
│
|
|
▼
|
|
┌───────────────┐
|
|
│ SurrealDB │ ← Database (internal, auto-configured)
|
|
│ (Port 8000) │
|
|
└───────────────┘
|
|
```
|
|
|
|
**Key Points:**
|
|
- **v1.1+**: Next.js automatically proxies `/api/*` requests to the backend, simplifying reverse proxy setup
|
|
- Your browser loads the frontend from port 8502
|
|
- The frontend needs to know where to find the API - when accessing remotely, set: `API_URL=http://your-server-ip:5055`
|
|
- **Behind reverse proxy?** You only need to proxy to port 8502 now! See [Reverse Proxy Configuration](../5-CONFIGURATION/reverse-proxy.md)
|
|
|
|
---
|
|
|
|
## Detailed Architecture
|
|
|
|
Open Notebook is built on a **three-tier, async-first architecture** designed for scalability, modularity, and multi-provider AI flexibility. The system separates concerns across frontend, API, and database layers, with LangGraph powering intelligent workflows and Esperanto enabling seamless integration with 8+ AI providers.
|
|
|
|
**Core Philosophy**:
|
|
- Privacy-first: Users control their data and AI provider choice
|
|
- Async/await throughout: Non-blocking operations for responsive UX
|
|
- Domain-Driven Design: Clear separation between domain models, repositories, and orchestrators
|
|
- Multi-provider flexibility: Swap AI providers without changing application code
|
|
- Self-hosted capable: All components deployable in isolated environments
|
|
|
|
---
|
|
|
|
## Three-Tier Architecture
|
|
|
|
### Layer 1: Frontend (React/Next.js @ port 3000)
|
|
|
|
**Purpose**: Responsive, interactive user interface for research, notes, chat, and podcast management.
|
|
|
|
**Technology Stack**:
|
|
- **Framework**: Next.js 15 with React 19
|
|
- **Language**: TypeScript with strict type checking
|
|
- **State Management**: Zustand (lightweight store) + TanStack Query (server state)
|
|
- **Styling**: Tailwind CSS + Shadcn/ui component library
|
|
- **Build Tool**: Webpack (bundled via Next.js)
|
|
|
|
**Key Responsibilities**:
|
|
- Render notebooks, sources, notes, chat sessions, and podcasts
|
|
- Handle user interactions (create, read, update, delete operations)
|
|
- Manage complex UI state (modals, file uploads, real-time search)
|
|
- Stream responses from API (chat, podcast generation)
|
|
- Display embeddings, vector search results, and insights
|
|
|
|
**Communication Pattern**:
|
|
- All data fetched via REST API (async requests to port 5055)
|
|
- Configured base URL: `http://localhost:5055` (dev) or environment-specific (prod)
|
|
- TanStack Query handles caching, refetching, and data synchronization
|
|
- Zustand stores global state (user, notebooks, selected context)
|
|
- CORS enabled on API side for cross-origin requests
|
|
|
|
**Component Architecture**:
|
|
- `/src/app/`: Next.js App Router (pages, layouts)
|
|
- `/src/components/`: Reusable React components (buttons, forms, cards)
|
|
- `/src/hooks/`: Custom hooks (useNotebook, useChat, useSearch)
|
|
- `/src/lib/`: Utility functions, API clients, validators
|
|
- `/src/styles/`: Global CSS, Tailwind config
|
|
|
|
---
|
|
|
|
### Layer 2: API (FastAPI @ port 5055)
|
|
|
|
**Purpose**: RESTful backend exposing operations on notebooks, sources, notes, chat sessions, and AI models.
|
|
|
|
**Technology Stack**:
|
|
- **Framework**: FastAPI 0.104+ (async Python web framework)
|
|
- **Language**: Python 3.11+
|
|
- **Validation**: Pydantic v2 (request/response schemas)
|
|
- **Logging**: Loguru (structured JSON logging)
|
|
- **Testing**: Pytest (unit and integration tests)
|
|
|
|
**Architecture**:
|
|
```
|
|
FastAPI App (main.py)
|
|
├── Routers (HTTP endpoints)
|
|
│ ├── routers/notebooks.py (CRUD operations)
|
|
│ ├── routers/sources.py (content ingestion, upload)
|
|
│ ├── routers/notes.py (note management)
|
|
│ ├── routers/chat.py (conversation sessions)
|
|
│ ├── routers/search.py (full-text + vector search)
|
|
│ ├── routers/transformations.py (custom transformations)
|
|
│ ├── routers/models.py (AI model configuration)
|
|
│ └── routers/*.py (11 additional routers)
|
|
│
|
|
├── Services (business logic)
|
|
│ ├── *_service.py (orchestration, graph invocation)
|
|
│ ├── command_service.py (async job submission)
|
|
│ └── middleware (auth, logging)
|
|
│
|
|
├── Models (Pydantic schemas)
|
|
│ └── models.py (validation, serialization)
|
|
│
|
|
└── Lifespan (startup/shutdown)
|
|
└── AsyncMigrationManager (database schema migrations)
|
|
```
|
|
|
|
**Key Responsibilities**:
|
|
1. **HTTP Interface**: Accept REST requests, validate, return JSON responses
|
|
2. **Business Logic**: Orchestrate domain models, repository operations, and workflows
|
|
3. **Async Job Queue**: Submit long-running tasks (podcast generation, source processing)
|
|
4. **Database Migrations**: Run schema updates on startup
|
|
5. **Error Handling**: Catch exceptions, return appropriate HTTP status codes
|
|
6. **Logging**: Track operations for debugging and monitoring
|
|
|
|
**Startup Flow**:
|
|
1. Load `.env` environment variables
|
|
2. Initialize FastAPI app with CORS + auth middleware
|
|
3. Run AsyncMigrationManager (creates/updates database schema)
|
|
4. Register all routers (20+ endpoints)
|
|
5. Server ready on port 5055
|
|
|
|
**Request-Response Cycle**:
|
|
```
|
|
HTTP Request → Router → Service → Domain/Repository → SurrealDB
|
|
↓
|
|
LangGraph (optional)
|
|
↓
|
|
Response ← Pydantic serialization ← Service ← Result
|
|
```
|
|
|
|
---
|
|
|
|
### Layer 3: Database (SurrealDB @ port 8000)
|
|
|
|
**Purpose**: Graph database with built-in vector embeddings, semantic search, and relationship management.
|
|
|
|
**Technology Stack**:
|
|
- **Database**: SurrealDB (multi-model, ACID transactions)
|
|
- **Query Language**: SurrealQL (SQL-like syntax with graph operations)
|
|
- **Async Driver**: Async Rust client for Python
|
|
- **Migrations**: Manual `.surql` files in `/migrations/` (auto-run on API startup)
|
|
|
|
**Core Tables**:
|
|
|
|
| Table | Purpose | Key Fields |
|
|
|-------|---------|-----------|
|
|
| `notebook` | Research project container | id, name, description, archived, created, updated |
|
|
| `source` | Content item (PDF, URL, text) | id, title, full_text, topics, asset, created, updated |
|
|
| `source_embedding` | Vector embeddings for semantic search | id, source, embedding, chunk_text, chunk_index |
|
|
| `note` | User-created research notes | id, title, content, note_type (human/ai), created, updated |
|
|
| `chat_session` | Conversation session | id, notebook_id, title, messages (JSON), created, updated |
|
|
| `transformation` | Custom transformation rules | id, name, description, prompt, created, updated |
|
|
| `source_insight` | Transformation output | id, source_id, insight_type, content, created, updated |
|
|
| `reference` | Relationship: source → notebook | out (source), in (notebook) |
|
|
| `artifact` | Relationship: note → notebook | out (note), in (notebook) |
|
|
|
|
**Relationship Graph**:
|
|
```
|
|
Notebook
|
|
↓ (referenced_by)
|
|
Source
|
|
├→ SourceEmbedding (1:many for chunked text)
|
|
├→ SourceInsight (1:many for transformation outputs)
|
|
└→ Note (via artifact relationship)
|
|
├→ Embedding (semantic search)
|
|
└→ Topics (tags)
|
|
|
|
ChatSession
|
|
├→ Notebook
|
|
└→ Messages (stored as JSON array)
|
|
```
|
|
|
|
**Vector Search Capability**:
|
|
- Embeddings stored natively in SurrealDB
|
|
- Full-text search on `source.full_text` and `note.content`
|
|
- Cosine similarity search on embedding vectors
|
|
- Semantic search integrates with search endpoint
|
|
|
|
**Connection Management**:
|
|
- Async connection pooling (configurable size)
|
|
- Transaction support for multi-record operations
|
|
- Schema auto-validation via migrations
|
|
- Query timeout protection (prevent infinite queries)
|
|
|
|
---
|
|
|
|
## Tech Stack Rationale
|
|
|
|
### Why Python + FastAPI?
|
|
|
|
**Python**:
|
|
- Rich AI/ML ecosystem (LangChain, LangGraph, transformers, scikit-learn)
|
|
- Rapid prototyping and deployment
|
|
- Extensive async support (asyncio, async/await)
|
|
- Strong type hints (Pydantic, mypy)
|
|
|
|
**FastAPI**:
|
|
- Modern, async-first framework
|
|
- Automatic OpenAPI documentation (Swagger UI @ /docs)
|
|
- Built-in request validation (Pydantic)
|
|
- Excellent performance (benchmarked near C/Rust speeds)
|
|
- Easy middleware/dependency injection
|
|
|
|
### Why Next.js + React + TypeScript?
|
|
|
|
**Next.js**:
|
|
- Full-stack React framework with SSR/SSG
|
|
- File-based routing (intuitive project structure)
|
|
- Built-in API routes (optional backend co-location)
|
|
- Optimized image/code splitting
|
|
- Easy deployment (Vercel, Docker, self-hosted)
|
|
|
|
**React 19**:
|
|
- Component-based UI (reusable, testable)
|
|
- Excellent tooling and community
|
|
- Client-side state management (Zustand)
|
|
- Server-side state sync (TanStack Query)
|
|
|
|
**TypeScript**:
|
|
- Type safety catches errors at compile time
|
|
- Better IDE autocomplete and refactoring
|
|
- Documentation via types (self-documenting code)
|
|
- Easier onboarding for new contributors
|
|
|
|
### Why SurrealDB?
|
|
|
|
**SurrealDB**:
|
|
- Native graph database (relationships are first-class)
|
|
- Built-in vector embeddings (no separate vector DB)
|
|
- ACID transactions (data consistency)
|
|
- Multi-model (relational + document + graph)
|
|
- Full-text search + semantic search in one query
|
|
- Self-hosted (unlike managed Pinecone/Weaviate)
|
|
- Flexible SurrealQL (SQL-like syntax)
|
|
|
|
**Alternative Considered**: PostgreSQL + pgvector (more mature but separate extensions)
|
|
|
|
### Why Esperanto for AI Providers?
|
|
|
|
**Esperanto Library**:
|
|
- Unified interface to 8+ LLM providers (OpenAI, Anthropic, Google, Groq, Ollama, Mistral, DeepSeek, xAI)
|
|
- Multi-provider embeddings (OpenAI, Google, Ollama, Mistral, Voyage)
|
|
- TTS/STT integration (OpenAI, Groq, ElevenLabs, Google)
|
|
- Smart provider selection (fallback logic, cost optimization)
|
|
- Per-request model override support
|
|
- Local Ollama support (completely self-hosted option)
|
|
|
|
**Alternative Considered**: LangChain's provider abstraction (more verbose, less flexible)
|
|
|
|
---
|
|
|
|
## LangGraph Workflows
|
|
|
|
LangGraph is a state machine library that orchestrates multi-step AI workflows. Open Notebook uses five core workflows:
|
|
|
|
### 1. **Source Processing Workflow** (`open_notebook/graphs/source.py`)
|
|
|
|
**Purpose**: Ingest content (PDF, URL, text) and prepare for search/insights.
|
|
|
|
**Flow**:
|
|
```
|
|
Input (file/URL/text)
|
|
↓
|
|
Extract Content (content-core library)
|
|
↓
|
|
Clean & tokenize text
|
|
↓
|
|
Generate Embeddings (Esperanto)
|
|
↓
|
|
Create SourceEmbedding records (chunked + indexed)
|
|
↓
|
|
Extract Topics (LLM summarization)
|
|
↓
|
|
Save to SurrealDB
|
|
↓
|
|
Output (Source record with embeddings)
|
|
```
|
|
|
|
**State Dict**:
|
|
```python
|
|
{
|
|
"content_state": {"file_path" | "url" | "content": str},
|
|
"source_id": str,
|
|
"full_text": str,
|
|
"embeddings": List[Dict],
|
|
"topics": List[str],
|
|
"notebook_ids": List[str],
|
|
}
|
|
```
|
|
|
|
**Invoked By**: Sources API (`POST /sources`)
|
|
|
|
---
|
|
|
|
### 2. **Chat Workflow** (`open_notebook/graphs/chat.py`)
|
|
|
|
**Purpose**: Conduct multi-turn conversations with AI model, referencing notebook context.
|
|
|
|
**Flow**:
|
|
```
|
|
User Message
|
|
↓
|
|
Build Context (selected sources/notes)
|
|
↓
|
|
Add Message to Session
|
|
↓
|
|
Create Chat Prompt (system + history + context)
|
|
↓
|
|
Call LLM (via Esperanto)
|
|
↓
|
|
Stream Response
|
|
↓
|
|
Save AI Message to ChatSession
|
|
↓
|
|
Output (complete message)
|
|
```
|
|
|
|
**State Dict**:
|
|
```python
|
|
{
|
|
"session_id": str,
|
|
"messages": List[BaseMessage],
|
|
"context": Dict[str, Any], # sources, notes, snippets
|
|
"response": str,
|
|
"model_override": Optional[str],
|
|
}
|
|
```
|
|
|
|
**Key Features**:
|
|
- Message history persisted in SurrealDB (SqliteSaver checkpoint)
|
|
- Context building via `build_context_for_chat()` utility
|
|
- Token counting to prevent overflow
|
|
- Per-message model override support
|
|
|
|
**Invoked By**: Chat API (`POST /chat/execute`)
|
|
|
|
---
|
|
|
|
### 3. **Ask Workflow** (`open_notebook/graphs/ask.py`)
|
|
|
|
**Purpose**: Answer user questions by searching sources and synthesizing responses.
|
|
|
|
**Flow**:
|
|
```
|
|
User Question
|
|
↓
|
|
Plan Search Strategy (LLM generates searches)
|
|
↓
|
|
Execute Searches (vector + text search)
|
|
↓
|
|
Score & Rank Results
|
|
↓
|
|
Provide Answers (LLM synthesizes from results)
|
|
↓
|
|
Stream Responses
|
|
↓
|
|
Output (final answer)
|
|
```
|
|
|
|
**State Dict**:
|
|
```python
|
|
{
|
|
"question": str,
|
|
"strategy": SearchStrategy,
|
|
"answers": List[str],
|
|
"final_answer": str,
|
|
"sources_used": List[Source],
|
|
}
|
|
```
|
|
|
|
**Streaming**: Uses `astream()` to emit updates in real-time (strategy → answers → final answer)
|
|
|
|
**Invoked By**: Search API (`POST /ask` with streaming)
|
|
|
|
---
|
|
|
|
### 4. **Transformation Workflow** (`open_notebook/graphs/transformation.py`)
|
|
|
|
**Purpose**: Apply custom transformations to sources (extract summaries, key points, etc).
|
|
|
|
**Flow**:
|
|
```
|
|
Source + Transformation Rule
|
|
↓
|
|
Generate Prompt (Jinja2 template)
|
|
↓
|
|
Call LLM
|
|
↓
|
|
Parse Output
|
|
↓
|
|
Create SourceInsight record
|
|
↓
|
|
Output (insight with type + content)
|
|
```
|
|
|
|
**Example Transformations**:
|
|
- Summary (5-sentence overview)
|
|
- Key Points (bulleted list)
|
|
- Quotes (notable excerpts)
|
|
- Q&A (generated questions and answers)
|
|
|
|
**Invoked By**: Sources API (`POST /sources/{id}/insights`)
|
|
|
|
---
|
|
|
|
### 5. **Prompt Workflow** (`open_notebook/graphs/prompt.py`)
|
|
|
|
**Purpose**: Generic LLM task execution (e.g., auto-generate note titles, analyze content).
|
|
|
|
**Flow**:
|
|
```
|
|
Input Text + Prompt
|
|
↓
|
|
Call LLM (simple request-response)
|
|
↓
|
|
Output (completion)
|
|
```
|
|
|
|
**Used For**: Note title generation, content analysis, etc.
|
|
|
|
---
|
|
|
|
## AI Provider Integration Pattern
|
|
|
|
### ModelManager: Centralized Factory
|
|
|
|
Located in `open_notebook/ai/models.py`, ModelManager handles:
|
|
|
|
1. **Provider Detection**: Check environment variables for available providers
|
|
2. **Model Selection**: Choose best model based on context size and task
|
|
3. **Fallback Logic**: If primary provider unavailable, try backup
|
|
4. **Cost Optimization**: Prefer cheaper models for simple tasks
|
|
5. **Token Calculation**: Estimate cost before LLM call
|
|
|
|
**Usage**:
|
|
```python
|
|
from open_notebook.ai.provision import provision_langchain_model
|
|
|
|
# Get best LLM for context size
|
|
model = await provision_langchain_model(
|
|
task="chat", # or "search", "extraction"
|
|
model_override="anthropic/claude-opus-4", # optional
|
|
context_size=8000, # estimated tokens
|
|
)
|
|
|
|
# Invoke model
|
|
response = await model.ainvoke({"input": prompt})
|
|
```
|
|
|
|
### Multi-Provider Support
|
|
|
|
**LLM Providers**:
|
|
- OpenAI (gpt-4, gpt-4-turbo, gpt-3.5-turbo)
|
|
- Anthropic (claude-opus, claude-sonnet, claude-haiku)
|
|
- Google (gemini-pro, gemini-1.5)
|
|
- Groq (mixtral, llama-2)
|
|
- Ollama (local models)
|
|
- Mistral (mistral-large, mistral-medium)
|
|
- DeepSeek (deepseek-chat)
|
|
- xAI (grok)
|
|
|
|
**Embedding Providers**:
|
|
- OpenAI (text-embedding-3-large, text-embedding-3-small)
|
|
- Google (embedding-001)
|
|
- Ollama (local embeddings)
|
|
- Mistral (mistral-embed)
|
|
- Voyage (voyage-large-2)
|
|
|
|
**TTS Providers**:
|
|
- OpenAI (tts-1, tts-1-hd)
|
|
- Groq (no TTS, fallback to OpenAI)
|
|
- ElevenLabs (multilingual voices)
|
|
- Google TTS (text-to-speech)
|
|
|
|
### Per-Request Override
|
|
|
|
Every LangGraph invocation accepts a `config` parameter to override models:
|
|
|
|
```python
|
|
result = await graph.ainvoke(
|
|
input={...},
|
|
config={
|
|
"configurable": {
|
|
"model_override": "anthropic/claude-opus-4" # Use Claude instead
|
|
}
|
|
}
|
|
)
|
|
```
|
|
|
|
---
|
|
|
|
## Design Patterns
|
|
|
|
### 1. **Domain-Driven Design (DDD)**
|
|
|
|
**Domain Objects** (`open_notebook/domain/`):
|
|
- `Notebook`: Research container with relationships to sources/notes
|
|
- `Source`: Content item (PDF, URL, text) with embeddings
|
|
- `Note`: User-created or AI-generated research note
|
|
- `ChatSession`: Conversation history for a notebook
|
|
- `Transformation`: Custom rule for extracting insights
|
|
|
|
**Repository Pattern**:
|
|
- Database access layer (`open_notebook/database/repository.py`)
|
|
- `repo_query()`: Execute SurrealQL queries
|
|
- `repo_create()`: Insert records
|
|
- `repo_upsert()`: Merge records
|
|
- `repo_delete()`: Remove records
|
|
|
|
**Entity Methods**:
|
|
```python
|
|
# Domain methods (business logic)
|
|
notebook = await Notebook.get(id)
|
|
await notebook.save()
|
|
notes = await notebook.get_notes()
|
|
sources = await notebook.get_sources()
|
|
```
|
|
|
|
### 2. **Async-First Architecture**
|
|
|
|
**All I/O is async**:
|
|
- Database queries: `await repo_query(...)`
|
|
- LLM calls: `await model.ainvoke(...)`
|
|
- File I/O: `await upload_file.read()`
|
|
- Graph invocations: `await graph.ainvoke(...)`
|
|
|
|
**Benefits**:
|
|
- Non-blocking request handling (FastAPI serves multiple concurrent requests)
|
|
- Better resource utilization (I/O waiting doesn't block CPU)
|
|
- Natural fit for Python async/await syntax
|
|
|
|
**Example**:
|
|
```python
|
|
@router.post("/sources")
|
|
async def create_source(source_data: SourceCreate):
|
|
# All operations are non-blocking
|
|
source = Source(title=source_data.title)
|
|
await source.save() # async database operation
|
|
await graph.ainvoke({...}) # async LangGraph invocation
|
|
return SourceResponse(...)
|
|
```
|
|
|
|
### 3. **Service Pattern**
|
|
|
|
Services orchestrate domain objects, repositories, and workflows:
|
|
|
|
```python
|
|
# api/notebook_service.py
|
|
class NotebookService:
|
|
async def get_notebook_with_stats(notebook_id: str):
|
|
notebook = await Notebook.get(notebook_id)
|
|
sources = await notebook.get_sources()
|
|
notes = await notebook.get_notes()
|
|
return {
|
|
"notebook": notebook,
|
|
"source_count": len(sources),
|
|
"note_count": len(notes),
|
|
}
|
|
```
|
|
|
|
**Responsibilities**:
|
|
- Validate inputs (Pydantic)
|
|
- Orchestrate database operations
|
|
- Invoke workflows (LangGraph graphs)
|
|
- Handle errors and return appropriate status codes
|
|
- Log operations
|
|
|
|
### 4. **Streaming Pattern**
|
|
|
|
For long-running operations (ask workflow, podcast generation), stream results as Server-Sent Events:
|
|
|
|
```python
|
|
@router.post("/ask", response_class=StreamingResponse)
|
|
async def ask(request: AskRequest):
|
|
async def stream_response():
|
|
async for chunk in ask_graph.astream(input={...}):
|
|
yield f"data: {json.dumps(chunk)}\n\n"
|
|
return StreamingResponse(stream_response(), media_type="text/event-stream")
|
|
```
|
|
|
|
### 5. **Job Queue Pattern**
|
|
|
|
For async background tasks (source processing), use Surreal-Commands job queue:
|
|
|
|
```python
|
|
# Submit job
|
|
command_id = await CommandService.submit_command_job(
|
|
app="open_notebook",
|
|
command="process_source",
|
|
input={...}
|
|
)
|
|
|
|
# Poll status
|
|
status = await source.get_status()
|
|
```
|
|
|
|
---
|
|
|
|
## Service Communication Patterns
|
|
|
|
### Frontend → API
|
|
|
|
1. **REST requests** (HTTP GET/POST/PUT/DELETE)
|
|
2. **JSON request/response bodies**
|
|
3. **Standard HTTP status codes** (200, 400, 404, 500)
|
|
4. **Optional streaming** (Server-Sent Events for long operations)
|
|
|
|
**Example**:
|
|
```typescript
|
|
// Frontend
|
|
const response = await fetch("http://localhost:5055/sources", {
|
|
method: "POST",
|
|
body: formData, // multipart/form-data for file upload
|
|
});
|
|
const source = await response.json();
|
|
```
|
|
|
|
### API → SurrealDB
|
|
|
|
1. **SurrealQL queries** (similar to SQL)
|
|
2. **Async driver** with connection pooling
|
|
3. **Type-safe record IDs** (record_id syntax)
|
|
4. **Transaction support** for multi-step operations
|
|
|
|
**Example**:
|
|
```python
|
|
# API
|
|
result = await repo_query(
|
|
"SELECT * FROM source WHERE notebook = $notebook_id",
|
|
{"notebook_id": ensure_record_id(notebook_id)}
|
|
)
|
|
```
|
|
|
|
### API → AI Providers (via Esperanto)
|
|
|
|
1. **Esperanto unified interface**
|
|
2. **Per-request provider override**
|
|
3. **Automatic fallback on failure**
|
|
4. **Token counting and cost estimation**
|
|
|
|
**Example**:
|
|
```python
|
|
# API
|
|
model = await provision_langchain_model(task="chat")
|
|
response = await model.ainvoke({"input": prompt})
|
|
```
|
|
|
|
### API → Job Queue (Surreal-Commands)
|
|
|
|
1. **Async job submission**
|
|
2. **Fire-and-forget pattern**
|
|
3. **Status polling via `/commands/{id}` endpoint**
|
|
4. **Job completion callbacks (optional)**
|
|
|
|
**Example**:
|
|
```python
|
|
# Submit async source processing
|
|
command_id = await CommandService.submit_command_job(...)
|
|
|
|
# Client polls status
|
|
response = await fetch(f"http://localhost:5055/commands/{command_id}")
|
|
status = await response.json() # returns { status: "running|queued|completed|failed" }
|
|
```
|
|
|
|
---
|
|
|
|
## Database Schema Overview
|
|
|
|
### Core Schema Structure
|
|
|
|
**Tables** (20+):
|
|
- Notebooks (with soft-delete via `archived` flag)
|
|
- Sources (content + metadata)
|
|
- SourceEmbeddings (vector chunks)
|
|
- Notes (user-created + AI-generated)
|
|
- ChatSessions (conversation history)
|
|
- Transformations (custom rules)
|
|
- SourceInsights (transformation outputs)
|
|
- Relationships (notebook→source, notebook→note)
|
|
|
|
**Migrations**:
|
|
- Automatic on API startup
|
|
- Located in `/migrations/` directory
|
|
- Numbered sequentially (001_*.surql, 002_*.surql, etc)
|
|
- Tracked in `_sbl_migrations` table
|
|
- Rollback via `_down.surql` files (manual)
|
|
|
|
### Relationship Model
|
|
|
|
**Graph Relationships**:
|
|
```
|
|
Notebook
|
|
← reference ← Source (many:many)
|
|
← artifact ← Note (many:many)
|
|
|
|
Source
|
|
→ source_embedding (one:many)
|
|
→ source_insight (one:many)
|
|
→ embedding (via source_embedding)
|
|
|
|
ChatSession
|
|
→ messages (JSON array in database)
|
|
→ notebook_id (reference to Notebook)
|
|
|
|
Transformation
|
|
→ source_insight (one:many)
|
|
```
|
|
|
|
**Query Example** (get all sources in a notebook with counts):
|
|
```sql
|
|
SELECT id, title,
|
|
count(<-reference.in) as note_count,
|
|
count(<-embedding.in) as embedded_chunks
|
|
FROM source
|
|
WHERE notebook = $notebook_id
|
|
ORDER BY updated DESC
|
|
```
|
|
|
|
---
|
|
|
|
## Key Architectural Decisions
|
|
|
|
### 1. **Async Throughout**
|
|
|
|
All I/O operations are non-blocking to maximize concurrency and responsiveness.
|
|
|
|
**Trade-off**: Slightly more complex code (async/await syntax) vs. high throughput.
|
|
|
|
### 2. **Multi-Provider from Day 1**
|
|
|
|
Built-in support for 8+ AI providers prevents vendor lock-in.
|
|
|
|
**Trade-off**: Added complexity in ModelManager vs. flexibility and cost optimization.
|
|
|
|
### 3. **Graph-First Workflows**
|
|
|
|
LangGraph state machines for complex multi-step operations (ask, chat, transformations).
|
|
|
|
**Trade-off**: Steeper learning curve vs. maintainable, debuggable workflows.
|
|
|
|
### 4. **Self-Hosted Database**
|
|
|
|
SurrealDB for graph + vector search in one system (no external dependencies).
|
|
|
|
**Trade-off**: Operational responsibility vs. simplified architecture and cost savings.
|
|
|
|
### 5. **Job Queue for Long-Running Tasks**
|
|
|
|
Async job submission (source processing, podcast generation) prevents request timeouts.
|
|
|
|
**Trade-off**: Eventual consistency vs. responsive user experience.
|
|
|
|
---
|
|
|
|
## Important Quirks & Gotchas
|
|
|
|
### API Startup
|
|
|
|
- **Migrations run automatically** on every startup; check logs for errors
|
|
- **SurrealDB must be running** before starting API (connection test in lifespan)
|
|
- **Auth middleware is basic** (password-only); upgrade to OAuth/JWT for production
|
|
|
|
### Database Operations
|
|
|
|
- **Record IDs use SurrealDB syntax** (table:id format, e.g., "notebook:abc123")
|
|
- **ensure_record_id()** helper prevents malformed IDs
|
|
- **Soft deletes** via `archived` field (data not removed, just marked inactive)
|
|
- **Timestamps in ISO 8601 format** (created, updated fields)
|
|
|
|
### LangGraph Workflows
|
|
|
|
- **State persistence** via SqliteSaver in `/data/sqlite-db/`
|
|
- **No built-in timeout**; long workflows may block requests (use streaming for UX)
|
|
- **Model fallback** automatic if primary provider unavailable
|
|
- **Checkpoint IDs** must be unique per session (avoid collisions)
|
|
|
|
### AI Provider Integration
|
|
|
|
- **Esperanto library** handles all provider APIs (no direct API calls)
|
|
- **Per-request override** via RunnableConfig (temporary, not persistent)
|
|
- **Cost estimation** via token counting (not 100% accurate, use for guidance)
|
|
- **Fallback logic** tries cheaper models if primary fails
|
|
|
|
### File Uploads
|
|
|
|
- **Stored in `/data/uploads/`** directory (not database)
|
|
- **Unique filename generation** prevents overwrites (counter suffix)
|
|
- **Content-core library** extracts text from 50+ file types
|
|
- **Large files** may block API briefly (sync content extraction)
|
|
|
|
---
|
|
|
|
## Performance Considerations
|
|
|
|
### Optimization Strategies
|
|
|
|
1. **Connection Pooling**: SurrealDB async driver with configurable pool size
|
|
2. **Query Caching**: TanStack Query on frontend (client-side caching)
|
|
3. **Embedding Reuse**: Vector search uses pre-computed embeddings
|
|
4. **Chunking**: Sources split into chunks for better search relevance
|
|
5. **Async Operations**: Non-blocking I/O for high concurrency
|
|
6. **Lazy Loading**: Frontend requests only needed data (pagination)
|
|
|
|
### Bottlenecks
|
|
|
|
1. **LLM Calls**: Latency depends on provider (typically 1-30 seconds)
|
|
2. **Embedding Generation**: Time proportional to content size and provider
|
|
3. **Vector Search**: Similarity computation over all embeddings
|
|
4. **Content Extraction**: Sync operation in source processing
|
|
|
|
### Monitoring
|
|
|
|
- **API Logs**: Check loguru output for errors and slow operations
|
|
- **Database Queries**: SurrealDB metrics available via admin UI
|
|
- **Token Usage**: Estimated via `estimate_tokens()` utility
|
|
- **Job Status**: Poll `/commands/{id}` for async operations
|
|
|
|
---
|
|
|
|
## Extension Points
|
|
|
|
### Adding a New Workflow
|
|
|
|
1. Create `open_notebook/graphs/workflow_name.py`
|
|
2. Define StateDict and node functions
|
|
3. Build graph with `.add_node()` / `.add_edge()`
|
|
4. Create service in `api/workflow_service.py`
|
|
5. Register router in `api/main.py`
|
|
6. Add tests in `tests/test_workflow.py`
|
|
|
|
### Adding a New Data Model
|
|
|
|
1. Create model in `open_notebook/domain/model_name.py`
|
|
2. Inherit from BaseModel (domain object)
|
|
3. Implement `save()`, `get()`, `delete()` methods (CRUD)
|
|
4. Add repository functions if complex queries needed
|
|
5. Create database migration in `migrations/`
|
|
6. Add API routes and models in `api/`
|
|
|
|
### Adding a New AI Provider
|
|
|
|
1. Configure Esperanto for new provider (see .env.example)
|
|
2. ModelManager automatically detects via environment variables
|
|
3. Override via per-request config (no code changes needed)
|
|
4. Test fallback logic if provider unavailable
|
|
|
|
---
|
|
|
|
## Deployment Considerations
|
|
|
|
### Development
|
|
|
|
- All services on localhost (3000, 5055, 8000)
|
|
- Auto-reload on file changes (Next.js, FastAPI)
|
|
- Hot-reload database migrations
|
|
- Open API docs at http://localhost:5055/docs
|
|
|
|
### Production
|
|
|
|
- **Frontend**: Deploy to Vercel, Netlify, or Docker
|
|
- **API**: Docker container (see Dockerfile)
|
|
- **Database**: SurrealDB container or managed service
|
|
- **Environment**: Secure .env file with API keys
|
|
- **SSL/TLS**: Reverse proxy (Nginx, CloudFlare)
|
|
- **Rate Limiting**: Add at proxy layer
|
|
- **Auth**: Replace PasswordAuthMiddleware with OAuth/JWT
|
|
- **Monitoring**: Log aggregation (CloudWatch, DataDog, etc)
|
|
|
|
---
|
|
|
|
## Summary
|
|
|
|
Open Notebook's architecture provides a solid foundation for privacy-focused, AI-powered research. The separation of concerns (frontend/API/database), async-first design, and multi-provider flexibility enable rapid development and easy deployment. LangGraph workflows orchestrate complex AI tasks, while Esperanto abstracts provider details. The result is a scalable, maintainable system that puts users in control of their data and AI provider choice.
|