mirror of
https://github.com/lfnovo/open-notebook.git
synced 2026-05-01 21:00:43 +00:00
docs: restructure documentation with new organized layout
- Replace old docs structure with new comprehensive documentation - Organize into 8 major sections (0-START-HERE through 7-DEVELOPMENT) - Convert CONFIGURATION.md, CONTRIBUTING.md, MAINTAINER_GUIDE.md to redirects - Remove outdated MIGRATION.md and DESIGN_PRINCIPLES.md - Fix all internal documentation links and cross-references - Add progressive disclosure paths for different user types - Include 44 focused guides covering all features - Update README.md to remove v1.0 breaking changes notice
This commit is contained in:
parent
71b8d13b24
commit
e13e4a2d8b
108 changed files with 16392 additions and 18153 deletions
211
docs/7-DEVELOPMENT/api-reference.md
Normal file
211
docs/7-DEVELOPMENT/api-reference.md
Normal file
|
|
@ -0,0 +1,211 @@
|
|||
# API Reference
|
||||
|
||||
Complete REST API for Open Notebook. All endpoints are served from the API backend (default: `http://localhost:5055`).
|
||||
|
||||
**Base URL**: `http://localhost:5055` (development) or environment-specific production URL
|
||||
|
||||
**Interactive Docs**: Use FastAPI's built-in Swagger UI at `http://localhost:5055/docs` for live testing and exploration. This is the primary reference for all endpoints, request/response schemas, and real-time testing.
|
||||
|
||||
---
|
||||
|
||||
## Quick Start
|
||||
|
||||
### 1. Authentication
|
||||
|
||||
Simple password-based (development only):
|
||||
|
||||
```bash
|
||||
curl http://localhost:5055/notebooks \
|
||||
-H "X-Password: your_password"
|
||||
```
|
||||
|
||||
**⚠️ Production**: Replace with OAuth/JWT. See CONFIGURATION.md for details.
|
||||
|
||||
### 2. Base API Flow
|
||||
|
||||
Most operations follow this pattern:
|
||||
1. Create a **Notebook** (container for research)
|
||||
2. Add **Sources** (PDFs, URLs, text)
|
||||
3. Query via **Chat** or **Search**
|
||||
4. View results and **Notes**
|
||||
|
||||
### 3. Testing Endpoints
|
||||
|
||||
Instead of memorizing endpoints, use the interactive API docs:
|
||||
- Navigate to `http://localhost:5055/docs`
|
||||
- Try requests directly in the browser
|
||||
- See request/response schemas in real-time
|
||||
- Test with your own data
|
||||
|
||||
---
|
||||
|
||||
## API Endpoints Overview
|
||||
|
||||
### Main Resource Types
|
||||
|
||||
**Notebooks** - Research projects containing sources and notes
|
||||
- `GET/POST /notebooks` - List and create
|
||||
- `GET/PUT/DELETE /notebooks/{id}` - Read, update, delete
|
||||
|
||||
**Sources** - Content items (PDFs, URLs, text)
|
||||
- `GET/POST /sources` - List and add content
|
||||
- `GET /sources/{id}` - Fetch source details
|
||||
- `POST /sources/{id}/retry` - Retry failed processing
|
||||
- `GET /sources/{id}/download` - Download original file
|
||||
|
||||
**Notes** - User-created or AI-generated research notes
|
||||
- `GET/POST /notes` - List and create
|
||||
- `GET/PUT/DELETE /notes/{id}` - Read, update, delete
|
||||
|
||||
**Chat** - Conversational AI interface
|
||||
- `GET/POST /chat/sessions` - Manage chat sessions
|
||||
- `POST /chat/execute` - Send message and get response
|
||||
- `POST /chat/context/build` - Prepare context for chat
|
||||
|
||||
**Search** - Find content by text or semantic similarity
|
||||
- `POST /search` - Full-text or vector search
|
||||
- `POST /ask` - Ask a question (search + synthesize)
|
||||
|
||||
**Transformations** - Custom prompts for extracting insights
|
||||
- `GET/POST /transformations` - Create custom extraction rules
|
||||
- `POST /sources/{id}/insights` - Apply transformation to source
|
||||
|
||||
**Models** - Configure AI providers
|
||||
- `GET /models` - Available models
|
||||
- `GET /models/defaults` - Current defaults
|
||||
- `POST /models/config` - Set defaults
|
||||
|
||||
**Health & Status**
|
||||
- `GET /health` - Health check
|
||||
- `GET /commands/{id}` - Track async operations
|
||||
|
||||
---
|
||||
|
||||
## Authentication
|
||||
|
||||
### Current (Development)
|
||||
|
||||
All requests require password header:
|
||||
|
||||
```bash
|
||||
curl -H "X-Password: your_password" http://localhost:5055/notebooks
|
||||
```
|
||||
|
||||
Password configured via `ADMIN_PASSWORD` environment variable.
|
||||
|
||||
### Production
|
||||
|
||||
**⚠️ Not secure.** Replace with:
|
||||
- OAuth 2.0 (recommended)
|
||||
- JWT tokens
|
||||
- API keys
|
||||
|
||||
See CONFIGURATION.md for production setup.
|
||||
|
||||
---
|
||||
|
||||
## Common Patterns
|
||||
|
||||
### Pagination
|
||||
|
||||
```bash
|
||||
# List sources with limit/offset
|
||||
curl 'http://localhost:5055/sources?limit=20&offset=10'
|
||||
```
|
||||
|
||||
### Filtering & Sorting
|
||||
|
||||
```bash
|
||||
# Filter by notebook, sort by date
|
||||
curl 'http://localhost:5055/sources?notebook_id=notebook:abc&sort_by=created&sort_order=asc'
|
||||
```
|
||||
|
||||
### Async Operations
|
||||
|
||||
Some operations (source processing, podcast generation) return immediately with a command ID:
|
||||
|
||||
```bash
|
||||
# Submit async operation
|
||||
curl -X POST http://localhost:5055/sources -F async_processing=true
|
||||
# Response: {"id": "source:src001", "command_id": "command:cmd123"}
|
||||
|
||||
# Poll status
|
||||
curl http://localhost:5055/commands/command:cmd123
|
||||
```
|
||||
|
||||
### Streaming Responses
|
||||
|
||||
The `/ask` endpoint streams responses as Server-Sent Events:
|
||||
|
||||
```bash
|
||||
curl -N 'http://localhost:5055/ask' \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"question": "What is AI?"}'
|
||||
|
||||
# Outputs: data: {"type":"strategy",...}
|
||||
# data: {"type":"answer",...}
|
||||
# data: {"type":"final_answer",...}
|
||||
```
|
||||
|
||||
### Multipart File Upload
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:5055/sources \
|
||||
-F "type=upload" \
|
||||
-F "notebook_id=notebook:abc" \
|
||||
-F "file=@document.pdf"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Error Handling
|
||||
|
||||
All errors return JSON with status code:
|
||||
|
||||
```json
|
||||
{"detail": "Notebook not found"}
|
||||
```
|
||||
|
||||
### Common Status Codes
|
||||
|
||||
| Code | Meaning | Example |
|
||||
|------|---------|---------|
|
||||
| 200 | Success | Operation completed |
|
||||
| 400 | Bad Request | Invalid input |
|
||||
| 404 | Not Found | Resource doesn't exist |
|
||||
| 409 | Conflict | Resource already exists |
|
||||
| 500 | Server Error | Database/processing error |
|
||||
|
||||
---
|
||||
|
||||
## Tips for Developers
|
||||
|
||||
1. **Start with interactive docs** (`http://localhost:5055/docs`) - this is the definitive reference
|
||||
2. **Enable logging** for debugging (check API logs: `docker logs`)
|
||||
3. **Streaming endpoints** require special handling (Server-Sent Events, not standard JSON)
|
||||
4. **Async operations** return immediately; always poll status before assuming completion
|
||||
5. **Vector search** requires embedding model configured (check `/models`)
|
||||
6. **Model overrides** are per-request; set in body, not config
|
||||
7. **CORS enabled** in development; configure for production
|
||||
|
||||
---
|
||||
|
||||
## Learning Path
|
||||
|
||||
1. **Authentication**: Add `X-Password` header to all requests
|
||||
2. **Create a notebook**: `POST /notebooks` with name and description
|
||||
3. **Add a source**: `POST /sources` with file, URL, or text
|
||||
4. **Query your content**: `POST /chat/execute` to ask questions
|
||||
5. **Explore advanced features**: Search, transformations, streaming
|
||||
|
||||
---
|
||||
|
||||
## Production Considerations
|
||||
|
||||
- Replace password auth with OAuth/JWT
|
||||
- Add rate limiting via reverse proxy (Nginx, CloudFlare, Kong)
|
||||
- Enable CORS restrictions (currently allows all origins)
|
||||
- Use HTTPS (reverse proxy + SSL cert)
|
||||
- Set up API versioning strategy (currently implicit)
|
||||
|
||||
See CONFIGURATION.md for complete production setup.
|
||||
854
docs/7-DEVELOPMENT/architecture.md
Normal file
854
docs/7-DEVELOPMENT/architecture.md
Normal file
|
|
@ -0,0 +1,854 @@
|
|||
# Open Notebook Architecture
|
||||
|
||||
## Overview
|
||||
|
||||
Open Notebook is built on a **three-tier, async-first architecture** designed for scalability, modularity, and multi-provider AI flexibility. The system separates concerns across frontend, API, and database layers, with LangGraph powering intelligent workflows and Esperanto enabling seamless integration with 8+ AI providers.
|
||||
|
||||
**Core Philosophy**:
|
||||
- Privacy-first: Users control their data and AI provider choice
|
||||
- Async/await throughout: Non-blocking operations for responsive UX
|
||||
- Domain-Driven Design: Clear separation between domain models, repositories, and orchestrators
|
||||
- Multi-provider flexibility: Swap AI providers without changing application code
|
||||
- Self-hosted capable: All components deployable in isolated environments
|
||||
|
||||
---
|
||||
|
||||
## Three-Tier Architecture
|
||||
|
||||
### Layer 1: Frontend (React/Next.js @ port 3000)
|
||||
|
||||
**Purpose**: Responsive, interactive user interface for research, notes, chat, and podcast management.
|
||||
|
||||
**Technology Stack**:
|
||||
- **Framework**: Next.js 15 with React 19
|
||||
- **Language**: TypeScript with strict type checking
|
||||
- **State Management**: Zustand (lightweight store) + TanStack Query (server state)
|
||||
- **Styling**: Tailwind CSS + Shadcn/ui component library
|
||||
- **Build Tool**: Webpack (bundled via Next.js)
|
||||
|
||||
**Key Responsibilities**:
|
||||
- Render notebooks, sources, notes, chat sessions, and podcasts
|
||||
- Handle user interactions (create, read, update, delete operations)
|
||||
- Manage complex UI state (modals, file uploads, real-time search)
|
||||
- Stream responses from API (chat, podcast generation)
|
||||
- Display embeddings, vector search results, and insights
|
||||
|
||||
**Communication Pattern**:
|
||||
- All data fetched via REST API (async requests to port 5055)
|
||||
- Configured base URL: `http://localhost:5055` (dev) or environment-specific (prod)
|
||||
- TanStack Query handles caching, refetching, and data synchronization
|
||||
- Zustand stores global state (user, notebooks, selected context)
|
||||
- CORS enabled on API side for cross-origin requests
|
||||
|
||||
**Component Architecture**:
|
||||
- `/src/app/`: Next.js App Router (pages, layouts)
|
||||
- `/src/components/`: Reusable React components (buttons, forms, cards)
|
||||
- `/src/hooks/`: Custom hooks (useNotebook, useChat, useSearch)
|
||||
- `/src/lib/`: Utility functions, API clients, validators
|
||||
- `/src/styles/`: Global CSS, Tailwind config
|
||||
|
||||
---
|
||||
|
||||
### Layer 2: API (FastAPI @ port 5055)
|
||||
|
||||
**Purpose**: RESTful backend exposing operations on notebooks, sources, notes, chat sessions, and AI models.
|
||||
|
||||
**Technology Stack**:
|
||||
- **Framework**: FastAPI 0.104+ (async Python web framework)
|
||||
- **Language**: Python 3.11+
|
||||
- **Validation**: Pydantic v2 (request/response schemas)
|
||||
- **Logging**: Loguru (structured JSON logging)
|
||||
- **Testing**: Pytest (unit and integration tests)
|
||||
|
||||
**Architecture**:
|
||||
```
|
||||
FastAPI App (main.py)
|
||||
├── Routers (HTTP endpoints)
|
||||
│ ├── routers/notebooks.py (CRUD operations)
|
||||
│ ├── routers/sources.py (content ingestion, upload)
|
||||
│ ├── routers/notes.py (note management)
|
||||
│ ├── routers/chat.py (conversation sessions)
|
||||
│ ├── routers/search.py (full-text + vector search)
|
||||
│ ├── routers/transformations.py (custom transformations)
|
||||
│ ├── routers/models.py (AI model configuration)
|
||||
│ └── routers/*.py (11 additional routers)
|
||||
│
|
||||
├── Services (business logic)
|
||||
│ ├── *_service.py (orchestration, graph invocation)
|
||||
│ ├── command_service.py (async job submission)
|
||||
│ └── middleware (auth, logging)
|
||||
│
|
||||
├── Models (Pydantic schemas)
|
||||
│ └── models.py (validation, serialization)
|
||||
│
|
||||
└── Lifespan (startup/shutdown)
|
||||
└── AsyncMigrationManager (database schema migrations)
|
||||
```
|
||||
|
||||
**Key Responsibilities**:
|
||||
1. **HTTP Interface**: Accept REST requests, validate, return JSON responses
|
||||
2. **Business Logic**: Orchestrate domain models, repository operations, and workflows
|
||||
3. **Async Job Queue**: Submit long-running tasks (podcast generation, source processing)
|
||||
4. **Database Migrations**: Run schema updates on startup
|
||||
5. **Error Handling**: Catch exceptions, return appropriate HTTP status codes
|
||||
6. **Logging**: Track operations for debugging and monitoring
|
||||
|
||||
**Startup Flow**:
|
||||
1. Load `.env` environment variables
|
||||
2. Initialize FastAPI app with CORS + auth middleware
|
||||
3. Run AsyncMigrationManager (creates/updates database schema)
|
||||
4. Register all routers (20+ endpoints)
|
||||
5. Server ready on port 5055
|
||||
|
||||
**Request-Response Cycle**:
|
||||
```
|
||||
HTTP Request → Router → Service → Domain/Repository → SurrealDB
|
||||
↓
|
||||
LangGraph (optional)
|
||||
↓
|
||||
Response ← Pydantic serialization ← Service ← Result
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Layer 3: Database (SurrealDB @ port 8000)
|
||||
|
||||
**Purpose**: Graph database with built-in vector embeddings, semantic search, and relationship management.
|
||||
|
||||
**Technology Stack**:
|
||||
- **Database**: SurrealDB (multi-model, ACID transactions)
|
||||
- **Query Language**: SurrealQL (SQL-like syntax with graph operations)
|
||||
- **Async Driver**: Async Rust client for Python
|
||||
- **Migrations**: Manual `.surql` files in `/migrations/` (auto-run on API startup)
|
||||
|
||||
**Core Tables**:
|
||||
|
||||
| Table | Purpose | Key Fields |
|
||||
|-------|---------|-----------|
|
||||
| `notebook` | Research project container | id, name, description, archived, created, updated |
|
||||
| `source` | Content item (PDF, URL, text) | id, title, full_text, topics, asset, created, updated |
|
||||
| `source_embedding` | Vector embeddings for semantic search | id, source, embedding, chunk_text, chunk_index |
|
||||
| `note` | User-created research notes | id, title, content, note_type (human/ai), created, updated |
|
||||
| `chat_session` | Conversation session | id, notebook_id, title, messages (JSON), created, updated |
|
||||
| `transformation` | Custom transformation rules | id, name, description, prompt, created, updated |
|
||||
| `source_insight` | Transformation output | id, source_id, insight_type, content, created, updated |
|
||||
| `reference` | Relationship: source → notebook | out (source), in (notebook) |
|
||||
| `artifact` | Relationship: note → notebook | out (note), in (notebook) |
|
||||
|
||||
**Relationship Graph**:
|
||||
```
|
||||
Notebook
|
||||
↓ (referenced_by)
|
||||
Source
|
||||
├→ SourceEmbedding (1:many for chunked text)
|
||||
├→ SourceInsight (1:many for transformation outputs)
|
||||
└→ Note (via artifact relationship)
|
||||
├→ Embedding (semantic search)
|
||||
└→ Topics (tags)
|
||||
|
||||
ChatSession
|
||||
├→ Notebook
|
||||
└→ Messages (stored as JSON array)
|
||||
```
|
||||
|
||||
**Vector Search Capability**:
|
||||
- Embeddings stored natively in SurrealDB
|
||||
- Full-text search on `source.full_text` and `note.content`
|
||||
- Cosine similarity search on embedding vectors
|
||||
- Semantic search integrates with search endpoint
|
||||
|
||||
**Connection Management**:
|
||||
- Async connection pooling (configurable size)
|
||||
- Transaction support for multi-record operations
|
||||
- Schema auto-validation via migrations
|
||||
- Query timeout protection (prevent infinite queries)
|
||||
|
||||
---
|
||||
|
||||
## Tech Stack Rationale
|
||||
|
||||
### Why Python + FastAPI?
|
||||
|
||||
**Python**:
|
||||
- Rich AI/ML ecosystem (LangChain, LangGraph, transformers, scikit-learn)
|
||||
- Rapid prototyping and deployment
|
||||
- Extensive async support (asyncio, async/await)
|
||||
- Strong type hints (Pydantic, mypy)
|
||||
|
||||
**FastAPI**:
|
||||
- Modern, async-first framework
|
||||
- Automatic OpenAPI documentation (Swagger UI @ /docs)
|
||||
- Built-in request validation (Pydantic)
|
||||
- Excellent performance (benchmarked near C/Rust speeds)
|
||||
- Easy middleware/dependency injection
|
||||
|
||||
### Why Next.js + React + TypeScript?
|
||||
|
||||
**Next.js**:
|
||||
- Full-stack React framework with SSR/SSG
|
||||
- File-based routing (intuitive project structure)
|
||||
- Built-in API routes (optional backend co-location)
|
||||
- Optimized image/code splitting
|
||||
- Easy deployment (Vercel, Docker, self-hosted)
|
||||
|
||||
**React 19**:
|
||||
- Component-based UI (reusable, testable)
|
||||
- Excellent tooling and community
|
||||
- Client-side state management (Zustand)
|
||||
- Server-side state sync (TanStack Query)
|
||||
|
||||
**TypeScript**:
|
||||
- Type safety catches errors at compile time
|
||||
- Better IDE autocomplete and refactoring
|
||||
- Documentation via types (self-documenting code)
|
||||
- Easier onboarding for new contributors
|
||||
|
||||
### Why SurrealDB?
|
||||
|
||||
**SurrealDB**:
|
||||
- Native graph database (relationships are first-class)
|
||||
- Built-in vector embeddings (no separate vector DB)
|
||||
- ACID transactions (data consistency)
|
||||
- Multi-model (relational + document + graph)
|
||||
- Full-text search + semantic search in one query
|
||||
- Self-hosted (unlike managed Pinecone/Weaviate)
|
||||
- Flexible SurrealQL (SQL-like syntax)
|
||||
|
||||
**Alternative Considered**: PostgreSQL + pgvector (more mature but separate extensions)
|
||||
|
||||
### Why Esperanto for AI Providers?
|
||||
|
||||
**Esperanto Library**:
|
||||
- Unified interface to 8+ LLM providers (OpenAI, Anthropic, Google, Groq, Ollama, Mistral, DeepSeek, xAI)
|
||||
- Multi-provider embeddings (OpenAI, Google, Ollama, Mistral, Voyage)
|
||||
- TTS/STT integration (OpenAI, Groq, ElevenLabs, Google)
|
||||
- Smart provider selection (fallback logic, cost optimization)
|
||||
- Per-request model override support
|
||||
- Local Ollama support (completely self-hosted option)
|
||||
|
||||
**Alternative Considered**: LangChain's provider abstraction (more verbose, less flexible)
|
||||
|
||||
---
|
||||
|
||||
## LangGraph Workflows
|
||||
|
||||
LangGraph is a state machine library that orchestrates multi-step AI workflows. Open Notebook uses five core workflows:
|
||||
|
||||
### 1. **Source Processing Workflow** (`open_notebook/graphs/source.py`)
|
||||
|
||||
**Purpose**: Ingest content (PDF, URL, text) and prepare for search/insights.
|
||||
|
||||
**Flow**:
|
||||
```
|
||||
Input (file/URL/text)
|
||||
↓
|
||||
Extract Content (content-core library)
|
||||
↓
|
||||
Clean & tokenize text
|
||||
↓
|
||||
Generate Embeddings (Esperanto)
|
||||
↓
|
||||
Create SourceEmbedding records (chunked + indexed)
|
||||
↓
|
||||
Extract Topics (LLM summarization)
|
||||
↓
|
||||
Save to SurrealDB
|
||||
↓
|
||||
Output (Source record with embeddings)
|
||||
```
|
||||
|
||||
**State Dict**:
|
||||
```python
|
||||
{
|
||||
"content_state": {"file_path" | "url" | "content": str},
|
||||
"source_id": str,
|
||||
"full_text": str,
|
||||
"embeddings": List[Dict],
|
||||
"topics": List[str],
|
||||
"notebook_ids": List[str],
|
||||
}
|
||||
```
|
||||
|
||||
**Invoked By**: Sources API (`POST /sources`)
|
||||
|
||||
---
|
||||
|
||||
### 2. **Chat Workflow** (`open_notebook/graphs/chat.py`)
|
||||
|
||||
**Purpose**: Conduct multi-turn conversations with AI model, referencing notebook context.
|
||||
|
||||
**Flow**:
|
||||
```
|
||||
User Message
|
||||
↓
|
||||
Build Context (selected sources/notes)
|
||||
↓
|
||||
Add Message to Session
|
||||
↓
|
||||
Create Chat Prompt (system + history + context)
|
||||
↓
|
||||
Call LLM (via Esperanto)
|
||||
↓
|
||||
Stream Response
|
||||
↓
|
||||
Save AI Message to ChatSession
|
||||
↓
|
||||
Output (complete message)
|
||||
```
|
||||
|
||||
**State Dict**:
|
||||
```python
|
||||
{
|
||||
"session_id": str,
|
||||
"messages": List[BaseMessage],
|
||||
"context": Dict[str, Any], # sources, notes, snippets
|
||||
"response": str,
|
||||
"model_override": Optional[str],
|
||||
}
|
||||
```
|
||||
|
||||
**Key Features**:
|
||||
- Message history persisted in SurrealDB (SqliteSaver checkpoint)
|
||||
- Context building via `build_context_for_chat()` utility
|
||||
- Token counting to prevent overflow
|
||||
- Per-message model override support
|
||||
|
||||
**Invoked By**: Chat API (`POST /chat/execute`)
|
||||
|
||||
---
|
||||
|
||||
### 3. **Ask Workflow** (`open_notebook/graphs/ask.py`)
|
||||
|
||||
**Purpose**: Answer user questions by searching sources and synthesizing responses.
|
||||
|
||||
**Flow**:
|
||||
```
|
||||
User Question
|
||||
↓
|
||||
Plan Search Strategy (LLM generates searches)
|
||||
↓
|
||||
Execute Searches (vector + text search)
|
||||
↓
|
||||
Score & Rank Results
|
||||
↓
|
||||
Provide Answers (LLM synthesizes from results)
|
||||
↓
|
||||
Stream Responses
|
||||
↓
|
||||
Output (final answer)
|
||||
```
|
||||
|
||||
**State Dict**:
|
||||
```python
|
||||
{
|
||||
"question": str,
|
||||
"strategy": SearchStrategy,
|
||||
"answers": List[str],
|
||||
"final_answer": str,
|
||||
"sources_used": List[Source],
|
||||
}
|
||||
```
|
||||
|
||||
**Streaming**: Uses `astream()` to emit updates in real-time (strategy → answers → final answer)
|
||||
|
||||
**Invoked By**: Search API (`POST /ask` with streaming)
|
||||
|
||||
---
|
||||
|
||||
### 4. **Transformation Workflow** (`open_notebook/graphs/transformation.py`)
|
||||
|
||||
**Purpose**: Apply custom transformations to sources (extract summaries, key points, etc).
|
||||
|
||||
**Flow**:
|
||||
```
|
||||
Source + Transformation Rule
|
||||
↓
|
||||
Generate Prompt (Jinja2 template)
|
||||
↓
|
||||
Call LLM
|
||||
↓
|
||||
Parse Output
|
||||
↓
|
||||
Create SourceInsight record
|
||||
↓
|
||||
Output (insight with type + content)
|
||||
```
|
||||
|
||||
**Example Transformations**:
|
||||
- Summary (5-sentence overview)
|
||||
- Key Points (bulleted list)
|
||||
- Quotes (notable excerpts)
|
||||
- Q&A (generated questions and answers)
|
||||
|
||||
**Invoked By**: Sources API (`POST /sources/{id}/insights`)
|
||||
|
||||
---
|
||||
|
||||
### 5. **Prompt Workflow** (`open_notebook/graphs/prompt.py`)
|
||||
|
||||
**Purpose**: Generic LLM task execution (e.g., auto-generate note titles, analyze content).
|
||||
|
||||
**Flow**:
|
||||
```
|
||||
Input Text + Prompt
|
||||
↓
|
||||
Call LLM (simple request-response)
|
||||
↓
|
||||
Output (completion)
|
||||
```
|
||||
|
||||
**Used For**: Note title generation, content analysis, etc.
|
||||
|
||||
---
|
||||
|
||||
## AI Provider Integration Pattern
|
||||
|
||||
### ModelManager: Centralized Factory
|
||||
|
||||
Located in `open_notebook/ai/models.py`, ModelManager handles:
|
||||
|
||||
1. **Provider Detection**: Check environment variables for available providers
|
||||
2. **Model Selection**: Choose best model based on context size and task
|
||||
3. **Fallback Logic**: If primary provider unavailable, try backup
|
||||
4. **Cost Optimization**: Prefer cheaper models for simple tasks
|
||||
5. **Token Calculation**: Estimate cost before LLM call
|
||||
|
||||
**Usage**:
|
||||
```python
|
||||
from open_notebook.ai.provision import provision_langchain_model
|
||||
|
||||
# Get best LLM for context size
|
||||
model = await provision_langchain_model(
|
||||
task="chat", # or "search", "extraction"
|
||||
model_override="anthropic/claude-opus-4", # optional
|
||||
context_size=8000, # estimated tokens
|
||||
)
|
||||
|
||||
# Invoke model
|
||||
response = await model.ainvoke({"input": prompt})
|
||||
```
|
||||
|
||||
### Multi-Provider Support
|
||||
|
||||
**LLM Providers**:
|
||||
- OpenAI (gpt-4, gpt-4-turbo, gpt-3.5-turbo)
|
||||
- Anthropic (claude-opus, claude-sonnet, claude-haiku)
|
||||
- Google (gemini-pro, gemini-1.5)
|
||||
- Groq (mixtral, llama-2)
|
||||
- Ollama (local models)
|
||||
- Mistral (mistral-large, mistral-medium)
|
||||
- DeepSeek (deepseek-chat)
|
||||
- xAI (grok)
|
||||
|
||||
**Embedding Providers**:
|
||||
- OpenAI (text-embedding-3-large, text-embedding-3-small)
|
||||
- Google (embedding-001)
|
||||
- Ollama (local embeddings)
|
||||
- Mistral (mistral-embed)
|
||||
- Voyage (voyage-large-2)
|
||||
|
||||
**TTS Providers**:
|
||||
- OpenAI (tts-1, tts-1-hd)
|
||||
- Groq (no TTS, fallback to OpenAI)
|
||||
- ElevenLabs (multilingual voices)
|
||||
- Google TTS (text-to-speech)
|
||||
|
||||
### Per-Request Override
|
||||
|
||||
Every LangGraph invocation accepts a `config` parameter to override models:
|
||||
|
||||
```python
|
||||
result = await graph.ainvoke(
|
||||
input={...},
|
||||
config={
|
||||
"configurable": {
|
||||
"model_override": "anthropic/claude-opus-4" # Use Claude instead
|
||||
}
|
||||
}
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Design Patterns
|
||||
|
||||
### 1. **Domain-Driven Design (DDD)**
|
||||
|
||||
**Domain Objects** (`open_notebook/domain/`):
|
||||
- `Notebook`: Research container with relationships to sources/notes
|
||||
- `Source`: Content item (PDF, URL, text) with embeddings
|
||||
- `Note`: User-created or AI-generated research note
|
||||
- `ChatSession`: Conversation history for a notebook
|
||||
- `Transformation`: Custom rule for extracting insights
|
||||
|
||||
**Repository Pattern**:
|
||||
- Database access layer (`open_notebook/database/repository.py`)
|
||||
- `repo_query()`: Execute SurrealQL queries
|
||||
- `repo_create()`: Insert records
|
||||
- `repo_upsert()`: Merge records
|
||||
- `repo_delete()`: Remove records
|
||||
|
||||
**Entity Methods**:
|
||||
```python
|
||||
# Domain methods (business logic)
|
||||
notebook = await Notebook.get(id)
|
||||
await notebook.save()
|
||||
notes = await notebook.get_notes()
|
||||
sources = await notebook.get_sources()
|
||||
```
|
||||
|
||||
### 2. **Async-First Architecture**
|
||||
|
||||
**All I/O is async**:
|
||||
- Database queries: `await repo_query(...)`
|
||||
- LLM calls: `await model.ainvoke(...)`
|
||||
- File I/O: `await upload_file.read()`
|
||||
- Graph invocations: `await graph.ainvoke(...)`
|
||||
|
||||
**Benefits**:
|
||||
- Non-blocking request handling (FastAPI serves multiple concurrent requests)
|
||||
- Better resource utilization (I/O waiting doesn't block CPU)
|
||||
- Natural fit for Python async/await syntax
|
||||
|
||||
**Example**:
|
||||
```python
|
||||
@router.post("/sources")
|
||||
async def create_source(source_data: SourceCreate):
|
||||
# All operations are non-blocking
|
||||
source = Source(title=source_data.title)
|
||||
await source.save() # async database operation
|
||||
await graph.ainvoke({...}) # async LangGraph invocation
|
||||
return SourceResponse(...)
|
||||
```
|
||||
|
||||
### 3. **Service Pattern**
|
||||
|
||||
Services orchestrate domain objects, repositories, and workflows:
|
||||
|
||||
```python
|
||||
# api/notebook_service.py
|
||||
class NotebookService:
|
||||
async def get_notebook_with_stats(notebook_id: str):
|
||||
notebook = await Notebook.get(notebook_id)
|
||||
sources = await notebook.get_sources()
|
||||
notes = await notebook.get_notes()
|
||||
return {
|
||||
"notebook": notebook,
|
||||
"source_count": len(sources),
|
||||
"note_count": len(notes),
|
||||
}
|
||||
```
|
||||
|
||||
**Responsibilities**:
|
||||
- Validate inputs (Pydantic)
|
||||
- Orchestrate database operations
|
||||
- Invoke workflows (LangGraph graphs)
|
||||
- Handle errors and return appropriate status codes
|
||||
- Log operations
|
||||
|
||||
### 4. **Streaming Pattern**
|
||||
|
||||
For long-running operations (ask workflow, podcast generation), stream results as Server-Sent Events:
|
||||
|
||||
```python
|
||||
@router.post("/ask", response_class=StreamingResponse)
|
||||
async def ask(request: AskRequest):
|
||||
async def stream_response():
|
||||
async for chunk in ask_graph.astream(input={...}):
|
||||
yield f"data: {json.dumps(chunk)}\n\n"
|
||||
return StreamingResponse(stream_response(), media_type="text/event-stream")
|
||||
```
|
||||
|
||||
### 5. **Job Queue Pattern**
|
||||
|
||||
For async background tasks (source processing), use Surreal-Commands job queue:
|
||||
|
||||
```python
|
||||
# Submit job
|
||||
command_id = await CommandService.submit_command_job(
|
||||
app="open_notebook",
|
||||
command="process_source",
|
||||
input={...}
|
||||
)
|
||||
|
||||
# Poll status
|
||||
status = await source.get_status()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Service Communication Patterns
|
||||
|
||||
### Frontend → API
|
||||
|
||||
1. **REST requests** (HTTP GET/POST/PUT/DELETE)
|
||||
2. **JSON request/response bodies**
|
||||
3. **Standard HTTP status codes** (200, 400, 404, 500)
|
||||
4. **Optional streaming** (Server-Sent Events for long operations)
|
||||
|
||||
**Example**:
|
||||
```typescript
|
||||
// Frontend
|
||||
const response = await fetch("http://localhost:5055/sources", {
|
||||
method: "POST",
|
||||
body: formData, // multipart/form-data for file upload
|
||||
});
|
||||
const source = await response.json();
|
||||
```
|
||||
|
||||
### API → SurrealDB
|
||||
|
||||
1. **SurrealQL queries** (similar to SQL)
|
||||
2. **Async driver** with connection pooling
|
||||
3. **Type-safe record IDs** (record_id syntax)
|
||||
4. **Transaction support** for multi-step operations
|
||||
|
||||
**Example**:
|
||||
```python
|
||||
# API
|
||||
result = await repo_query(
|
||||
"SELECT * FROM source WHERE notebook = $notebook_id",
|
||||
{"notebook_id": ensure_record_id(notebook_id)}
|
||||
)
|
||||
```
|
||||
|
||||
### API → AI Providers (via Esperanto)
|
||||
|
||||
1. **Esperanto unified interface**
|
||||
2. **Per-request provider override**
|
||||
3. **Automatic fallback on failure**
|
||||
4. **Token counting and cost estimation**
|
||||
|
||||
**Example**:
|
||||
```python
|
||||
# API
|
||||
model = await provision_langchain_model(task="chat")
|
||||
response = await model.ainvoke({"input": prompt})
|
||||
```
|
||||
|
||||
### API → Job Queue (Surreal-Commands)
|
||||
|
||||
1. **Async job submission**
|
||||
2. **Fire-and-forget pattern**
|
||||
3. **Status polling via `/commands/{id}` endpoint**
|
||||
4. **Job completion callbacks (optional)**
|
||||
|
||||
**Example**:
|
||||
```python
|
||||
# Submit async source processing
|
||||
command_id = await CommandService.submit_command_job(...)
|
||||
|
||||
# Client polls status
|
||||
response = await fetch(f"http://localhost:5055/commands/{command_id}")
|
||||
status = await response.json() # returns { status: "running|queued|completed|failed" }
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Database Schema Overview
|
||||
|
||||
### Core Schema Structure
|
||||
|
||||
**Tables** (20+):
|
||||
- Notebooks (with soft-delete via `archived` flag)
|
||||
- Sources (content + metadata)
|
||||
- SourceEmbeddings (vector chunks)
|
||||
- Notes (user-created + AI-generated)
|
||||
- ChatSessions (conversation history)
|
||||
- Transformations (custom rules)
|
||||
- SourceInsights (transformation outputs)
|
||||
- Relationships (notebook→source, notebook→note)
|
||||
|
||||
**Migrations**:
|
||||
- Automatic on API startup
|
||||
- Located in `/migrations/` directory
|
||||
- Numbered sequentially (001_*.surql, 002_*.surql, etc)
|
||||
- Tracked in `_sbl_migrations` table
|
||||
- Rollback via `_down.surql` files (manual)
|
||||
|
||||
### Relationship Model
|
||||
|
||||
**Graph Relationships**:
|
||||
```
|
||||
Notebook
|
||||
← reference ← Source (many:many)
|
||||
← artifact ← Note (many:many)
|
||||
|
||||
Source
|
||||
→ source_embedding (one:many)
|
||||
→ source_insight (one:many)
|
||||
→ embedding (via source_embedding)
|
||||
|
||||
ChatSession
|
||||
→ messages (JSON array in database)
|
||||
→ notebook_id (reference to Notebook)
|
||||
|
||||
Transformation
|
||||
→ source_insight (one:many)
|
||||
```
|
||||
|
||||
**Query Example** (get all sources in a notebook with counts):
|
||||
```sql
|
||||
SELECT id, title,
|
||||
count(<-reference.in) as note_count,
|
||||
count(<-embedding.in) as embedded_chunks
|
||||
FROM source
|
||||
WHERE notebook = $notebook_id
|
||||
ORDER BY updated DESC
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Key Architectural Decisions
|
||||
|
||||
### 1. **Async Throughout**
|
||||
|
||||
All I/O operations are non-blocking to maximize concurrency and responsiveness.
|
||||
|
||||
**Trade-off**: Slightly more complex code (async/await syntax) vs. high throughput.
|
||||
|
||||
### 2. **Multi-Provider from Day 1**
|
||||
|
||||
Built-in support for 8+ AI providers prevents vendor lock-in.
|
||||
|
||||
**Trade-off**: Added complexity in ModelManager vs. flexibility and cost optimization.
|
||||
|
||||
### 3. **Graph-First Workflows**
|
||||
|
||||
LangGraph state machines for complex multi-step operations (ask, chat, transformations).
|
||||
|
||||
**Trade-off**: Steeper learning curve vs. maintainable, debuggable workflows.
|
||||
|
||||
### 4. **Self-Hosted Database**
|
||||
|
||||
SurrealDB for graph + vector search in one system (no external dependencies).
|
||||
|
||||
**Trade-off**: Operational responsibility vs. simplified architecture and cost savings.
|
||||
|
||||
### 5. **Job Queue for Long-Running Tasks**
|
||||
|
||||
Async job submission (source processing, podcast generation) prevents request timeouts.
|
||||
|
||||
**Trade-off**: Eventual consistency vs. responsive user experience.
|
||||
|
||||
---
|
||||
|
||||
## Important Quirks & Gotchas
|
||||
|
||||
### API Startup
|
||||
|
||||
- **Migrations run automatically** on every startup; check logs for errors
|
||||
- **SurrealDB must be running** before starting API (connection test in lifespan)
|
||||
- **Auth middleware is basic** (password-only); upgrade to OAuth/JWT for production
|
||||
|
||||
### Database Operations
|
||||
|
||||
- **Record IDs use SurrealDB syntax** (table:id format, e.g., "notebook:abc123")
|
||||
- **ensure_record_id()** helper prevents malformed IDs
|
||||
- **Soft deletes** via `archived` field (data not removed, just marked inactive)
|
||||
- **Timestamps in ISO 8601 format** (created, updated fields)
|
||||
|
||||
### LangGraph Workflows
|
||||
|
||||
- **State persistence** via SqliteSaver in `/data/sqlite-db/`
|
||||
- **No built-in timeout**; long workflows may block requests (use streaming for UX)
|
||||
- **Model fallback** automatic if primary provider unavailable
|
||||
- **Checkpoint IDs** must be unique per session (avoid collisions)
|
||||
|
||||
### AI Provider Integration
|
||||
|
||||
- **Esperanto library** handles all provider APIs (no direct API calls)
|
||||
- **Per-request override** via RunnableConfig (temporary, not persistent)
|
||||
- **Cost estimation** via token counting (not 100% accurate, use for guidance)
|
||||
- **Fallback logic** tries cheaper models if primary fails
|
||||
|
||||
### File Uploads
|
||||
|
||||
- **Stored in `/data/uploads/`** directory (not database)
|
||||
- **Unique filename generation** prevents overwrites (counter suffix)
|
||||
- **Content-core library** extracts text from 50+ file types
|
||||
- **Large files** may block API briefly (sync content extraction)
|
||||
|
||||
---
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
### Optimization Strategies
|
||||
|
||||
1. **Connection Pooling**: SurrealDB async driver with configurable pool size
|
||||
2. **Query Caching**: TanStack Query on frontend (client-side caching)
|
||||
3. **Embedding Reuse**: Vector search uses pre-computed embeddings
|
||||
4. **Chunking**: Sources split into chunks for better search relevance
|
||||
5. **Async Operations**: Non-blocking I/O for high concurrency
|
||||
6. **Lazy Loading**: Frontend requests only needed data (pagination)
|
||||
|
||||
### Bottlenecks
|
||||
|
||||
1. **LLM Calls**: Latency depends on provider (typically 1-30 seconds)
|
||||
2. **Embedding Generation**: Time proportional to content size and provider
|
||||
3. **Vector Search**: Similarity computation over all embeddings
|
||||
4. **Content Extraction**: Sync operation in source processing
|
||||
|
||||
### Monitoring
|
||||
|
||||
- **API Logs**: Check loguru output for errors and slow operations
|
||||
- **Database Queries**: SurrealDB metrics available via admin UI
|
||||
- **Token Usage**: Estimated via `estimate_tokens()` utility
|
||||
- **Job Status**: Poll `/commands/{id}` for async operations
|
||||
|
||||
---
|
||||
|
||||
## Extension Points
|
||||
|
||||
### Adding a New Workflow
|
||||
|
||||
1. Create `open_notebook/graphs/workflow_name.py`
|
||||
2. Define StateDict and node functions
|
||||
3. Build graph with `.add_node()` / `.add_edge()`
|
||||
4. Create service in `api/workflow_service.py`
|
||||
5. Register router in `api/main.py`
|
||||
6. Add tests in `tests/test_workflow.py`
|
||||
|
||||
### Adding a New Data Model
|
||||
|
||||
1. Create model in `open_notebook/domain/model_name.py`
|
||||
2. Inherit from BaseModel (domain object)
|
||||
3. Implement `save()`, `get()`, `delete()` methods (CRUD)
|
||||
4. Add repository functions if complex queries needed
|
||||
5. Create database migration in `migrations/`
|
||||
6. Add API routes and models in `api/`
|
||||
|
||||
### Adding a New AI Provider
|
||||
|
||||
1. Configure Esperanto for new provider (see .env.example)
|
||||
2. ModelManager automatically detects via environment variables
|
||||
3. Override via per-request config (no code changes needed)
|
||||
4. Test fallback logic if provider unavailable
|
||||
|
||||
---
|
||||
|
||||
## Deployment Considerations
|
||||
|
||||
### Development
|
||||
|
||||
- All services on localhost (3000, 5055, 8000)
|
||||
- Auto-reload on file changes (Next.js, FastAPI)
|
||||
- Hot-reload database migrations
|
||||
- Open API docs at http://localhost:5055/docs
|
||||
|
||||
### Production
|
||||
|
||||
- **Frontend**: Deploy to Vercel, Netlify, or Docker
|
||||
- **API**: Docker container (see Dockerfile)
|
||||
- **Database**: SurrealDB container or managed service
|
||||
- **Environment**: Secure .env file with API keys
|
||||
- **SSL/TLS**: Reverse proxy (Nginx, CloudFlare)
|
||||
- **Rate Limiting**: Add at proxy layer
|
||||
- **Auth**: Replace PasswordAuthMiddleware with OAuth/JWT
|
||||
- **Monitoring**: Log aggregation (CloudWatch, DataDog, etc)
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
Open Notebook's architecture provides a solid foundation for privacy-focused, AI-powered research. The separation of concerns (frontend/API/database), async-first design, and multi-provider flexibility enable rapid development and easy deployment. LangGraph workflows orchestrate complex AI tasks, while Esperanto abstracts provider details. The result is a scalable, maintainable system that puts users in control of their data and AI provider choice.
|
||||
375
docs/7-DEVELOPMENT/code-standards.md
Normal file
375
docs/7-DEVELOPMENT/code-standards.md
Normal file
|
|
@ -0,0 +1,375 @@
|
|||
# Code Standards
|
||||
|
||||
This document outlines coding standards and best practices for Open Notebook contributions. All code should follow these guidelines to ensure consistency, readability, and maintainability.
|
||||
|
||||
## Python Standards
|
||||
|
||||
### Code Formatting
|
||||
|
||||
We follow **PEP 8** with some specific guidelines:
|
||||
|
||||
- Use **Ruff** for linting and formatting
|
||||
- Maximum line length: **88 characters**
|
||||
- Use **double quotes** for strings
|
||||
- Use **trailing commas** in multi-line structures
|
||||
|
||||
### Type Hints
|
||||
|
||||
Always use type hints for function parameters and return values:
|
||||
|
||||
```python
|
||||
from typing import List, Optional, Dict, Any
|
||||
from pydantic import BaseModel
|
||||
|
||||
async def process_content(
|
||||
content: str,
|
||||
options: Optional[Dict[str, Any]] = None
|
||||
) -> ProcessedContent:
|
||||
"""Process content with optional configuration."""
|
||||
# Implementation
|
||||
```
|
||||
|
||||
### Async/Await Patterns
|
||||
|
||||
Use async/await consistently throughout the codebase:
|
||||
|
||||
```python
|
||||
# Good
|
||||
async def fetch_data(url: str) -> Dict[str, Any]:
|
||||
async with aiohttp.ClientSession() as session:
|
||||
async with session.get(url) as response:
|
||||
return await response.json()
|
||||
|
||||
# Bad - mixing sync and async
|
||||
def fetch_data(url: str) -> Dict[str, Any]:
|
||||
loop = asyncio.get_event_loop()
|
||||
return loop.run_until_complete(async_fetch(url))
|
||||
```
|
||||
|
||||
### Error Handling
|
||||
|
||||
Use structured error handling with custom exceptions:
|
||||
|
||||
```python
|
||||
from open_notebook.exceptions import DatabaseOperationError, InvalidInputError
|
||||
|
||||
async def create_notebook(name: str, description: str) -> Notebook:
|
||||
"""Create a new notebook with validation."""
|
||||
if not name.strip():
|
||||
raise InvalidInputError("Notebook name cannot be empty")
|
||||
|
||||
try:
|
||||
notebook = Notebook(name=name, description=description)
|
||||
await notebook.save()
|
||||
return notebook
|
||||
except Exception as e:
|
||||
raise DatabaseOperationError(f"Failed to create notebook: {str(e)}")
|
||||
```
|
||||
|
||||
### Documentation (Google-style Docstrings)
|
||||
|
||||
Use Google-style docstrings for all functions, classes, and modules:
|
||||
|
||||
```python
|
||||
async def vector_search(
|
||||
query: str,
|
||||
limit: int = 10,
|
||||
minimum_score: float = 0.2
|
||||
) -> List[SearchResult]:
|
||||
"""Perform vector search across embedded content.
|
||||
|
||||
Args:
|
||||
query: Search query string
|
||||
limit: Maximum number of results to return
|
||||
minimum_score: Minimum similarity score for results
|
||||
|
||||
Returns:
|
||||
List of search results sorted by relevance score
|
||||
|
||||
Raises:
|
||||
InvalidInputError: If query is empty or limit is invalid
|
||||
DatabaseOperationError: If search operation fails
|
||||
"""
|
||||
# Implementation
|
||||
```
|
||||
|
||||
#### Module Docstrings
|
||||
```python
|
||||
"""
|
||||
Notebook domain model and operations.
|
||||
|
||||
This module contains the core Notebook class and related operations for
|
||||
managing research notebooks within the Open Notebook system.
|
||||
"""
|
||||
```
|
||||
|
||||
#### Class Docstrings
|
||||
```python
|
||||
class Notebook(BaseModel):
|
||||
"""A research notebook containing sources, notes, and chat sessions.
|
||||
|
||||
Notebooks are the primary organizational unit in Open Notebook, allowing
|
||||
users to group related research materials and maintain separate contexts
|
||||
for different projects.
|
||||
|
||||
Attributes:
|
||||
name: The notebook's display name
|
||||
description: Optional description of the notebook's purpose
|
||||
archived: Whether the notebook is archived (default: False)
|
||||
created: Timestamp of creation
|
||||
updated: Timestamp of last update
|
||||
"""
|
||||
```
|
||||
|
||||
#### Function Docstrings
|
||||
```python
|
||||
async def create_notebook(
|
||||
name: str,
|
||||
description: str = "",
|
||||
user_id: Optional[str] = None
|
||||
) -> Notebook:
|
||||
"""Create a new notebook with validation.
|
||||
|
||||
Args:
|
||||
name: The notebook name (required, non-empty)
|
||||
description: Optional notebook description
|
||||
user_id: Optional user ID for multi-user deployments
|
||||
|
||||
Returns:
|
||||
The created notebook instance
|
||||
|
||||
Raises:
|
||||
InvalidInputError: If name is empty or invalid
|
||||
DatabaseOperationError: If creation fails
|
||||
|
||||
Example:
|
||||
```python
|
||||
notebook = await create_notebook(
|
||||
name="AI Research",
|
||||
description="Research on AI applications"
|
||||
)
|
||||
```
|
||||
"""
|
||||
```
|
||||
|
||||
## FastAPI Standards
|
||||
|
||||
### Router Organization
|
||||
|
||||
Organize endpoints by domain:
|
||||
|
||||
```python
|
||||
# api/routers/notebooks.py
|
||||
from fastapi import APIRouter, HTTPException, Query
|
||||
from typing import List, Optional
|
||||
|
||||
router = APIRouter()
|
||||
|
||||
@router.get("/notebooks", response_model=List[NotebookResponse])
|
||||
async def get_notebooks(
|
||||
archived: Optional[bool] = Query(None, description="Filter by archived status"),
|
||||
order_by: str = Query("updated desc", description="Order by field and direction"),
|
||||
):
|
||||
"""Get all notebooks with optional filtering and ordering."""
|
||||
# Implementation
|
||||
```
|
||||
|
||||
### Request/Response Models
|
||||
|
||||
Use Pydantic models for validation:
|
||||
|
||||
```python
|
||||
from pydantic import BaseModel, Field
|
||||
from typing import Optional
|
||||
|
||||
class NotebookCreate(BaseModel):
|
||||
name: str = Field(..., description="Name of the notebook", min_length=1)
|
||||
description: str = Field(default="", description="Description of the notebook")
|
||||
|
||||
class NotebookResponse(BaseModel):
|
||||
id: str
|
||||
name: str
|
||||
description: str
|
||||
archived: bool
|
||||
created: str
|
||||
updated: str
|
||||
```
|
||||
|
||||
### Error Handling
|
||||
|
||||
Use consistent error responses:
|
||||
|
||||
```python
|
||||
from fastapi import HTTPException
|
||||
from loguru import logger
|
||||
|
||||
try:
|
||||
result = await some_operation()
|
||||
return result
|
||||
except InvalidInputError as e:
|
||||
raise HTTPException(status_code=400, detail=str(e))
|
||||
except DatabaseOperationError as e:
|
||||
logger.error(f"Database error: {str(e)}")
|
||||
raise HTTPException(status_code=500, detail="Internal server error")
|
||||
```
|
||||
|
||||
### API Documentation
|
||||
|
||||
Use FastAPI's automatic documentation features:
|
||||
|
||||
```python
|
||||
@router.post(
|
||||
"/notebooks",
|
||||
response_model=NotebookResponse,
|
||||
summary="Create a new notebook",
|
||||
description="Create a new notebook with the specified name and description.",
|
||||
responses={
|
||||
201: {"description": "Notebook created successfully"},
|
||||
400: {"description": "Invalid input data"},
|
||||
500: {"description": "Internal server error"}
|
||||
}
|
||||
)
|
||||
async def create_notebook(notebook: NotebookCreate):
|
||||
"""Create a new notebook."""
|
||||
# Implementation
|
||||
```
|
||||
|
||||
## Database Standards
|
||||
|
||||
### SurrealDB Patterns
|
||||
|
||||
Use the repository pattern consistently:
|
||||
|
||||
```python
|
||||
from open_notebook.database.repository import repo_create, repo_query, repo_update
|
||||
|
||||
# Create records
|
||||
async def create_notebook(data: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""Create a new notebook record."""
|
||||
return await repo_create("notebook", data)
|
||||
|
||||
# Query with parameters
|
||||
async def find_notebooks_by_user(user_id: str) -> List[Dict[str, Any]]:
|
||||
"""Find notebooks for a specific user."""
|
||||
return await repo_query(
|
||||
"SELECT * FROM notebook WHERE user_id = $user_id",
|
||||
{"user_id": user_id}
|
||||
)
|
||||
|
||||
# Update records
|
||||
async def update_notebook(notebook_id: str, data: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""Update a notebook record."""
|
||||
return await repo_update("notebook", notebook_id, data)
|
||||
```
|
||||
|
||||
### Schema Management
|
||||
|
||||
Use migrations for schema changes:
|
||||
|
||||
```surrealql
|
||||
-- migrations/8.surrealql
|
||||
DEFINE TABLE IF NOT EXISTS new_feature SCHEMAFULL;
|
||||
DEFINE FIELD IF NOT EXISTS name ON TABLE new_feature TYPE string;
|
||||
DEFINE FIELD IF NOT EXISTS description ON TABLE new_feature TYPE option<string>;
|
||||
DEFINE FIELD IF NOT EXISTS created ON TABLE new_feature TYPE datetime DEFAULT time::now();
|
||||
DEFINE FIELD IF NOT EXISTS updated ON TABLE new_feature TYPE datetime DEFAULT time::now();
|
||||
```
|
||||
|
||||
## TypeScript Standards
|
||||
|
||||
### Basic Guidelines
|
||||
|
||||
Follow TypeScript best practices:
|
||||
|
||||
- Use strict mode enabled in `tsconfig.json`
|
||||
- Use proper type annotations for all variables and functions
|
||||
- Avoid using `any` type unless absolutely necessary
|
||||
- Use `interface` for object shapes, `type` for unions and other advanced types
|
||||
|
||||
### Component Structure
|
||||
|
||||
- Use functional components with hooks
|
||||
- Keep components focused and single-responsibility
|
||||
- Extract reusable logic into custom hooks
|
||||
- Use proper TypeScript types for props
|
||||
|
||||
### Error Handling
|
||||
|
||||
- Handle errors explicitly
|
||||
- Provide meaningful error messages
|
||||
- Log errors appropriately
|
||||
- Don't suppress errors silently
|
||||
|
||||
## Code Quality Tools
|
||||
|
||||
We use these tools to maintain code quality:
|
||||
|
||||
- **Ruff**: Linting and code formatting
|
||||
- Run with: `uv run ruff check . --fix`
|
||||
- Format with: `uv run ruff format .`
|
||||
|
||||
- **MyPy**: Static type checking
|
||||
- Run with: `uv run python -m mypy .`
|
||||
|
||||
- **Pytest**: Testing framework
|
||||
- Run with: `uv run pytest`
|
||||
|
||||
## Common Patterns
|
||||
|
||||
### Async Database Operations
|
||||
|
||||
```python
|
||||
async def get_notebook_with_sources(notebook_id: str) -> Notebook:
|
||||
"""Retrieve notebook with all related sources."""
|
||||
notebook_data = await repo_query(
|
||||
"SELECT * FROM notebook WHERE id = $id",
|
||||
{"id": notebook_id}
|
||||
)
|
||||
if not notebook_data:
|
||||
raise InvalidInputError(f"Notebook {notebook_id} not found")
|
||||
|
||||
sources_data = await repo_query(
|
||||
"SELECT * FROM source WHERE notebook_id = $notebook_id",
|
||||
{"notebook_id": notebook_id}
|
||||
)
|
||||
|
||||
return Notebook(
|
||||
**notebook_data[0],
|
||||
sources=[Source(**s) for s in sources_data]
|
||||
)
|
||||
```
|
||||
|
||||
### Model Validation
|
||||
|
||||
```python
|
||||
from pydantic import BaseModel, validator
|
||||
|
||||
class NotebookInput(BaseModel):
|
||||
name: str
|
||||
description: str = ""
|
||||
|
||||
@validator('name')
|
||||
def name_not_empty(cls, v):
|
||||
if not v.strip():
|
||||
raise ValueError('Name cannot be empty')
|
||||
return v.strip()
|
||||
```
|
||||
|
||||
## Code Review Checklist
|
||||
|
||||
Before submitting code for review, ensure:
|
||||
|
||||
- [ ] Code follows PEP 8 / TypeScript best practices
|
||||
- [ ] Type hints are present for all functions
|
||||
- [ ] Docstrings are complete and accurate
|
||||
- [ ] Error handling is appropriate
|
||||
- [ ] Tests are included and passing
|
||||
- [ ] No debug code (console.logs, print statements) left behind
|
||||
- [ ] Commit messages are clear and follow conventions
|
||||
- [ ] Documentation is updated if needed
|
||||
|
||||
---
|
||||
|
||||
**See also:**
|
||||
- [Testing Guide](testing.md) - How to write tests
|
||||
- [Contributing Guide](contributing.md) - Overall contribution workflow
|
||||
201
docs/7-DEVELOPMENT/contributing.md
Normal file
201
docs/7-DEVELOPMENT/contributing.md
Normal file
|
|
@ -0,0 +1,201 @@
|
|||
# Contributing to Open Notebook
|
||||
|
||||
Thank you for your interest in contributing to Open Notebook! We welcome contributions from developers of all skill levels. This guide will help you understand our contribution workflow and what makes a good contribution.
|
||||
|
||||
## 🚨 Issue-First Workflow
|
||||
|
||||
**To maintain project coherence and avoid wasted effort, please follow this process:**
|
||||
|
||||
1. **Create an issue first** - Before writing any code, create an issue describing the bug or feature
|
||||
2. **Propose your solution** - Explain how you plan to implement the fix or feature
|
||||
3. **Wait for assignment** - A maintainer will review and assign the issue to you if approved
|
||||
4. **Only then start coding** - This ensures your work aligns with the project's vision and architecture
|
||||
|
||||
**Why this process?**
|
||||
- Prevents duplicate work
|
||||
- Ensures solutions align with our architecture and design principles
|
||||
- Saves your time by getting feedback before coding
|
||||
- Helps maintainers manage the project direction
|
||||
|
||||
> ⚠️ **Pull requests without an assigned issue may be closed**, even if the code is good. We want to respect your time by making sure work is aligned before it starts.
|
||||
|
||||
## Code of Conduct
|
||||
|
||||
By participating in this project, you are expected to uphold our Code of Conduct. Be respectful, constructive, and collaborative.
|
||||
|
||||
## How Can I Contribute?
|
||||
|
||||
### Reporting Bugs
|
||||
|
||||
1. **Search existing issues** - Check if the bug was already reported
|
||||
2. **Create a bug report** - Use the [Bug Report template](https://github.com/lfnovo/open-notebook/issues/new?template=bug_report.yml)
|
||||
3. **Provide details** - Include:
|
||||
- Steps to reproduce
|
||||
- Expected vs actual behavior
|
||||
- Logs, screenshots, or error messages
|
||||
- Your environment (OS, Docker version, Open Notebook version)
|
||||
4. **Indicate if you want to fix it** - Check the "I would like to work on this" box if you're interested
|
||||
|
||||
### Suggesting Features
|
||||
|
||||
1. **Search existing issues** - Check if the feature was already suggested
|
||||
2. **Create a feature request** - Use the [Feature Request template](https://github.com/lfnovo/open-notebook/issues/new?template=feature_request.yml)
|
||||
3. **Explain the value** - Describe why this feature would be helpful
|
||||
4. **Propose implementation** - If you have ideas on how to implement it, share them
|
||||
5. **Indicate if you want to build it** - Check the "I would like to work on this" box if you're interested
|
||||
|
||||
### Contributing Code (Pull Requests)
|
||||
|
||||
**IMPORTANT: Follow the issue-first workflow above before starting any PR**
|
||||
|
||||
Once your issue is assigned:
|
||||
|
||||
1. **Fork the repo** and create your branch from `main`
|
||||
2. **Understand our vision and principles** - Read [design-principles.md](design-principles.md) to understand what guides our decisions
|
||||
3. **Follow our architecture** - Refer to the architecture documentation to understand project structure
|
||||
4. **Write quality code** - Follow the standards outlined in [code-standards.md](code-standards.md)
|
||||
5. **Test your changes** - See [testing.md](testing.md) for test guidelines
|
||||
6. **Update documentation** - If you changed functionality, update the relevant docs
|
||||
7. **Create your PR**:
|
||||
- Reference the issue number (e.g., "Fixes #123")
|
||||
- Describe what changed and why
|
||||
- Include screenshots for UI changes
|
||||
- Keep PRs focused - one issue per PR
|
||||
|
||||
### What Makes a Good Contribution?
|
||||
|
||||
✅ **We love PRs that:**
|
||||
- Solve a real problem described in an issue
|
||||
- Follow our architecture and coding standards
|
||||
- Include tests and documentation
|
||||
- Are well-scoped (focused on one thing)
|
||||
- Have clear commit messages
|
||||
|
||||
❌ **We may close PRs that:**
|
||||
- Don't have an associated approved issue
|
||||
- Introduce breaking changes without discussion
|
||||
- Conflict with our architectural vision
|
||||
- Lack tests or documentation
|
||||
- Try to solve multiple unrelated problems
|
||||
|
||||
## Git Commit Messages
|
||||
|
||||
- Use the present tense ("Add feature" not "Added feature")
|
||||
- Use the imperative mood ("Move cursor to..." not "Moves cursor to...")
|
||||
- Limit the first line to 72 characters or less
|
||||
- Reference issues and pull requests liberally after the first line
|
||||
|
||||
## Development Workflow
|
||||
|
||||
### Branch Strategy
|
||||
|
||||
We use a **feature branch workflow**:
|
||||
|
||||
1. **Main Branch**: `main` - production-ready code
|
||||
2. **Feature Branches**: `feature/description` - new features
|
||||
3. **Bug Fixes**: `fix/description` - bug fixes
|
||||
4. **Documentation**: `docs/description` - documentation updates
|
||||
|
||||
### Making Changes
|
||||
|
||||
1. **Create a feature branch**:
|
||||
```bash
|
||||
git checkout -b feature/amazing-new-feature
|
||||
```
|
||||
|
||||
2. **Make your changes** following our coding standards
|
||||
|
||||
3. **Test your changes**:
|
||||
```bash
|
||||
# Run tests
|
||||
uv run pytest
|
||||
|
||||
# Run linting
|
||||
uv run ruff check .
|
||||
|
||||
# Run formatting
|
||||
uv run ruff format .
|
||||
```
|
||||
|
||||
4. **Commit your changes**:
|
||||
```bash
|
||||
git add .
|
||||
git commit -m "feat: add amazing new feature"
|
||||
```
|
||||
|
||||
5. **Push and create PR**:
|
||||
```bash
|
||||
git push origin feature/amazing-new-feature
|
||||
# Then create a Pull Request on GitHub
|
||||
```
|
||||
|
||||
### Keeping Your Fork Updated
|
||||
|
||||
```bash
|
||||
# Fetch upstream changes
|
||||
git fetch upstream
|
||||
|
||||
# Switch to main and merge
|
||||
git checkout main
|
||||
git merge upstream/main
|
||||
|
||||
# Push to your fork
|
||||
git push origin main
|
||||
```
|
||||
|
||||
## Pull Request Process
|
||||
|
||||
When you create a pull request:
|
||||
|
||||
1. **Link your issue** - Reference the issue number in PR description
|
||||
2. **Describe your changes** - Explain what changed and why
|
||||
3. **Provide test evidence** - Screenshots, test results, or logs
|
||||
4. **Check PR template** - Ensure you've completed all required sections
|
||||
5. **Wait for review** - A maintainer will review your PR within a week
|
||||
|
||||
### PR Review Expectations
|
||||
|
||||
- Code review feedback is about the code, not the person
|
||||
- Be open to suggestions and alternative approaches
|
||||
- Address review comments with clarity and respect
|
||||
- Ask questions if feedback is unclear
|
||||
|
||||
## Current Priority Areas
|
||||
|
||||
We're actively looking for contributions in these areas:
|
||||
|
||||
1. **Frontend Enhancement** - Help improve the Next.js/React UI with real-time updates and better UX
|
||||
2. **Testing** - Expand test coverage across all components
|
||||
3. **Performance** - Async processing improvements and caching
|
||||
4. **Documentation** - API examples and user guides
|
||||
5. **Integrations** - New content sources and AI providers
|
||||
|
||||
## Getting Help
|
||||
|
||||
### Community Support
|
||||
|
||||
- **Discord**: [Join our Discord server](https://discord.gg/37XJPXfz2w) for real-time help
|
||||
- **GitHub Discussions**: For longer-form questions and ideas
|
||||
- **GitHub Issues**: For bug reports and feature requests
|
||||
|
||||
### Documentation References
|
||||
|
||||
- [Design Principles](design-principles.md) - Understanding our project vision
|
||||
- [Code Standards](code-standards.md) - Coding guidelines by language
|
||||
- [Testing Guide](testing.md) - How to write tests
|
||||
- [Development Setup](development-setup.md) - Getting started locally
|
||||
|
||||
## Recognition
|
||||
|
||||
We recognize contributions through:
|
||||
|
||||
- **GitHub credits** on releases
|
||||
- **Community recognition** in Discord
|
||||
- **Contribution statistics** in project analytics
|
||||
- **Maintainer consideration** for active contributors
|
||||
|
||||
---
|
||||
|
||||
Thank you for contributing to Open Notebook! Your contributions help make research more accessible and private for everyone.
|
||||
|
||||
For questions about this guide or contributing in general, please reach out on [Discord](https://discord.gg/37XJPXfz2w) or open a GitHub Discussion.
|
||||
351
docs/7-DEVELOPMENT/design-principles.md
Normal file
351
docs/7-DEVELOPMENT/design-principles.md
Normal file
|
|
@ -0,0 +1,351 @@
|
|||
# Design Principles & Project Vision
|
||||
|
||||
This document outlines the core principles, vision, and design philosophy that guide Open Notebook's development. All contributors should read and understand these principles before proposing changes or new features.
|
||||
|
||||
## 🎯 Project Vision
|
||||
|
||||
Open Notebook aims to be a **privacy-focused, self-hosted alternative to Google's Notebook LM** that empowers users to:
|
||||
|
||||
1. **Own their research data** - Full control over where data lives and who can access it
|
||||
2. **Choose their AI providers** - Freedom to use any AI provider or run models locally
|
||||
3. **Customize their workflows** - Flexibility to adapt the tool to different research needs
|
||||
4. **Access their work anywhere** - Through web UI, API, or integrations
|
||||
|
||||
### What Open Notebook IS
|
||||
|
||||
- A **research assistant** for managing and understanding content
|
||||
- A **platform** that connects various AI providers
|
||||
- A **privacy-first** tool that keeps your data under your control
|
||||
- An **extensible system** with APIs and customization options
|
||||
|
||||
### What Open Notebook IS NOT
|
||||
|
||||
- A document editor (use Google Docs, Notion, etc. for that)
|
||||
- A file storage system (use Dropbox, S3, etc. for that)
|
||||
- A general-purpose chatbot (use ChatGPT, Claude, etc. for that)
|
||||
- A replacement for your entire workflow (it's one tool in your toolkit)
|
||||
|
||||
## 🏗️ Core Design Principles
|
||||
|
||||
### 1. Privacy First
|
||||
|
||||
**Principle**: User data and research should stay under user control by default.
|
||||
|
||||
**In Practice**:
|
||||
- Self-hosted deployment is the primary use case
|
||||
- No telemetry or analytics without explicit opt-in
|
||||
- No hard dependency on specific cloud services
|
||||
- Clear documentation on what data goes where
|
||||
|
||||
**Example Decisions**:
|
||||
- ✅ Support for local Ollama models
|
||||
- ✅ Configurable AI provider selection
|
||||
- ❌ Hard-coded cloud service integrations
|
||||
- ❌ Required external service dependencies
|
||||
|
||||
### 2. Simplicity Over Features
|
||||
|
||||
**Principle**: The tool should be easy to understand and use, even if it means fewer features.
|
||||
|
||||
**In Practice**:
|
||||
- Clear, focused UI with well-defined sections
|
||||
- Sensible defaults that work for most users
|
||||
- Advanced features hidden behind optional configuration
|
||||
- Documentation written for non-technical users
|
||||
|
||||
**Example Decisions**:
|
||||
- ✅ Three-column layout (Sources, Notes, Chat)
|
||||
- ✅ Default models that work out of the box
|
||||
- ❌ Overwhelming users with too many options upfront
|
||||
- ❌ Complex multi-step workflows for basic tasks
|
||||
|
||||
### 3. API-First Architecture
|
||||
|
||||
**Principle**: All functionality should be accessible via API, not just the UI.
|
||||
|
||||
**In Practice**:
|
||||
- UI calls the same API that external clients use
|
||||
- Comprehensive REST API with OpenAPI documentation
|
||||
- No "UI-only" features that can't be automated
|
||||
- Clear separation between frontend and backend
|
||||
|
||||
**Example Decisions**:
|
||||
- ✅ FastAPI backend with full API documentation
|
||||
- ✅ Consistent API patterns across all endpoints
|
||||
- ❌ Business logic in UI components
|
||||
- ❌ Features that require direct database access
|
||||
|
||||
### 4. Multi-Provider Flexibility
|
||||
|
||||
**Principle**: Users should never be locked into a single AI provider.
|
||||
|
||||
**In Practice**:
|
||||
- Support for multiple AI providers through Esperanto library
|
||||
- Easy switching between providers and models
|
||||
- Clear documentation on provider limitations
|
||||
- Graceful degradation when providers are unavailable
|
||||
|
||||
**Example Decisions**:
|
||||
- ✅ Support for 16+ AI providers
|
||||
- ✅ Per-feature model selection (chat, embeddings, TTS)
|
||||
- ❌ Features that only work with OpenAI
|
||||
- ❌ Hard-coded API endpoints for specific providers
|
||||
|
||||
### 5. Extensibility Through Standards
|
||||
|
||||
**Principle**: The system should be extensible through well-defined interfaces, not by forking.
|
||||
|
||||
**In Practice**:
|
||||
- Plugin systems for transformations and commands
|
||||
- Standard data formats (JSON, Markdown)
|
||||
- Clear extension points in the architecture
|
||||
- Documentation for common customization scenarios
|
||||
|
||||
**Example Decisions**:
|
||||
- ✅ Custom transformation templates
|
||||
- ✅ Background command system
|
||||
- ✅ Jinja2 prompt templates
|
||||
- ❌ Hard-coded business logic without extension points
|
||||
|
||||
### 6. Async-First for Performance
|
||||
|
||||
**Principle**: Long-running operations should not block the user interface or API.
|
||||
|
||||
**In Practice**:
|
||||
- Async/await patterns throughout the backend
|
||||
- Background job processing for heavy workloads
|
||||
- Status updates and progress tracking
|
||||
- Graceful handling of slow AI provider responses
|
||||
|
||||
**Example Decisions**:
|
||||
- ✅ AsyncIO for database operations
|
||||
- ✅ Background commands for podcast generation
|
||||
- ✅ Streaming responses for chat
|
||||
- ❌ Synchronous blocking operations in API endpoints
|
||||
|
||||
## 🎨 UI/UX Principles
|
||||
|
||||
### Focus on Content, Not Chrome
|
||||
|
||||
- Minimize UI clutter and distractions
|
||||
- Content should occupy most of the screen space
|
||||
- Controls appear when needed, not always visible
|
||||
- Consistent layout across different views
|
||||
|
||||
### Progressive Disclosure
|
||||
|
||||
- Show simple options first, advanced options on demand
|
||||
- Don't overwhelm new users with every possible setting
|
||||
- Provide sensible defaults that work for 80% of use cases
|
||||
- Make power features discoverable but not intrusive
|
||||
|
||||
### Responsive and Fast
|
||||
|
||||
- UI should feel instant for common operations
|
||||
- Show loading states for operations that take time
|
||||
- Cache and optimize where possible
|
||||
- Degrade gracefully on slow connections
|
||||
|
||||
## 🔧 Technical Principles
|
||||
|
||||
### Clean Separation of Concerns
|
||||
|
||||
**Layers should not leak**:
|
||||
- Frontend should not know about database structure
|
||||
- API should not contain business logic (delegate to domain layer)
|
||||
- Domain models should not know about HTTP requests
|
||||
- Database layer should not know about AI providers
|
||||
|
||||
### Type Safety and Validation
|
||||
|
||||
**Catch errors early**:
|
||||
- Use Pydantic models for all API boundaries
|
||||
- Type hints throughout Python codebase
|
||||
- TypeScript for frontend code
|
||||
- Validate data at system boundaries
|
||||
|
||||
### Test What Matters
|
||||
|
||||
**Focus on valuable tests**:
|
||||
- Test business logic and domain models
|
||||
- Test API contracts and error handling
|
||||
- Don't test framework code (FastAPI, React, etc.)
|
||||
- Integration tests for critical workflows
|
||||
|
||||
### Database as Source of Truth
|
||||
|
||||
**SurrealDB is our single source of truth**:
|
||||
- All state persisted in database
|
||||
- No business logic in database layer
|
||||
- Use SurrealDB features (record links, queries) appropriately
|
||||
- Schema migrations for all schema changes
|
||||
|
||||
## 🚫 Anti-Patterns to Avoid
|
||||
|
||||
### Feature Creep
|
||||
|
||||
**What it looks like**:
|
||||
- Adding features because they're "cool" or "easy"
|
||||
- Building features for edge cases before common cases work well
|
||||
- Trying to be everything to everyone
|
||||
|
||||
**Why we avoid it**:
|
||||
- Increases complexity and maintenance burden
|
||||
- Makes the tool harder to learn and use
|
||||
- Dilutes the core value proposition
|
||||
|
||||
**Instead**:
|
||||
- Focus on core use cases
|
||||
- Say no to features that don't align with vision
|
||||
- Build extensibility points for edge cases
|
||||
|
||||
### Premature Optimization
|
||||
|
||||
**What it looks like**:
|
||||
- Optimizing code before knowing if it's slow
|
||||
- Complex caching strategies without measuring impact
|
||||
- Trading code clarity for marginal performance gains
|
||||
|
||||
**Why we avoid it**:
|
||||
- Makes code harder to understand and maintain
|
||||
- Optimizes the wrong things
|
||||
- Wastes development time
|
||||
|
||||
**Instead**:
|
||||
- Measure first, optimize second
|
||||
- Focus on algorithmic improvements
|
||||
- Profile before making performance changes
|
||||
|
||||
### Over-Engineering
|
||||
|
||||
**What it looks like**:
|
||||
- Building abstraction layers "in case we need them later"
|
||||
- Implementing design patterns for 3-line functions
|
||||
- Creating frameworks instead of solving problems
|
||||
|
||||
**Why we avoid it**:
|
||||
- Increases cognitive load for contributors
|
||||
- Makes simple changes require touching many files
|
||||
- Hides the actual business logic
|
||||
|
||||
**Instead**:
|
||||
- Start simple, refactor when patterns emerge
|
||||
- Optimize for readability and clarity
|
||||
- Use abstractions when they simplify, not complicate
|
||||
|
||||
### Breaking Changes Without Migration Path
|
||||
|
||||
**What it looks like**:
|
||||
- Changing database schema without migration scripts
|
||||
- Modifying API contracts without versioning
|
||||
- Removing features without deprecation warnings
|
||||
|
||||
**Why we avoid it**:
|
||||
- Breaks existing installations
|
||||
- Frustrates users and contributors
|
||||
- Creates maintenance nightmares
|
||||
|
||||
**Instead**:
|
||||
- Always provide migration scripts for schema changes
|
||||
- Deprecate before removing
|
||||
- Document breaking changes clearly
|
||||
|
||||
## 🤝 Decision-Making Framework
|
||||
|
||||
When evaluating new features or changes, ask:
|
||||
|
||||
### 1. Does it align with our vision?
|
||||
- Does it help users own their research data?
|
||||
- Does it support privacy and self-hosting?
|
||||
- Does it fit our core use cases?
|
||||
|
||||
### 2. Does it follow our principles?
|
||||
- Is it simple to use and understand?
|
||||
- Does it work via API?
|
||||
- Does it support multiple providers?
|
||||
- Can it be extended by users?
|
||||
|
||||
### 3. Is the implementation sound?
|
||||
- Does it maintain separation of concerns?
|
||||
- Is it properly typed and validated?
|
||||
- Does it include tests?
|
||||
- Is it documented?
|
||||
|
||||
### 4. What is the cost?
|
||||
- How much complexity does it add?
|
||||
- How much maintenance burden?
|
||||
- Does it introduce new dependencies?
|
||||
- Will it be used enough to justify the cost?
|
||||
|
||||
### 5. Are there alternatives?
|
||||
- Can existing features solve this problem?
|
||||
- Can this be built as a plugin or extension?
|
||||
- Should this be a separate tool instead?
|
||||
|
||||
## 📚 Examples of Principle-Driven Decisions
|
||||
|
||||
### Why we migrated from Streamlit to Next.js
|
||||
|
||||
**Principle**: API-First Architecture
|
||||
|
||||
**Reasoning**:
|
||||
- Streamlit coupled UI and backend logic
|
||||
- Difficult to build external integrations
|
||||
- Limited control over API behavior
|
||||
- Next.js + FastAPI provides clear separation
|
||||
|
||||
### Why we use Esperanto for AI providers
|
||||
|
||||
**Principle**: Multi-Provider Flexibility
|
||||
|
||||
**Reasoning**:
|
||||
- Abstracts provider-specific details
|
||||
- Easy to add new providers
|
||||
- Consistent interface across providers
|
||||
- No vendor lock-in
|
||||
|
||||
### Why we have a Background Command System
|
||||
|
||||
**Principle**: Async-First for Performance
|
||||
|
||||
**Reasoning**:
|
||||
- Podcast generation takes minutes
|
||||
- Users shouldn't wait for long operations
|
||||
- Need status tracking and error handling
|
||||
- Supports future batch operations
|
||||
|
||||
### Why we support Local Ollama
|
||||
|
||||
**Principle**: Privacy First
|
||||
|
||||
**Reasoning**:
|
||||
- Enables fully offline operation
|
||||
- No data sent to external services
|
||||
- Free for users after hardware cost
|
||||
- Aligns with self-hosted philosophy
|
||||
|
||||
## 🔄 Evolution of Principles
|
||||
|
||||
These principles are not set in stone. As the project grows and we learn from users, some principles may evolve. However, changes to core principles should be:
|
||||
|
||||
1. **Well-justified** - Clear reasoning for why the change is needed
|
||||
2. **Discussed openly** - Community input on major changes
|
||||
3. **Documented** - Updated in this document with explanation
|
||||
4. **Gradual** - Not implemented as breaking changes when possible
|
||||
|
||||
---
|
||||
|
||||
## For Contributors
|
||||
|
||||
When proposing a feature or change:
|
||||
|
||||
1. **Reference these principles** - Explain how your proposal aligns
|
||||
2. **Identify trade-offs** - Be honest about what you're trading for what
|
||||
3. **Suggest alternatives** - Show you've considered other approaches
|
||||
4. **Be open to feedback** - Maintainers may see concerns you don't
|
||||
|
||||
**Remember**: A "no" to a feature isn't a judgment on you or your idea. It means we're staying focused on our core vision. We appreciate all contributions and ideas!
|
||||
|
||||
---
|
||||
|
||||
**Questions about these principles?** Open a discussion on GitHub or join our [Discord](https://discord.gg/37XJPXfz2w).
|
||||
409
docs/7-DEVELOPMENT/development-setup.md
Normal file
409
docs/7-DEVELOPMENT/development-setup.md
Normal file
|
|
@ -0,0 +1,409 @@
|
|||
# Local Development Setup
|
||||
|
||||
This guide walks you through setting up Open Notebook for local development. Follow these steps to get the full stack running on your machine.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before you start, ensure you have the following installed:
|
||||
|
||||
- **Python 3.11+** - Check with: `python --version`
|
||||
- **uv** (recommended) or **pip** - Install from: https://github.com/astral-sh/uv
|
||||
- **SurrealDB** - Via Docker or binary (see below)
|
||||
- **Docker** (optional) - For containerized database
|
||||
- **Node.js 18+** (optional) - For frontend development
|
||||
- **Git** - For version control
|
||||
|
||||
## Step 1: Clone and Initial Setup
|
||||
|
||||
```bash
|
||||
# Clone the repository
|
||||
git clone https://github.com/lfnovo/open-notebook.git
|
||||
cd open-notebook
|
||||
|
||||
# Add upstream remote for keeping your fork updated
|
||||
git remote add upstream https://github.com/lfnovo/open-notebook.git
|
||||
```
|
||||
|
||||
## Step 2: Install Python Dependencies
|
||||
|
||||
```bash
|
||||
# Using uv (recommended)
|
||||
uv sync
|
||||
|
||||
# Or using pip
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
## Step 3: Environment Variables
|
||||
|
||||
Create a `.env` file in the project root with your configuration:
|
||||
|
||||
```bash
|
||||
# Copy from example
|
||||
cp .env.example .env
|
||||
```
|
||||
|
||||
Edit `.env` with your settings:
|
||||
|
||||
```bash
|
||||
# Database
|
||||
SURREAL_URL=ws://localhost:8000/rpc
|
||||
SURREAL_USER=root
|
||||
SURREAL_PASSWORD=password
|
||||
SURREAL_NAMESPACE=open_notebook
|
||||
SURREAL_DATABASE=development
|
||||
|
||||
# AI Providers (add your API keys)
|
||||
OPENAI_API_KEY=sk-...
|
||||
ANTHROPIC_API_KEY=sk-ant-...
|
||||
GOOGLE_API_KEY=AI...
|
||||
GROQ_API_KEY=gsk-...
|
||||
|
||||
# Application
|
||||
APP_PASSWORD= # Optional password protection
|
||||
DEBUG=true
|
||||
LOG_LEVEL=DEBUG
|
||||
```
|
||||
|
||||
### AI Provider Keys
|
||||
|
||||
You'll need at least one AI provider. Popular options:
|
||||
|
||||
- **OpenAI** - https://platform.openai.com/api-keys
|
||||
- **Anthropic (Claude)** - https://console.anthropic.com/
|
||||
- **Google** - https://ai.google.dev/
|
||||
- **Groq** - https://console.groq.com/
|
||||
|
||||
For local development, you can also use:
|
||||
- **Ollama** - Run locally without API keys (see "Local Ollama" below)
|
||||
|
||||
## Step 4: Start SurrealDB
|
||||
|
||||
### Option A: Using Docker (Recommended)
|
||||
|
||||
```bash
|
||||
# Start SurrealDB in memory
|
||||
docker run -d --name surrealdb -p 8000:8000 \
|
||||
surrealdb/surrealdb:v2 start \
|
||||
--user root --pass password \
|
||||
--bind 0.0.0.0:8000 memory
|
||||
|
||||
# Or with persistent storage
|
||||
docker run -d --name surrealdb -p 8000:8000 \
|
||||
-v surrealdb_data:/data \
|
||||
surrealdb/surrealdb:v2 start \
|
||||
--user root --pass password \
|
||||
--bind 0.0.0.0:8000 file:/data/surreal.db
|
||||
```
|
||||
|
||||
### Option B: Using Make
|
||||
|
||||
```bash
|
||||
make database
|
||||
```
|
||||
|
||||
### Option C: Using Docker Compose
|
||||
|
||||
```bash
|
||||
docker compose up -d surrealdb
|
||||
```
|
||||
|
||||
### Verify SurrealDB is Running
|
||||
|
||||
```bash
|
||||
# Should show server information
|
||||
curl http://localhost:8000/
|
||||
```
|
||||
|
||||
## Step 5: Run Database Migrations
|
||||
|
||||
Database migrations run automatically when you start the API. The first startup will apply any pending migrations.
|
||||
|
||||
To verify migrations manually:
|
||||
|
||||
```bash
|
||||
# API will run migrations on startup
|
||||
uv run python -m api.main
|
||||
```
|
||||
|
||||
Check the logs - you should see messages like:
|
||||
```
|
||||
Running migration 001_initial_schema
|
||||
Running migration 002_add_vectors
|
||||
...
|
||||
Migrations completed successfully
|
||||
```
|
||||
|
||||
## Step 6: Start the API Server
|
||||
|
||||
In a new terminal window:
|
||||
|
||||
```bash
|
||||
# Terminal 2: Start API (port 5055)
|
||||
uv run --env-file .env uvicorn api.main:app --host 0.0.0.0 --port 5055
|
||||
|
||||
# Or using the shortcut
|
||||
make api
|
||||
```
|
||||
|
||||
You should see:
|
||||
```
|
||||
INFO: Application startup complete
|
||||
INFO: Uvicorn running on http://0.0.0.0:5055
|
||||
```
|
||||
|
||||
### Verify API is Running
|
||||
|
||||
```bash
|
||||
# Check health endpoint
|
||||
curl http://localhost:5055/health
|
||||
|
||||
# View API documentation
|
||||
open http://localhost:5055/docs
|
||||
```
|
||||
|
||||
## Step 7: Start the Frontend (Optional)
|
||||
|
||||
If you want to work on the frontend, start Next.js in another terminal:
|
||||
|
||||
```bash
|
||||
# Terminal 3: Start Next.js frontend (port 3000)
|
||||
cd frontend
|
||||
npm install # First time only
|
||||
npm run dev
|
||||
```
|
||||
|
||||
You should see:
|
||||
```
|
||||
> next dev
|
||||
▲ Next.js 15.x
|
||||
- Local: http://localhost:3000
|
||||
```
|
||||
|
||||
### Access the Frontend
|
||||
|
||||
Open your browser to: http://localhost:3000
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
After setup, verify everything is working:
|
||||
|
||||
- [ ] **SurrealDB**: `curl http://localhost:8000/` returns content
|
||||
- [ ] **API**: `curl http://localhost:5055/health` returns `{"status": "ok"}`
|
||||
- [ ] **API Docs**: `open http://localhost:5055/docs` works
|
||||
- [ ] **Database**: API logs show migrations completing
|
||||
- [ ] **Frontend** (optional): `http://localhost:3000` loads
|
||||
|
||||
## Starting Services Together
|
||||
|
||||
### Quick Start All Services
|
||||
|
||||
```bash
|
||||
make start-all
|
||||
```
|
||||
|
||||
This starts SurrealDB, API, and frontend in one command.
|
||||
|
||||
### Individual Terminals (Recommended for Development)
|
||||
|
||||
**Terminal 1 - Database:**
|
||||
```bash
|
||||
make database
|
||||
```
|
||||
|
||||
**Terminal 2 - API:**
|
||||
```bash
|
||||
make api
|
||||
```
|
||||
|
||||
**Terminal 3 - Frontend:**
|
||||
```bash
|
||||
cd frontend && npm run dev
|
||||
```
|
||||
|
||||
## Development Tools Setup
|
||||
|
||||
### Pre-commit Hooks (Optional but Recommended)
|
||||
|
||||
Install git hooks to automatically check code quality:
|
||||
|
||||
```bash
|
||||
uv run pre-commit install
|
||||
```
|
||||
|
||||
Now your commits will be checked before they're made.
|
||||
|
||||
### Code Quality Commands
|
||||
|
||||
```bash
|
||||
# Lint Python code (auto-fix)
|
||||
make ruff
|
||||
# or: ruff check . --fix
|
||||
|
||||
# Type check Python code
|
||||
make lint
|
||||
# or: uv run python -m mypy .
|
||||
|
||||
# Run tests
|
||||
uv run pytest
|
||||
|
||||
# Run tests with coverage
|
||||
uv run pytest --cov=open_notebook
|
||||
```
|
||||
|
||||
## Common Development Tasks
|
||||
|
||||
### Running Tests
|
||||
|
||||
```bash
|
||||
# Run all tests
|
||||
uv run pytest
|
||||
|
||||
# Run specific test file
|
||||
uv run pytest tests/test_notebooks.py
|
||||
|
||||
# Run with coverage report
|
||||
uv run pytest --cov=open_notebook --cov-report=html
|
||||
```
|
||||
|
||||
### Creating a Feature Branch
|
||||
|
||||
```bash
|
||||
# Create and switch to new branch
|
||||
git checkout -b feature/my-feature
|
||||
|
||||
# Make changes, then commit
|
||||
git add .
|
||||
git commit -m "feat: add my feature"
|
||||
|
||||
# Push to your fork
|
||||
git push origin feature/my-feature
|
||||
```
|
||||
|
||||
### Updating from Upstream
|
||||
|
||||
```bash
|
||||
# Fetch latest changes
|
||||
git fetch upstream
|
||||
|
||||
# Rebase your branch
|
||||
git rebase upstream/main
|
||||
|
||||
# Push updated branch
|
||||
git push origin feature/my-feature -f
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "Connection refused" on SurrealDB
|
||||
|
||||
**Problem**: API can't connect to SurrealDB
|
||||
|
||||
**Solutions**:
|
||||
1. Check if SurrealDB is running: `docker ps | grep surrealdb`
|
||||
2. Verify URL in `.env`: Should be `ws://localhost:8000/rpc`
|
||||
3. Restart SurrealDB: `docker stop surrealdb && docker rm surrealdb`
|
||||
4. Then restart with: `docker run -d --name surrealdb -p 8000:8000 surrealdb/surrealdb:v2 start --user root --pass password --bind 0.0.0.0:8000 memory`
|
||||
|
||||
### "Address already in use"
|
||||
|
||||
**Problem**: Port 5055 or 3000 is already in use
|
||||
|
||||
**Solutions**:
|
||||
```bash
|
||||
# Find process using port
|
||||
lsof -i :5055 # Check port 5055
|
||||
|
||||
# Kill process (macOS/Linux)
|
||||
kill -9 <PID>
|
||||
|
||||
# Or use different port
|
||||
uvicorn api.main:app --port 5056
|
||||
```
|
||||
|
||||
### Module not found errors
|
||||
|
||||
**Problem**: Import errors when running API
|
||||
|
||||
**Solutions**:
|
||||
```bash
|
||||
# Reinstall dependencies
|
||||
uv sync
|
||||
|
||||
# Or with pip
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
### Database migration failures
|
||||
|
||||
**Problem**: API fails to start with migration errors
|
||||
|
||||
**Solutions**:
|
||||
1. Check SurrealDB is running: `curl http://localhost:8000/`
|
||||
2. Check credentials in `.env` match your SurrealDB setup
|
||||
3. Check logs for specific migration error: `make api 2>&1 | grep -i migration`
|
||||
4. Verify database exists: Check SurrealDB console at http://localhost:8000/
|
||||
|
||||
### Migrations not applying
|
||||
|
||||
**Problem**: Database schema seems outdated
|
||||
|
||||
**Solutions**:
|
||||
1. Restart API - migrations run on startup: `make api`
|
||||
2. Check logs show "Migrations completed successfully"
|
||||
3. Verify `/migrations/` folder exists and has files
|
||||
4. Check SurrealDB is writable and not in read-only mode
|
||||
|
||||
## Optional: Local Ollama Setup
|
||||
|
||||
For testing with local AI models:
|
||||
|
||||
```bash
|
||||
# Install Ollama from https://ollama.ai
|
||||
|
||||
# Pull a model (e.g., Mistral 7B)
|
||||
ollama pull mistral
|
||||
|
||||
# Add to .env
|
||||
OLLAMA_BASE_URL=http://localhost:11434
|
||||
```
|
||||
|
||||
Then in your code, you can use Ollama through the Esperanto library.
|
||||
|
||||
## Optional: Docker Development Environment
|
||||
|
||||
Run entire stack in Docker:
|
||||
|
||||
```bash
|
||||
# Start all services
|
||||
docker compose --profile multi up
|
||||
|
||||
# Logs
|
||||
docker compose logs -f
|
||||
|
||||
# Stop services
|
||||
docker compose down
|
||||
```
|
||||
|
||||
## Next Steps
|
||||
|
||||
After setup is complete:
|
||||
|
||||
1. **Read the Contributing Guide** - [contributing.md](contributing.md)
|
||||
2. **Explore the Architecture** - Check the documentation
|
||||
3. **Find an Issue** - Look for "good first issue" on GitHub
|
||||
4. **Set Up Pre-commit** - Install git hooks for code quality
|
||||
5. **Join Discord** - https://discord.gg/37XJPXfz2w
|
||||
|
||||
## Getting Help
|
||||
|
||||
If you get stuck:
|
||||
|
||||
- **Discord**: [Join our server](https://discord.gg/37XJPXfz2w) for real-time help
|
||||
- **GitHub Issues**: Check existing issues for similar problems
|
||||
- **GitHub Discussions**: Ask questions in discussions
|
||||
- **Documentation**: See [code-standards.md](code-standards.md) and [testing.md](testing.md)
|
||||
|
||||
---
|
||||
|
||||
**Ready to contribute?** Go to [contributing.md](contributing.md) for the contribution workflow.
|
||||
96
docs/7-DEVELOPMENT/index.md
Normal file
96
docs/7-DEVELOPMENT/index.md
Normal file
|
|
@ -0,0 +1,96 @@
|
|||
# Development
|
||||
|
||||
Welcome to the Open Notebook development documentation! Whether you're contributing code, understanding our architecture, or maintaining the project, you'll find guidance here.
|
||||
|
||||
## 🎯 Pick Your Path
|
||||
|
||||
### 👨💻 I Want to Contribute Code
|
||||
|
||||
Start with **[Contributing Guide](contributing.md)** for the workflow, then check:
|
||||
- **[Quick Start](quick-start.md)** - Clone, install, verify in 5 minutes
|
||||
- **[Development Setup](development-setup.md)** - Complete local environment guide
|
||||
- **[Code Standards](code-standards.md)** - How to write code that fits our style
|
||||
- **[Testing](testing.md)** - How to write and run tests
|
||||
|
||||
**First time?** Check out our [Contributing Guide](contributing.md) for the issue-first workflow.
|
||||
|
||||
---
|
||||
|
||||
### 🏗️ I Want to Understand the Architecture
|
||||
|
||||
**[Architecture Overview](architecture.md)** covers:
|
||||
- 3-tier system design
|
||||
- Tech stack and rationale
|
||||
- Key components and workflows
|
||||
- Design patterns we use
|
||||
|
||||
For deeper dives, check `/open_notebook/` CLAUDE.md for component-specific guidance.
|
||||
|
||||
---
|
||||
|
||||
### 👨🔧 I'm a Maintainer
|
||||
|
||||
**[Maintainer Guide](maintainer-guide.md)** covers:
|
||||
- Issue triage and management
|
||||
- Pull request review process
|
||||
- Communication templates
|
||||
- Best practices
|
||||
|
||||
---
|
||||
|
||||
## 📚 Quick Links
|
||||
|
||||
| Document | For | Purpose |
|
||||
|---|---|---|
|
||||
| [Quick Start](quick-start.md) | New developers | Clone, install, and verify setup (5 min) |
|
||||
| [Development Setup](development-setup.md) | Local development | Complete environment setup guide |
|
||||
| [Contributing](contributing.md) | Code contributors | Workflow: issue → code → PR |
|
||||
| [Code Standards](code-standards.md) | Writing code | Style guides for Python, FastAPI, DB |
|
||||
| [Testing](testing.md) | Testing code | How to write and run tests |
|
||||
| [Architecture](architecture.md) | Understanding system | System design, tech stack, workflows |
|
||||
| [Design Principles](design-principles.md) | All developers | What guides our decisions |
|
||||
| [API Reference](api-reference.md) | Building integrations | Complete REST API documentation |
|
||||
| [Maintainer Guide](maintainer-guide.md) | Maintainers | Managing issues, PRs, releases |
|
||||
|
||||
---
|
||||
|
||||
## 🚀 Current Development Priorities
|
||||
|
||||
We're actively looking for help with:
|
||||
|
||||
1. **Frontend Enhancement** - Improve Next.js/React UI with real-time updates
|
||||
2. **Performance** - Async processing and caching optimizations
|
||||
3. **Testing** - Expand test coverage across components
|
||||
4. **Documentation** - API examples and developer guides
|
||||
5. **Integrations** - New content sources and AI providers
|
||||
|
||||
See GitHub Issues labeled `good first issue` or `help wanted`.
|
||||
|
||||
---
|
||||
|
||||
## 💬 Getting Help
|
||||
|
||||
- **Discord**: [Join our server](https://discord.gg/37XJPXfz2w) for real-time discussions
|
||||
- **GitHub Discussions**: For architecture questions
|
||||
- **GitHub Issues**: For bugs and features
|
||||
|
||||
Don't be shy! We're here to help new contributors succeed.
|
||||
|
||||
---
|
||||
|
||||
## 📖 Additional Resources
|
||||
|
||||
### External Documentation
|
||||
- [FastAPI Docs](https://fastapi.tiangolo.com/)
|
||||
- [SurrealDB Docs](https://surrealdb.com/docs)
|
||||
- [LangChain Docs](https://python.langchain.com/)
|
||||
- [Next.js Docs](https://nextjs.org/docs)
|
||||
|
||||
### Our Libraries
|
||||
- [Esperanto](https://github.com/lfnovo/esperanto) - Multi-provider AI abstraction
|
||||
- [Content Core](https://github.com/lfnovo/content-core) - Content processing
|
||||
- [Podcast Creator](https://github.com/lfnovo/podcast-creator) - Podcast generation
|
||||
|
||||
---
|
||||
|
||||
Ready to get started? Head over to **[Quick Start](quick-start.md)**! 🎉
|
||||
408
docs/7-DEVELOPMENT/maintainer-guide.md
Normal file
408
docs/7-DEVELOPMENT/maintainer-guide.md
Normal file
|
|
@ -0,0 +1,408 @@
|
|||
# Maintainer Guide
|
||||
|
||||
This guide is for project maintainers to help manage contributions effectively while maintaining project quality and vision.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Issue Management](#issue-management)
|
||||
- [Pull Request Review](#pull-request-review)
|
||||
- [Common Scenarios](#common-scenarios)
|
||||
- [Communication Templates](#communication-templates)
|
||||
|
||||
## Issue Management
|
||||
|
||||
### When a New Issue is Created
|
||||
|
||||
**1. Initial Triage** (within 24-48 hours)
|
||||
|
||||
- Add appropriate labels:
|
||||
- `bug`, `enhancement`, `documentation`, etc.
|
||||
- `good first issue` for beginner-friendly tasks
|
||||
- `needs-triage` until reviewed
|
||||
- `help wanted` if you'd welcome community contributions
|
||||
|
||||
- Quick assessment:
|
||||
- Is it clear and well-described?
|
||||
- Is it aligned with project vision? (See [design-principles.md](design-principles.md))
|
||||
- Does it duplicate an existing issue?
|
||||
|
||||
**2. Initial Response**
|
||||
|
||||
```markdown
|
||||
Thanks for opening this issue! We'll review it and get back to you soon.
|
||||
|
||||
[If it's a bug] In the meantime, have you checked our troubleshooting guide?
|
||||
|
||||
[If it's a feature] You might find our [design principles](design-principles.md) helpful for understanding what we're building toward.
|
||||
```
|
||||
|
||||
**3. Decision Making**
|
||||
|
||||
Ask yourself:
|
||||
- Does this align with our [design principles](design-principles.md)?
|
||||
- Is this something we want in the core project, or better as a plugin/extension?
|
||||
- Do we have the capacity to support this feature long-term?
|
||||
- Will this benefit most users, or just a specific use case?
|
||||
|
||||
**4. Issue Assignment**
|
||||
|
||||
If the contributor checked "I am a developer and would like to work on this":
|
||||
|
||||
**For Accepted Issues:**
|
||||
```markdown
|
||||
Great idea! This aligns well with our goals, particularly [specific design principle].
|
||||
|
||||
I see you'd like to work on this. Before you start:
|
||||
|
||||
1. Please share your proposed approach/solution
|
||||
2. Review our [Contributing Guide](contributing.md) and [Design Principles](design-principles.md)
|
||||
3. Once we agree on the approach, I'll assign this to you
|
||||
|
||||
Looking forward to your thoughts!
|
||||
```
|
||||
|
||||
**For Issues Needing Clarification:**
|
||||
```markdown
|
||||
Thanks for offering to work on this! Before we proceed, we need to clarify a few things:
|
||||
|
||||
1. [Question 1]
|
||||
2. [Question 2]
|
||||
|
||||
Once we have these details, we can discuss the best approach.
|
||||
```
|
||||
|
||||
**For Issues Not Aligned with Vision:**
|
||||
```markdown
|
||||
Thank you for the suggestion and for offering to work on this!
|
||||
|
||||
After reviewing against our [design principles](design-principles.md), we've decided not to pursue this in the core project because [specific reason].
|
||||
|
||||
However, you might be able to achieve this through [alternative approach, if applicable].
|
||||
|
||||
We appreciate your interest in contributing! Feel free to check out our [open issues](link) for other ways to contribute.
|
||||
```
|
||||
|
||||
### Labels to Use
|
||||
|
||||
**Priority:**
|
||||
- `priority: critical` - Security issues, data loss bugs
|
||||
- `priority: high` - Major functionality broken
|
||||
- `priority: medium` - Annoying bugs, useful features
|
||||
- `priority: low` - Nice to have, edge cases
|
||||
|
||||
**Status:**
|
||||
- `needs-triage` - Not yet reviewed by maintainer
|
||||
- `needs-info` - Waiting for more information from reporter
|
||||
- `needs-discussion` - Requires community/team discussion
|
||||
- `ready` - Approved and ready to be worked on
|
||||
- `in-progress` - Someone is actively working on this
|
||||
- `blocked` - Cannot proceed due to external dependency
|
||||
|
||||
**Type:**
|
||||
- `bug` - Something is broken
|
||||
- `enhancement` - New feature or improvement
|
||||
- `documentation` - Documentation improvements
|
||||
- `question` - General questions
|
||||
- `refactor` - Code cleanup/restructuring
|
||||
|
||||
**Difficulty:**
|
||||
- `good first issue` - Good for newcomers
|
||||
- `help wanted` - Community contributions welcome
|
||||
- `advanced` - Requires deep codebase knowledge
|
||||
|
||||
## Pull Request Review
|
||||
|
||||
### Initial PR Review Checklist
|
||||
|
||||
**Before diving into code:**
|
||||
|
||||
- [ ] Is there an associated approved issue?
|
||||
- [ ] Does the PR reference the issue number?
|
||||
- [ ] Is the PR description clear about what changed and why?
|
||||
- [ ] Did the contributor check the relevant boxes in the PR template?
|
||||
- [ ] Are there tests? Screenshots (for UI changes)?
|
||||
|
||||
**Red Flags** (may require closing PR):
|
||||
- No associated issue
|
||||
- Issue was not assigned to contributor
|
||||
- PR tries to solve multiple unrelated problems
|
||||
- Breaking changes without discussion
|
||||
- Conflicts with project vision
|
||||
|
||||
### Code Review Process
|
||||
|
||||
**1. High-Level Review**
|
||||
|
||||
- Does the approach align with our architecture?
|
||||
- Is the solution appropriately scoped?
|
||||
- Are there simpler alternatives?
|
||||
- Does it follow our design principles?
|
||||
|
||||
**2. Code Quality Review**
|
||||
|
||||
Python:
|
||||
- [ ] Follows PEP 8
|
||||
- [ ] Has type hints
|
||||
- [ ] Has docstrings
|
||||
- [ ] Proper error handling
|
||||
- [ ] No security vulnerabilities
|
||||
|
||||
TypeScript/Frontend:
|
||||
- [ ] Follows TypeScript best practices
|
||||
- [ ] Proper component structure
|
||||
- [ ] No console.logs left in production code
|
||||
- [ ] Accessible UI components
|
||||
|
||||
**3. Testing Review**
|
||||
|
||||
- [ ] Has appropriate test coverage
|
||||
- [ ] Tests are meaningful (not just for coverage percentage)
|
||||
- [ ] Tests pass locally and in CI
|
||||
- [ ] Edge cases are tested
|
||||
|
||||
**4. Documentation Review**
|
||||
|
||||
- [ ] Code is well-commented
|
||||
- [ ] Complex logic is explained
|
||||
- [ ] User-facing documentation updated (if applicable)
|
||||
- [ ] API documentation updated (if API changed)
|
||||
- [ ] Migration guide provided (if breaking change)
|
||||
|
||||
### Providing Feedback
|
||||
|
||||
**Positive Feedback** (important!):
|
||||
```markdown
|
||||
Thanks for this PR! I really like [specific thing they did well].
|
||||
|
||||
[Feedback on what needs to change]
|
||||
```
|
||||
|
||||
**Requesting Changes:**
|
||||
```markdown
|
||||
This is a great start! A few things to address:
|
||||
|
||||
1. **[High-level concern]**: [Explanation and suggested approach]
|
||||
2. **[Code quality issue]**: [Specific example and fix]
|
||||
3. **[Testing gap]**: [What scenarios need coverage]
|
||||
|
||||
Let me know if you have questions about any of this!
|
||||
```
|
||||
|
||||
**Suggesting Alternative Approach:**
|
||||
```markdown
|
||||
I appreciate the effort you put into this! However, I'm concerned about [specific issue].
|
||||
|
||||
Have you considered [alternative approach]? It might be better because [reasons].
|
||||
|
||||
What do you think?
|
||||
```
|
||||
|
||||
## Common Scenarios
|
||||
|
||||
### Scenario 1: Good Code, Wrong Approach
|
||||
|
||||
**Situation**: Contributor wrote quality code, but solved the problem in a way that doesn't fit our architecture.
|
||||
|
||||
**Response:**
|
||||
```markdown
|
||||
Thank you for this PR! The code quality is great, and I can see you put thought into this.
|
||||
|
||||
However, I'm concerned that this approach [specific architectural concern]. In our architecture, we [explain the pattern we follow].
|
||||
|
||||
Would you be open to refactoring this to [suggested approach]? I'm happy to provide guidance on the specifics.
|
||||
|
||||
Alternatively, if you don't have time for a refactor, I can take over and finish this up (with credit to you, of course).
|
||||
|
||||
Let me know what you prefer!
|
||||
```
|
||||
|
||||
### Scenario 2: PR Without Assigned Issue
|
||||
|
||||
**Situation**: Contributor submitted PR without going through issue approval process.
|
||||
|
||||
**Response:**
|
||||
```markdown
|
||||
Thanks for the PR! I appreciate you taking the time to contribute.
|
||||
|
||||
However, to maintain project coherence, we require all PRs to be linked to an approved issue that was assigned to the contributor. This is explained in our [Contributing Guide](contributing.md).
|
||||
|
||||
This helps us:
|
||||
- Ensure work aligns with project vision
|
||||
- Prevent duplicate efforts
|
||||
- Discuss approach before implementation
|
||||
|
||||
Could you please:
|
||||
1. Create an issue describing this change
|
||||
2. Wait for it to be reviewed and assigned to you
|
||||
3. We can then reopen this PR or you can create a new one
|
||||
|
||||
Sorry for the inconvenience - this process helps us manage the project effectively.
|
||||
```
|
||||
|
||||
### Scenario 3: Feature Request Not Aligned with Vision
|
||||
|
||||
**Situation**: Well-intentioned feature that doesn't fit project goals.
|
||||
|
||||
**Response:**
|
||||
```markdown
|
||||
Thank you for this suggestion! I can see how this would be useful for [specific use case].
|
||||
|
||||
After reviewing against our [design principles](design-principles.md), we've decided not to include this in the core project because [specific reason - e.g., "it conflicts with our 'Simplicity Over Features' principle" or "it would require dependencies that conflict with our privacy-first approach"].
|
||||
|
||||
Some alternatives:
|
||||
- [If applicable] This could be built as a plugin/extension
|
||||
- [If applicable] This functionality might be achievable through [existing feature]
|
||||
- [If applicable] You might be interested in [other tool] which is designed for this use case
|
||||
|
||||
We appreciate your contribution and hope you understand. Feel free to check our roadmap or open issues for other ways to contribute!
|
||||
```
|
||||
|
||||
### Scenario 4: Contributor Ghosts After Feedback
|
||||
|
||||
**Situation**: You requested changes, but contributor hasn't responded in 2+ weeks.
|
||||
|
||||
**After 2 weeks:**
|
||||
```markdown
|
||||
Hey there! Just checking in on this PR. Do you have time to address the feedback, or would you like someone else to take over?
|
||||
|
||||
No pressure either way - just want to make sure this doesn't fall through the cracks.
|
||||
```
|
||||
|
||||
**After 1 month with no response:**
|
||||
```markdown
|
||||
Thanks again for starting this work! Since we haven't heard back, I'm going to close this PR for now.
|
||||
|
||||
If you want to pick this up again in the future, feel free to reopen it or create a new PR. Alternatively, I'll mark the issue as available for someone else to work on.
|
||||
|
||||
We appreciate your contribution!
|
||||
```
|
||||
|
||||
Then:
|
||||
- Close the PR
|
||||
- Unassign the issue
|
||||
- Add `help wanted` label to the issue
|
||||
|
||||
### Scenario 5: Breaking Changes Without Discussion
|
||||
|
||||
**Situation**: PR introduces breaking changes that weren't discussed.
|
||||
|
||||
**Response:**
|
||||
```markdown
|
||||
Thanks for this PR! However, I notice this introduces breaking changes that weren't discussed in the original issue.
|
||||
|
||||
Breaking changes require:
|
||||
1. Prior discussion and approval
|
||||
2. Migration guide for users
|
||||
3. Deprecation period (when possible)
|
||||
4. Clear documentation of the change
|
||||
|
||||
Could we discuss the breaking changes first? Specifically:
|
||||
- [What breaks and why]
|
||||
- [Who will be affected]
|
||||
- [Migration path]
|
||||
|
||||
We may need to adjust the approach to minimize impact on existing users.
|
||||
```
|
||||
|
||||
## Communication Templates
|
||||
|
||||
### Closing a PR (Misaligned with Vision)
|
||||
|
||||
```markdown
|
||||
Thank you for taking the time to contribute! We really appreciate it.
|
||||
|
||||
After careful review, we've decided not to merge this PR because [specific reason related to design principles].
|
||||
|
||||
This isn't a reflection on your code quality - it's about maintaining focus on our core goals as outlined in [design-principles.md](design-principles.md).
|
||||
|
||||
We'd love to have you contribute in other ways! Check out:
|
||||
- Good first issues
|
||||
- Help wanted issues
|
||||
- Our roadmap
|
||||
|
||||
Thanks again for your interest in Open Notebook!
|
||||
```
|
||||
|
||||
### Closing a Stale Issue
|
||||
|
||||
```markdown
|
||||
We're closing this issue due to inactivity. If this is still relevant, feel free to reopen it with updated information.
|
||||
|
||||
Thanks!
|
||||
```
|
||||
|
||||
### Asking for More Information
|
||||
|
||||
```markdown
|
||||
Thanks for reporting this! To help us investigate, could you provide:
|
||||
|
||||
1. [Specific information needed]
|
||||
2. [Logs, screenshots, etc.]
|
||||
3. [Steps to reproduce]
|
||||
|
||||
This will help us understand the issue better and find a solution.
|
||||
```
|
||||
|
||||
### Thanking a Contributor
|
||||
|
||||
```markdown
|
||||
Merged!
|
||||
|
||||
Thank you so much for this contribution, @username! [Specific thing they did well].
|
||||
|
||||
This will be included in the next release.
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
### Be Kind and Respectful
|
||||
|
||||
- Thank contributors for their time and effort
|
||||
- Assume good intentions
|
||||
- Be patient with newcomers
|
||||
- Explain *why*, not just *what*
|
||||
|
||||
### Be Clear and Direct
|
||||
|
||||
- Don't leave ambiguity about next steps
|
||||
- Be specific about what needs to change
|
||||
- Explain architectural decisions
|
||||
- Set clear expectations
|
||||
|
||||
### Be Consistent
|
||||
|
||||
- Apply the same standards to all contributors
|
||||
- Follow the process you've defined
|
||||
- Document decisions for future reference
|
||||
|
||||
### Be Protective of Project Vision
|
||||
|
||||
- It's okay to say "no"
|
||||
- Prioritize long-term maintainability
|
||||
- Don't accept features you can't support
|
||||
- Keep the project focused
|
||||
|
||||
### Be Responsive
|
||||
|
||||
- Respond to issues within 48 hours (even just to acknowledge)
|
||||
- Review PRs within a week when possible
|
||||
- Keep contributors updated on status
|
||||
- Close stale issues/PRs to keep things tidy
|
||||
|
||||
## When in Doubt
|
||||
|
||||
Ask yourself:
|
||||
1. Does this align with our [design principles](design-principles.md)?
|
||||
2. Will we be able to maintain this feature long-term?
|
||||
3. Does this benefit most users, or just an edge case?
|
||||
4. Is there a simpler alternative?
|
||||
5. Would I want to support this in 2 years?
|
||||
|
||||
If you're unsure, it's perfectly fine to:
|
||||
- Ask for input from other maintainers
|
||||
- Start a discussion issue
|
||||
- Sleep on it before making a decision
|
||||
|
||||
---
|
||||
|
||||
**Remember**: Good maintainership is about balancing openness to contributions with protection of project vision. You're not being mean by saying "no" to things that don't fit - you're being a responsible steward of the project.
|
||||
128
docs/7-DEVELOPMENT/quick-start.md
Normal file
128
docs/7-DEVELOPMENT/quick-start.md
Normal file
|
|
@ -0,0 +1,128 @@
|
|||
# Quick Start - Development
|
||||
|
||||
Get Open Notebook running locally in 5 minutes.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- **Python 3.11+**
|
||||
- **Git**
|
||||
- **uv** (package manager) - install with `curl -LsSf https://astral.sh/uv/install.sh | sh`
|
||||
- **Docker** (optional, for SurrealDB)
|
||||
|
||||
## 1. Clone the Repository (2 min)
|
||||
|
||||
```bash
|
||||
# Fork the repository on GitHub first, then clone your fork
|
||||
git clone https://github.com/YOUR_USERNAME/open-notebook.git
|
||||
cd open-notebook
|
||||
|
||||
# Add upstream remote for updates
|
||||
git remote add upstream https://github.com/lfnovo/open-notebook.git
|
||||
```
|
||||
|
||||
## 2. Install Dependencies (2 min)
|
||||
|
||||
```bash
|
||||
# Install Python dependencies
|
||||
uv sync
|
||||
|
||||
# Verify uv is working
|
||||
uv --version
|
||||
```
|
||||
|
||||
## 3. Start Services (1 min)
|
||||
|
||||
In separate terminal windows:
|
||||
|
||||
```bash
|
||||
# Terminal 1: Start SurrealDB (database)
|
||||
make database
|
||||
# or: docker run -d --name surrealdb -p 8000:8000 surrealdb/surrealdb:v2 start --user root --pass password --bind 0.0.0.0:8000 memory
|
||||
|
||||
# Terminal 2: Start API (backend on port 5055)
|
||||
make api
|
||||
# or: uv run --env-file .env uvicorn api.main:app --host 0.0.0.0 --port 5055
|
||||
|
||||
# Terminal 3: Start Frontend (UI on port 3000)
|
||||
cd frontend && npm run dev
|
||||
```
|
||||
|
||||
## 4. Verify Everything Works (instant)
|
||||
|
||||
- **API Health**: http://localhost:5055/health → should return `{"status": "ok"}`
|
||||
- **API Docs**: http://localhost:5055/docs → interactive API documentation
|
||||
- **Frontend**: http://localhost:3000 → Open Notebook UI
|
||||
|
||||
**All three show up?** ✅ You're ready to develop!
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
- **First Issue?** Pick a [good first issue](https://github.com/lfnovo/open-notebook/issues?q=label%3A%22good+first+issue%22)
|
||||
- **Understand the code?** Read [Architecture Overview](architecture.md)
|
||||
- **Make changes?** Follow [Contributing Guide](contributing.md)
|
||||
- **Setup details?** See [Development Setup](development-setup.md)
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "Port 5055 already in use"
|
||||
```bash
|
||||
# Find what's using the port
|
||||
lsof -i :5055
|
||||
|
||||
# Use a different port
|
||||
uv run uvicorn api.main:app --port 5056
|
||||
```
|
||||
|
||||
### "Can't connect to SurrealDB"
|
||||
```bash
|
||||
# Check if SurrealDB is running
|
||||
docker ps | grep surrealdb
|
||||
|
||||
# Restart it
|
||||
make database
|
||||
```
|
||||
|
||||
### "Python version is too old"
|
||||
```bash
|
||||
# Check your Python version
|
||||
python --version # Should be 3.11+
|
||||
|
||||
# Use Python 3.11 specifically
|
||||
uv sync --python 3.11
|
||||
```
|
||||
|
||||
### "npm: command not found"
|
||||
```bash
|
||||
# Install Node.js from https://nodejs.org/
|
||||
# Then install frontend dependencies
|
||||
cd frontend && npm install
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Common Development Commands
|
||||
|
||||
```bash
|
||||
# Run tests
|
||||
uv run pytest
|
||||
|
||||
# Format code
|
||||
make ruff
|
||||
|
||||
# Type checking
|
||||
make lint
|
||||
|
||||
# Run the full stack
|
||||
make start-all
|
||||
|
||||
# View API documentation
|
||||
open http://localhost:5055/docs
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
Need more help? See [Development Setup](development-setup.md) for details or join our [Discord](https://discord.gg/37XJPXfz2w).
|
||||
423
docs/7-DEVELOPMENT/testing.md
Normal file
423
docs/7-DEVELOPMENT/testing.md
Normal file
|
|
@ -0,0 +1,423 @@
|
|||
# Testing Guide
|
||||
|
||||
This document provides guidelines for writing tests in Open Notebook. Testing is critical to maintaining code quality and preventing regressions.
|
||||
|
||||
## Testing Philosophy
|
||||
|
||||
### What to Test
|
||||
|
||||
Focus on testing the things that matter most:
|
||||
|
||||
- **Business Logic** - Core domain models and their operations
|
||||
- **API Contracts** - HTTP endpoint behavior and error handling
|
||||
- **Critical Workflows** - End-to-end flows that users depend on
|
||||
- **Data Persistence** - Database operations and data integrity
|
||||
- **Error Conditions** - How the system handles failures gracefully
|
||||
|
||||
### What NOT to Test
|
||||
|
||||
Don't waste time testing framework code:
|
||||
|
||||
- Framework functionality (FastAPI, React, etc.)
|
||||
- Third-party library implementation
|
||||
- Simple getters/setters without logic
|
||||
- View/presentation layer rendering (unless it contains logic)
|
||||
|
||||
## Test Structure
|
||||
|
||||
We use **pytest** with async support for all Python tests:
|
||||
|
||||
```python
|
||||
import pytest
|
||||
from httpx import AsyncClient
|
||||
from open_notebook.domain.notebook import Notebook
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_create_notebook():
|
||||
"""Test notebook creation."""
|
||||
notebook = Notebook(name="Test Notebook", description="Test description")
|
||||
await notebook.save()
|
||||
|
||||
assert notebook.id is not None
|
||||
assert notebook.name == "Test Notebook"
|
||||
assert notebook.created is not None
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_api_create_notebook():
|
||||
"""Test notebook creation via API."""
|
||||
async with AsyncClient(app=app, base_url="http://test") as client:
|
||||
response = await client.post(
|
||||
"/api/notebooks",
|
||||
json={"name": "Test Notebook", "description": "Test description"}
|
||||
)
|
||||
assert response.status_code == 200
|
||||
data = response.json()
|
||||
assert data["name"] == "Test Notebook"
|
||||
```
|
||||
|
||||
## Test Categories
|
||||
|
||||
### 1. Unit Tests
|
||||
|
||||
Test individual functions and methods in isolation:
|
||||
|
||||
```python
|
||||
@pytest.mark.asyncio
|
||||
async def test_notebook_validation():
|
||||
"""Test that notebook name validation works."""
|
||||
with pytest.raises(InvalidInputError):
|
||||
Notebook(name="", description="test")
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_notebook_archive():
|
||||
"""Test notebook archiving."""
|
||||
notebook = Notebook(name="Test", description="")
|
||||
notebook.archive()
|
||||
assert notebook.archived is True
|
||||
```
|
||||
|
||||
**Location**: `tests/unit/`
|
||||
|
||||
### 2. Integration Tests
|
||||
|
||||
Test component interactions and database operations:
|
||||
|
||||
```python
|
||||
@pytest.mark.asyncio
|
||||
async def test_create_notebook_with_sources():
|
||||
"""Test creating a notebook and adding sources."""
|
||||
notebook = await create_notebook(name="Research", description="")
|
||||
source = await add_source(notebook_id=notebook.id, url="https://example.com")
|
||||
|
||||
retrieved = await get_notebook_with_sources(notebook.id)
|
||||
assert len(retrieved.sources) == 1
|
||||
assert retrieved.sources[0].id == source.id
|
||||
```
|
||||
|
||||
**Location**: `tests/integration/`
|
||||
|
||||
### 3. API Tests
|
||||
|
||||
Test HTTP endpoints and error responses:
|
||||
|
||||
```python
|
||||
@pytest.mark.asyncio
|
||||
async def test_get_notebooks_endpoint():
|
||||
"""Test GET /notebooks endpoint."""
|
||||
async with AsyncClient(app=app, base_url="http://test") as client:
|
||||
response = await client.get("/api/notebooks")
|
||||
assert response.status_code == 200
|
||||
data = response.json()
|
||||
assert isinstance(data, list)
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_create_notebook_validation():
|
||||
"""Test that invalid input is rejected."""
|
||||
async with AsyncClient(app=app, base_url="http://test") as client:
|
||||
response = await client.post(
|
||||
"/api/notebooks",
|
||||
json={"name": "", "description": ""}
|
||||
)
|
||||
assert response.status_code == 400
|
||||
```
|
||||
|
||||
**Location**: `tests/api/`
|
||||
|
||||
### 4. Database Tests
|
||||
|
||||
Test data persistence and query correctness:
|
||||
|
||||
```python
|
||||
@pytest.mark.asyncio
|
||||
async def test_save_and_retrieve_notebook():
|
||||
"""Test saving and retrieving a notebook from database."""
|
||||
notebook = Notebook(name="Test", description="desc")
|
||||
await notebook.save()
|
||||
|
||||
retrieved = await Notebook.get(notebook.id)
|
||||
assert retrieved.name == "Test"
|
||||
assert retrieved.description == "desc"
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_query_by_criteria():
|
||||
"""Test querying notebooks by criteria."""
|
||||
await create_notebook("Active", "")
|
||||
await create_notebook("Archived", "")
|
||||
|
||||
active = await repo_query(
|
||||
"SELECT * FROM notebook WHERE archived = false"
|
||||
)
|
||||
assert len(active) >= 1
|
||||
```
|
||||
|
||||
**Location**: `tests/database/`
|
||||
|
||||
## Running Tests
|
||||
|
||||
### Run All Tests
|
||||
|
||||
```bash
|
||||
uv run pytest
|
||||
```
|
||||
|
||||
### Run Specific Test File
|
||||
|
||||
```bash
|
||||
uv run pytest tests/test_notebooks.py
|
||||
```
|
||||
|
||||
### Run Specific Test Function
|
||||
|
||||
```bash
|
||||
uv run pytest tests/test_notebooks.py::test_create_notebook
|
||||
```
|
||||
|
||||
### Run with Coverage Report
|
||||
|
||||
```bash
|
||||
uv run pytest --cov=open_notebook
|
||||
```
|
||||
|
||||
### Run Only Unit Tests
|
||||
|
||||
```bash
|
||||
uv run pytest tests/unit/
|
||||
```
|
||||
|
||||
### Run Only Integration Tests
|
||||
|
||||
```bash
|
||||
uv run pytest tests/integration/
|
||||
```
|
||||
|
||||
### Run Tests in Verbose Mode
|
||||
|
||||
```bash
|
||||
uv run pytest -v
|
||||
```
|
||||
|
||||
### Run Tests with Output
|
||||
|
||||
```bash
|
||||
uv run pytest -s
|
||||
```
|
||||
|
||||
## Test Fixtures
|
||||
|
||||
Use pytest fixtures for common setup and teardown:
|
||||
|
||||
```python
|
||||
import pytest
|
||||
|
||||
@pytest.fixture
|
||||
async def test_notebook():
|
||||
"""Create a test notebook."""
|
||||
notebook = Notebook(name="Test Notebook", description="Test description")
|
||||
await notebook.save()
|
||||
yield notebook
|
||||
await notebook.delete()
|
||||
|
||||
@pytest.fixture
|
||||
async def api_client():
|
||||
"""Create an API test client."""
|
||||
async with AsyncClient(app=app, base_url="http://test") as client:
|
||||
yield client
|
||||
|
||||
@pytest.fixture
|
||||
async def test_notebook_with_sources(test_notebook):
|
||||
"""Create a test notebook with sample sources."""
|
||||
source1 = Source(notebook_id=test_notebook.id, url="https://example.com")
|
||||
source2 = Source(notebook_id=test_notebook.id, url="https://example.org")
|
||||
await source1.save()
|
||||
await source2.save()
|
||||
|
||||
test_notebook.sources = [source1, source2]
|
||||
yield test_notebook
|
||||
|
||||
# Cleanup
|
||||
await source1.delete()
|
||||
await source2.delete()
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
### 1. Write Descriptive Test Names
|
||||
|
||||
```python
|
||||
# Good - clearly describes what is being tested
|
||||
async def test_create_notebook_with_valid_name_succeeds():
|
||||
...
|
||||
|
||||
# Bad - vague about what's being tested
|
||||
async def test_notebook():
|
||||
...
|
||||
```
|
||||
|
||||
### 2. Use Docstrings
|
||||
|
||||
```python
|
||||
@pytest.mark.asyncio
|
||||
async def test_vector_search_returns_sorted_results():
|
||||
"""Test that vector search results are sorted by relevance score."""
|
||||
# Implementation
|
||||
```
|
||||
|
||||
### 3. Test Edge Cases
|
||||
|
||||
```python
|
||||
@pytest.mark.asyncio
|
||||
async def test_search_with_empty_query():
|
||||
"""Test that empty query raises error."""
|
||||
with pytest.raises(InvalidInputError):
|
||||
await vector_search("")
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_search_with_very_long_query():
|
||||
"""Test that very long query is handled."""
|
||||
long_query = "x" * 10000
|
||||
results = await vector_search(long_query)
|
||||
assert isinstance(results, list)
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_search_with_special_characters():
|
||||
"""Test that special characters are handled."""
|
||||
results = await vector_search("@#$%^&*()")
|
||||
assert isinstance(results, list)
|
||||
```
|
||||
|
||||
### 4. Use Assertions Effectively
|
||||
|
||||
```python
|
||||
# Good - specific assertions
|
||||
assert notebook.name == "Test"
|
||||
assert len(notebook.sources) == 3
|
||||
assert notebook.created is not None
|
||||
|
||||
# Less good - too broad
|
||||
assert notebook is not None
|
||||
assert notebook # ambiguous what's being tested
|
||||
```
|
||||
|
||||
### 5. Test Both Success and Failure Cases
|
||||
|
||||
```python
|
||||
@pytest.mark.asyncio
|
||||
async def test_create_notebook_success():
|
||||
"""Test successful notebook creation."""
|
||||
notebook = await create_notebook(name="Research", description="AI")
|
||||
assert notebook.id is not None
|
||||
assert notebook.name == "Research"
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_create_notebook_empty_name_fails():
|
||||
"""Test that empty name raises error."""
|
||||
with pytest.raises(InvalidInputError):
|
||||
await create_notebook(name="", description="")
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_create_notebook_duplicate_fails():
|
||||
"""Test that duplicate names are handled."""
|
||||
await create_notebook(name="Research", description="")
|
||||
with pytest.raises(DuplicateError):
|
||||
await create_notebook(name="Research", description="")
|
||||
```
|
||||
|
||||
### 6. Keep Tests Independent
|
||||
|
||||
```python
|
||||
# Good - test is self-contained
|
||||
@pytest.mark.asyncio
|
||||
async def test_archive_notebook():
|
||||
notebook = Notebook(name="Test", description="")
|
||||
await notebook.save()
|
||||
await notebook.archive()
|
||||
assert notebook.archived is True
|
||||
|
||||
# Bad - depends on another test's state
|
||||
@pytest.mark.asyncio
|
||||
async def test_archive_existing_notebook():
|
||||
# Assumes test_create_notebook ran first
|
||||
await notebook.archive() # notebook undefined
|
||||
```
|
||||
|
||||
### 7. Use Fixtures for Reusable Setup
|
||||
|
||||
```python
|
||||
# Instead of repeating setup:
|
||||
@pytest.fixture
|
||||
async def client_with_auth(api_client, mock_auth):
|
||||
"""Client with authentication set up."""
|
||||
api_client.headers.update({"Authorization": f"Bearer {mock_auth.token}"})
|
||||
yield api_client
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_protected_endpoint(client_with_auth):
|
||||
"""Test protected endpoint."""
|
||||
response = await client_with_auth.get("/api/protected")
|
||||
assert response.status_code == 200
|
||||
```
|
||||
|
||||
## Coverage Goals
|
||||
|
||||
- Aim for 70%+ overall coverage
|
||||
- 90%+ coverage for critical business logic
|
||||
- Don't obsess over 100% - focus on meaningful tests
|
||||
- Use `--cov` flag to check coverage: `uv run pytest --cov=open_notebook`
|
||||
|
||||
## Async Test Patterns
|
||||
|
||||
### Testing Async Functions
|
||||
|
||||
```python
|
||||
@pytest.mark.asyncio
|
||||
async def test_async_operation():
|
||||
"""Test async function."""
|
||||
result = await some_async_function()
|
||||
assert result is not None
|
||||
```
|
||||
|
||||
### Testing Concurrent Operations
|
||||
|
||||
```python
|
||||
@pytest.mark.asyncio
|
||||
async def test_concurrent_notebook_creation():
|
||||
"""Test creating multiple notebooks concurrently."""
|
||||
tasks = [
|
||||
create_notebook(f"Notebook {i}", "")
|
||||
for i in range(10)
|
||||
]
|
||||
notebooks = await asyncio.gather(*tasks)
|
||||
assert len(notebooks) == 10
|
||||
assert all(n.id for n in notebooks)
|
||||
```
|
||||
|
||||
## Common Testing Errors
|
||||
|
||||
### Error: "event loop is closed"
|
||||
|
||||
Solution: Use the async fixture properly:
|
||||
```python
|
||||
@pytest.fixture
|
||||
async def notebook(): # Use async fixture
|
||||
notebook = Notebook(name="Test", description="")
|
||||
await notebook.save()
|
||||
yield notebook
|
||||
await notebook.delete()
|
||||
```
|
||||
|
||||
### Error: "object is not awaitable"
|
||||
|
||||
Solution: Make sure you're using await:
|
||||
```python
|
||||
# Wrong
|
||||
result = create_notebook("Test", "")
|
||||
|
||||
# Right
|
||||
result = await create_notebook("Test", "")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**See also:**
|
||||
- [Code Standards](code-standards.md) - Code formatting and style
|
||||
- [Contributing Guide](contributing.md) - Overall contribution workflow
|
||||
Loading…
Add table
Add a link
Reference in a new issue