diff --git a/.gitignore b/.gitignore index 999338d..628a4cb 100644 --- a/.gitignore +++ b/.gitignore @@ -133,4 +133,8 @@ doc_exports/ specs/ .claude -.playwright-mcp/ \ No newline at end of file +.playwright-mcp/ + + + +**/*.local.md \ No newline at end of file diff --git a/CLAUDE.md b/CLAUDE.md index f9df7b2..9892d74 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1,3 +1,352 @@ +# Open Notebook - Root CLAUDE.md -We have a good amount of documentation on this project on the ./docs folder. Please read through them when necessary, and always review the docs/index.md file before starting a new feature so you know at least which docs are available. +This file provides architectural guidance for contributors working on Open Notebook at the project level. +## Project Overview + +**Open Notebook** is an open-source, privacy-focused alternative to Google's Notebook LM. It's an AI-powered research assistant enabling users to upload multi-modal content (PDFs, audio, video, web pages), generate intelligent notes, search semantically, chat with AI models, and produce professional podcasts—all with complete control over data and choice of AI providers. + +**Key Values**: Privacy-first, multi-provider AI support, fully self-hosted option, open-source transparency. + +--- + +## Three-Tier Architecture + +``` +┌─────────────────────────────────────────────────────────┐ +│ Frontend (React/Next.js) │ +│ frontend/ @ port 3000 │ +├─────────────────────────────────────────────────────────┤ +│ - Notebooks, sources, notes, chat, podcasts, search UI │ +│ - Zustand state management, TanStack Query (React Query)│ +│ - Shadcn/ui component library with Tailwind CSS │ +└────────────────────────┬────────────────────────────────┘ + │ HTTP REST +┌────────────────────────▼────────────────────────────────┐ +│ API (FastAPI) │ +│ api/ @ port 5055 │ +├─────────────────────────────────────────────────────────┤ +│ - REST endpoints for notebooks, sources, notes, chat │ +│ - LangGraph workflow orchestration │ +│ - Job queue for async operations (podcasts) │ +│ - Multi-provider AI provisioning via Esperanto │ +└────────────────────────┬────────────────────────────────┘ + │ SurrealQL +┌────────────────────────▼────────────────────────────────┐ +│ Database (SurrealDB) │ +│ Graph database @ port 8000 │ +├─────────────────────────────────────────────────────────┤ +│ - Records: Notebook, Source, Note, ChatSession, etc. │ +│ - Relationships: source-to-notebook, note-to-source │ +│ - Vector embeddings for semantic search │ +└─────────────────────────────────────────────────────────┘ +``` + +--- + +## Tech Stack + +### Frontend (`frontend/`) +- **Framework**: Next.js 15 (React 19) +- **Language**: TypeScript +- **State Management**: Zustand +- **Data Fetching**: TanStack Query (React Query) +- **Styling**: Tailwind CSS + Shadcn/ui +- **Build Tool**: Webpack (via Next.js) + +### API Backend (`api/` + `open_notebook/`) +- **Framework**: FastAPI 0.104+ +- **Language**: Python 3.11+ +- **Workflows**: LangGraph state machines +- **Database**: SurrealDB async driver +- **AI Providers**: Esperanto library (8+ providers: OpenAI, Anthropic, Google, Groq, Ollama, Mistral, DeepSeek, xAI) +- **Job Queue**: Surreal-Commands for async jobs (podcasts) +- **Logging**: Loguru +- **Validation**: Pydantic v2 +- **Testing**: Pytest + +### Database +- **SurrealDB**: Graph database with built-in embedding storage and vector search +- **Schema Migrations**: Automatic on API startup via AsyncMigrationManager + +### Additional Services +- **Content Processing**: content-core library (file/URL extraction) +- **Prompts**: AI-Prompter with Jinja2 templating +- **Podcast Generation**: podcast-creator library +- **Embeddings**: Multi-provider via Esperanto + +--- + +## Directory Structure + +``` +open-notebook/ +├── frontend/ # React/Next.js UI +│ ├── src/ +│ │ ├── app/ # Next.js app router +│ │ ├── components/ # React components +│ │ ├── hooks/ # Custom React hooks +│ │ ├── lib/ # Utilities +│ │ └── styles/ # Global styles +│ ├── package.json # Node dependencies +│ └── CLAUDE.md # Frontend-specific guidance +│ +├── api/ # FastAPI REST layer +│ ├── routers/ # HTTP endpoints +│ ├── services/ # Business logic +│ ├── models.py # Request/response schemas +│ ├── main.py # FastAPI app + lifespan +│ └── CLAUDE.md # API-specific guidance +│ +├── open_notebook/ # Python backend core (domain + workflows) +│ ├── domain/ # Data models (Notebook, Source, Note, etc.) +│ ├── database/ # SurrealDB async repository & migrations +│ ├── graphs/ # LangGraph workflows (chat, ask, source) +│ ├── ai/ # ModelManager, AI provider provisioning +│ ├── utils/ # Context builders, token utils +│ ├── podcasts/ # Podcast models & generation +│ ├── config.py # Configuration & paths +│ ├── exceptions.py # Error hierarchy +│ └── CLAUDE.md # Backend core guidance +│ +├── prompts/ # Jinja2 prompt templates +│ ├── chat/ # Chat prompt templates +│ ├── ask/ # Search/synthesis prompts +│ ├── podcast/ # Podcast outline & transcript +│ └── source_chat/ # Source-specific chat +│ +├── migrations/ # SurrealDB schema migrations +│ ├── 001_*.surql # Initial schema +│ └── ... +│ +├── tests/ # Python unit & integration tests +│ ├── test_domain.py +│ ├── test_graphs.py +│ └── conftest.py +│ +├── commands/ # CLI utilities +├── docs/ # User & deployment documentation +├── scripts/ # Utility scripts +├── setup_guide/ # Setup guides +│ +├── docker-compose.yml # Multi-container orchestration +├── Dockerfile # API container image +├── Makefile # Development commands +├── pyproject.toml # Python project config +├── README.md # Project README +├── CLAUDE.md # This file +└── CLAUDE.md # Root project guidance (THIS FILE) +``` + +--- + +## Getting Started + +### 1. Clone & Install +```bash +git clone https://github.com/lfnovo/open-notebook.git +cd open-notebook + +# Python dependencies +uv sync + +# Frontend dependencies +cd frontend +npm install +cd .. +``` + +### 2. Environment Setup +```bash +cp .env.example .env +# Edit .env with your API keys (OpenAI, Anthropic, etc.) +``` + +### 3. Start Services +```bash +# Terminal 1: Start SurrealDB +make database + +# Terminal 2: Start API (port 5055) +make api +# or: uv run --env-file .env uvicorn api.main:app --host 0.0.0.0 --port 5055 + +# Terminal 3: Start Frontend (port 3000) +cd frontend && npm run dev + +# Full stack (development) +make start-all +``` + +### 4. Verify +- Frontend: http://localhost:3000 +- API docs: http://localhost:5055/docs +- SurrealDB: http://localhost:8000 + +--- + +## Development Workflow + +### Key Commands +```bash +# Code quality +make ruff # Lint + auto-fix Python +make lint # Type checking (mypy) + +# Testing +uv run pytest tests/ + +# Database migrations (auto-run on API startup) +# Manual check: API logs show "Running migration X" + +# Docker +make docker-build # Build multi-platform image +docker compose --profile multi up # Full stack in Docker +``` + +### Code Style +- **Python**: Ruff (auto-fix), mypy (type checking) +- **TypeScript**: ESLint config provided +- **Commits**: Conventional commits (feat:, fix:, docs:, refactor:) +- **Git Flow**: Feature branches from `main` + +--- + +## Architecture Highlights + +### 1. Async-First Design +- All database queries, graph invocations, and API calls are async (await) +- SurrealDB async driver with connection pooling +- FastAPI handles concurrent requests efficiently + +### 2. LangGraph Workflows +- **source.py**: Content ingestion (extract → embed → save) +- **chat.py**: Conversational agent with message history +- **ask.py**: Search + synthesis (retrieve relevant sources → LLM) +- **transformation.py**: Custom transformations on sources +- All use `provision_langchain_model()` for smart model selection + +### 3. Multi-Provider AI +- **Esperanto library**: Unified interface to 8+ AI providers +- **ModelManager**: Factory pattern with fallback logic +- **Smart selection**: Detects large contexts, prefers long-context models +- **Override support**: Per-request model configuration + +### 4. Database Schema +- **Automatic migrations**: AsyncMigrationManager runs on API startup +- **SurrealDB graph model**: Records with relationships and embeddings +- **Vector search**: Built-in semantic search across all content +- **Transactions**: Repo functions handle ACID operations + +### 5. Authentication +- **Current**: Simple password middleware (insecure, dev-only) +- **Production**: Replace with OAuth/JWT (see CONFIGURATION.md) + +--- + +## Important Quirks & Gotchas + +### API Startup +- **Migrations run automatically** on startup; check logs for errors +- **Must start API before UI**: UI depends on API for all data +- **SurrealDB must be running**: API fails without database connection + +### Frontend-Backend Communication +- **Base API URL**: Configured in `.env.local` (default: http://localhost:5055) +- **CORS enabled**: Configured in `api/main.py` (allow all origins in dev) +- **Rate limiting**: Not built-in; add at proxy layer for production + +### LangGraph Workflows +- **Blocking operations**: Chat/podcast workflows may take minutes; no timeout +- **State persistence**: Uses SQLite checkpoint storage in `/data/sqlite-db/` +- **Model fallback**: If primary model fails, falls back to cheaper/smaller model + +### Podcast Generation +- **Async job queue**: `podcast_service.py` submits jobs but doesn't wait +- **Track status**: Use `/commands/{command_id}` endpoint to poll status +- **TTS failures**: Fall back to silent audio if speech synthesis fails + +### Content Processing +- **File extraction**: Uses content-core library; supports 50+ file types +- **URL handling**: Extracts text + metadata from web pages +- **Large files**: Content processing is sync; may block API briefly + +--- + +## Component References + +See dedicated CLAUDE.md files for detailed guidance: + +- **[frontend/CLAUDE.md](frontend/CLAUDE.md)**: React/Next.js architecture, state management, API integration +- **[api/CLAUDE.md](api/CLAUDE.md)**: FastAPI structure, service pattern, endpoint development +- **[open_notebook/CLAUDE.md](open_notebook/CLAUDE.md)**: Backend core, domain models, LangGraph workflows, AI provisioning +- **[open_notebook/domain/CLAUDE.md](open_notebook/domain/CLAUDE.md)**: Data models, repository pattern, search functions +- **[open_notebook/ai/CLAUDE.md](open_notebook/ai/CLAUDE.md)**: ModelManager, AI provider integration, Esperanto usage +- **[open_notebook/graphs/CLAUDE.md](open_notebook/graphs/CLAUDE.md)**: LangGraph workflow design, state machines +- **[open_notebook/database/CLAUDE.md](open_notebook/database/CLAUDE.md)**: SurrealDB operations, migrations, async patterns + +--- + +## Documentation Map + +- **[README.md](README.md)**: Project overview, features, quick start +- **[docs/index.md](docs/index.md)**: Complete user & deployment documentation +- **[CONFIGURATION.md](CONFIGURATION.md)**: Environment variables, model configuration +- **[DESIGN_PRINCIPLES.md](DESIGN_PRINCIPLES.md)**: Architectural decisions & philosophy +- **[MIGRATION.md](MIGRATION.md)**: v1.0 upgrade guide from Streamlit → React +- **[CONTRIBUTING.md](CONTRIBUTING.md)**: Contribution guidelines +- **[MAINTAINER_GUIDE.md](MAINTAINER_GUIDE.md)**: Release & maintenance procedures + +--- + +## Testing Strategy + +- **Unit tests**: `tests/test_domain.py`, `test_models_api.py` +- **Graph tests**: `tests/test_graphs.py` (workflow integration) +- **Utils tests**: `tests/test_utils.py` +- **Run all**: `uv run pytest tests/` +- **Coverage**: Check with `pytest --cov` + +--- + +## Common Tasks + +### Add a New API Endpoint +1. Create router in `api/routers/feature.py` +2. Create service in `api/feature_service.py` +3. Define schemas in `api/models.py` +4. Register router in `api/main.py` +5. Test via http://localhost:5055/docs + +### Add a New LangGraph Workflow +1. Create `open_notebook/graphs/workflow_name.py` +2. Define StateDict and node functions +3. Build graph with `.add_node()` / `.add_edge()` +4. Invoke in service: `graph.ainvoke({"input": ...}, config={"..."})` +5. Test with sample data in `tests/` + +### Add Database Migration +1. Create `migrations/XXX_description.surql` +2. Write SurrealQL schema changes +3. Create `migrations/XXX_description_down.surql` (optional rollback) +4. API auto-detects on startup; migration runs if newer than recorded version + +### Deploy to Production +1. Review [CONFIGURATION.md](CONFIGURATION.md) for security settings +2. Use `make docker-release` for multi-platform image +3. Push to Docker Hub / GitHub Container Registry +4. Deploy `docker compose --profile multi up` +5. Verify migrations via API logs + +--- + +## Support & Community + +- **Documentation**: https://open-notebook.ai +- **Discord**: https://discord.gg/37XJPXfz2w +- **Issues**: https://github.com/lfnovo/open-notebook/issues +- **License**: MIT (see LICENSE) + +--- + +**Last Updated**: January 2026 | **Project Version**: 1.2.4+ diff --git a/api/CLAUDE.md b/api/CLAUDE.md new file mode 100644 index 0000000..6620970 --- /dev/null +++ b/api/CLAUDE.md @@ -0,0 +1,117 @@ +# API Module + +FastAPI-based REST backend exposing services for notebooks, sources, notes, chat, podcasts, and AI model management. + +## Purpose + +FastAPI application serving three architectural layers: routes (HTTP endpoints), services (business logic), and models (request/response schemas). Integrates LangGraph workflows (chat, ask, source_chat), SurrealDB persistence, and AI providers via Esperanto. + +## Architecture Overview + +**Three layers**: +1. **Routes** (`routers/*`): HTTP endpoints mapping to services +2. **Services** (`*_service.py`): Business logic orchestrating domain models, database, graphs, AI providers +3. **Models** (`models.py`): Pydantic request/response schemas with validation + +**Startup flow**: +- Load .env environment variables +- Initialize CORS middleware + password auth middleware +- Run database migrations via AsyncMigrationManager on lifespan startup +- Register all routers + +**Key services**: +- `chat_service.py`: Invokes chat graph with messages, context +- `podcast_service.py`: Orchestrates outline + transcript generation +- `sources_service.py`: Content ingestion, vectorization, metadata +- `notes_service.py`: Note creation, linking to sources/insights +- `transformations_service.py`: Applies transformations to content +- `models_service.py`: Manages AI provider/model configuration +- `episode_profiles_service.py`: Manages podcast speaker/episode profiles + +## Component Catalog + +### Main Application +- **main.py**: FastAPI app initialization, CORS setup, auth middleware, lifespan event, router registration +- **Lifespan handler**: Runs AsyncMigrationManager on startup (database schema migration) +- **Auth middleware**: PasswordAuthMiddleware protects endpoints (password-based access control) + +### Services (Business Logic) +- **chat_service.py**: Invokes chat.py graph; handles message history via SqliteSaver +- **podcast_service.py**: Generates outline (outline.jinja), then transcript (transcript.jinja) for episodes +- **sources_service.py**: Ingests files/URLs (content_core), extracts text, vectorizes, saves to SurrealDB +- **transformations_service.py**: Applies transformations via transformation.py graph +- **models_service.py**: Manages ModelManager config (AI provider overrides) +- **episode_profiles_service.py**: CRUD for EpisodeProfile and SpeakerProfile models +- **insights_service.py**: Generates and retrieves source insights +- **notes_service.py**: Creates notes linked to sources/insights + +### Models (Schemas) +- **models.py**: Pydantic schemas for request/response validation +- Request bodies: ChatRequest, CreateNoteRequest, PodcastGenerationRequest, etc. +- Response bodies: ChatResponse, NoteResponse, PodcastResponse, etc. +- Custom validators for enum fields, file paths, model references + +### Routers +- **routers/chat.py**: POST /chat +- **routers/source_chat.py**: POST /source/{source_id}/chat +- **routers/podcasts.py**: POST /podcasts, GET /podcasts/{id}, etc. +- **routers/notes.py**: POST /notes, GET /notes/{id} +- **routers/sources.py**: POST /sources, GET /sources/{id}, DELETE /sources/{id} +- **routers/models.py**: GET /models, POST /models/config +- **routers/transformations.py**: POST /transformations +- **routers/insights.py**: GET /sources/{source_id}/insights +- **routers/auth.py**: POST /auth/password (password-based auth) +- **routers/commands.py**: GET /commands/{command_id} (job status tracking) + +## Common Patterns + +- **Service injection via FastAPI**: Routers import services directly; no DI framework +- **Async/await throughout**: All DB queries, graph invocations, AI calls are async +- **SurrealDB transactions**: Services use repo_query, repo_create, repo_upsert from database layer +- **Config override pattern**: Models/config override via models_service passed to graph.ainvoke(config=...) +- **Error handling**: Services catch exceptions and return HTTP status codes (400 Bad Request, 404 Not Found, 500 Internal Server Error) +- **Logging**: loguru logger in main.py; services expected to log key operations +- **Response normalization**: All responses follow standard schema (data + metadata structure) + +## Key Dependencies + +- `fastapi`: FastAPI app, routers, HTTPException +- `pydantic`: Validation models with Field, field_validator +- `open_notebook.graphs`: chat, ask, source_chat, source, transformation graphs +- `open_notebook.database`: SurrealDB repository functions (repo_query, repo_create, repo_upsert) +- `open_notebook.domain`: Notebook, Source, Note, SourceInsight models +- `open_notebook.ai.provision`: provision_langchain_model() factory +- `ai_prompter`: Prompter for template rendering +- `content_core`: extract_content() for file/URL processing +- `esperanto`: AI provider client library (LLM, embeddings, TTS) +- `surreal_commands`: Job queue for async operations (podcast generation) +- `loguru`: Structured logging + +## Important Quirks & Gotchas + +- **Migration auto-run**: Database schema migrations run on every API startup (via lifespan); no manual migration steps +- **PasswordAuthMiddleware is basic**: Uses simple password check; production deployments should replace with OAuth/JWT +- **No request rate limiting**: No built-in rate limiting; deployment must add via proxy/middleware +- **Service state is stateless**: Services don't cache results; each request re-queries database/AI models +- **Graph invocation is blocking**: chat/podcast workflows may take minutes; no timeout handling in services +- **Command job fire-and-forget**: podcast_service.py submits jobs but doesn't wait (async job queue pattern) +- **Model override scoping**: Model config override via RunnableConfig is per-request only (not persistent) +- **CORS open by default**: main.py CORS settings allow all origins (restrict before production) +- **No OpenAPI security scheme**: API docs available without auth (disable before production) +- **Services don't validate user permission**: All endpoints trust authentication layer; no per-notebook permission checks + +## How to Add New Endpoint + +1. Create router file in `routers/` (e.g., `routers/new_feature.py`) +2. Import router into `main.py` and register: `app.include_router(new_feature.router, tags=["new_feature"])` +3. Create service in `new_feature_service.py` with business logic +4. Define request/response schemas in `models.py` (or create `new_feature_models.py`) +5. Implement router functions calling service methods +6. Test with `uv run uvicorn api.main:app --host 0.0.0.0 --port 5055` + +## Testing Patterns + +- **Interactive docs**: http://localhost:5055/docs (Swagger UI) +- **Direct service tests**: Import service, call methods directly with test data +- **Mock graphs**: Replace graph.ainvoke() with mock for testing service logic +- **Database: Use test database** (separate SurrealDB instance or mock repo_query) diff --git a/batch_fix_services.py b/batch_fix_services.py deleted file mode 100644 index 4db32b6..0000000 --- a/batch_fix_services.py +++ /dev/null @@ -1,77 +0,0 @@ -#!/usr/bin/env python3 -"""Batch fix service files for mypy errors.""" -import re -from pathlib import Path - -SERVICE_FILES = [ - 'api/notes_service.py', - 'api/insights_service.py', - 'api/episode_profiles_service.py', - 'api/settings_service.py', - 'api/sources_service.py', - 'api/podcast_service.py', - 'api/command_service.py', -] - -BASE_DIR = Path('/Users/luisnovo/dev/projetos/open-notebook/open-notebook') - -for service_file in SERVICE_FILES: - file_path = BASE_DIR / service_file - if not file_path.exists(): - print(f"Skipping {service_file} - file not found") - continue - - content = file_path.read_text() - original_content = content - - # Pattern to find: var_name = api_client.method(args) - # Followed by: var_name["key"] or var_name.get("key") - lines = content.split('\n') - new_lines = [] - i = 0 - - while i < len(lines): - line = lines[i] - - # Check if this line has an api_client call assignment - match = re.match(r'(\s*)(\w+)\s*=\s*api_client\.(\w+)\((.*)\)\s*$', line) - if match and 'response = api_client' not in line: - indent = match.group(1) - var_name = match.group(2) - method_name = match.group(3) - args = match.group(4) - - # Look ahead to see if this variable is used with dict access - has_dict_access = False - for j in range(i+1, min(i+15, len(lines))): - next_line = lines[j] - if f'{var_name}["' in next_line or f"{var_name}['" in next_line or f'{var_name}.get(' in next_line: - has_dict_access = True - break - # Stop looking if we hit a blank line, new function, or new assignment - if (not next_line.strip() or - next_line.strip().startswith('def ') or - next_line.strip().startswith('class ') or - (re.match(r'\s*\w+\s*=', next_line) and var_name not in next_line)): - break - - if has_dict_access: - # Replace with response and isinstance check - new_lines.append(f'{indent}response = api_client.{method_name}({args})') - new_lines.append(f'{indent}{var_name} = response if isinstance(response, dict) else response[0]') - i += 1 - continue - - new_lines.append(line) - i += 1 - - new_content = '\n'.join(new_lines) - - # Check if content changed - if new_content != original_content: - file_path.write_text(new_content) - print(f"✓ Fixed {service_file}") - else: - print(f"- No changes needed for {service_file}") - -print("\nDone!") diff --git a/commands/CLAUDE.md b/commands/CLAUDE.md new file mode 100644 index 0000000..0b0eb61 --- /dev/null +++ b/commands/CLAUDE.md @@ -0,0 +1,49 @@ +# Commands Module + +**Purpose**: Defines async command handlers for long-running operations via `surreal-commands` job queue system. + +## Key Components + +- **`process_source_command`**: Ingests content through `source_graph`, creates embeddings (optional), and generates insights. Retries on transaction conflicts (exp. jitter, max 5×). +- **`embed_single_item_command`**: Embeds individual sources/notes/insights; splits content into chunks for vector storage. +- **`rebuild_embeddings_command`**: Bulk re-embed all/existing items with selective type filtering. +- **`generate_podcast_command`**: Creates podcasts via `podcast-creator` library using stored episode/speaker profiles. +- **`process_text_command`** (example): Test fixture for text operations (uppercase, lowercase, reverse, word_count). +- **`analyze_data_command`** (example): Test fixture for numeric aggregations. + +## Important Patterns + +- **Pydantic I/O**: All commands use `CommandInput`/`CommandOutput` subclasses for type safety and serialization. +- **Error handling**: Permanent errors return failure output; `RuntimeError` exceptions auto-retry via surreal-commands. +- **Model dumping**: Recursive `full_model_dump()` utility converts Pydantic models → dicts for DB/API responses. +- **Logging**: Uses `loguru.logger` throughout; logs execution start/end and key metrics (processing time, counts). +- **Time tracking**: All commands measure `start_time` → `processing_time` for monitoring. + +## Dependencies + +**External**: `surreal_commands` (command decorator, job queue), `loguru`, `pydantic`, `podcast_creator` +**Internal**: `open_notebook.domain.*` (Source, Note, Transformation), `open_notebook.graphs.source`, `open_notebook.ai.models` + +## Quirks & Edge Cases + +- **source_commands**: `ensure_record_id()` wraps command IDs for DB storage; transaction conflicts trigger exponential backoff retry (1-30s). Non-`RuntimeError` exceptions are permanent. +- **embedding_commands**: Queries DB directly for item state; chunk index must match source's chunk list. Model availability checked at command start. +- **podcast_commands**: Profiles loaded from SurrealDB by name (must exist); briefing can be extended with suffix. Episode records created mid-execution. +- **Example commands**: Accept optional `delay_seconds` for testing async behavior; not for production. + +## Code Example + +```python +@command("process_source", app="open_notebook", retry={...}) +async def process_source_command(input_data: SourceProcessingInput) -> SourceProcessingOutput: + start_time = time.time() + try: + transformations = [await Transformation.get(id) for id in input_data.transformations] + source = await Source.get(input_data.source_id) + result = await source_graph.ainvoke({...}) + return SourceProcessingOutput(success=True, ...) + except RuntimeError as e: + raise # Retry this + except Exception as e: + return SourceProcessingOutput(success=False, error_message=str(e)) +``` diff --git a/frontend/src/CLAUDE.md b/frontend/src/CLAUDE.md new file mode 100644 index 0000000..1d09ad5 --- /dev/null +++ b/frontend/src/CLAUDE.md @@ -0,0 +1,159 @@ +# Frontend Architecture + +Next.js React application providing UI for Open Notebook research assistant. Three-layer architecture: **pages** (Next.js App Router), **components** (feature-specific UI), and **lib** (data fetching, state management, utilities). + +## High-Level Data Flow + +``` +Pages (Next.js) → Components (feature-specific) → Hooks (queries/mutations) + ↓ + Stores (auth/modal state) → API module → Backend +``` + +User interactions trigger mutations/queries via hooks, which communicate with the backend through the API module. Store state (auth, modals) flows back to components via hooks. Child CLAUDE.md files document specific modules in detail: + +- **`lib/api/CLAUDE.md`**: Axios client, FormData handling, interceptors +- **`lib/hooks/CLAUDE.md`**: TanStack Query wrappers, SSE streaming, context building +- **`lib/stores/CLAUDE.md`**: Zustand auth/modal state, localStorage persistence +- **`components/ui/CLAUDE.md`**: Radix UI primitives, CVA styling, accessibility + +## Architectural Layers + +### Pages (`src/app/`) — Next.js App Router +- `(auth)/login`: Authentication entry point +- `(dashboard)/`: Protected routes (notebooks, sources, search, models, etc.) +- Directory-based routing; each `page.tsx` is a route endpoint +- **Key pattern**: Pages call hooks to fetch data, render components with state +- **Router groups** `(auth)`, `(dashboard)` organize routes by feature without affecting URL + +### Components (`src/components/`) — Feature-Specific UI +- **layout**: `AppShell.tsx`, `AppSidebar.tsx` — main layout wrapper used by all pages +- **providers**: `ThemeProvider`, `QueryProvider`, `ModalProvider` — app-wide context setup +- **auth**: `LoginForm.tsx` — authentication UI +- **common**: `CommandPalette`, `ErrorBoundary`, `ContextToggle`, `ModelSelector` — shared across pages +- **ui**: Reusable Radix UI building blocks (see child CLAUDE.md) +- **source**, **notebooks**, **search**, **podcasts**: Feature-specific components consuming hooks + +**Component composition pattern**: Pages → Feature components → UI components. Feature components handle page-level state (loading, error), UI components remain stateless and styled. + +### Lib (`src/lib/`) — Data & State Layer + +#### `lib/api/` — Backend Communication +- **`client.ts`**: Central Axios instance with auth interceptor, FormData handling, 10-min timeout +- **`query-client.ts`**: TanStack Query configuration +- **Resource modules** (`sources.ts`, `chat.ts`, `notebooks.ts`, etc.): Endpoint-specific functions returning typed responses +- **Pattern**: All requests go through `apiClient`; auth token auto-added from localStorage + +#### `lib/hooks/` — React Query + Custom Logic +- **Query hooks**: `useNotebookSources`, `useSources`, `useSource` — TanStack Query wrappers with cache keys +- **Mutation hooks**: `useCreateSource`, `useUpdateSource`, `useDeleteSource` — mutations with toast feedback + cache invalidation +- **Complex hooks**: `useNotebookChat`, `useSourceChat` — session management, message streaming, context building +- **SSE streaming**: `useAsk` — parses newline-delimited JSON from backend for multi-stage workflows +- **Pattern**: Hooks return `{ data, isLoading, error, refetch }` + action functions; cache invalidation on mutations + +#### `lib/stores/` — Application State +- **`auth-store.ts`**: Authentication state (token, isAuthenticated) with 30-second check caching +- **Zustand + persist middleware**: Auto-syncs sensitive state to localStorage +- **Pattern**: Store actions (`login()`, `logout()`, `checkAuth()`) update state; consumed via hooks in components + +#### `lib/types/` — TypeScript Definitions +- API request/response shapes, domain models (Notebook, Source, Note, etc.) +- Ensures type safety across API calls and store mutations + +## Data & Control Flow Walkthrough + +### Example: Notebook Chat +1. **Page** (`notebooks/[id]/page.tsx`) fetches initial data, passes `notebookId` to `ChatColumn` component +2. **Hook call** (`useNotebookChat()`): + - Queries sessions for notebook via TanStack Query + - Sets up message state + context building logic + - Returns `{ messages, sendMessage(), setModelOverride() }` +3. **Component renders**: `ChatColumn` displays messages, text input +4. **User sends message**: Component calls `sendMessage()` hook +5. **Hook execution**: + - Builds context from selected sources/notes via `buildContext()` helper + - Calls `chatApi.sendMessage()` (from API module) + - Client-side optimistic update: adds message to local state before response +6. **Backend response** arrives, TanStack Query updates cache +7. **Cache invalidation** on other source/note mutations ensures stale UI refreshes + +### Example: File Upload with Source Creation +1. **Component** (`SourceDialog`) renders form with file picker +2. **Hook** (`useFileUpload`): + - Converts file to FormData (JSON fields stringified) + - Calls `sourcesApi.create()` with FormData + - API client interceptor deletes Content-Type header (lets browser set multipart boundary) +3. **Toast notifications** show progress +4. **Cache invalidation** on success: `queryClient.invalidateQueries(['sources'])` +5. **Related queries** auto-refetch: notebooks, sources list, etc. + +## Key Patterns & Cross-Layer Coordination + +### Caching & Invalidation +- **Query keys**: `QUERY_KEYS.notebook(id)`, `QUERY_KEYS.sources(notebookId)` — hierarchical structure +- **Broad invalidation**: `['sources']` invalidates all source queries; trade-off between accuracy + performance +- **Auto-refetch**: `refetchOnWindowFocus: true` on frequently-changing data (sources, notebooks) + +### Auth & Protected Routes +- **Middleware** (`src/middleware.ts`): Redirects unauthenticated users to `/login` +- **Auth store**: Validates token via `/notebooks` API call (actual validation, not JWT decode) +- **Interceptor**: Adds `Bearer {token}` to all requests; 401 response clears auth and redirects to login + +### Modal State Management +- **Modal hooks**: Components query modal state from stores +- **Context**: Modals pass data (e.g., notebook ID) to child components +- **Pattern**: One store per modal type; triggered by button clicks + data passing via hook arguments + +### Error Handling +- **API errors**: All request failures propagate to consuming code; components show toast notifications +- **Toast feedback**: Mutations show success/error toasts (from `sonner` library) +- **Error boundary**: App-level error boundary catches React render errors; shows fallback UI + +### FormData Handling +- **JSON fields**: Nested objects (arrays, objects) must be JSON stringified before FormData +- **Content-Type header**: Removed by interceptor for FormData requests (lets browser set boundary) +- **Example**: `sources` array converted to string via `JSON.stringify()` before appending to FormData + +## Component Organization Within Features + +- **Feature folders** (`source/`, `notebooks/`, `podcasts/`): Group related components +- **Composition**: Larger components nest smaller ones; no deep prop drilling (state lifted to hooks) +- **Dialog patterns**: Features define dialog components for inline actions (edit, create, delete) +- **Props**: Components accept data + action callbacks from parent or hooks + +## Providers & Context Setup + +**Root layout** (`app/layout.tsx`) wraps app with: +1. `ThemeProvider` — next-themes for light/dark mode +2. `QueryProvider` — TanStack Query client +3. `ErrorBoundary` — React error boundary +4. `ConnectionGuard` — checks backend connectivity on startup +5. `Toaster` — sonner toast notification system + +## Important Gotchas & Design Decisions + +- **Token storage**: Stored in localStorage under `auth-storage` key (Zustand persist); consumed by API interceptor +- **Base URL discovery**: API client fetches base URL from runtime config on first request (async; can be slow on startup) +- **Optimistic updates**: Chat messages added to state before server confirmation; removed on error +- **Modal lifecycle**: Dialogs not auto-reset; parent must clear form state after submit +- **Focus management**: Dialog auto-focuses first input; can cause layout shifts if inputs are conditional +- **Cache invalidation breadth**: Trade-off between precision + simplicity; broad invalidation simpler but may over-fetch + +## How to Add a New Feature + +1. **Create page**: `app/(dashboard)/feature/page.tsx` — calls hooks, renders components +2. **Create feature components**: `components/feature/` — compose UI + business logic +3. **Add hooks** (if data needed): `lib/hooks/useFeature.ts` — TanStack Query wrapper +4. **Add API module** (if backend call needed): `lib/api/feature.ts` — resource-specific functions +5. **Add types**: `lib/types/api.ts` — request/response shapes +6. **Use UI components**: Import from `components/ui/` for consistent styling +7. **Handle auth**: Middleware redirects unauthenticated users; no special handling needed in component + +## Testing + +- **Hooks**: Mock API functions, wrap in `QueryClientProvider`, assert query/mutation behavior +- **Components**: Mock hooks via `vi.fn()`, test rendering + user interactions +- **API calls**: Mock `axios` interceptors; test request/response shapes +- **Stores**: Mock store state, test mutations via `act()`, assert state changes + +See child CLAUDE.md files for module-specific testing patterns. diff --git a/frontend/src/components/ui/CLAUDE.md b/frontend/src/components/ui/CLAUDE.md new file mode 100644 index 0000000..ff45b17 --- /dev/null +++ b/frontend/src/components/ui/CLAUDE.md @@ -0,0 +1,64 @@ +# UI Components Module + +Radix UI-based accessible component library with CVA styling, composed building blocks, and theming support. + +## Key Components + +- **Primitives** (`button.tsx`, `dialog.tsx`, `select.tsx`, `dropdown-menu.tsx`): Radix UI wrappers with Tailwind styling +- **Composite components** (`checkbox-list.tsx`, `wizard-container.tsx`, `command.tsx`): Multi-part patterns combining primitives +- **Form components** (`input.tsx`, `textarea.tsx`, `label.tsx`, `form-section.tsx`): Input handling with accessibility +- **Feedback** (`alert.tsx`, `alert-dialog.tsx`, `sonner.tsx`, `progress.tsx`): User notifications and status +- **Layout** (`card.tsx`, `accordion.tsx`, `tabs.tsx`, `scroll-area.tsx`): Structural wrappers +- **Utilities** (`badge.tsx`, `separator.tsx`, `tooltip.tsx`, `popover.tsx`, `collapsible.tsx`): Small focused components + +## Important Patterns + +- **Radix UI wrappers**: Components delegate to Radix primitives; apply Tailwind classes via `cn()` utility +- **CVA (Class Variance Authority)**: `button.tsx` and similar use CVA for variant/size combinations +- **Composition via Slot**: `Button` uses `asChild` prop + `Slot` from radix to render as any element type +- **Data slots**: All components have `data-slot` attributes for testing/styling isolation +- **Controlled styling**: Classes hardcoded in components; use `className` prop to override/extend +- **Animations**: Radix `data-[state]` selectors for open/close animations (fade-in, zoom-in) +- **Accessibility first**: ARIA attributes from Radix (aria-invalid, sr-only labels, focus rings) +- **Dark mode support**: Uses Tailwind dark: prefix for color scheme (e.g., `dark:border-input`) + +## Key Dependencies + +- `@radix-ui/*`: Unstyled accessible primitives (dialog, select, dropdown-menu, etc.) +- `class-variance-authority`: CVA for variant patterns +- `lucide-react`: Icon library (XIcon in dialog close button) +- `@/lib/utils`: `cn()` utility for class merging + +## How to Add New Components + +1. Create `.tsx` file wrapping Radix primitive or composing existing components +2. Add `data-slot="component-name"` to root element +3. Use `cn()` to merge default classes with `className` prop +4. Export both component and variants (if using CVA) +5. Document prop shape and usage in JSDoc + +## Important Quirks & Gotchas + +- **Slot forwarding**: `asChild={true}` on Button passes all props to child; ensure child accepts them +- **FormData in dialogs**: Dialog not reset automatically; parent must manually clear form state +- **Focus management**: Dialog auto-focuses first input; can cause layout shifts if inputs conditionally rendered +- **Z-index stacking**: Fixed elements (Dialog overlay, dropdown menus) use z-50; be careful with other fixed elements +- **Click outside closes dropdown**: Radix dropdowns auto-close on outside click; may conflict with hover-triggered actions +- **SVG size inference**: Button uses `[&_svg:not([class*='size-'])]:size-4` to default unlabeled icons to 4x4; be explicit if different size needed +- **CSS-in-JS conflicts**: Hardcoded Tailwind classes may conflict with global CSS; specificity matters +- **Dark mode class**: Requires `dark` class on document root; not automatic with prefers-color-scheme alone + +## Testing Patterns + +```typescript +// Test component rendering with props +render() +expect(screen.getByRole('button')).toHaveClass('bg-destructive') + +// Test Dialog interaction +render(Content) +expect(screen.getByText('Content')).toBeInTheDocument() + +// Test accessibility +expect(screen.getByRole('dialog')).toHaveAttribute('role', 'dialog') +``` diff --git a/frontend/src/lib/api/CLAUDE.md b/frontend/src/lib/api/CLAUDE.md new file mode 100644 index 0000000..2307eb0 --- /dev/null +++ b/frontend/src/lib/api/CLAUDE.md @@ -0,0 +1,66 @@ +# API Module + +Axios-based client and resource-specific API modules for backend communication with auth, FormData handling, and error recovery. + +## Key Components + +- **`client.ts`**: Central Axios instance with request/response interceptors, auth headers, base URL resolution +- **Resource modules** (`sources.ts`, `notebooks.ts`, `chat.ts`, `search.ts`, etc.): Endpoint-specific functions returning typed responses +- **`query-client.ts`**: TanStack Query client configuration with default options +- **`models.ts`, `notes.ts`, `embeddings.ts`, `settings.ts`**: Additional resource APIs + +## Important Patterns + +- **Single axios instance**: `apiClient` with 10-minute timeout (for slow LLM operations) +- **Request interceptor**: Auto-fetches base URL from config, adds Bearer auth from localStorage `auth-storage` +- **FormData handling**: Auto-removes Content-Type header for FormData to let browser set multipart boundary +- **Response interceptor**: 401 clears auth and redirects to `/login` +- **Async base URL resolution**: `getApiUrl()` fetches from runtime config on first request +- **Error propagation**: All functions return typed responses via `response.data` +- **Method chaining**: Resource modules export namespaced objects (e.g., `sourcesApi.list()`, `sourcesApi.create()`) + +## Key Dependencies + +- `axios`: HTTP client library +- `@/lib/config`: `getApiUrl()` for dynamic base URL +- `@/lib/types/api`: TypeScript types for request/response shapes + +## How to Add New API Modules + +1. Create new file (e.g., `transforms.ts`) +2. Import `apiClient` +3. Export namespaced object with methods: + ```typescript + export const transformsApi = { + list: async () => { const response = await apiClient.get('/transforms'); return response.data } + } + ``` +4. Add types to `@/lib/types/api` if new response shapes needed + +## Important Quirks & Gotchas + +- **Base URL delay**: First request waits for `getApiUrl()` to resolve; can be slow on startup +- **FormData fields as JSON strings**: Nested objects (arrays, objects) must be JSON stringified in FormData (e.g., `notebooks`, `transformations`) +- **Timeout for streaming**: 10-minute timeout may not cover very long-running LLM operations; consider extending if needed +- **Auth token management**: Token stored in localStorage `auth-storage` key; uses Zustand persist middleware +- **Headers mutation in interceptor**: Mutating `config.headers` directly; be careful with middleware order +- **No retry logic**: Failed requests not automatically retried; must be handled in consuming code +- **Content-Type header precedence**: FormData interceptor deletes Content-Type after checking; subsequent interceptors won't re-add it + +## Usage Example + +```typescript +// Basic list +const sources = await sourcesApi.list({ notebook_id: notebookId }) + +// File upload with FormData +const response = await sourcesApi.create({ + type: 'upload', + file: fileObj, + notebook_id: notebookId, + async_processing: true +}) + +// With auth token (auto-added by interceptor) +const notes = await notesApi.list() +``` diff --git a/frontend/src/lib/hooks/CLAUDE.md b/frontend/src/lib/hooks/CLAUDE.md new file mode 100644 index 0000000..2e969a5 --- /dev/null +++ b/frontend/src/lib/hooks/CLAUDE.md @@ -0,0 +1,64 @@ +# Hooks Module + +React hooks for API data fetching, state management, and complex workflows (chat, streaming, file handling). + +## Key Components + +- **Query hooks** (`useNotebookSources`, `useSource`, `useSources`): TanStack Query wrappers for source data with infinite scroll and refetch strategies +- **Mutation hooks** (`useCreateSource`, `useUpdateSource`, `useDeleteSource`, `useFileUpload`, `useRetrySource`): Server mutations with toast notifications and cache invalidation +- **Chat hooks** (`useNotebookChat`, `useSourceChat`): Complex session management, context building, and message streaming +- **Streaming hooks** (`useAsk`): SSE parsing for multi-stage Ask workflows (strategy → answers → final answer) +- **Model/config hooks** (`useModels`, `useSettings`, `useTransformations`): Application-level settings and model management +- **Utility hooks** (`useMediaQuery`, `useToast`, `useNavigation`, `useAuth`): UI state and auth checking + +## Important Patterns + +- **TanStack Query integration**: All data hooks use `useQuery`/`useMutation` with `QUERY_KEYS` for cache consistency +- **Optimistic updates**: Mutations add local state before server response (e.g., notebook chat messages) +- **Cache invalidation**: Broad invalidation of query keys on mutations (e.g., `['sources']` catches all source queries) +- **Auto-refetch on return**: `refetchOnWindowFocus: true` on frequently-changing data (sources, notebooks) +- **Manual refetch controls**: Hooks return `refetch()` for parent components to trigger refresh +- **SSE streaming pattern**: `useAsk` manually parses newline-delimited JSON from `/api/search/ask`; handles incomplete buffers +- **Status polling**: `useSourceStatus` auto-refetches every 2s while `status === 'running' | 'queued' | 'new'` +- **Context building**: `useNotebookChat.buildContext()` assembles selected sources + notes with token/char counts + +## Key Dependencies + +- `@tanstack/react-query`: Data fetching and caching +- `sonner`: Toast notifications +- `@/lib/api/*`: API module exports (sourcesApi, chatApi, searchApi, etc.) +- `@/lib/types/api`: TypeScript response types +- Zustand stores: `useAuthStore`, modal managers + +## How to Add New Hooks + +1. **Data queries**: Create `useQuery` hook wrapping API call; use `QUERY_KEYS.entityName(id)` for cache key +2. **Mutations**: Create `useMutation` hook with `onSuccess` cache invalidation + toast feedback +3. **Complex state**: Use `useState` + callbacks for local state (see `useAsk`, `useNotebookChat`) +4. **Return shape**: Export object with both state and action functions for composability + +## Important Quirks & Gotchas + +- **Cache invalidation breadth**: Invalidating `['sources']` affects ALL source queries; be precise if performance matters +- **Optimistic updates + error handling**: `useNotebookChat` removes optimistic messages on error; ensure cleanup +- **SSE buffer handling**: `useAsk` keeps incomplete lines in buffer between reads; incomplete JSON silently skipped +- **Model override timing**: `useNotebookChat` stores pending model override if no session exists; applied on session creation +- **Pagination cursor**: `useNotebookSources` uses offset-based pagination; `nextOffset` calculated from page size +- **Status polling race**: `useSourceStatus` may refetch stale data before server catches up; retry logic has 3-attempt limit +- **Keyboard trap in dialogs**: Some hooks manage modal state; ensure Dialog/Modal components handle escape key properly +- **Form data handling**: `useFileUpload` and source creation convert JSON fields to strings in FormData + +## Testing Patterns + +```typescript +// Mock API +const mockApi = { + list: vi.fn().mockResolvedValue([...]) +} + +// Test hook with QueryClientProvider + wrapper +render(, { wrapper: QueryClientProvider }) + +// Assert mutations trigger cache invalidation +await waitFor(() => expect(queryClient.invalidateQueries).toHaveBeenCalled()) +``` diff --git a/frontend/src/lib/stores/CLAUDE.md b/frontend/src/lib/stores/CLAUDE.md new file mode 100644 index 0000000..1214b7f --- /dev/null +++ b/frontend/src/lib/stores/CLAUDE.md @@ -0,0 +1,68 @@ +# Stores Module + +Zustand-based state management for authentication, modals, and application-level settings with localStorage persistence. + +## Key Components + +- **`auth-store.ts`**: Authentication state (token, isAuthenticated) with login, logout, auth checking, and Zustand persistence +- **Modal stores** (imported via hooks): Modal visibility and data state management +- **Settings persistence**: Auto-saves sensitive state (token, auth status) to localStorage via Zustand persist middleware + +## Important Patterns + +- **Zustand create + persist**: State + actions combined in single store; `persist` middleware auto-syncs to localStorage +- **Selective persistence**: `partialize` option limits what's saved (e.g., only `token` and `isAuthenticated`, not `isLoading`) +- **Hydration tracking**: `setHasHydrated()` marks when localStorage data loaded; used to avoid hydration mismatch in SSR +- **Auth caching**: 30-second cache on `checkAuth()` to avoid excessive API calls; stores `lastAuthCheck` timestamp +- **Network resilience**: Handles 401 globally in API interceptor; graceful degradation if API unreachable +- **API validation**: Uses actual API call (`/notebooks` endpoint) to validate token instead of parsing JWT + +## Key Dependencies + +- `zustand`: State management library +- `@/lib/config`: `getApiUrl()` for dynamic server discovery +- localStorage: Browser persistence API + +## How to Add New Stores + +1. Create new file (e.g., `settings-store.ts`) +2. Define interface extending store state and actions +3. Use `create()(persist(...))` for persistence, or plain `create()` for ephemeral state: + ```typescript + export const useSettingsStore = create()( + persist((set) => ({ + theme: 'dark', + setTheme: (theme) => set({ theme }) + }), { + name: 'settings-storage' + }) + ) + ``` + +## Important Quirks & Gotchas + +- **Hydration mismatch**: Server-side rendered stores must check `hasHydrated` before rendering to prevent SSR mismatches +- **localStorage key collision**: Persist middleware uses `name` option as localStorage key; ensure unique per store +- **Token not validated**: `login()` only checks HTTP 200 response; doesn't decode or validate JWT structure +- **Auth check race condition**: Multiple simultaneous `checkAuth()` calls return early if one already in progress (`isCheckingAuth`) +- **Error messages from HTTP**: Shows 401/403/5xx status codes to user; helps with debugging but may leak info +- **Network timeout handling**: Network errors in `checkAuthRequired()` set `authRequired: null` (safe default); `login()` shows generic message +- **Logout doesn't invalidate session**: Client-side logout only clears local token; server session may still be valid +- **Double authentication**: Both `login()` and `checkAuth()` test same `/notebooks` endpoint; could be optimized with dedicated endpoint + +## Testing Patterns + +```typescript +// Mock store +const mockAuthStore = { + isAuthenticated: true, + token: 'test-token', + checkAuth: vi.fn().mockResolvedValue(true), + login: vi.fn().mockResolvedValue(true), + logout: vi.fn() +} + +// Test store mutations +act(() => store.setState({ theme: 'light' })) +expect(store.getState().theme).toBe('light') +``` diff --git a/open_notebook/CLAUDE.md b/open_notebook/CLAUDE.md new file mode 100644 index 0000000..808d3c5 --- /dev/null +++ b/open_notebook/CLAUDE.md @@ -0,0 +1,242 @@ +# Open Notebook Core Backend + +The `open_notebook` module is the heart of the system: a multi-layer backend orchestrating AI-powered research workflows. It bridges domain models, asynchronous database operations, LangGraph-based content processing, and multi-provider AI model management. + +## Purpose + +Encapsulates the entire backend architecture: +1. **Data layer**: SurrealDB persistence with async CRUD and migrations +2. **Domain layer**: Research models (Notebook, Source, Note, etc.) with embedded relationships +3. **Workflow layer**: LangGraph state machines for content ingestion, chat, and transformations +4. **AI provisioning**: Multi-provider model management with smart fallback logic +5. **Support services**: Context building, tokenization, and utility functions + +All components communicate through async/await patterns and use Pydantic for validation. + +## Architecture Overview + +``` +┌─────────────────────────────────────────────────────────────┐ +│ API / Streamlit UI │ +└──────────────────────┬──────────────────────────────────────┘ + │ + ┌──────────────────┴──────────────────┐ + │ │ +┌───▼────────────────────┐ ┌──────────▼────────────────┐ +│ Graphs (LangGraph) │ │ Domain Models (Data) │ +│ - source.py (ingestion) │ │ - Notebook, Source, Note │ +│ - chat.py │ │ - ChatSession, Asset │ +│ - ask.py (search) │ │ - SourceInsight, Embedding│ +│ - transformation.py │ │ - Transformation, Settings│ +└───┬────────────────────┘ │ - EpisodeProfile, Podcast │ + │ └──────────┬─────────────────┘ + │ │ + └───────────────────┬───────────────┘ + │ + ┌───────────────────┴────────────────────┐ + │ │ +┌───▼─────────────────┐ ┌──────────────▼──────┐ +│ AI Module (Models) │ │ Utils (Helpers) │ +│ - ModelManager │ │ - ContextBuilder │ +│ - DefaultModels │ │ - TokenUtils │ +│ - provision_langchain│ │ - TextUtils │ +│ - Multi-provider AI │ │ - VersionUtils │ +└───┬─────────────────┘ └──────────┬──────────┘ + │ │ + └───────────────────┬───────────────┘ + │ + ┌──────────────▼────────────────┐ + │ Database (SurrealDB) │ + │ - repository.py (CRUD ops) │ + │ - async_migrate.py (schema) │ + │ - Configuration │ + └────────────────────────────────┘ +``` + +## Component Catalog + +### Core Layers + +**See dedicated CLAUDE.md files for detailed patterns and usage:** + +- **`database/`**: Async repository pattern (repo_query, repo_create, repo_upsert), connection pooling, and automatic schema migrations on API startup. See `database/CLAUDE.md`. + +- **`domain/`**: Core data models using Pydantic with SurrealDB persistence. Two base classes: `ObjectModel` (mutable records with auto-increment IDs and embedding) and `RecordModel` (singleton configuration). Includes search functions (text_search, vector_search). See `domain/CLAUDE.md`. + +- **`graphs/`**: LangGraph state machines for async workflows. Content ingestion (source.py), conversational agents (chat.py), search synthesis (ask.py), and transformations. Uses provision_langchain_model() for smart model selection with token-aware fallback. See `graphs/CLAUDE.md`. + +- **`ai/`**: Centralized AI model lifecycle via Esperanto library. ModelManager factory with intelligent fallback (large context detection, type-specific defaults, config override). Supports 8+ providers (OpenAI, Anthropic, Google, Groq, Ollama, Mistral, DeepSeek, xAI). See `ai/CLAUDE.md`. + +- **`utils/`**: Cross-cutting utilities: ContextBuilder (flexible context assembly from sources/notes/insights with token budgeting), TextUtils (truncation, cleaning), TokenUtils (GPT token counting), VersionUtils (schema compatibility). See `utils/CLAUDE.md`. + +- **`podcasts/`**: Podcast generation models: SpeakerProfile (TTS voice config), EpisodeProfile (generation settings), PodcastEpisode (job tracking via surreal-commands). See `podcasts/CLAUDE.md`. + +### Configuration & Exceptions + +- **`config.py`**: Paths for data folder, uploads, LangGraph checkpoints, and tiktoken cache. Auto-creates directories. +- **`exceptions.py`**: Hierarchy of OpenNotebookError subclasses for database, file, network, authentication, and rate-limit failures. + +## Data Flow: Content Ingestion + +``` +User uploads file/URL + │ + ▼ +┌─────────────────────────────────────┐ +│ source.py (LangGraph state machine) │ +├─────────────────────────────────────┤ +│ 1. content_process() │ +│ - extract_content() from file/URL│ +│ - Use ContentSettings defaults │ +│ - speech_to_text model from DB │ +│ │ +│ 2. save_source() │ +│ - Update Source with full_text │ +│ - Preserve title if empty │ +│ │ +│ 3. trigger_transformations() │ +│ - Parallel fan-out to each TXN │ +└────────────────┬────────────────────┘ + │ + ▼ + ┌──────────────┐ + │ transformation.py (parallel) + │ - Apply prompt to source text + │ - Generate insights + │ - Auto-embed results + └──────────────┘ + │ + ▼ + ┌────────────────────┐ + │ Database Storage │ + │ - Source.full_text │ + │ - SourceInsight │ + │ - Embeddings │ + │ - (async job) │ + └────────────────────┘ +``` + +**Fire-and-forget embeddings**: Source.vectorize() returns command_id without awaiting; embedding happens asynchronously via surreal-commands job system. + +## Data Flow: Chat & Search + +``` +User message in chat + │ + ▼ +┌──────────────────────────┐ +│ ContextBuilder │ +│ - Select sources/notes │ +│ - Token budget limiting │ +│ - Priority weighting │ +└──────────┬───────────────┘ + │ + ▼ +┌──────────────────────────────────┐ +│ chat.py or ask.py (LangGraph) │ +│ - Load context from above │ +│ - provision_langchain_model() │ +│ * Auto-upgrade for large text │ +│ * Apply model_id override │ +│ - Call LLM with context │ +│ - Store message in SqliteSaver │ +└──────────┬───────────────────────┘ + │ + ▼ + ┌──────────────┐ + │ LLM Response │ + │ (persisted) │ + └──────────────┘ +``` + +## Key Patterns Across Layers + +### Async/Await Everywhere +All database operations, model provisioning, and graph execution are async. Mix with sync code only via `asyncio.run()` or LangGraph's async bridges (see graphs/CLAUDE.md for workarounds). + +### Type-Driven Dispatch +Model types (language, embedding, speech_to_text, text_to_speech) drive factory logic in ModelManager. Domain model IDs encode their type: `notebook:uuid`, `source:uuid`, `note:uuid`. + +### Smart Fallback Logic +`provision_langchain_model()` auto-detects large contexts (105K+ tokens) and upgrades to dedicated large_context_model. Falls back to default_chat_model if specific type not found. + +### Fire-and-Forget Jobs +Time-consuming operations (embedding, podcast generation) return command_id immediately. Caller polls surreal-commands for status; no blocking. + +### Embedding on Save +Domain models with `needs_embedding()=True` auto-generate embeddings in `save()`. Search functions (text_search, vector_search) use embeddings for semantic matching. + +### Relationship Management +SurrealDB graph edges link entities: Notebook→Source (has), Source→Note (artifact), Note→Source (refers_to). See `relate()` in domain/base.py. + +## Integration Points + +**API startup** (`api/main.py`): +- AsyncMigrationManager.run_migration_up() on lifespan startup +- Ensures schema is current before handling requests + +**Streamlit UI** (`pages/stream_app/`): +- Calls domain models directly to fetch/create notebooks, sources, notes +- Invokes graphs (chat, source, ask) via async wrapper +- Relies on API for migrations (deprecated check in UI) + +**Background Jobs** (`surreal_commands`): +- Source.vectorize() submits async embedding job +- PodcastEpisode.get_job_status() polls job queue +- Decouples long-running operations from request flow + +## Important Quirks & Gotchas + +1. **Token counting rough estimate**: Uses cl100k_base encoding; may differ 5-10% from actual model +2. **Large context threshold hard-coded**: 105,000 token limit for large_context_model upgrade (not configurable) +3. **Async loop gymnastics in graphs**: ThreadPoolExecutor workaround for LangGraph sync nodes calling async functions (fragile) +4. **DefaultModels always fresh**: get_instance() bypasses singleton cache to pick up live config changes +5. **Polymorphic model.get()**: Resolves subclass from ID prefix; fails silently if subclass not imported +6. **RecordID string inconsistency**: repo_update() accepts both "table:id" format and full RecordID +7. **Snapshot profiles**: podcast profiles stored as dicts, so config updates don't affect past episodes +8. **No connection pooling**: Each repo_* creates new connection (adequate for HTTP but inefficient for bulk) +9. **Circular import guard**: utils imports domain; domain must not import utils (breaks on import) +10. **SqliteSaver shared location**: LangGraph checkpoints from LANGGRAPH_CHECKPOINT_FILE env var; all graphs use same file + +## How to Add New Feature + +**New data model**: +1. Create class inheriting from `ObjectModel` with `table_name` ClassVar +2. Define Pydantic fields and validators +3. Override `needs_embedding()` if searchable +4. Add custom methods for domain logic (get_X, add_to_Y) +5. Register in domain/__init__.py exports + +**New workflow**: +1. Create state machine in graphs/WORKFLOW.py using StateGraph +2. Import domain models and provision_langchain_model() +3. Define nodes as async functions taking State, returning dict +4. Compile with graph.compile() +5. Invoke from API endpoint or Streamlit page + +**New AI model type**: +1. Add type string to Model class +2. Add AIFactory.create_* method in Esperanto +3. Handle in ModelManager.get_model() +4. Add DefaultModels field + getter + +## Key Dependencies + +- **surrealdb**: AsyncSurreal client, RecordID type +- **pydantic**: Validation, field_validator +- **langgraph**: StateGraph, Send, SqliteSaver, async/sync bridging +- **langchain_core**: Messages, OutputParser, RunnableConfig +- **esperanto**: Multi-provider AI model abstraction (OpenAI, Anthropic, Google, Groq, Ollama, etc.) +- **content-core**: File/URL content extraction +- **ai_prompter**: Jinja2 template rendering for prompts +- **surreal_commands**: Async job queue for embeddings, podcast generation +- **loguru**: Structured logging throughout +- **tiktoken**: GPT token encoding for context window estimation + +## Codebase Statistics + +- **Modules**: 6 core layers + support services +- **Async operations**: Database, AI provisioning, graph execution, embedding, job tracking +- **Supported AI providers**: 8+ (OpenAI, Anthropic, Google, Groq, Ollama, Mistral, DeepSeek, xAI, OpenRouter) +- **Domain models**: Notebook, Source, Note, SourceInsight, SourceEmbedding, ChatSession, Asset, Transformation, ContentSettings, EpisodeProfile, SpeakerProfile, PodcastEpisode +- **Graph workflows**: 6 (source, chat, source_chat, ask, transformation, prompt) diff --git a/open_notebook/ai/CLAUDE.md b/open_notebook/ai/CLAUDE.md new file mode 100644 index 0000000..604a54d --- /dev/null +++ b/open_notebook/ai/CLAUDE.md @@ -0,0 +1,109 @@ +# AI Module + +Model configuration, provisioning, and management for multi-provider AI integration via Esperanto. + +## Purpose + +Centralizes AI model lifecycle: database models for model metadata (provider, type), default model configuration, and factory for instantiating LLM/embedding/speech models at runtime with fallback logic. + +## Architecture Overview + +**Two-tier system**: +1. **Database models** (`Model`, `DefaultModels`): Metadata storage and default configuration +2. **ModelManager**: Factory for provisioning models with intelligent fallback (large context detection, config override) + +All models use Esperanto library as provider abstraction (OpenAI, Anthropic, Google, Groq, Ollama, Mistral, DeepSeek, xAI, OpenRouter). + +## Component Catalog + +### models.py + +#### Model (ObjectModel) +- Database record: name, provider, type (language/embedding/speech_to_text/text_to_speech) +- `get_models_by_type()`: Async query to fetch all models of a specific type +- Stores provider-model pairs for AI factory instantiation + +#### DefaultModels (RecordModel) +- Singleton configuration record (record_id: `open_notebook:default_models`) +- Fields: default_chat_model, default_transformation_model, large_context_model, default_text_to_speech_model, default_speech_to_text_model, default_embedding_model, default_tools_model +- `get_instance()`: Always fetches fresh from database (overrides parent caching for real-time updates) +- Returns fresh instance on each call (no singleton cache) + +#### ModelManager +- Stateless factory for instantiating AI models +- `get_model(model_id)`: Retrieves Model by ID, creates via AIFactory.create_* based on type +- `get_defaults()`: Fetches DefaultModels configuration +- `get_default_model(model_type)`: Smart lookup (e.g., "chat" → default_chat_model, "transformation" → default_transformation_model with fallback to chat) +- `get_speech_to_text()`, `get_text_to_speech()`, `get_embedding_model()`: Type-specific convenience methods with assertions +- **Global instance**: `model_manager` singleton exported for use throughout app + +### provision.py + +#### provision_langchain_model() +- Factory for LangGraph nodes needing LLM provisioning +- **Smart fallback logic**: + - If tokens > 105,000: Use `large_context_model` + - Elif `model_id` specified: Use specific model + - Else: Use default model for type (e.g., "chat", "transformation") +- Returns LangChain-compatible model via `.to_langchain()` +- Logs model selection decision + +## Common Patterns + +- **Type dispatch**: Model.type field drives factory logic (4 model types) +- **Provider abstraction**: Esperanto handles provider differences; ModelManager unaware of provider specifics +- **Fresh defaults**: DefaultModels.get_instance() always fetches from database (not cached) for live config updates +- **Config override**: provision_langchain_model() accepts kwargs passed to AIFactory.create_* methods +- **Token-based selection**: provision_langchain_model() detects large contexts and upgrades model automatically +- **Type assertions**: get_speech_to_text(), get_embedding_model() assert returned type (safety check) + +## Key Dependencies + +- `esperanto`: AIFactory.create_language(), create_embedding(), create_speech_to_text(), create_text_to_speech() +- `open_notebook.database.repository`: repo_query, ensure_record_id +- `open_notebook.domain.base`: ObjectModel, RecordModel base classes +- `open_notebook.utils`: token_count() for context size detection +- `loguru`: Logging for model selection decisions + +## Important Quirks & Gotchas + +- **Token counting rough estimate**: provision_langchain_model() uses token_count() which estimates via cl100k_base encoding (may differ 5-10% from actual model) +- **Large context threshold hard-coded**: 105,000 token threshold for large_context_model upgrade (not configurable) +- **DefaultModels.get_instance() fresh fetch**: Intentionally bypasses parent singleton cache to pick up live config changes; creates new instance each call +- **Type-specific getters use assertions**: get_speech_to_text() asserts isinstance (catches misconfiguration early) +- **No validation of model existence**: ModelManager.get_model() raises ValueError if model not found (not caught upstream) +- **Esperanto caching**: Actual model instances cached by Esperanto (not by ModelManager); ModelManager stateless +- **Fallback chain specificity**: "transformation" type falls back to default_chat_model if not explicitly set (convention-based) +- **kwargs passed through**: provision_langchain_model() passes kwargs to AIFactory but doesn't validate what's accepted + +## How to Extend + +1. **Add new model type**: Add type string to Model.type enum, add create_* method in AIFactory, handle in ModelManager.get_model() +2. **Add new default configuration**: Extend DefaultModels with new field (e.g., default_vision_model), add getter in ModelManager +3. **Change fallback logic**: Modify provision_langchain_model() token threshold or fallback chain +4. **Add model filtering**: Extend Model.get_models_by_type() with additional filters (e.g., by provider) +5. **Implement model caching**: Wrap ModelManager methods with functools.lru_cache (be aware of kwargs mutability) + +## Usage Example + +```python +from open_notebook.ai.models import model_manager + +# Get default chat model +chat_model = await model_manager.get_default_model("chat") + +# Get specific model by ID +embedding_model = await model_manager.get_model("model:openai_embedding") + +# Get embedding model with config override +embedding_model = await model_manager.get_embedding_model(temperature=0.1) + +# Provision model for LangGraph (auto-detects large context) +from open_notebook.ai.provision import provision_langchain_model +langchain_model = await provision_langchain_model( + content=long_text, + model_id=None, # Use default + default_type="chat", + temperature=0.7 +) +``` diff --git a/open_notebook/database/CLAUDE.md b/open_notebook/database/CLAUDE.md new file mode 100644 index 0000000..17d808d --- /dev/null +++ b/open_notebook/database/CLAUDE.md @@ -0,0 +1,124 @@ +# Database Module + +SurrealDB abstraction layer providing repository pattern for CRUD operations and async migration management. + +## Purpose + +Encapsulates all database interactions: connection pooling, async CRUD operations, relationship management, and schema migrations. Provides clean interface for domain models and API endpoints to interact with SurrealDB without direct query knowledge. + +## Architecture Overview + +Two-tier system: +1. **Repository Layer** (repository.py): Raw async CRUD operations on SurrealDB via AsyncSurreal client +2. **Migration Layer** (async_migrate.py): Schema versioning and migration execution + +Both leverage connection context manager for lifecycle management and automatic cleanup. + +## Component Catalog + +### repository.py + +**Connection Management** +- `get_database_url()`: Resolves `SURREAL_URL` or constructs from `SURREAL_ADDRESS`/`SURREAL_PORT` (backward compatible) +- `get_database_password()`: Falls back from `SURREAL_PASSWORD` to legacy `SURREAL_PASS` env var +- `db_connection()`: Async context manager handling sign-in, namespace/database selection, and cleanup + - Opens AsyncSurreal, authenticates, selects namespace/database, yields connection, closes on exit + +**Query Operations** +- `repo_query(query_str, vars)`: Execute raw SurrealQL with parameter substitution; returns list of dicts +- `repo_create(table, data)`: Insert record; auto-adds `created`/`updated` timestamps; removes any existing `id` field +- `repo_insert(table, data_list, ignore_duplicates)`: Bulk insert multiple records; optionally ignores "already contains" errors +- `repo_upsert(table, id, data, add_timestamp)`: MERGE operation for create-or-update; optionally adds `updated` timestamp +- `repo_update(table, id, data)`: Update existing record by table+id or full record_id; auto-adds `updated`, parses ISO dates +- `repo_delete(record_id)`: Delete record by RecordID +- `repo_relate(source, relationship, target, data)`: Create graph relationship; optional relationship data + +**Utilities** +- `parse_record_ids(obj)`: Recursively converts SurrealDB RecordID objects to strings (deep tree traversal) +- `ensure_record_id(value)`: Coerces string or RecordID to RecordID type + +### async_migrate.py + +**Migration Classes** +- `AsyncMigration`: Single migration wrapper + - `from_file(path)`: Load .surrealql file; strips comments and whitespace + - `run(bump)`: Execute SQL; call bump_version() on success (bump=True) or lower_version() (bump=False) + +- `AsyncMigrationRunner`: Sequences multiple migrations + - `run_all()`: Execute pending migrations from current_version to end + - `run_one_up()`: Run next migration + - `run_one_down()`: Rollback latest migration + +- `AsyncMigrationManager`: Main orchestrator + - Loads 9 up migrations + 9 down migrations (hard-coded in __init__) + - `get_current_version()`: Query max version from _sbl_migrations table + - `needs_migration()`: Boolean check (current < total migrations available) + - `run_migration_up()`: Run all pending migrations with logging + +**Version Tracking** +- `get_latest_version()`: Query max version; returns 0 if _sbl_migrations table missing +- `get_all_versions()`: Fetch all migration records; returns empty list on error +- `bump_version()`: INSERT new entry into _sbl_migrations with version + applied_at timestamp +- `lower_version()`: DELETE latest migration record (rollback) + +### migrate.py + +**Backward Compatibility** +- `MigrationManager`: Sync wrapper around AsyncMigrationManager + - `get_current_version()`: Wraps async call with asyncio.run() + - `needs_migration` property: Checks if migration pending + - `run_migration_up()`: Execute migrations synchronously + +## Common Patterns + +- **Async-first design**: All operations async via AsyncSurreal; sync wrapper provided for legacy code +- **Connection per operation**: Each repo_* function opens/closes connection (no pooling); designed for serverless/stateless API +- **Auto-timestamping**: repo_create() and repo_update() auto-set `created`/`updated` fields +- **Error resilience**: RuntimeError for transaction conflicts (retriable); catches and re-raises other exceptions +- **RecordID polymorphism**: Functions accept string or RecordID; coerced to consistent type +- **Graceful degradation**: Migration queries catch exceptions and treat table-not-found as version 0 + +## Key Dependencies + +- `surrealdb`: AsyncSurreal client, RecordID type +- `loguru`: Logging with context (debug/error/success levels) +- Python stdlib: `os` (env vars), `datetime` (timestamps), `contextlib` (async context manager) + +## Important Quirks & Gotchas + +- **No connection pooling**: Each repo_* operation creates new connection; adequate for HTTP request-scoped operations but inefficient for bulk workloads +- **Hard-coded migration files**: AsyncMigrationManager lists migrations 1-9 explicitly; adding new migration requires code change (not auto-discovery) +- **Record ID format inconsistency**: repo_update() accepts both `table:id` format and full RecordID; path handling can be subtle +- **ISO date parsing**: repo_update() parses `created` field from string to datetime if present; assumes ISO format +- **Timestamp overwrite risk**: repo_create() always sets new timestamps; can't preserve original created time on reimport +- **Transaction conflict handling**: RuntimeError from transaction conflicts logged without stack trace (prevents log spam) +- **Graceful null returns**: get_all_versions() returns [] on table missing; allows migration system to bootstrap cleanly + +## How to Extend + +1. **Add new CRUD operation**: Follow repo_* pattern (open connection, execute query, handle errors, close) +2. **Add migration**: Create migration file in `/migrations/N.surrealql` and `/migrations/N_down.surrealql`; update AsyncMigrationManager to load new files +3. **Change timestamp behavior**: Modify repo_create()/repo_update() to not auto-set `updated` field if caller-provided +4. **Implement connection pooling**: Replace db_connection context manager with pool.acquire() pattern (for high-throughput scenarios) + +## Integration Points + +- **API startup** (api/main.py): FastAPI lifespan handler calls AsyncMigrationManager.run_migration_up() on server start +- **Domain models** (domain/*.py): All models call repo_* functions for persistence +- **Commands** (commands/*.py): Background jobs use repo_* for state updates +- **Streamlit UI** (pages/*.py): Deprecated migration check; relies on API to run migrations + +## Usage Example + +```python +from open_notebook.database.repository import repo_create, repo_query, repo_update + +# Create +record = await repo_create("notebooks", {"title": "Research"}) + +# Query +results = await repo_query("SELECT * FROM notebooks WHERE title = $title", {"title": "Research"}) + +# Update +await repo_update("notebooks", record["id"], {"title": "Updated Research"}) +``` diff --git a/open_notebook/domain/CLAUDE.md b/open_notebook/domain/CLAUDE.md new file mode 100644 index 0000000..81075fd --- /dev/null +++ b/open_notebook/domain/CLAUDE.md @@ -0,0 +1,100 @@ +# Domain Module + +Core data models for notebooks, sources, notes, and settings with async SurrealDB persistence, auto-embedding, and relationship management. + +## Purpose + +Two base classes support different persistence patterns: **ObjectModel** (mutable records with auto-increment IDs) and **RecordModel** (singleton configuration with fixed IDs). + +## Key Components + +### base.py +- **ObjectModel**: Base for notebooks, sources, notes + - `save()`: Create/update with auto-embedding for searchable content + - `delete()`: Remove by ID + - `relate(relationship, target_id)`: Create graph relationships (reference, artifact, refers_to) + - `get(id)`: Polymorphic fetch; resolves subclass from ID prefix + - `get_all(order_by)`: Fetch all records from table + - Integrates with ModelManager for automatic embedding + +- **RecordModel**: Singleton configuration (ContentSettings, DefaultPrompts) + - Fixed record_id per subclass + - `update()`: Upsert to database + - Lazy DB loading via `_load_from_db()` + +### notebook.py +- **Notebook**: Research project container + - `get_sources()`, `get_notes()`, `get_chat_sessions()`: Navigate relationships + +- **Source**: Content item (file/URL) + - `vectorize()`: Submit async embedding job (returns command_id, fire-and-forget) + - `get_status()`, `get_processing_progress()`: Track job via surreal_commands + - `get_context()`: Returns summary for LLM context + - `add_insight()`: Generate and store insights with embeddings + +- **Note**: Standalone or linked notes + - `needs_embedding()`: Always True (searchable) + - `add_to_notebook()`: Link to notebook + +- **SourceInsight, SourceEmbedding**: Derived content models +- **ChatSession**: Conversation container with optional model_override +- **Asset**: File/URL reference helper + +- **Search functions**: + - `text_search()`: Full-text keyword search + - `vector_search()`: Semantic search via embeddings (default minimum_score=0.2) + +### content_settings.py +- **ContentSettings**: Singleton for processing engines, embedding strategy, file deletion, YouTube languages + +### transformation.py +- **Transformation**: Reusable prompts for content transformation +- **DefaultPrompts**: Singleton with transformation instructions + +## Important Patterns + +- **Async/await**: All DB operations async; always use await +- **Polymorphic get()**: `ObjectModel.get(id)` determines subclass from ID prefix (table:id format) +- **Auto-embedding**: `save()` generates embeddings if `needs_embedding()` returns True +- **Nullable fields**: Declare via `nullable_fields` ClassVar to allow None in database +- **Timestamps**: `created` and `updated` auto-managed as ISO strings +- **Fire-and-forget jobs**: `source.vectorize()` returns command_id without waiting + +## Key Dependencies + +- `surrealdb`: RecordID type for relationships +- `pydantic`: Validation and field_validator decorators +- `open_notebook.database.repository`: CRUD and relationship functions +- `open_notebook.ai.models`: ModelManager for embeddings +- `surreal_commands`: Async job submission (vectorization, insights) +- `loguru`: Logging + +## Quirks & Gotchas + +- **Polymorphic resolution**: `ObjectModel.get()` fails if subclass not imported (search subclasses list) +- **RecordModel singleton**: __new__ returns existing instance; call `clear_instance()` in tests +- **Source.command field**: Stored as RecordID; auto-parsed from strings via field_validator +- **Text truncation**: `Note.get_context(short)` hardcodes 100-char limit +- **Embedding async**: Only Note and SourceInsight embed on save; Source too large (uses async job) +- **Relationship strings**: Must match SurrealDB schema (reference, artifact, refers_to) + +## How to Add New Model + +1. Inherit from ObjectModel with table_name ClassVar +2. Define Pydantic fields with validators +3. Override `needs_embedding()` if searchable +4. Add custom methods for domain logic (get_X, add_to_Y) +5. Implement `_prepare_save_data()` if custom serialization needed + +## Usage + +```python +notebook = Notebook(name="Research", description="My project") +await notebook.save() + +obj = await ObjectModel.get("notebook:123") # Polymorphic fetch + +# Search +await text_search("quantum", results=5) +await vector_search("quantum computing", results=10, minimum_score=0.3) +``` diff --git a/open_notebook/graphs/CLAUDE.md b/open_notebook/graphs/CLAUDE.md new file mode 100644 index 0000000..3406576 --- /dev/null +++ b/open_notebook/graphs/CLAUDE.md @@ -0,0 +1,61 @@ +# Graphs Module + +LangGraph-based workflow orchestration for content processing, chat interactions, and AI-powered transformations. + +## Key Components + +- **`chat.py`**: Conversational agent with message history, notebook context, and model override support +- **`source_chat.py`**: Source-focused chat with ContextBuilder for insights/content injection and context tracking +- **`ask.py`**: Multi-search strategy agent (generates search terms, retrieves results, synthesizes answers) +- **`source.py`**: Content ingestion pipeline (extract → save → transform with content-core) +- **`transformation.py`**: Single-node transformation executor with prompt templating via ai_prompter +- **`prompt.py`**: Generic pattern chain for arbitrary prompt-based LLM calls +- **`tools.py`**: Minimal tool library (currently just `get_current_timestamp()`) + +## Important Patterns + +- **Async/sync bridging in graphs**: Both `chat.py` and `source_chat.py` use `asyncio.new_event_loop()` workaround because LangGraph nodes are sync but `provision_langchain_model()` is async +- **State machines via StateGraph**: Each graph compiles to stateful runnable; conditional edges fan out work (ask.py, source.py do parallel transforms) +- **Prompt templating**: `ai_prompter.Prompter` with Jinja2 templates referenced by path ("chat/system", "ask/entry", etc.) +- **Model provisioning via context**: Config dict passed to node via `RunnableConfig`; defaults fall back to state overrides +- **Checkpointing**: `chat.py` and `source_chat.py` use SqliteSaver for message history (LangGraph's built-in persistence) +- **Content extraction**: `source.py` uses content-core library with provider/model from DefaultModels; URLs and files both supported + +## Quirks & Edge Cases + +- **Async loop gymnastics**: ThreadPoolExecutor workaround needed because LangGraph invokes sync nodes but we call async functions; fragile if event loop state changes +- **`clean_thinking_content()` ubiquitous**: Strips `...` tags from model responses (handles extended thinking models) +- **source_chat.py builds context twice**: ContextBuilder runs during node execution to fetch source/insights; rebuilds list from context_data (inefficient but safe) +- **source.py embedding is async**: `source.vectorize()` returns job command ID; not awaited (fire-and-forget) +- **transformation.py nullable source**: Accepts `input_text` or `source.full_text` (falls back to second if first missing) +- **ask.py hard-coded vector_search**: No fallback to text search despite commented code suggesting it was planned +- **SqliteSaver location**: Checkpoints stored in path from `LANGGRAPH_CHECKPOINT_FILE` env var; connection shared across graphs + +## Key Dependencies + +- `langgraph`: StateGraph, Send, END, START, SqliteSaver checkpoint persistence +- `langchain_core`: Messages, OutputParser, RunnableConfig +- `ai_prompter`: Prompter for Jinja2 template rendering +- `content_core`: `extract_content()` for file/URL processing +- `open_notebook.ai.provision`: `provision_langchain_model()` (async factory with fallback logic) +- `open_notebook.domain.notebook`: Domain models (Source, Note, SourceInsight, vector_search) +- `loguru`: Logging + +## Usage Example + +```python +# Invoke a graph with config override +config = {"configurable": {"model_id": "model:custom_id"}} +result = await chat_graph.ainvoke( + {"messages": [HumanMessage(content="...")], "notebook": notebook}, + config=config +) + +# Source processing (content → save → transform) +result = await source_graph.ainvoke({ + "content_state": {...}, # ProcessSourceState from content-core + "apply_transformations": [t1, t2], + "source_id": "source:123", + "embed": True +}) +``` diff --git a/open_notebook/plugins/podcasts.py b/open_notebook/plugins/podcasts.py deleted file mode 100644 index 9afabac..0000000 --- a/open_notebook/plugins/podcasts.py +++ /dev/null @@ -1,293 +0,0 @@ -from typing import ClassVar, List, Optional - -from loguru import logger -from podcastfy.client import generate_podcast -from pydantic import Field, field_validator, model_validator - -from open_notebook.config import DATA_FOLDER -from open_notebook.domain.notebook import ObjectModel - - -class PodcastEpisode(ObjectModel): - table_name: ClassVar[str] = "podcast_episode" - name: str - template: str - instructions: str - text: str - audio_file: str - - -class PodcastConfig(ObjectModel): - table_name: ClassVar[str] = "podcast_config" - name: str - podcast_name: str - podcast_tagline: str - output_language: str = Field(default="English") - person1_role: List[str] - person2_role: List[str] - conversation_style: List[str] - engagement_technique: List[str] - dialogue_structure: List[str] - transcript_model: Optional[str] = None - transcript_model_provider: Optional[str] = None - user_instructions: Optional[str] = None - ending_message: Optional[str] = None - creativity: float = Field(ge=0, le=1) - provider: str = Field(default="openai") - voice1: str - voice2: str - model: str - - # Backwards compatibility - @field_validator("person1_role", "person2_role", mode="before") - @classmethod - def split_string_to_list(cls, value): - if isinstance(value, str): - return [item.strip() for item in value.split(",")] - return value - - @model_validator(mode="after") - def validate_voices(self) -> "PodcastConfig": - if not self.voice1 or not self.voice2: - raise ValueError("Both voice1 and voice2 must be provided") - return self - - async def generate_episode( - self, - episode_name: str, - text: str, - instructions: str = "", - longform: bool = False, - chunks: int = 8, - min_chunk_size=600, - ): - self.user_instructions = ( - instructions if instructions else self.user_instructions - ) - conversation_config = { - "max_num_chunks": chunks, - "min_chunk_size": min_chunk_size, - "conversation_style": self.conversation_style, - "roles_person1": self.person1_role, - "roles_person2": self.person2_role, - "dialogue_structure": self.dialogue_structure, - "podcast_name": self.podcast_name, - "podcast_tagline": self.podcast_tagline, - "output_language": self.output_language, - "user_instructions": self.user_instructions, - "engagement_techniques": self.engagement_technique, - "creativity": self.creativity, - "text_to_speech": { - "output_directories": { - "transcripts": f"{DATA_FOLDER}/podcasts/transcripts", - "audio": f"{DATA_FOLDER}/podcasts/audio", - }, - "temp_audio_dir": f"{DATA_FOLDER}/podcasts/audio/tmp", - "ending_message": "Thank you for listening to this episode. Don't forget to subscribe to our podcast for more interesting conversations.", - "default_tts_model": self.provider, - self.provider: { - "default_voices": { - "question": self.voice1, - "answer": self.voice2, - }, - "model": self.model, - }, - "audio_format": "mp3", - }, - } - - api_key_label = None - llm_model_name = None - tts_model = None - - if self.transcript_model_provider: - if self.transcript_model_provider == "openai": - api_key_label = "OPENAI_API_KEY" - llm_model_name = self.transcript_model - elif self.transcript_model_provider == "anthropic": - api_key_label = "ANTHROPIC_API_KEY" - llm_model_name = self.transcript_model - elif self.transcript_model_provider == "gemini": - api_key_label = "GOOGLE_API_KEY" - llm_model_name = self.transcript_model - - if self.provider == "google": - tts_model = "gemini" - elif self.provider == "openai": - tts_model = "openai" - elif self.provider == "anthropic": - tts_model = "anthropic" - elif self.provider == "vertexai": - tts_model = "geminimulti" - elif self.provider == "elevenlabs": - tts_model = "elevenlabs" - - logger.info( - f"Generating episode {episode_name} with config {conversation_config} and using model {llm_model_name}, tts model {tts_model}" - ) - - try: - audio_file = generate_podcast( - conversation_config=conversation_config, - text=text, - tts_model=tts_model, - llm_model_name=llm_model_name, - api_key_label=api_key_label, - longform=longform, - ) - episode = PodcastEpisode( - name=episode_name, - template=self.name, - instructions=instructions, - text=str(text), - audio_file=audio_file, - ) - await episode.save() - except Exception as e: - logger.error(f"Failed to generate episode {episode_name}: {e}") - raise - - @field_validator( - "name", "podcast_name", "podcast_tagline", "output_language", "model" - ) - @classmethod - def validate_required_strings(cls, value: str, field) -> str: - if value is None or value.strip() == "": - raise ValueError(f"{field.field_name} cannot be None or empty string") - return value.strip() - - @field_validator("creativity") - def validate_creativity(cls, value): - if not 0 <= value <= 1: - raise ValueError("Creativity must be between 0 and 1") - return value - - -conversation_styles = [ - "Analytical", - "Argumentative", - "Informative", - "Humorous", - "Casual", - "Formal", - "Inspirational", - "Debate-style", - "Interview-style", - "Storytelling", - "Satirical", - "Educational", - "Philosophical", - "Speculative", - "Motivational", - "Fun", - "Technical", - "Light-hearted", - "Serious", - "Investigative", - "Debunking", - "Didactic", - "Thought-provoking", - "Controversial", - "Sarcastic", - "Emotional", - "Exploratory", - "Fast-paced", - "Slow-paced", - "Introspective", -] - -# Dialogue Structures -dialogue_structures = [ - "Topic Introduction", - "Opening Monologue", - "Guest Introduction", - "Icebreakers", - "Historical Context", - "Defining Terms", - "Problem Statement", - "Overview of the Issue", - "Deep Dive into Subtopics", - "Pro Arguments", - "Con Arguments", - "Cross-examination", - "Expert Interviews", - "Case Studies", - "Myth Busting", - "Q&A Session", - "Rapid-fire Questions", - "Summary of Key Points", - "Recap", - "Key Takeaways", - "Actionable Tips", - "Call to Action", - "Future Outlook", - "Closing Remarks", - "Resource Recommendations", - "Trending Topics", - "Closing Inspirational Quote", - "Final Reflections", -] - -# Podcast Participant Roles -participant_roles = [ - "Main Summarizer", - "Questioner/Clarifier", - "Optimist", - "Skeptic", - "Specialist", - "Thesis Presenter", - "Counterargument Provider", - "Professor", - "Student", - "Moderator", - "Host", - "Co-host", - "Expert Guest", - "Novice", - "Devil's Advocate", - "Analyst", - "Storyteller", - "Fact-checker", - "Comedian", - "Interviewer", - "Interviewee", - "Historian", - "Visionary", - "Strategist", - "Critic", - "Enthusiast", - "Mediator", - "Commentator", - "Researcher", - "Reporter", - "Advocate", - "Debater", - "Explorer", -] - -# Engagement Techniques -engagement_techniques = [ - "Rhetorical Questions", - "Anecdotes", - "Analogies", - "Humor", - "Metaphors", - "Storytelling", - "Quizzes", - "Personal Testimonials", - "Quotes", - "Jokes", - "Emotional Appeals", - "Provocative Statements", - "Sarcasm", - "Pop Culture References", - "Thought Experiments", - "Puzzles and Riddles", - "Role-playing", - "Debates", - "Catchphrases", - "Statistics and Facts", - "Open-ended Questions", - "Challenges to Assumptions", - "Evoking Curiosity", -] diff --git a/open_notebook/podcasts/CLAUDE.md b/open_notebook/podcasts/CLAUDE.md new file mode 100644 index 0000000..95a8f6c --- /dev/null +++ b/open_notebook/podcasts/CLAUDE.md @@ -0,0 +1,68 @@ +# Podcasts Module + +Domain models for podcast generation featuring speaker and episode profile management with job tracking. + +## Purpose + +Encapsulates podcast metadata and configuration: speaker profiles (voice/personality config), episode profiles (generation settings), and podcast episodes (with job status tracking via surreal-commands). + +## Architecture Overview + +Two-tier profile system: +- **SpeakerProfile**: TTS provider/model + 1-4 speaker configurations (name, voice_id, backstory, personality) +- **EpisodeProfile**: Generation settings (outline/transcript models, segment count, briefing template) +- **PodcastEpisode**: Generated episode record linking profiles, content, and async job + +All inherit from `ObjectModel` (SurrealDB base class with table_name and save/load). + +## Component Catalog + +### SpeakerProfile +- Validates 1-4 speakers with required fields: name, voice_id, backstory, personality +- Stores TTS provider/model (e.g., "elevenlabs", "openai") +- `get_by_name()` async query by profile name +- Raises ValueError on invalid speaker counts or missing fields + +### EpisodeProfile +- Configures outline/transcript generation: provider, model, num_segments (3-20 validated) +- References speaker_config by name +- Stores default_briefing template for episode generation +- `get_by_name()` async query + +### PodcastEpisode +- Stores episode_profile and speaker_profile as dicts (snapshots of config at generation time) +- Optional audio_file path, transcript/outline dicts +- **Job tracking**: command field links to surreal-commands RecordID +- `get_job_status()` fetches async job status via surreal-commands library +- `_prepare_save_data()` ensures command field is always RecordID format for database + +## Common Patterns + +- **Profile snapshots**: episode_profile and speaker_profile stored as dicts to freeze config at generation time +- **Field validation**: Pydantic validators enforce constraints (segment count, speaker count, required fields) +- **Async database access**: `get_by_name()` queries via repo_query +- **Job tracking**: command field delegates to surreal-commands; get_job_status() returns "unknown" on failure +- **Record ID handling**: ensure_record_id() converts string to RecordID before save + +## Key Dependencies + +- `pydantic`: Field validators, ObjectModel inheritance +- `surrealdb`: RecordID type for job references +- `open_notebook.database.repository`: repo_query, ensure_record_id +- `open_notebook.domain.base`: ObjectModel base class +- `surreal_commands` (optional): get_command_status() for job status + +## Important Quirks & Gotchas + +- **Snapshot approach**: Episode/speaker profiles stored as dicts (not references), so profile updates don't retroactively affect past episodes +- **Job status resilience**: get_job_status() catches all exceptions and returns "unknown" (no error propagation) +- **validate_speakers executes late**: Validators run at instantiation; bulk inserts may not trigger full validation +- **RecordID coercion**: ensure_record_id() handles both string and RecordID inputs; command field parsed during deserialization +- **No cascade delete**: Removing a profile doesn't cascade to episodes using it + +## How to Extend + +1. **Add new speaker field**: Add to required_fields list in validate_speakers() +2. **Add episode config field**: Validate in EpisodeProfile, update briefing generation code +3. **Add job metadata**: Extend PodcastEpisode with new fields (e.g., progress tracking) +4. **Change job provider**: Replace surreal-commands with alternative job queue library; update get_job_status() diff --git a/open_notebook/utils/CLAUDE.md b/open_notebook/utils/CLAUDE.md new file mode 100644 index 0000000..aec3539 --- /dev/null +++ b/open_notebook/utils/CLAUDE.md @@ -0,0 +1,113 @@ +# Utils Module + +Utility functions and helpers for context building, text processing, tokenization, and versioning. + +## Purpose + +Provides cross-cutting concerns: building LLM context from sources/insights, text utilities (truncation, cleaning), token counting, and version management. + +## Architecture Overview + +**Four core utilities**: +1. **context_builder.py**: Flexible context assembly from sources, notes, insights with token budgeting +2. **text_utils.py**: Text truncation, whitespace cleaning, formatting helpers +3. **token_utils.py**: Token counting for LLM context windows (wrapper around encoding library) +4. **version_utils.py**: Version parsing, comparison, and schema compatibility checks + +Each utility is stateless and can be imported independently. + +## Component Catalog + +### context_builder.py +- **ContextItem**: Dataclass for individual context piece (id, type, content, priority, token_count) +- **ContextConfig**: Configuration for context building (sources/notes/insights selection, max tokens, priority weights) +- **ContextBuilder**: Main class assembling context + - `add_source()`: Include source by ID with inclusion level + - `add_note()`: Include note by ID + - `add_insight()`: Include insight by ID + - `build()`: Assemble context respecting token budget and priorities + - Uses vector_search to fetch source/insight content from SurrealDB + - Returns list of ContextItem objects sorted by priority + +**Key behavior**: +- Token counting is automatic (calculated in ContextItem.__post_init__) +- Max token enforcement via priority weighting (higher priority items included first) +- Type-specific fetching: sources → Source.full_text, notes → Note.content, insights → SourceInsight.content +- Raises DatabaseOperationError if source/note fetch fails + +### text_utils.py +- **truncate_text(text, max_chars, suffix="...")**: Truncates string, adds ellipsis +- **clean_text(text)**: Removes extra whitespace, normalizes newlines +- **extract_sentences(text, max_count)**: Splits text into sentences up to limit +- **normalize_whitespace(text)**: Collapse multiple spaces/newlines into single +- **format_for_llm(text)**: Combines cleaning + normalization for LLM consumption + +**Key behavior**: All functions are pure (no side effects); safe for high-volume processing + +### token_utils.py +- **token_count(text)**: Returns estimated token count for string (via encoding library) +- **remaining_tokens(max_tokens, used)**: Returns remaining tokens in budget +- **fits_in_context(text, max_tokens)**: Boolean check if text fits token budget + +**Key behavior**: Uses fixed encoding (cl100k_base for GPT models); may differ slightly from actual model tokenization + +### version_utils.py +- **parse_version(version_string)**: Parses "1.2.3" format; returns Version namedtuple +- **compare_versions(v1, v2)**: Returns -1 (v1 < v2), 0 (equal), 1 (v1 > v2) +- **is_compatible(current, required)**: Checks if current version meets requirement (e.g., current >= required) +- **schema_version_check()**: Validates database schema version on startup + +**Key behavior**: Assumes semantic versioning (MAJOR.MINOR.PATCH); non-standard formats raise ValueError + +## Common Patterns + +- **Dataclass-driven config**: ContextConfig used by ContextBuilder (immutable after init) +- **Token budgeting**: ContextBuilder respects max_tokens constraint; prioritizes high-priority items +- **Error handling resilience**: token_count() returns estimate; context_builder catches DB errors gracefully +- **Pure text functions**: text_utils functions are stateless utilities (no class needed) +- **Lazy evaluation**: ContextBuilder doesn't fetch items until build() called +- **Type hints throughout**: All functions use Optional, List, Dict for clarity + +## Key Dependencies + +- `open_notebook.domain.notebook`: Source, Note, SourceInsight models; vector_search function +- `open_notebook.exceptions`: DatabaseOperationError, NotFoundError +- `tiktoken` (via token_utils.py): Token encoding for GPT models +- `loguru`: Logging in context_builder (debug-level) + +## Important Quirks & Gotchas + +- **Token count estimation**: Uses cl100k_base encoding; may differ 5-10% from actual model tokens +- **Priority weights default**: If not specified, ContextConfig uses default weights (source=1, note=0.8, insight=1.2) +- **Vector search required**: ContextBuilder assumes vector_search is available on Notebook model; fails if not +- **Source.full_text vs content**: Uses full_text field (may include extracted text + metadata) +- **Type-specific fetch logic**: ContextItem.content stores raw dict; caller must parse (e.g., dict["content"]) +- **Circular import risk**: context_builder imports from domain.notebook; avoid domain importing utils +- **Max tokens hard limit**: ContextBuilder stops adding items once max_tokens exceeded (not prorated) +- **No caching**: Every build() call re-fetches from database (use cache layer if needed) +- **Whitespace normalization lossy**: clean_text() may change intended formatting (code blocks, poetry, etc.) + +## How to Extend + +1. **Add new context source type**: Create fetch method in ContextBuilder; update ContextConfig.sources dict +2. **Add text preprocessing**: Add new function to text_utils (e.g., remove_urls, extract_keywords) +3. **Change tokenization**: Replace tiktoken with alternative library in token_utils; update all calls +4. **Add context filtering**: Extend ContextConfig with filter_by_date, filter_by_topic fields +5. **Implement caching**: Wrap ContextBuilder.build() with functools.lru_cache (be aware of mutability) + +## Usage Example + +```python +from open_notebook.utils.context_builder import ContextBuilder, ContextConfig + +config = ContextConfig( + sources={"source:123": "full", "source:456": "summary"}, + max_tokens=2000, +) +builder = ContextBuilder(notebook, config) +context_items = await builder.build() + +# context_items is List[ContextItem] sorted by priority +for item in context_items: + print(f"{item.type}:{item.id} ({item.token_count} tokens)") +``` diff --git a/prompts/CLAUDE.md b/prompts/CLAUDE.md new file mode 100644 index 0000000..976a1a7 --- /dev/null +++ b/prompts/CLAUDE.md @@ -0,0 +1,190 @@ +# Prompts Module + +Jinja2 prompt templates for multi-provider AI workflows in Open Notebook. + +## Purpose + +Centralized prompt repository using `ai_prompter` library to: +1. Separate prompt engineering from Python application logic +2. Provide reusable Jinja2 templates with variable injection +3. Support multi-stage prompt chains (orchestrated by LangGraph workflows) +4. Ensure consistency across similar workflows (chat, search, content generation) + +## Architecture Overview + +**Template Organization by Workflow**: +- **`ask/`**: Multi-stage search synthesis (entry → query_process → final_answer) +- **`chat/`**: Conversational agent with notebook context (system prompt only) +- **`source_chat/`**: Source-focused chat with insight injection (system prompt only) +- **`podcast/`**: Podcast generation pipeline (outline → transcript) + +**Rendering Pattern** (all workflows): +```python +from ai_prompter import Prompter + +# Load template + render with variables +system_prompt = Prompter(prompt_template="ask/entry", parser=parser).render( + data=state +) + +# Then invoke LLM +model = await provision_langchain_model(system_prompt, ...) +response = await model.ainvoke(system_prompt) +``` + +See detailed workflow integration in `open_notebook/graphs/CLAUDE.md` for how each template fits into chat.py, ask.py, source_chat.py. + +## Prompt Engineering Patterns + +### 1. Multi-Stage Chain (Ask Workflow) + +Three-template chain for intelligent search: + +``` +entry.jinja (user question → search strategy) + ↓ +query_process.jinja (run each search, generate sub-answer) + ↓ (multiple parallel) +final_answer.jinja (synthesize all results into final response) +``` + +**Key pattern**: `entry.jinja` generates JSON-structured reasoning (via PydanticOutputParser). Each `query_process.jinja` invocation receives one search term + retrieved results. `final_answer.jinja` combines all answers with proper source citation. + +### 2. Conditional Variable Injection (Podcast Workflow) + +Templates accept optional variables for context assembly: + +```jinja +{% if notebook %} +# PROJECT INFORMATION +{{ notebook }} +{% endif %} + +{% if context %} +# CONTEXT +{{ context }} +{% endif %} +``` + +Enabled by Jinja2's conditional blocks. Critical for podcast outline (handles list or string context) and source_chat (injects variable notebook/insight data). + +### 3. Repeated Emphasis on Citation Format (Ask & Chat) + +All response-generating templates emphasize source citation rules: +- Document ID syntax: `[source:id]`, `[note:id]`, `[insight:id]` +- "Do not make up document IDs" repeated multiple times +- Example citations provided inline + +**Rationale**: LLMs naturally hallucinate citations without explicit guidance; repetition + examples reduce hallucination. + +### 4. Format Instructions Delegation + +Templates accept external `{{ format_instructions }}` variable: + +```jinja +# OUTPUT FORMATTING +{{ format_instructions }} +``` + +Allows caller to inject JSON schema, XML format, or other output constraints without modifying template. Decouples prompt from output format evolution. + +### 5. JSON Output with Extended Thinking Support + +Podcast templates include extended thinking pattern: + +```jinja +IMPORTANT OUTPUT FORMAT: +- If you use extended thinking with tags, put ALL your reasoning inside tags +- Put the final JSON output OUTSIDE and AFTER any tags +``` + +Guides models with extended thinking capability to separate reasoning from output (cleaner parsing downstream). + +## File Catalog + +**`ask/` - Search Synthesis Pipeline**: +- **entry.jinja**: Analyzes user question, generates search strategy with JSON output (term + instructions per search) +- **query_process.jinja**: Accepts one search term + retrieved results, generates sub-answer with citations +- **final_answer.jinja**: Combines all sub-answers into coherent final response, enforces source citation + +**`chat/` - Conversational Agent**: +- **system.jinja**: Single system prompt for general chat. Uses conditional blocks for optional notebook context. Emphasizes citation format. + +**`source_chat/` - Source-Focused Chat**: +- **system.jinja**: Single system prompt for source-specific discussion. Injects source metadata (ID, title, topics) + selected context. Conditional blocks for optional notebook/context data. + +**`podcast/` - Podcast Generation**: +- **outline.jinja**: Takes briefing + content + speaker profiles (list support via Jinja2 for-loop). Generates JSON outline with segments (name, description, size). +- **transcript.jinja**: Takes outline + segment index + optional existing transcript. Generates JSON dialogue array (speaker name + dialogue). Iterates speakers with for-loop. + +## Key Dependencies + +- **ai_prompter**: Prompter class for Jinja2 template rendering with optional OutputParser binding +- **Jinja2** (transitive via ai_prompter): Template syntax (if/for, filters, variable interpolation) +- **No external AI calls**: Templates are pure text; LLM invocation happens in calling code (graphs/) + +## How to Add New Template + +1. **Create subdirectory** in `prompts/` matching workflow name (e.g., `prompts/new_workflow/`) +2. **Define .jinja file(s)** with Jinja2 syntax: + - Use `{{ variable_name }}` for scalar injection + - Use `{% if condition %} ... {% endif %}` for optional sections + - Use `{% for item in list %} ... {% endfor %}` for iteration +3. **Document template variables** as inline comments (follow existing templates) +4. **Reference in calling code** (graphs/): + ```python + from ai_prompter import Prompter + prompt = Prompter(prompt_template="new_workflow/template_name").render(data=context_dict) + ``` +5. **If structured output needed**: Pass `parser=PydanticOutputParser(...)` to Prompter +6. **Document in graphs/CLAUDE.md** how new template fits into workflow chain + +## Important Quirks & Gotchas + +1. **Template path syntax**: Uses forward slashes without `.jinja` extension in Prompter. `"ask/entry"` maps to `/prompts/ask/entry.jinja` +2. **Variable key convention**: All data passed as `data=dict` arg to `.render()`. Template accesses variables directly (e.g., `{{ question }}`). Ensure dict keys match template variable names. +3. **OutputParser binding**: When using PydanticOutputParser, Prompter auto-injects `{{ format_instructions }}` into template. If template doesn't have this placeholder, parser is ignored. +4. **Jinja2 whitespace sensitivity**: Template indentation doesn't affect output, but raw newlines do. Use explicit `\n` or trim filters if output formatting matters. +5. **Conditional blocks are loose**: Jinja2 if-condition evaluates any truthy value (non-empty string, list, dict). `{% if variable %}` is False for empty string/"" but True for any non-empty content. +6. **For-loop list assumption**: Templates using `{% for item in list %}` don't validate list type. If caller passes string instead of list, iteration happens character-by-character (bug risk). +7. **No template composition/inheritance**: Templates are flat (no `{% extends %}` or `{% include %}`). Each workflow keeps templates independent to avoid coupling. +8. **Citation ID format is caller's responsibility**: Templates emphasize citation rules but don't validate. If caller returns wrong ID format, template can't catch it upstream. +9. **Parser extraction happens post-render**: OutputParser.parse() is called AFTER `.render()` returns string. If template has syntax errors, render fails before parsing logic runs. +10. **Template cache**: Prompter likely caches loaded templates. File edits require app restart if using cached instance. + +## Testing Patterns + +**Manual render test**: +```python +from ai_prompter import Prompter + +prompt = Prompter(prompt_template="ask/entry").render( + data={"question": "What is RAG?"} +) +print(prompt) # Inspect Jinja2 output before sending to LLM +``` + +**With parser**: +```python +from pydantic import BaseModel +from langchain_core.output_parsers.pydantic import PydanticOutputParser + +class Strategy(BaseModel): + reasoning: str + searches: list + +parser = PydanticOutputParser(pydantic_object=Strategy) +prompt = Prompter(prompt_template="ask/entry", parser=parser).render( + data={"question": "..."} +) +# prompt now includes {{ format_instructions }} substitution +``` + +**Integration test** (invoke full graph): +See `open_notebook/graphs/ask.py` for how entry.jinja is invoked inside ask_graph workflow. + +## Reference Documentation + +- **Jinja2 syntax guide**: See existing templates for for-loop, if-conditional, variable interpolation patterns +- **Graph integration**: `open_notebook/graphs/CLAUDE.md` documents which template is used in which workflow +- **Sub-directory CLAUDE.md files**: `ask/CLAUDE.md`, `chat/CLAUDE.md`, `podcast/CLAUDE.md` (if created) provide template-specific implementation notes