mirror of
https://github.com/lfnovo/open-notebook.git
synced 2026-04-29 03:50:04 +00:00
Create a hierarchical CLAUDE.md documentation system for the entire Open Notebook codebase with focus on concise, pattern-driven reference cards rather than comprehensive tutorials. ## Changes ### Core Documentation System - Updated `.claude/commands/build-claude-md.md` to distinguish between leaf and parent modules, with special handling for prompt/template modules - Established clear patterns: * Leaf modules (40-70 lines): Components, hooks, API clients * Parent modules (50-150 lines): Architecture, cross-layer patterns, data flows * Template modules: Pattern focus, not catalog listings ### Generated Documentation Created 15 CLAUDE.md reference files across the project: **Frontend (React/Next.js)** - frontend/src/CLAUDE.md: Architecture overview, data flow, three-tier design - frontend/src/lib/hooks/CLAUDE.md: React Query patterns, state management - frontend/src/lib/api/CLAUDE.md: Axios client, FormData handling, interceptors - frontend/src/lib/stores/CLAUDE.md: Zustand state persistence, auth patterns - frontend/src/components/ui/CLAUDE.md: Radix UI primitives, CVA styling **Backend (Python/FastAPI)** - open_notebook/CLAUDE.md: System architecture, layer interactions - open_notebook/ai/CLAUDE.md: Model provisioning, Esperanto integration - open_notebook/domain/CLAUDE.md: Data models, ObjectModel/RecordModel patterns - open_notebook/database/CLAUDE.md: Repository pattern, async migrations - open_notebook/graphs/CLAUDE.md: LangGraph workflows, async orchestration - open_notebook/utils/CLAUDE.md: Cross-cutting utilities, context building - open_notebook/podcasts/CLAUDE.md: Episode/speaker profiles, job tracking **API & Other** - api/CLAUDE.md: REST layer, service architecture - commands/CLAUDE.md: Async command handlers, job queue patterns - prompts/CLAUDE.md: Jinja2 templates, prompt engineering patterns (refactored) **Project Root** - CLAUDE.md: Project overview, three-tier architecture, tech stack, getting started ### Key Features - Zero duplication: Parent modules reference child CLAUDE.md files, don't repeat them - Pattern-focused: Emphasizes how components work together, not component catalogs - Scannable: Short bullets, code examples only when necessary (1-2 per file) - Practical: "How to extend" guides, quirks/gotchas for each module - Navigation: Root CLAUDE.md acts as hub pointing to specialized documentation ### Cleanup - Removed unused `batch_fix_services.py` - Removed deprecated `open_notebook/plugins/podcasts.py` - Updated .gitignore for documentation consistency ## Impact New contributors can now: 1. Read root CLAUDE.md for system architecture (5 min) 2. Jump to specific layer documentation (frontend, api, open_notebook) 3. Dive into module-specific patterns in child CLAUDE.md files (1 min per module) All documentation is lean, reference-focused, and avoids duplication.
6.6 KiB
6.6 KiB
Database Module
SurrealDB abstraction layer providing repository pattern for CRUD operations and async migration management.
Purpose
Encapsulates all database interactions: connection pooling, async CRUD operations, relationship management, and schema migrations. Provides clean interface for domain models and API endpoints to interact with SurrealDB without direct query knowledge.
Architecture Overview
Two-tier system:
- Repository Layer (repository.py): Raw async CRUD operations on SurrealDB via AsyncSurreal client
- Migration Layer (async_migrate.py): Schema versioning and migration execution
Both leverage connection context manager for lifecycle management and automatic cleanup.
Component Catalog
repository.py
Connection Management
get_database_url(): ResolvesSURREAL_URLor constructs fromSURREAL_ADDRESS/SURREAL_PORT(backward compatible)get_database_password(): Falls back fromSURREAL_PASSWORDto legacySURREAL_PASSenv vardb_connection(): Async context manager handling sign-in, namespace/database selection, and cleanup- Opens AsyncSurreal, authenticates, selects namespace/database, yields connection, closes on exit
Query Operations
repo_query(query_str, vars): Execute raw SurrealQL with parameter substitution; returns list of dictsrepo_create(table, data): Insert record; auto-addscreated/updatedtimestamps; removes any existingidfieldrepo_insert(table, data_list, ignore_duplicates): Bulk insert multiple records; optionally ignores "already contains" errorsrepo_upsert(table, id, data, add_timestamp): MERGE operation for create-or-update; optionally addsupdatedtimestamprepo_update(table, id, data): Update existing record by table+id or full record_id; auto-addsupdated, parses ISO datesrepo_delete(record_id): Delete record by RecordIDrepo_relate(source, relationship, target, data): Create graph relationship; optional relationship data
Utilities
parse_record_ids(obj): Recursively converts SurrealDB RecordID objects to strings (deep tree traversal)ensure_record_id(value): Coerces string or RecordID to RecordID type
async_migrate.py
Migration Classes
-
AsyncMigration: Single migration wrapperfrom_file(path): Load .surrealql file; strips comments and whitespacerun(bump): Execute SQL; call bump_version() on success (bump=True) or lower_version() (bump=False)
-
AsyncMigrationRunner: Sequences multiple migrationsrun_all(): Execute pending migrations from current_version to endrun_one_up(): Run next migrationrun_one_down(): Rollback latest migration
-
AsyncMigrationManager: Main orchestrator- Loads 9 up migrations + 9 down migrations (hard-coded in init)
get_current_version(): Query max version from _sbl_migrations tableneeds_migration(): Boolean check (current < total migrations available)run_migration_up(): Run all pending migrations with logging
Version Tracking
get_latest_version(): Query max version; returns 0 if _sbl_migrations table missingget_all_versions(): Fetch all migration records; returns empty list on errorbump_version(): INSERT new entry into _sbl_migrations with version + applied_at timestamplower_version(): DELETE latest migration record (rollback)
migrate.py
Backward Compatibility
MigrationManager: Sync wrapper around AsyncMigrationManagerget_current_version(): Wraps async call with asyncio.run()needs_migrationproperty: Checks if migration pendingrun_migration_up(): Execute migrations synchronously
Common Patterns
- Async-first design: All operations async via AsyncSurreal; sync wrapper provided for legacy code
- Connection per operation: Each repo_* function opens/closes connection (no pooling); designed for serverless/stateless API
- Auto-timestamping: repo_create() and repo_update() auto-set
created/updatedfields - Error resilience: RuntimeError for transaction conflicts (retriable); catches and re-raises other exceptions
- RecordID polymorphism: Functions accept string or RecordID; coerced to consistent type
- Graceful degradation: Migration queries catch exceptions and treat table-not-found as version 0
Key Dependencies
surrealdb: AsyncSurreal client, RecordID typeloguru: Logging with context (debug/error/success levels)- Python stdlib:
os(env vars),datetime(timestamps),contextlib(async context manager)
Important Quirks & Gotchas
- No connection pooling: Each repo_* operation creates new connection; adequate for HTTP request-scoped operations but inefficient for bulk workloads
- Hard-coded migration files: AsyncMigrationManager lists migrations 1-9 explicitly; adding new migration requires code change (not auto-discovery)
- Record ID format inconsistency: repo_update() accepts both
table:idformat and full RecordID; path handling can be subtle - ISO date parsing: repo_update() parses
createdfield from string to datetime if present; assumes ISO format - Timestamp overwrite risk: repo_create() always sets new timestamps; can't preserve original created time on reimport
- Transaction conflict handling: RuntimeError from transaction conflicts logged without stack trace (prevents log spam)
- Graceful null returns: get_all_versions() returns [] on table missing; allows migration system to bootstrap cleanly
How to Extend
- Add new CRUD operation: Follow repo_* pattern (open connection, execute query, handle errors, close)
- Add migration: Create migration file in
/migrations/N.surrealqland/migrations/N_down.surrealql; update AsyncMigrationManager to load new files - Change timestamp behavior: Modify repo_create()/repo_update() to not auto-set
updatedfield if caller-provided - Implement connection pooling: Replace db_connection context manager with pool.acquire() pattern (for high-throughput scenarios)
Integration Points
- API startup (api/main.py): FastAPI lifespan handler calls AsyncMigrationManager.run_migration_up() on server start
- Domain models (domain/.py): All models call repo_ functions for persistence
- Commands (commands/.py): Background jobs use repo_ for state updates
- Streamlit UI (pages/*.py): Deprecated migration check; relies on API to run migrations
Usage Example
from open_notebook.database.repository import repo_create, repo_query, repo_update
# Create
record = await repo_create("notebooks", {"title": "Research"})
# Query
results = await repo_query("SELECT * FROM notebooks WHERE title = $title", {"title": "Research"})
# Update
await repo_update("notebooks", record["id"], {"title": "Updated Research"})