open-notebook/open_notebook/ai/CLAUDE.md
LUIS NOVO 71b8d13b24 docs: generate comprehensive CLAUDE.md reference documentation across codebase
Create a hierarchical CLAUDE.md documentation system for the entire Open Notebook
codebase with focus on concise, pattern-driven reference cards rather than
comprehensive tutorials.

## Changes

### Core Documentation System
- Updated `.claude/commands/build-claude-md.md` to distinguish between leaf and
  parent modules, with special handling for prompt/template modules
- Established clear patterns:
  * Leaf modules (40-70 lines): Components, hooks, API clients
  * Parent modules (50-150 lines): Architecture, cross-layer patterns, data flows
  * Template modules: Pattern focus, not catalog listings

### Generated Documentation
Created 15 CLAUDE.md reference files across the project:

**Frontend (React/Next.js)**
- frontend/src/CLAUDE.md: Architecture overview, data flow, three-tier design
- frontend/src/lib/hooks/CLAUDE.md: React Query patterns, state management
- frontend/src/lib/api/CLAUDE.md: Axios client, FormData handling, interceptors
- frontend/src/lib/stores/CLAUDE.md: Zustand state persistence, auth patterns
- frontend/src/components/ui/CLAUDE.md: Radix UI primitives, CVA styling

**Backend (Python/FastAPI)**
- open_notebook/CLAUDE.md: System architecture, layer interactions
- open_notebook/ai/CLAUDE.md: Model provisioning, Esperanto integration
- open_notebook/domain/CLAUDE.md: Data models, ObjectModel/RecordModel patterns
- open_notebook/database/CLAUDE.md: Repository pattern, async migrations
- open_notebook/graphs/CLAUDE.md: LangGraph workflows, async orchestration
- open_notebook/utils/CLAUDE.md: Cross-cutting utilities, context building
- open_notebook/podcasts/CLAUDE.md: Episode/speaker profiles, job tracking

**API & Other**
- api/CLAUDE.md: REST layer, service architecture
- commands/CLAUDE.md: Async command handlers, job queue patterns
- prompts/CLAUDE.md: Jinja2 templates, prompt engineering patterns (refactored)

**Project Root**
- CLAUDE.md: Project overview, three-tier architecture, tech stack, getting started

### Key Features
- Zero duplication: Parent modules reference child CLAUDE.md files, don't repeat them
- Pattern-focused: Emphasizes how components work together, not component catalogs
- Scannable: Short bullets, code examples only when necessary (1-2 per file)
- Practical: "How to extend" guides, quirks/gotchas for each module
- Navigation: Root CLAUDE.md acts as hub pointing to specialized documentation

### Cleanup
- Removed unused `batch_fix_services.py`
- Removed deprecated `open_notebook/plugins/podcasts.py`
- Updated .gitignore for documentation consistency

## Impact
New contributors can now:
1. Read root CLAUDE.md for system architecture (5 min)
2. Jump to specific layer documentation (frontend, api, open_notebook)
3. Dive into module-specific patterns in child CLAUDE.md files (1 min per module)
All documentation is lean, reference-focused, and avoids duplication.
2026-01-03 16:27:52 -03:00

5.7 KiB

AI Module

Model configuration, provisioning, and management for multi-provider AI integration via Esperanto.

Purpose

Centralizes AI model lifecycle: database models for model metadata (provider, type), default model configuration, and factory for instantiating LLM/embedding/speech models at runtime with fallback logic.

Architecture Overview

Two-tier system:

  1. Database models (Model, DefaultModels): Metadata storage and default configuration
  2. ModelManager: Factory for provisioning models with intelligent fallback (large context detection, config override)

All models use Esperanto library as provider abstraction (OpenAI, Anthropic, Google, Groq, Ollama, Mistral, DeepSeek, xAI, OpenRouter).

Component Catalog

models.py

Model (ObjectModel)

  • Database record: name, provider, type (language/embedding/speech_to_text/text_to_speech)
  • get_models_by_type(): Async query to fetch all models of a specific type
  • Stores provider-model pairs for AI factory instantiation

DefaultModels (RecordModel)

  • Singleton configuration record (record_id: open_notebook:default_models)
  • Fields: default_chat_model, default_transformation_model, large_context_model, default_text_to_speech_model, default_speech_to_text_model, default_embedding_model, default_tools_model
  • get_instance(): Always fetches fresh from database (overrides parent caching for real-time updates)
  • Returns fresh instance on each call (no singleton cache)

ModelManager

  • Stateless factory for instantiating AI models
  • get_model(model_id): Retrieves Model by ID, creates via AIFactory.create_* based on type
  • get_defaults(): Fetches DefaultModels configuration
  • get_default_model(model_type): Smart lookup (e.g., "chat" → default_chat_model, "transformation" → default_transformation_model with fallback to chat)
  • get_speech_to_text(), get_text_to_speech(), get_embedding_model(): Type-specific convenience methods with assertions
  • Global instance: model_manager singleton exported for use throughout app

provision.py

provision_langchain_model()

  • Factory for LangGraph nodes needing LLM provisioning
  • Smart fallback logic:
    • If tokens > 105,000: Use large_context_model
    • Elif model_id specified: Use specific model
    • Else: Use default model for type (e.g., "chat", "transformation")
  • Returns LangChain-compatible model via .to_langchain()
  • Logs model selection decision

Common Patterns

  • Type dispatch: Model.type field drives factory logic (4 model types)
  • Provider abstraction: Esperanto handles provider differences; ModelManager unaware of provider specifics
  • Fresh defaults: DefaultModels.get_instance() always fetches from database (not cached) for live config updates
  • Config override: provision_langchain_model() accepts kwargs passed to AIFactory.create_* methods
  • Token-based selection: provision_langchain_model() detects large contexts and upgrades model automatically
  • Type assertions: get_speech_to_text(), get_embedding_model() assert returned type (safety check)

Key Dependencies

  • esperanto: AIFactory.create_language(), create_embedding(), create_speech_to_text(), create_text_to_speech()
  • open_notebook.database.repository: repo_query, ensure_record_id
  • open_notebook.domain.base: ObjectModel, RecordModel base classes
  • open_notebook.utils: token_count() for context size detection
  • loguru: Logging for model selection decisions

Important Quirks & Gotchas

  • Token counting rough estimate: provision_langchain_model() uses token_count() which estimates via cl100k_base encoding (may differ 5-10% from actual model)
  • Large context threshold hard-coded: 105,000 token threshold for large_context_model upgrade (not configurable)
  • DefaultModels.get_instance() fresh fetch: Intentionally bypasses parent singleton cache to pick up live config changes; creates new instance each call
  • Type-specific getters use assertions: get_speech_to_text() asserts isinstance (catches misconfiguration early)
  • No validation of model existence: ModelManager.get_model() raises ValueError if model not found (not caught upstream)
  • Esperanto caching: Actual model instances cached by Esperanto (not by ModelManager); ModelManager stateless
  • Fallback chain specificity: "transformation" type falls back to default_chat_model if not explicitly set (convention-based)
  • kwargs passed through: provision_langchain_model() passes kwargs to AIFactory but doesn't validate what's accepted

How to Extend

  1. Add new model type: Add type string to Model.type enum, add create_* method in AIFactory, handle in ModelManager.get_model()
  2. Add new default configuration: Extend DefaultModels with new field (e.g., default_vision_model), add getter in ModelManager
  3. Change fallback logic: Modify provision_langchain_model() token threshold or fallback chain
  4. Add model filtering: Extend Model.get_models_by_type() with additional filters (e.g., by provider)
  5. Implement model caching: Wrap ModelManager methods with functools.lru_cache (be aware of kwargs mutability)

Usage Example

from open_notebook.ai.models import model_manager

# Get default chat model
chat_model = await model_manager.get_default_model("chat")

# Get specific model by ID
embedding_model = await model_manager.get_model("model:openai_embedding")

# Get embedding model with config override
embedding_model = await model_manager.get_embedding_model(temperature=0.1)

# Provision model for LangGraph (auto-detects large context)
from open_notebook.ai.provision import provision_langchain_model
langchain_model = await provision_langchain_model(
    content=long_text,
    model_id=None,  # Use default
    default_type="chat",
    temperature=0.7
)