mirror of https://github.com/lfnovo/open-notebook.git synced 2026-04-29 12:00:00 +00:00

LUIS NOVO 71b8d13b24 docs: generate comprehensive CLAUDE.md reference documentation across codebase

Create a hierarchical CLAUDE.md documentation system for the entire Open Notebook
codebase with focus on concise, pattern-driven reference cards rather than
comprehensive tutorials.

## Changes

### Core Documentation System
- Updated `.claude/commands/build-claude-md.md` to distinguish between leaf and
  parent modules, with special handling for prompt/template modules
- Established clear patterns:
  * Leaf modules (40-70 lines): Components, hooks, API clients
  * Parent modules (50-150 lines): Architecture, cross-layer patterns, data flows
  * Template modules: Pattern focus, not catalog listings

### Generated Documentation
Created 15 CLAUDE.md reference files across the project:

**Frontend (React/Next.js)**
- frontend/src/CLAUDE.md: Architecture overview, data flow, three-tier design
- frontend/src/lib/hooks/CLAUDE.md: React Query patterns, state management
- frontend/src/lib/api/CLAUDE.md: Axios client, FormData handling, interceptors
- frontend/src/lib/stores/CLAUDE.md: Zustand state persistence, auth patterns
- frontend/src/components/ui/CLAUDE.md: Radix UI primitives, CVA styling

**Backend (Python/FastAPI)**
- open_notebook/CLAUDE.md: System architecture, layer interactions
- open_notebook/ai/CLAUDE.md: Model provisioning, Esperanto integration
- open_notebook/domain/CLAUDE.md: Data models, ObjectModel/RecordModel patterns
- open_notebook/database/CLAUDE.md: Repository pattern, async migrations
- open_notebook/graphs/CLAUDE.md: LangGraph workflows, async orchestration
- open_notebook/utils/CLAUDE.md: Cross-cutting utilities, context building
- open_notebook/podcasts/CLAUDE.md: Episode/speaker profiles, job tracking

**API & Other**
- api/CLAUDE.md: REST layer, service architecture
- commands/CLAUDE.md: Async command handlers, job queue patterns
- prompts/CLAUDE.md: Jinja2 templates, prompt engineering patterns (refactored)

**Project Root**
- CLAUDE.md: Project overview, three-tier architecture, tech stack, getting started

### Key Features
- Zero duplication: Parent modules reference child CLAUDE.md files, don't repeat them
- Pattern-focused: Emphasizes how components work together, not component catalogs
- Scannable: Short bullets, code examples only when necessary (1-2 per file)
- Practical: "How to extend" guides, quirks/gotchas for each module
- Navigation: Root CLAUDE.md acts as hub pointing to specialized documentation

### Cleanup
- Removed unused `batch_fix_services.py`
- Removed deprecated `open_notebook/plugins/podcasts.py`
- Updated .gitignore for documentation consistency

## Impact
New contributors can now:
1. Read root CLAUDE.md for system architecture (5 min)
2. Jump to specific layer documentation (frontend, api, open_notebook)
3. Dive into module-specific patterns in child CLAUDE.md files (1 min per module)
All documentation is lean, reference-focused, and avoids duplication.

2026-01-03 16:27:52 -03:00

3.5 KiB

Raw Blame History

Podcasts Module

Domain models for podcast generation featuring speaker and episode profile management with job tracking.

Purpose

Encapsulates podcast metadata and configuration: speaker profiles (voice/personality config), episode profiles (generation settings), and podcast episodes (with job status tracking via surreal-commands).

Architecture Overview

Two-tier profile system:

SpeakerProfile: TTS provider/model + 1-4 speaker configurations (name, voice_id, backstory, personality)
EpisodeProfile: Generation settings (outline/transcript models, segment count, briefing template)
PodcastEpisode: Generated episode record linking profiles, content, and async job

All inherit from ObjectModel (SurrealDB base class with table_name and save/load).

Component Catalog

SpeakerProfile

Validates 1-4 speakers with required fields: name, voice_id, backstory, personality
Stores TTS provider/model (e.g., "elevenlabs", "openai")
get_by_name() async query by profile name
Raises ValueError on invalid speaker counts or missing fields

EpisodeProfile

Configures outline/transcript generation: provider, model, num_segments (3-20 validated)
References speaker_config by name
Stores default_briefing template for episode generation
get_by_name() async query

PodcastEpisode

Stores episode_profile and speaker_profile as dicts (snapshots of config at generation time)
Optional audio_file path, transcript/outline dicts
Job tracking: command field links to surreal-commands RecordID
get_job_status() fetches async job status via surreal-commands library
_prepare_save_data() ensures command field is always RecordID format for database

Common Patterns

Profile snapshots: episode_profile and speaker_profile stored as dicts to freeze config at generation time
Field validation: Pydantic validators enforce constraints (segment count, speaker count, required fields)
Async database access: get_by_name() queries via repo_query
Job tracking: command field delegates to surreal-commands; get_job_status() returns "unknown" on failure
Record ID handling: ensure_record_id() converts string to RecordID before save

Key Dependencies

pydantic: Field validators, ObjectModel inheritance
surrealdb: RecordID type for job references
open_notebook.database.repository: repo_query, ensure_record_id
open_notebook.domain.base: ObjectModel base class
surreal_commands (optional): get_command_status() for job status

Important Quirks & Gotchas

Snapshot approach: Episode/speaker profiles stored as dicts (not references), so profile updates don't retroactively affect past episodes
Job status resilience: get_job_status() catches all exceptions and returns "unknown" (no error propagation)
validate_speakers executes late: Validators run at instantiation; bulk inserts may not trigger full validation
RecordID coercion: ensure_record_id() handles both string and RecordID inputs; command field parsed during deserialization
No cascade delete: Removing a profile doesn't cascade to episodes using it

How to Extend

Add new speaker field: Add to required_fields list in validate_speakers()
Add episode config field: Validate in EpisodeProfile, update briefing generation code
Add job metadata: Extend PodcastEpisode with new fields (e.g., progress tracking)
Change job provider: Replace surreal-commands with alternative job queue library; update get_job_status()

3.5 KiB Raw Blame History