Commit graph

150 commits

Author SHA1 Message Date
LUIS NOVO
71b8d13b24 docs: generate comprehensive CLAUDE.md reference documentation across codebase
Create a hierarchical CLAUDE.md documentation system for the entire Open Notebook
codebase with focus on concise, pattern-driven reference cards rather than
comprehensive tutorials.

## Changes

### Core Documentation System
- Updated `.claude/commands/build-claude-md.md` to distinguish between leaf and
  parent modules, with special handling for prompt/template modules
- Established clear patterns:
  * Leaf modules (40-70 lines): Components, hooks, API clients
  * Parent modules (50-150 lines): Architecture, cross-layer patterns, data flows
  * Template modules: Pattern focus, not catalog listings

### Generated Documentation
Created 15 CLAUDE.md reference files across the project:

**Frontend (React/Next.js)**
- frontend/src/CLAUDE.md: Architecture overview, data flow, three-tier design
- frontend/src/lib/hooks/CLAUDE.md: React Query patterns, state management
- frontend/src/lib/api/CLAUDE.md: Axios client, FormData handling, interceptors
- frontend/src/lib/stores/CLAUDE.md: Zustand state persistence, auth patterns
- frontend/src/components/ui/CLAUDE.md: Radix UI primitives, CVA styling

**Backend (Python/FastAPI)**
- open_notebook/CLAUDE.md: System architecture, layer interactions
- open_notebook/ai/CLAUDE.md: Model provisioning, Esperanto integration
- open_notebook/domain/CLAUDE.md: Data models, ObjectModel/RecordModel patterns
- open_notebook/database/CLAUDE.md: Repository pattern, async migrations
- open_notebook/graphs/CLAUDE.md: LangGraph workflows, async orchestration
- open_notebook/utils/CLAUDE.md: Cross-cutting utilities, context building
- open_notebook/podcasts/CLAUDE.md: Episode/speaker profiles, job tracking

**API & Other**
- api/CLAUDE.md: REST layer, service architecture
- commands/CLAUDE.md: Async command handlers, job queue patterns
- prompts/CLAUDE.md: Jinja2 templates, prompt engineering patterns (refactored)

**Project Root**
- CLAUDE.md: Project overview, three-tier architecture, tech stack, getting started

### Key Features
- Zero duplication: Parent modules reference child CLAUDE.md files, don't repeat them
- Pattern-focused: Emphasizes how components work together, not component catalogs
- Scannable: Short bullets, code examples only when necessary (1-2 per file)
- Practical: "How to extend" guides, quirks/gotchas for each module
- Navigation: Root CLAUDE.md acts as hub pointing to specialized documentation

### Cleanup
- Removed unused `batch_fix_services.py`
- Removed deprecated `open_notebook/plugins/podcasts.py`
- Updated .gitignore for documentation consistency

## Impact
New contributors can now:
1. Read root CLAUDE.md for system architecture (5 min)
2. Jump to specific layer documentation (frontend, api, open_notebook)
3. Dive into module-specific patterns in child CLAUDE.md files (1 min per module)
All documentation is lean, reference-focused, and avoids duplication.
2026-01-03 16:27:52 -03:00
LUIS NOVO
ab5560c9a2 refactor: reorganize folder structure for better maintainability
Changes:
- Move migrations/ under open_notebook/database/migrations/
- Extract AI models to open_notebook/ai/ (Model, ModelManager, provision)
- Extract podcasts to open_notebook/podcasts/ (EpisodeProfile, SpeakerProfile, PodcastEpisode)
- Reorganize prompts to mirror graphs structure (chat/, source_chat/)

This improves code organization by:
- Consolidating database concerns (migrations now with database code)
- Separating AI infrastructure from domain entities
- Isolating podcast feature into its own module
- Creating consistent prompt/graph naming conventions

All 52 tests pass.
2026-01-03 14:04:27 -03:00
Luis Novo
d6eedde5a3
Merge pull request #333 from jflo/fix/strip-thinking-tags
fix: strip <think> tags from chat responses
2025-12-19 22:52:43 -03:00
Justin Florentine
855e730577
fix: preserve AIMessage metadata when cleaning thinking content
Use model_copy() instead of creating new AIMessage to preserve
response_metadata, id, usage_metadata, etc. Also adds test coverage
for malformed thinking tags pattern.

Addresses PR #333 feedback from lfnovo and cubic-dev-ai.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 20:08:12 -05:00
LUIS NOVO
e11f0a4db8 fix: resolve chat model selection and session display issues
- Add nullable_fields support to ObjectModel base class
- Configure ChatSession to allow model_override to be cleared to null
- Fix JSX conditional that rendered "0" when message_count was 0
- Display model name instead of raw ID in session manager

Fixes issues:
1. Switching to default model now persists correctly
2. Session list shows human-readable model names
3. Sessions with 0 messages no longer show "0" badge
2025-12-19 16:47:34 -03:00
Justin Florentine
869664a10b
fix: strip <think> tags from chat responses
Add thinking content cleaning to notebook and source chat graphs.
Previously, models that output <think>...</think> tags (like DeepSeek)
or malformed variants without opening tags (like Nemotron) would leak
reasoning content into user-visible responses.

Changes:
- chat.py: Clean AI response content before returning messages
- source_chat.py: Same fix for source-specific chat
- text_utils.py: Handle malformed output where opening <think> tag
  is missing but </think> is present

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-18 16:31:23 -05:00
Bui Thanh Son
60566c9c4d
refactor: move environment variables loading to application entry point (#283) 2025-12-01 14:59:50 -03:00
Bui Thanh Son
55b8e6380c
fix: add UTF-8 encoding for async migrations file reading (#279)
Add UTF-8 encoding when opening migration files.
2025-11-27 09:59:47 -03:00
Luis Novo
45a99831a9
Hide sources notes (#273)
Some checks failed
Development Build / extract-version (push) Has been cancelled
Development Build / test-build-regular (push) Has been cancelled
Development Build / test-build-single (push) Has been cancelled
Development Build / summary (push) Has been cancelled
* fix: add missing overflow wrapper to notebooks list page

Adds flex-1 overflow-y-auto wrapper to enable proper scrolling
when notebook list exceeds viewport height. Matches the layout
pattern used by all other dashboard pages.

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: reorder transformation routes to prevent dynamic route interception

Moved static routes (/transformations/execute and /transformations/default-prompt)
before dynamic routes (/transformations/{transformation_id}) to ensure FastAPI
matches them correctly. Previously, requests to static routes were incorrectly
captured by the dynamic route handler.

Fixes #250

Co-Authored-By: Claude <noreply@anthropic.com>

* chore: bump to 1.2.1

* hide source and notes panel - fixes #193

* feat: improve layout for mobile views

* bump version to 1.2.2

* fix: address PR review feedback for collapsible columns

- Remove unused CollapseButton component from CollapsibleColumn.tsx
- Rename useCollapseButton to createCollapseButton (not a React hook)
- Move dialogs outside Card in SourcesColumn.tsx for consistency
- Add useMemo for collapseButton in both columns to prevent re-renders

* feat: support multiple sources

* fix: prevent ChatColumn double mounting on desktop

Add useIsDesktop hook to conditionally render mobile view only on
mobile screens. Previously, the mobile ChatColumn was hidden via CSS
on desktop but still mounted, causing duplicate hooks initialization
and redundant network requests.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-11-25 16:59:26 -03:00
Luis Novo
f79a9040ae
Release 1.2 (#242)
* chore: improve podcast transcripts

* fix: remove date from insight - fixes #241

* fix: improve scrolling on source and insights - fixes #237

* chore: update esperanto to fix: #234

* chore: update esperanto to fix #226

* fix: process vectorization as subcommands to handle larger documents more gracefully - fix: #229

* feat: enable background job retry capabilities

* feat: reenable content types that were disabled during alpha version

* fix: remove unnecessary model caching causing many issues.

* feat: support multiple azure endpoints and keys just like openai compatible. Fixes #215

* docs: update azure variables

* chore: bump and update dependencies
2025-11-01 14:40:00 -03:00
Luis Novo
1a67f1f912
fix: enhance chat reference links and prevent text overflow (#173)
This commit addresses two related issues in the chat interface:

1. **Fix broken reference links (OSS-310)**
   - Completely rewrote convertReferencesToMarkdownLinks() with greedy pattern matching
   - Now handles all edge cases: references after commas, nested brackets, bold markdown
   - Added visual icon indicators (FileText, Lightbulb, FileEdit) for reference types
   - Implemented proper error handling with toast notifications
   - Added validation for reference types and ID lengths

2. **Fix long URL/text overflow (#172)**
   - Added break-words and overflow-wrap classes to chat messages
   - Long URLs and text now wrap properly within chat bubbles
   - Applied fix consistently across source chat, notebook chat, and search results

**Technical Details:**
- Enhanced reference detection algorithm processes from end to start to preserve indices
- Context analysis (50 chars before/after) determines original formatting
- Icons are 12px, accessible, and themed appropriately
- All changes pass linting and build successfully

**Files Modified:**
- frontend/src/lib/utils/source-references.tsx (core algorithm rewrite)
- frontend/src/components/source/ChatPanel.tsx (error handling + text wrapping)
- frontend/src/components/search/StreamingResponse.tsx (error handling + text wrapping)
- open_notebook/utils/token_utils.py (ruff formatting fix)

fixes #172
2025-10-19 15:38:59 -03:00
Luis Novo
aa593c60bd
feat: add persistent tiktoken cache to reduce re-downloads (#171)
Some checks are pending
Development Build / extract-version (push) Waiting to run
Development Build / test-build-regular (push) Blocked by required conditions
Development Build / test-build-single (push) Blocked by required conditions
Development Build / summary (push) Blocked by required conditions
Configure tiktoken to cache tokenizer encodings in ./data/tiktoken-cache
instead of using system temp directory. This prevents re-downloading
encoding files on every container restart and improves startup time.

Changes:
- Add TIKTOKEN_CACHE_DIR configuration in config.py
- Set TIKTOKEN_CACHE_DIR environment variable in token_utils.py
- Bump version to 1.0.7
2025-10-19 14:50:52 -03:00
LUIS NOVO
a51bb9d792 fix: missing parenthesis 2025-10-18 13:22:39 -03:00
LUIS NOVO
8b5daa86bc fix: max tokens max is 8192 now 2025-10-18 13:21:53 -03:00
Luis Novo
b7e656a319
Version 1 (#160)
New front-end
Launch Chat API
Manage Sources
Enable re-embedding of all contents
Sources can be added without a notebook now
Improved settings
Enable model selector on all chats
Background processing for better experience
Dark mode
Improved Notes

Improved Docs: 
- Remove all Streamlit references from documentation
- Update deployment guides with React frontend setup
- Fix Docker environment variables format (SURREAL_URL, SURREAL_PASSWORD)
- Update docker image tag from :latest to :v1-latest
- Change navigation references (Settings → Models to just Models)
- Update development setup to include frontend npm commands
- Add MIGRATION.md guide for users upgrading from Streamlit
- Update quick-start guide with correct environment variables
- Add port 5055 documentation for API access
- Update project structure to reflect frontend/ directory
- Remove outdated source-chat documentation files
2025-10-18 12:46:22 -03:00
Luis Novo
fa27fe561a
Several hotfixes (#130)
* fix: prevent project failing to start when cannot talk to github - fixes #128

* improve ollama documentation - see #127

* chore: update esperanto library to enable gpt-5 - see #107; update podcast-creator library to enable TTS_BATCH_SIZE - fixes #125

* add info on ollama env variables

* chore: ignore dev logs

* chore: bump
2025-09-14 10:58:16 -03:00
Luis Novo
3b2ced54e2
fix environment variable error and enable docker build automation (#94)
* chore: fix database import error

* remove unused file and improve env example

* docker build automation
2025-07-17 09:54:28 -03:00
Luis Novo
d7b0fff954
Api podcast migration (#93)
Creates the API layer for Open Notebook
Creates a services API gateway for the Streamlit front-end
Migrates the SurrealDB SDK to the official one
Change all database calls to async
New podcast framework supporting multiple speaker configurations
Implement the surreal-commands library for async processing
Improve docker image and docker-compose configurations
2025-07-17 08:36:11 -03:00
LUIS NOVO
e3ee803a42 review: add validation and compile regex just once 2025-06-26 11:55:41 -03:00
LUIS NOVO
7eee271232 feat: extract think tags from reasoning models 2025-06-26 11:41:15 -03:00
LUIS NOVO
f4b9ccbb22 fix: remove provider check, not needed 2025-06-10 11:54:15 -03:00
LUIS NOVO
7239f719fd fix: enforce env variables are present 2025-06-10 11:53:53 -03:00
LUIS NOVO
61b3583a57 fix: fix provider routing for podcasts and add a try block to catch podcast generation issues 2025-06-10 11:52:41 -03:00
LUIS NOVO
24a359ecd3 chore: set migration target 2025-06-09 20:44:40 -03:00
LUIS NOVO
6532411d33 remove old model management code 2025-06-08 19:39:07 -03:00
LUIS NOVO
bea43f3ce7 feat: implement the new model management based on esperanto framework 2025-06-08 19:38:43 -03:00
LUIS NOVO
2afbd36cb4 refactor: implement ai_prompter library 2025-06-01 08:09:33 -03:00
LUIS NOVO
1afb5d81e8 feat: implement new content settings page and remove options from the source panel 2025-05-30 15:25:39 -03:00
LUIS NOVO
36e928eb75 feat: replace content processing engine with content-core 2025-05-30 13:35:46 -03:00
LUIS NOVO
aa4912334b fix: replace GEMINI_API_KEY with GOOGLE_API_KEY as per new SDK 2025-05-22 09:12:36 -03:00
LUIS NOVO
fe23e35670 fix elevenlabs provider selection 2025-03-11 20:49:53 -03:00
LUIS NOVO
bcef4ed46f fix bug with empty default models 2025-03-11 20:49:37 -03:00
LUIS NOVO
3d6d9c1195 fix record model empty object instantiation 2024-11-20 11:55:48 -03:00
LUIS NOVO
c297dcb809 refactor objectmodel 2024-11-19 19:03:32 -03:00
LUIS NOVO
add786d112 version bump and migration 2024-11-19 00:02:01 -03:00
LUIS NOVO
dbe362f95a enable podcast longform 2024-11-19 00:01:18 -03:00
LUIS NOVO
7f79f8224f improve context and several fixes 2024-11-18 22:01:49 -03:00
LUIS NOVO
4a5d47d934 refactor transformation, add graph and admin 2024-11-18 22:01:11 -03:00
LUIS NOVO
01db97924e improve cleanup function 2024-11-14 15:19:21 -03:00
LUIS NOVO
c8b5d422ae enable use without optional models, adds warning 2024-11-14 12:12:11 -03:00
LUIS NOVO
dd99531b00 final tweaks to podcast 2024-11-13 21:46:15 -03:00
LUIS NOVO
4f9aa63b3e add longform option to podcast generation 2024-11-13 19:08:03 -03:00
LUIS NOVO
1e35f069b0 add option to save insight as note 2024-11-13 17:33:38 -03:00
LUIS NOVO
182ae741d8 cleanup podcast 2024-11-13 17:02:48 -03:00
LUIS NOVO
06c6842f11 fix insight context to improve citations 2024-11-13 17:02:18 -03:00
LUIS NOVO
066c7a06e2 improve search functions 2024-11-13 15:52:44 -03:00
LUIS NOVO
b04761affc mypy fixes 2024-11-13 15:21:17 -03:00
LUIS NOVO
321234e485 add support for GROQ models 2024-11-13 15:09:56 -03:00
LUIS NOVO
9ba5709a3c model selector and model suggestions 2024-11-13 14:48:00 -03:00
LUIS NOVO
80353a97c9 make model rag work with vector only 2024-11-13 12:18:26 -03:00