open-notebook

mirror of https://github.com/lfnovo/open-notebook.git synced 2026-04-29 03:50:04 +00:00

Author	SHA1	Message	Date
danrush777	1a6fe4723b	fix: handle structured content format in LLM response parsing Some LLM providers (notably Gemini, DeepSeek via OpenAI-compatible proxies) return ai_message.content as a list of content parts: [{'type': 'text', 'text': '...', 'extras': {...}}] The current code uses str() on non-string content, which produces the Python repr of the entire list — not valid JSON. This breaks PydanticOutputParser.parse() with OutputParserException. This adds extract_text_content() to properly unwrap text from both string and structured content formats, applied in ask.py, chat.py, and prompt.py. Fixes #329	2026-02-08 22:29:45 +01:00
Luis Novo	d8006ff5cb	feat: content-type aware chunking and unified embedding (#444 ) * feat: content-type aware chunking and unified embedding - Add chunking.py with HTML, Markdown, and plain text detection - Add embedding.py with mean pooling for large content - Create dedicated commands: embed_note, embed_insight, embed_source - Use fire-and-forget pattern for embedding via submit_command() - Refactor rebuild_embeddings_command to delegate to individual commands - Remove legacy commands and needs_embedding() methods - Reduce chunk size to 1500 chars for Ollama compatibility - Update CLAUDE.md documentation for new architecture Fixes #350, #142 * fix: address code review issues - Note.save() now returns command_id for tracking embedding jobs - Add length check after generate_embeddings() to fail fast on mismatch - Add numpy as explicit dependency (was transitive) - Remove hardcoded chunk sizes from docstrings * docs: address code review comments - Rename "SYNC PATH" to "DOMAIN MODEL PATH" in embedding router - Add test_chunking.py and test_embedding.py to Testing Strategy - Clarify auto-embedding behavior for each domain model * fix: clean thinking tags from prompt graph output Adds clean_thinking_content() to prompt.py to handle extended thinking models that return <think>...</think> tags. This fixes empty titles when saving notes from chat. * chore: remove local docker-compose from git * fix(frontend): handle null parent_id in search results Add defensive check for null parent_id in search results to prevent "Cannot read properties of null (reading 'split')" error. This can happen with orphaned records in the database. * fix: cascade delete embeddings and insights when source is deleted When deleting a Source, now also deletes associated: - source_embedding records - source_insight records This prevents orphaned records that cause null parent_id errors in vector search results. * fix: add cleanup for orphan embedding/insight records in migration 10 Deletes source_embedding and source_insight records where the linked source no longer exists (source.id = NONE). * chore: bump esperanto to 2.16 Increases ctx_num for Ollama models to accommodate larger notebook context windows. See: https://github.com/lfnovo/esperanto/pull/69	2026-01-21 23:49:08 -03:00
LUIS NOVO	ab5560c9a2	refactor: reorganize folder structure for better maintainability Changes: - Move migrations/ under open_notebook/database/migrations/ - Extract AI models to open_notebook/ai/ (Model, ModelManager, provision) - Extract podcasts to open_notebook/podcasts/ (EpisodeProfile, SpeakerProfile, PodcastEpisode) - Reorganize prompts to mirror graphs structure (chat/, source_chat/) This improves code organization by: - Consolidating database concerns (migrations now with database code) - Separating AI infrastructure from domain entities - Isolating podcast feature into its own module - Creating consistent prompt/graph naming conventions All 52 tests pass.	2026-01-03 14:04:27 -03:00
Luis Novo	b7e656a319	Version 1 (#160 ) New front-end Launch Chat API Manage Sources Enable re-embedding of all contents Sources can be added without a notebook now Improved settings Enable model selector on all chats Background processing for better experience Dark mode Improved Notes Improved Docs: - Remove all Streamlit references from documentation - Update deployment guides with React frontend setup - Fix Docker environment variables format (SURREAL_URL, SURREAL_PASSWORD) - Update docker image tag from :latest to :v1-latest - Change navigation references (Settings → Models to just Models) - Update development setup to include frontend npm commands - Add MIGRATION.md guide for users upgrading from Streamlit - Update quick-start guide with correct environment variables - Add port 5055 documentation for API access - Update project structure to reflect frontend/ directory - Remove outdated source-chat documentation files	2025-10-18 12:46:22 -03:00
Luis Novo	d7b0fff954	Api podcast migration (#93 ) Creates the API layer for Open Notebook Creates a services API gateway for the Streamlit front-end Migrates the SurrealDB SDK to the official one Change all database calls to async New podcast framework supporting multiple speaker configurations Implement the surreal-commands library for async processing Improve docker image and docker-compose configurations	2025-07-17 08:36:11 -03:00
LUIS NOVO	2afbd36cb4	refactor: implement ai_prompter library	2025-06-01 08:09:33 -03:00
LUIS NOVO	4a5d47d934	refactor transformation, add graph and admin	2024-11-18 22:01:11 -03:00

7 commits