Commit graph

11 commits

Author SHA1 Message Date
Luis Novo
5b2c97cab7
Fix re-embedding issues and improve retry strategy (#515)
* fix: filter empty content in rebuild embeddings queries

Update collect_items_for_rebuild() to properly filter out items with
empty or whitespace-only content before submitting embedding jobs.

Changes:
- Sources: add string::trim(full_text) != '' filter
- Notes: add string::trim(content) != '' filter
- Insights: add content != none AND string::trim(content) != '' filter
  (previously had no content filter at all)

This prevents unnecessary job submissions that would fail validation
in the individual embed commands.

Ref #513

* feat: add command_id to embedding error logs

Add get_command_id() helper to extract command_id from execution context.
Include command_id in error logs for all embedding commands:
- embed_note_command
- embed_insight_command
- embed_source_command
- create_insight_command

This makes it easier to trace failed embedding jobs back to specific
command records in the database.

Ref #513

* fix: improve logging for embedding commands

Log improvements:
- Add command_id to all embedding error logs for traceability
- Transaction conflicts in repo_insert now log at DEBUG (not ERROR)
- Embedding API errors log at DEBUG, only ERROR when retries exhausted
- Friendlier retry messages: "This will be retried automatically"
- Include model name and command_id in generate_embeddings errors

Files changed:
- commands/embedding_commands.py: command_id in logs, friendlier messages
- open_notebook/database/repository.py: DEBUG for transaction conflicts
- open_notebook/utils/embedding.py: DEBUG logging, pass-through command_id

Ref #513

* fix: correct field names in rebuild embeddings status endpoint

The API status endpoint was looking for wrong field names:
- sources_processed → sources_submitted
- notes_processed → notes_submitted
- insights_processed → insights_submitted
- processed_items → jobs_submitted
- failed_items → failed_submissions

The command outputs "_submitted" because embedding happens async
(we count jobs submitted, not items processed).

Ref #513

* fix: update rebuild UI text to reflect async job submission

Changed terminology from "Completed/processed" to "Jobs Submitted"
since the rebuild command submits embedding jobs for async processing,
not completing them synchronously.

Updated in all locales: en-US, pt-BR, zh-CN, zh-TW, ja-JP

Ref #513

* refactor: migrate retry strategy from allowlist to blocklist

- Change from `retry_on: [RuntimeError, ...]` to `stop_on: [ValueError]`
- This is more resilient: new exception types auto-retry by default
- Simplified exception handling: ValueError = permanent, else = retry
- Transient errors logged at DEBUG (surreal-commands logs final failure)
- Permanent errors (ValueError) logged at ERROR

Ref #513
2026-01-31 18:55:01 -03:00
Luis Novo
a329806a33
fix: improve error handling in repo_create and repo_insert (#474)
Some checks are pending
Development Build / extract-version (push) Waiting to run
Development Build / test-build (push) Blocked by required conditions
Development Build / summary (push) Blocked by required conditions
* fix: improve error handling in repo_create and repo_insert

When SurrealDB encounters an error (e.g., schema validation failure),
it may return a string error message instead of the expected record.
Previously, this caused a confusing "'str' object has no attribute
'items'" error in base.py:save().

This change adds error string detection to repo_create() and repo_insert(),
raising a RuntimeError with the actual SurrealDB error message. This helps
debug issues like #469 by showing the underlying database error.

Related to #469

* fix: preserve error details in repo_insert RuntimeError handling

The RuntimeError with SurrealDB error message was being caught by
the broad 'except Exception' block and replaced with a generic
'Failed to create record' message.

Now RuntimeError is caught separately and re-raised, preserving
the actual error details for debugging.
2026-01-25 15:11:51 -03:00
LUIS NOVO
48e2800211 fix: reduce retry log noise during concurrent chunk processing
Addresses issue #362 - users were seeing hundreds of ERROR/WARNING logs
when processing large documents due to SurrealDB v2 transaction conflicts
during concurrent chunk embedding operations.

Changes:
- Upgraded to surreal-commands v1.3.0 which includes retry_log_level feature
- Increased retry attempts from 5 to 15 with max wait time 120s (from 30s)
  to handle deep queues during concurrent processing
- Set retry_log_level to "debug" in embed_chunk and process_source commands
- Changed repository.py RuntimeError logging from ERROR to DEBUG level
- Updated command exception handlers to log retries at DEBUG level
- Updated documentation to reflect retry strategy

This is a temporary workaround for SurrealDB v2.x transaction conflict
issues with SEARCH indexes. Settings can be reduced after migrating to
SurrealDB v3 which fixes the underlying concurrency issue.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-05 11:30:55 -03:00
Luis Novo
45a99831a9
Hide sources notes (#273)
Some checks failed
Development Build / extract-version (push) Has been cancelled
Development Build / test-build-regular (push) Has been cancelled
Development Build / test-build-single (push) Has been cancelled
Development Build / summary (push) Has been cancelled
* fix: add missing overflow wrapper to notebooks list page

Adds flex-1 overflow-y-auto wrapper to enable proper scrolling
when notebook list exceeds viewport height. Matches the layout
pattern used by all other dashboard pages.

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: reorder transformation routes to prevent dynamic route interception

Moved static routes (/transformations/execute and /transformations/default-prompt)
before dynamic routes (/transformations/{transformation_id}) to ensure FastAPI
matches them correctly. Previously, requests to static routes were incorrectly
captured by the dynamic route handler.

Fixes #250

Co-Authored-By: Claude <noreply@anthropic.com>

* chore: bump to 1.2.1

* hide source and notes panel - fixes #193

* feat: improve layout for mobile views

* bump version to 1.2.2

* fix: address PR review feedback for collapsible columns

- Remove unused CollapseButton component from CollapsibleColumn.tsx
- Rename useCollapseButton to createCollapseButton (not a React hook)
- Move dialogs outside Card in SourcesColumn.tsx for consistency
- Add useMemo for collapseButton in both columns to prevent re-renders

* feat: support multiple sources

* fix: prevent ChatColumn double mounting on desktop

Add useIsDesktop hook to conditionally render mobile view only on
mobile screens. Previously, the mobile ChatColumn was hidden via CSS
on desktop but still mounted, causing duplicate hooks initialization
and redundant network requests.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-11-25 16:59:26 -03:00
Luis Novo
f79a9040ae
Release 1.2 (#242)
* chore: improve podcast transcripts

* fix: remove date from insight - fixes #241

* fix: improve scrolling on source and insights - fixes #237

* chore: update esperanto to fix: #234

* chore: update esperanto to fix #226

* fix: process vectorization as subcommands to handle larger documents more gracefully - fix: #229

* feat: enable background job retry capabilities

* feat: reenable content types that were disabled during alpha version

* fix: remove unnecessary model caching causing many issues.

* feat: support multiple azure endpoints and keys just like openai compatible. Fixes #215

* docs: update azure variables

* chore: bump and update dependencies
2025-11-01 14:40:00 -03:00
Luis Novo
3b2ced54e2
fix environment variable error and enable docker build automation (#94)
* chore: fix database import error

* remove unused file and improve env example

* docker build automation
2025-07-17 09:54:28 -03:00
Luis Novo
d7b0fff954
Api podcast migration (#93)
Creates the API layer for Open Notebook
Creates a services API gateway for the Streamlit front-end
Migrates the SurrealDB SDK to the official one
Change all database calls to async
New podcast framework supporting multiple speaker configurations
Implement the surreal-commands library for async processing
Improve docker image and docker-compose configurations
2025-07-17 08:36:11 -03:00
LUIS NOVO
c297dcb809 refactor objectmodel 2024-11-19 19:03:32 -03:00
LUIS NOVO
4a5d47d934 refactor transformation, add graph and admin 2024-11-18 22:01:11 -03:00
LUIS NOVO
e4b8fa8cc7 cleanup logging 2024-11-13 12:17:57 -03:00
LUIS NOVO
2de8520d0c refactor database module and migrations 2024-10-30 16:33:07 -03:00
Renamed from open_notebook/repository.py (Browse further)