* feat: replace provider config with credential-based system (#477) Introduce a new credential management system replacing the old ProviderConfig singleton and standalone Models page. Each credential stores encrypted API keys and provider-specific configuration with full CRUD support via a unified settings UI. Backend: - Add Credential domain model with encrypted API key storage - Add credentials API router (CRUD, discovery, registration, testing) - Add encryption utilities for secure key storage - Add key_provider for DB-first env-var fallback provisioning - Add connection tester and model discovery services - Integrate ModelManager with credential-based config - Add provider name normalization for Esperanto compatibility - Add database migrations 11-12 for credential schema Frontend: - Rewrite settings/api-keys page with credential management UI - Add model discovery dialog with search and custom model support - Add compact default model assignments (primary/advanced layout) - Add inline model testing and credential connection testing - Add env-var migration banner - Update navigation to unified settings page - Remove standalone models page and old settings components i18n: - Update all 7 locale files with credential and model management keys Closes #477 Co-Authored-By: JFMD <git@jfmd.us> Co-Authored-By: OraCatQAQ <570768706@qq.com> * fix: address PR #540 review comments - Fix docs referencing removed Models page - Fix error-handler returning raw messages instead of i18n keys - Fix auth.py misleading docstring and missing no-password guard - Fix connection_tester using wrong env var for openai_compatible - Add provision_provider_keys before model discovery/sync - Update CLAUDE.md to reflect credential-based system - Fix missing closing brace in api-keys page useEffect * fix: add logging to credential migration and surface errors in UI - Add comprehensive logging to migrate-from-env and migrate-from-provider-config endpoints (start, per-provider progress, success/failure with stack traces, final summary) - Fix frontend migration hooks ignoring errors array from response - Show error toast when migration fails instead of "nothing to migrate" - Invalidate status/envStatus queries after migration so banner updates * docs: update CLAUDE.md files for credential system Replace stale ProviderConfig and /api-keys/ references across 8 CLAUDE.md files to reflect the new Credential-based system from PR #540. * docs: update user documentation for credential-based system Replace env var API key instructions with Settings UI credential workflow across all user-facing documentation. The new flow is: set OPEN_NOTEBOOK_ENCRYPTION_KEY → start services → add credential in Settings UI → test → discover models → register. - Rewrite ai-providers.md, api-configuration.md, environment-reference.md - Update all quick-start guides and installation docs - Update ollama.md, openai-compatible.md, local-tts/stt networking sections - Update reverse-proxy.md, development-setup.md, security.md - Fix broken links to non-existent docs/deployment/ paths - Add credentials endpoints to api-reference.md - Move all API key env vars to deprecated/legacy sections * chore: bump version to 1.7.0-rc1 Release candidate for credential-based provider management system. * fix: initialize provider before try block in test_credential Prevents UnboundLocalError when Credential.get() throws (e.g., invalid credential_id) before provider is assigned. * fix: reorder down migration to drop index before table Removes duplicate REMOVE FIELD statement and reorders so the index is dropped before the table, preventing rollback failures. * refactor: simplify encryption key to always derive via SHA-256 Remove the dual code path in _ensure_fernet_key() that detected native Fernet keys. Since the credential system is new, always deriving via SHA-256 removes unnecessary complexity. Also removes the generate_key() function and Fernet.generate_key() references from docs. * fix: correct mock patch targets in embedding tests and URL validation Fix embedding tests patching wrong module path for model_manager (was targeting open_notebook.utils.embedding.model_manager but it's imported locally from open_notebook.ai.models). Also fix URL validation to allow unresolvable hostnames since they may be valid in the deployment environment (e.g., Azure endpoints, internal DNS). * feat: add global setup banner for encryption and migration status Show a persistent banner in AppShell when encryption key is missing (red) or env var API keys can be migrated (amber), so users see these prompts on every page instead of only on Settings > API Keys. Includes a docs link for the encryption banner and i18n support across all 7 locales. * docs: several improvements to docker-compose e env examples * Update README.md Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com> * docs: fix env var format in README and update model setup instructions Align the encryption key snippet in README Step 2 with the list format used in the compose file. Replace deprecated "Settings → Models" instructions with credential-based Discover Models flow. * fix: address credential system review issues - Fix SSRF bypass via IPv4-mapped IPv6 addresses (::ffff:169.254.x.x) - Fix TTS connection test missing config parameter - Add Azure-specific model discovery using api-key auth header - Add Vertex static model list for credential-based discovery - Fix PROVIDER_DISCOVERY_FUNCTIONS incorrect azure/vertex mapping - Extract business logic to api/credentials_service.py (service layer) - Move credential Pydantic schemas to api/models.py - Update tests to use new service imports and ValueError assertions * fix: sanitize error responses and migrate key_provider to Credential - Replace raw exception messages in all credential router 500 responses with generic error strings (internal details logged server-side only) - Refactor key_provider.py to use Credential.get_by_provider() instead of deprecated ProviderConfig.get_instance() - Remove unused functions (get_provider_configs, get_default_api_key, get_provider_config) that were dead code --------- Co-authored-by: JFMD <git@jfmd.us> Co-authored-by: OraCatQAQ <570768706@qq.com>
11 KiB
Utils Module
Utility functions and helpers for context building, text processing, chunking, embedding, tokenization, and versioning.
Purpose
Provides cross-cutting concerns: building LLM context from sources/insights, content-type aware text chunking, unified embedding generation with mean pooling, token counting, and version management.
Architecture Overview
Six core utilities:
- context_builder.py: Flexible context assembly from sources, notes, insights with token budgeting
- chunking.py: Content-type detection and smart text chunking for embedding operations
- embedding.py: Unified embedding generation with mean pooling for large content
- text_utils.py: Text cleaning and thinking content extraction
- token_utils.py: Token counting for LLM context windows (wrapper around encoding library)
- version_utils.py: Version parsing, comparison, and schema compatibility checks
Each utility is stateless and can be imported independently.
Configuration
Chunking Configuration (chunking.py)
The chunking behavior can be configured via environment variables:
-
OPEN_NOTEBOOK_CHUNK_SIZE: Maximum chunk size in characters (default: 1200)
- Minimum: 100 characters
- Warnings: Values > 8192 characters or invalid values
- Use case: Smaller models (e.g., mxbai-embed-large with limited context window)
-
OPEN_NOTEBOOK_CHUNK_OVERLAP: Overlap between chunks in characters (default: 15% of CHUNK_SIZE)
- Must be: >= 0 and < CHUNK_SIZE
- Warnings: Invalid values or values >= CHUNK_SIZE
- Use case: Control how much context is shared between adjacent chunks
Example for models with small context windows:
export OPEN_NOTEBOOK_CHUNK_SIZE=512
export OPEN_NOTEBOOK_CHUNK_OVERLAP=50
Note: Changes require restart of the application.
Component Catalog
context_builder.py
- ContextItem: Dataclass for individual context piece (id, type, content, priority, token_count)
- ContextConfig: Configuration for context building (sources/notes/insights selection, max tokens, priority weights)
- ContextBuilder: Main class assembling context
add_source(): Include source by ID with inclusion leveladd_note(): Include note by IDadd_insight(): Include insight by IDbuild(): Assemble context respecting token budget and priorities- Uses vector_search to fetch source/insight content from SurrealDB
- Returns list of ContextItem objects sorted by priority
Key behavior:
- Token counting is automatic (calculated in ContextItem.post_init)
- Max token enforcement via priority weighting (higher priority items included first)
- Type-specific fetching: sources → Source.full_text, notes → Note.content, insights → SourceInsight.content
- Raises DatabaseOperationError if source/note fetch fails
chunking.py
- ContentType: Enum (HTML, MARKDOWN, PLAIN)
- CHUNK_SIZE: Configurable via
OPEN_NOTEBOOK_CHUNK_SIZEenv var (default: 1200) - CHUNK_OVERLAP: Configurable via
OPEN_NOTEBOOK_CHUNK_OVERLAPenv var (default: 15% of CHUNK_SIZE) - detect_content_type_from_extension(file_path): Detect type from file extension
- detect_content_type_from_heuristics(text): Detect type from content patterns (returns type + confidence)
- detect_content_type(text, file_path): Combined detection (extension primary, heuristics fallback)
- chunk_text(text, content_type, file_path): Split text using appropriate splitter
Key behavior:
- Uses LangChain splitters: HTMLHeaderTextSplitter, MarkdownHeaderTextSplitter, RecursiveCharacterTextSplitter
- Extension-based detection is primary; heuristics can override PLAIN extensions with 0.8+ confidence
- Secondary chunking applied when HTML/Markdown splitters produce oversized chunks
- Returns list of strings, each ≤ CHUNK_SIZE characters
embedding.py
- mean_pool_embeddings(embeddings): Combine multiple embeddings via normalized mean pooling
- generate_embeddings(texts): Batch embedding via single Esperanto API call
- generate_embedding(text, content_type, file_path): Unified embedding with automatic chunking + mean pooling
Key behavior:
- Uses model_manager.get_model("embedding") for embedding model
- Short text (≤ CHUNK_SIZE): direct embedding
- Long text: chunk → embed each → mean pool results
- Mean pooling: normalize each → mean → normalize result (using numpy)
- Raises ValueError for empty/whitespace-only text
text_utils.py
- remove_non_ascii(text): Remove non-ASCII characters from text
- remove_non_printable(text): Remove non-printable characters, preserving newlines/tabs
- parse_thinking_content(content): Extract
<think>tags content from AI responses - clean_thinking_content(content): Remove
<think>blocks, return cleaned content only
Key behavior:
- parse_thinking_content handles malformed output (missing opening
<think>tag) - Large content (>100KB) bypasses thinking extraction for performance
- Non-string input returns empty thinking and stringified content
token_utils.py
- token_count(text): Returns estimated token count for string (via tiktoken)
- token_cost(text, model): Calculate cost estimate for text with given model
Key behavior: Uses cl100k_base encoding; may differ slightly from actual model tokenization
version_utils.py
- compare_versions(v1, v2): Returns -1 (v1 < v2), 0 (equal), 1 (v1 > v2)
- get_installed_version(package): Get version of installed Python package
- get_version_from_github(url): Fetch latest version from GitHub releases
Key behavior: Uses packaging library for version parsing; supports pre-release tags
Common Patterns
- Dataclass-driven config: ContextConfig used by ContextBuilder (immutable after init)
- Token budgeting: ContextBuilder respects max_tokens constraint; prioritizes high-priority items
- Content-type aware processing: Chunking uses appropriate splitter based on detected content type
- Mean pooling for large content: Embedding handles arbitrarily large text via chunking + pooling
- Error handling resilience: token_count() returns estimate; context_builder catches DB errors gracefully
- Pure text functions: text_utils functions are stateless utilities (no class needed)
- Lazy evaluation: ContextBuilder doesn't fetch items until build() called
- Type hints throughout: All functions use Optional, List, Dict for clarity
Key Dependencies
open_notebook.domain.notebook: Source, Note, SourceInsight models; vector_search functionopen_notebook.ai.models: model_manager for embedding model accessopen_notebook.exceptions: DatabaseOperationError, NotFoundErrorlangchain_text_splitters: HTMLHeaderTextSplitter, MarkdownHeaderTextSplitter, RecursiveCharacterTextSplitternumpy: Mean pooling calculationstiktoken: Token encoding for GPT modelsloguru: Logging throughout
Important Quirks & Gotchas
- Token count estimation: Uses cl100k_base encoding; may differ 5-10% from actual model tokens
- Chunk size for Ollama: 1500 chars chosen to fit within Ollama embedding model context limits
- Content type detection order: Extension checked first, then heuristics; high-confidence heuristics (≥0.8) can override PLAIN extensions
- Mean pooling normalization: Each embedding normalized before mean, result normalized after
- Priority weights default: If not specified, ContextConfig uses default weights (source=1, note=0.8, insight=1.2)
- Vector search required: ContextBuilder assumes vector_search is available on Notebook model; fails if not
- Circular import risk: context_builder imports from domain.notebook; avoid domain importing utils
- Max tokens hard limit: ContextBuilder stops adding items once max_tokens exceeded (not prorated)
- No caching: Every build() call re-fetches from database (use cache layer if needed)
How to Extend
- Add new context source type: Create fetch method in ContextBuilder; update ContextConfig.sources dict
- Add content type: Add to ContentType enum; create splitter getter; update chunk_text()
- Change chunk size: Set OPEN_NOTEBOOK_CHUNK_SIZE and OPEN_NOTEBOOK_CHUNK_OVERLAP environment variables
- Add text preprocessing: Add new function to text_utils (e.g., remove_urls, extract_keywords)
- Change tokenization: Replace tiktoken with alternative library in token_utils; update all calls
- Add context filtering: Extend ContextConfig with filter_by_date, filter_by_topic fields
Usage Examples
Chunking
from open_notebook.utils.chunking import chunk_text, detect_content_type, ContentType
# Auto-detect content type and chunk
chunks = chunk_text(long_text, file_path="document.md")
# Explicit content type
chunks = chunk_text(html_content, content_type=ContentType.HTML)
Embedding
from open_notebook.utils.embedding import generate_embedding, generate_embeddings
# Single text (handles chunking + mean pooling automatically)
embedding = await generate_embedding(long_text)
# Batch embedding (more efficient for multiple texts)
embeddings = await generate_embeddings(["text1", "text2", "text3"])
Context Building
from open_notebook.utils.context_builder import ContextBuilder, ContextConfig
config = ContextConfig(
sources={"source:123": "full", "source:456": "summary"},
max_tokens=2000,
)
builder = ContextBuilder(notebook, config)
context_items = await builder.build()
for item in context_items:
print(f"{item.type}:{item.id} ({item.token_count} tokens)")
encryption.py
- get_secret_from_env(var_name): Retrieve secret from environment with Docker secrets support (checks VAR_FILE first, then VAR)
- get_fernet(): Get Fernet instance if encryption key is configured
- encrypt_value(value): Encrypt a string using Fernet symmetric encryption
- decrypt_value(value): Decrypt a Fernet-encrypted string; gracefully falls back to original value for legacy/unencrypted data Purpose: Provides field-level encryption for sensitive data (API keys) stored in the database. Uses Fernet symmetric encryption (AES-128-CBC with HMAC-SHA256) for authenticated encryption.
Key behavior:
- Key source: OPEN_NOTEBOOK_ENCRYPTION_KEY_FILE (Docker secrets) → OPEN_NOTEBOOK_ENCRYPTION_KEY (env var)
- Accepts any string: always derived to a Fernet key via SHA-256
- No default key — encryption is unavailable until the env var is set
- Graceful fallback on decryption: InvalidToken errors (legacy unencrypted data) return the original value
- Lazy-loaded key: initialized on first use, not at import time
Security considerations:
- OPEN_NOTEBOOK_ENCRYPTION_KEY must be set explicitly (no default)
- Docker secrets pattern supported for secure key injection in containerized environments
- Key rotation would require re-encrypting all stored keys (not currently implemented)
- Encryption is transparent to callers; unencrypted legacy data continues to work
Usage Example:
from open_notebook.utils.encryption import encrypt_value, decrypt_value
# Encrypt before storing in database
encrypted_api_key = encrypt_value(api_key)
# Decrypt when reading from database
decrypted_api_key = decrypt_value(encrypted_api_key)
# Set any string as encryption key:
# OPEN_NOTEBOOK_ENCRYPTION_KEY=my-secret-passphrase