mirror of https://github.com/lfnovo/open-notebook.git synced 2026-04-29 03:50:04 +00:00

feat: credential-based API key management (#477 ) (#540 )

* feat: replace provider config with credential-based system (#477)

Introduce a new credential management system replacing the old
ProviderConfig singleton and standalone Models page. Each credential
stores encrypted API keys and provider-specific configuration with
full CRUD support via a unified settings UI.

Backend:
- Add Credential domain model with encrypted API key storage
- Add credentials API router (CRUD, discovery, registration, testing)
- Add encryption utilities for secure key storage
- Add key_provider for DB-first env-var fallback provisioning
- Add connection tester and model discovery services
- Integrate ModelManager with credential-based config
- Add provider name normalization for Esperanto compatibility
- Add database migrations 11-12 for credential schema

Frontend:
- Rewrite settings/api-keys page with credential management UI
- Add model discovery dialog with search and custom model support
- Add compact default model assignments (primary/advanced layout)
- Add inline model testing and credential connection testing
- Add env-var migration banner
- Update navigation to unified settings page
- Remove standalone models page and old settings components

i18n:
- Update all 7 locale files with credential and model management keys

Closes #477

Co-Authored-By: JFMD <git@jfmd.us>
Co-Authored-By: OraCatQAQ <570768706@qq.com>

* fix: address PR #540 review comments

- Fix docs referencing removed Models page
- Fix error-handler returning raw messages instead of i18n keys
- Fix auth.py misleading docstring and missing no-password guard
- Fix connection_tester using wrong env var for openai_compatible
- Add provision_provider_keys before model discovery/sync
- Update CLAUDE.md to reflect credential-based system
- Fix missing closing brace in api-keys page useEffect

* fix: add logging to credential migration and surface errors in UI

- Add comprehensive logging to migrate-from-env and
  migrate-from-provider-config endpoints (start, per-provider
  progress, success/failure with stack traces, final summary)
- Fix frontend migration hooks ignoring errors array from response
- Show error toast when migration fails instead of "nothing to migrate"
- Invalidate status/envStatus queries after migration so banner updates

* docs: update CLAUDE.md files for credential system

Replace stale ProviderConfig and /api-keys/ references across 8 CLAUDE.md
files to reflect the new Credential-based system from PR #540.

* docs: update user documentation for credential-based system

Replace env var API key instructions with Settings UI credential
workflow across all user-facing documentation. The new flow is:
set OPEN_NOTEBOOK_ENCRYPTION_KEY → start services → add credential
in Settings UI → test → discover models → register.

- Rewrite ai-providers.md, api-configuration.md, environment-reference.md
- Update all quick-start guides and installation docs
- Update ollama.md, openai-compatible.md, local-tts/stt networking sections
- Update reverse-proxy.md, development-setup.md, security.md
- Fix broken links to non-existent docs/deployment/ paths
- Add credentials endpoints to api-reference.md
- Move all API key env vars to deprecated/legacy sections

* chore: bump version to 1.7.0-rc1

Release candidate for credential-based provider management system.

* fix: initialize provider before try block in test_credential

Prevents UnboundLocalError when Credential.get() throws (e.g.,
invalid credential_id) before provider is assigned.

* fix: reorder down migration to drop index before table

Removes duplicate REMOVE FIELD statement and reorders so the index
is dropped before the table, preventing rollback failures.

* refactor: simplify encryption key to always derive via SHA-256

Remove the dual code path in _ensure_fernet_key() that detected native
Fernet keys. Since the credential system is new, always deriving via
SHA-256 removes unnecessary complexity. Also removes the generate_key()
function and Fernet.generate_key() references from docs.

* fix: correct mock patch targets in embedding tests and URL validation

Fix embedding tests patching wrong module path for model_manager
(was targeting open_notebook.utils.embedding.model_manager but it's
imported locally from open_notebook.ai.models). Also fix URL validation
to allow unresolvable hostnames since they may be valid in the
deployment environment (e.g., Azure endpoints, internal DNS).

* feat: add global setup banner for encryption and migration status

Show a persistent banner in AppShell when encryption key is missing
(red) or env var API keys can be migrated (amber), so users see
these prompts on every page instead of only on Settings > API Keys.

Includes a docs link for the encryption banner and i18n support
across all 7 locales.

* docs: several improvements to docker-compose e env examples

* Update README.md

Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>

* docs: fix env var format in README and update model setup instructions

Align the encryption key snippet in README Step 2 with the list
format used in the compose file. Replace deprecated "Settings →
Models" instructions with credential-based Discover Models flow.

* fix: address credential system review issues

- Fix SSRF bypass via IPv4-mapped IPv6 addresses (::ffff:169.254.x.x)
- Fix TTS connection test missing config parameter
- Add Azure-specific model discovery using api-key auth header
- Add Vertex static model list for credential-based discovery
- Fix PROVIDER_DISCOVERY_FUNCTIONS incorrect azure/vertex mapping
- Extract business logic to api/credentials_service.py (service layer)
- Move credential Pydantic schemas to api/models.py
- Update tests to use new service imports and ValueError assertions

* fix: sanitize error responses and migrate key_provider to Credential

- Replace raw exception messages in all credential router 500 responses
  with generic error strings (internal details logged server-side only)
- Refactor key_provider.py to use Credential.get_by_provider() instead
  of deprecated ProviderConfig.get_instance()
- Remove unused functions (get_provider_configs, get_default_api_key,
  get_provider_config) that were dead code

---------

Co-authored-by: JFMD <git@jfmd.us>
Co-authored-by: OraCatQAQ <570768706@qq.com>

2026-02-10 08:30:22 -03:00

11 KiB

Raw Blame History

Utils Module

Utility functions and helpers for context building, text processing, chunking, embedding, tokenization, and versioning.

Purpose

Provides cross-cutting concerns: building LLM context from sources/insights, content-type aware text chunking, unified embedding generation with mean pooling, token counting, and version management.

Architecture Overview

Six core utilities:

context_builder.py: Flexible context assembly from sources, notes, insights with token budgeting
chunking.py: Content-type detection and smart text chunking for embedding operations
embedding.py: Unified embedding generation with mean pooling for large content
text_utils.py: Text cleaning and thinking content extraction
token_utils.py: Token counting for LLM context windows (wrapper around encoding library)
version_utils.py: Version parsing, comparison, and schema compatibility checks

Each utility is stateless and can be imported independently.

Configuration

Chunking Configuration (chunking.py)

The chunking behavior can be configured via environment variables:

OPEN_NOTEBOOK_CHUNK_SIZE: Maximum chunk size in characters (default: 1200)
- Minimum: 100 characters
- Warnings: Values > 8192 characters or invalid values
- Use case: Smaller models (e.g., mxbai-embed-large with limited context window)
OPEN_NOTEBOOK_CHUNK_OVERLAP: Overlap between chunks in characters (default: 15% of CHUNK_SIZE)
- Must be: >= 0 and < CHUNK_SIZE
- Warnings: Invalid values or values >= CHUNK_SIZE
- Use case: Control how much context is shared between adjacent chunks

Example for models with small context windows:

export OPEN_NOTEBOOK_CHUNK_SIZE=512
export OPEN_NOTEBOOK_CHUNK_OVERLAP=50

Note: Changes require restart of the application.

Component Catalog

context_builder.py

ContextItem: Dataclass for individual context piece (id, type, content, priority, token_count)
ContextConfig: Configuration for context building (sources/notes/insights selection, max tokens, priority weights)
ContextBuilder: Main class assembling context
- add_source(): Include source by ID with inclusion level
- add_note(): Include note by ID
- add_insight(): Include insight by ID
- build(): Assemble context respecting token budget and priorities
- Uses vector_search to fetch source/insight content from SurrealDB
- Returns list of ContextItem objects sorted by priority

Key behavior:

Token counting is automatic (calculated in ContextItem.post_init)
Max token enforcement via priority weighting (higher priority items included first)
Type-specific fetching: sources → Source.full_text, notes → Note.content, insights → SourceInsight.content
Raises DatabaseOperationError if source/note fetch fails

chunking.py

ContentType: Enum (HTML, MARKDOWN, PLAIN)
CHUNK_SIZE: Configurable via OPEN_NOTEBOOK_CHUNK_SIZE env var (default: 1200)
CHUNK_OVERLAP: Configurable via OPEN_NOTEBOOK_CHUNK_OVERLAP env var (default: 15% of CHUNK_SIZE)
detect_content_type_from_extension(file_path): Detect type from file extension
detect_content_type_from_heuristics(text): Detect type from content patterns (returns type + confidence)
detect_content_type(text, file_path): Combined detection (extension primary, heuristics fallback)
chunk_text(text, content_type, file_path): Split text using appropriate splitter

Key behavior:

Uses LangChain splitters: HTMLHeaderTextSplitter, MarkdownHeaderTextSplitter, RecursiveCharacterTextSplitter
Extension-based detection is primary; heuristics can override PLAIN extensions with 0.8+ confidence
Secondary chunking applied when HTML/Markdown splitters produce oversized chunks
Returns list of strings, each ≤ CHUNK_SIZE characters

embedding.py

mean_pool_embeddings(embeddings): Combine multiple embeddings via normalized mean pooling
generate_embeddings(texts): Batch embedding via single Esperanto API call
generate_embedding(text, content_type, file_path): Unified embedding with automatic chunking + mean pooling

Key behavior:

Uses model_manager.get_model("embedding") for embedding model
Short text (≤ CHUNK_SIZE): direct embedding
Long text: chunk → embed each → mean pool results
Mean pooling: normalize each → mean → normalize result (using numpy)
Raises ValueError for empty/whitespace-only text

text_utils.py

remove_non_ascii(text): Remove non-ASCII characters from text
remove_non_printable(text): Remove non-printable characters, preserving newlines/tabs
parse_thinking_content(content): Extract <think> tags content from AI responses
clean_thinking_content(content): Remove <think> blocks, return cleaned content only

Key behavior:

parse_thinking_content handles malformed output (missing opening <think> tag)
Large content (>100KB) bypasses thinking extraction for performance
Non-string input returns empty thinking and stringified content

token_utils.py

token_count(text): Returns estimated token count for string (via tiktoken)
token_cost(text, model): Calculate cost estimate for text with given model

Key behavior: Uses cl100k_base encoding; may differ slightly from actual model tokenization

version_utils.py

compare_versions(v1, v2): Returns -1 (v1 < v2), 0 (equal), 1 (v1 > v2)
get_installed_version(package): Get version of installed Python package
get_version_from_github(url): Fetch latest version from GitHub releases

Key behavior: Uses packaging library for version parsing; supports pre-release tags

Common Patterns

Dataclass-driven config: ContextConfig used by ContextBuilder (immutable after init)
Token budgeting: ContextBuilder respects max_tokens constraint; prioritizes high-priority items
Content-type aware processing: Chunking uses appropriate splitter based on detected content type
Mean pooling for large content: Embedding handles arbitrarily large text via chunking + pooling
Error handling resilience: token_count() returns estimate; context_builder catches DB errors gracefully
Pure text functions: text_utils functions are stateless utilities (no class needed)
Lazy evaluation: ContextBuilder doesn't fetch items until build() called
Type hints throughout: All functions use Optional, List, Dict for clarity

Key Dependencies

open_notebook.domain.notebook: Source, Note, SourceInsight models; vector_search function
open_notebook.ai.models: model_manager for embedding model access
open_notebook.exceptions: DatabaseOperationError, NotFoundError
langchain_text_splitters: HTMLHeaderTextSplitter, MarkdownHeaderTextSplitter, RecursiveCharacterTextSplitter
numpy: Mean pooling calculations
tiktoken: Token encoding for GPT models
loguru: Logging throughout

Important Quirks & Gotchas

Token count estimation: Uses cl100k_base encoding; may differ 5-10% from actual model tokens
Chunk size for Ollama: 1500 chars chosen to fit within Ollama embedding model context limits
Content type detection order: Extension checked first, then heuristics; high-confidence heuristics (≥0.8) can override PLAIN extensions
Mean pooling normalization: Each embedding normalized before mean, result normalized after
Priority weights default: If not specified, ContextConfig uses default weights (source=1, note=0.8, insight=1.2)
Vector search required: ContextBuilder assumes vector_search is available on Notebook model; fails if not
Circular import risk: context_builder imports from domain.notebook; avoid domain importing utils
Max tokens hard limit: ContextBuilder stops adding items once max_tokens exceeded (not prorated)
No caching: Every build() call re-fetches from database (use cache layer if needed)

How to Extend

Add new context source type: Create fetch method in ContextBuilder; update ContextConfig.sources dict
Add content type: Add to ContentType enum; create splitter getter; update chunk_text()
Change chunk size: Set OPEN_NOTEBOOK_CHUNK_SIZE and OPEN_NOTEBOOK_CHUNK_OVERLAP environment variables
Add text preprocessing: Add new function to text_utils (e.g., remove_urls, extract_keywords)
Change tokenization: Replace tiktoken with alternative library in token_utils; update all calls
Add context filtering: Extend ContextConfig with filter_by_date, filter_by_topic fields

Usage Examples

Chunking

from open_notebook.utils.chunking import chunk_text, detect_content_type, ContentType

# Auto-detect content type and chunk
chunks = chunk_text(long_text, file_path="document.md")

# Explicit content type
chunks = chunk_text(html_content, content_type=ContentType.HTML)

Embedding

from open_notebook.utils.embedding import generate_embedding, generate_embeddings

# Single text (handles chunking + mean pooling automatically)
embedding = await generate_embedding(long_text)

# Batch embedding (more efficient for multiple texts)
embeddings = await generate_embeddings(["text1", "text2", "text3"])

Context Building

from open_notebook.utils.context_builder import ContextBuilder, ContextConfig

config = ContextConfig(
    sources={"source:123": "full", "source:456": "summary"},
    max_tokens=2000,
)
builder = ContextBuilder(notebook, config)
context_items = await builder.build()

for item in context_items:
    print(f"{item.type}:{item.id} ({item.token_count} tokens)")

encryption.py

get_secret_from_env(var_name): Retrieve secret from environment with Docker secrets support (checks VAR_FILE first, then VAR)
get_fernet(): Get Fernet instance if encryption key is configured
encrypt_value(value): Encrypt a string using Fernet symmetric encryption
decrypt_value(value): Decrypt a Fernet-encrypted string; gracefully falls back to original value for legacy/unencrypted data Purpose: Provides field-level encryption for sensitive data (API keys) stored in the database. Uses Fernet symmetric encryption (AES-128-CBC with HMAC-SHA256) for authenticated encryption.

Key behavior:

Key source: OPEN_NOTEBOOK_ENCRYPTION_KEY_FILE (Docker secrets) → OPEN_NOTEBOOK_ENCRYPTION_KEY (env var)
Accepts any string: always derived to a Fernet key via SHA-256
No default key — encryption is unavailable until the env var is set
Graceful fallback on decryption: InvalidToken errors (legacy unencrypted data) return the original value
Lazy-loaded key: initialized on first use, not at import time

Security considerations:

OPEN_NOTEBOOK_ENCRYPTION_KEY must be set explicitly (no default)
Docker secrets pattern supported for secure key injection in containerized environments
Key rotation would require re-encrypting all stored keys (not currently implemented)
Encryption is transparent to callers; unencrypted legacy data continues to work

Usage Example:

from open_notebook.utils.encryption import encrypt_value, decrypt_value

# Encrypt before storing in database
encrypted_api_key = encrypt_value(api_key)

# Decrypt when reading from database
decrypted_api_key = decrypt_value(encrypted_api_key)

# Set any string as encryption key:
# OPEN_NOTEBOOK_ENCRYPTION_KEY=my-secret-passphrase

11 KiB Raw Blame History

Utils Module

Purpose

Architecture Overview

Configuration

Chunking Configuration (chunking.py)

Component Catalog

context_builder.py

chunking.py

embedding.py

text_utils.py

token_utils.py

version_utils.py

Common Patterns

Key Dependencies

Important Quirks & Gotchas

How to Extend

Usage Examples

Chunking

Embedding

Context Building

encryption.py

11 KiB

Raw Blame History