* feat: replace provider config with credential-based system (#477) Introduce a new credential management system replacing the old ProviderConfig singleton and standalone Models page. Each credential stores encrypted API keys and provider-specific configuration with full CRUD support via a unified settings UI. Backend: - Add Credential domain model with encrypted API key storage - Add credentials API router (CRUD, discovery, registration, testing) - Add encryption utilities for secure key storage - Add key_provider for DB-first env-var fallback provisioning - Add connection tester and model discovery services - Integrate ModelManager with credential-based config - Add provider name normalization for Esperanto compatibility - Add database migrations 11-12 for credential schema Frontend: - Rewrite settings/api-keys page with credential management UI - Add model discovery dialog with search and custom model support - Add compact default model assignments (primary/advanced layout) - Add inline model testing and credential connection testing - Add env-var migration banner - Update navigation to unified settings page - Remove standalone models page and old settings components i18n: - Update all 7 locale files with credential and model management keys Closes #477 Co-Authored-By: JFMD <git@jfmd.us> Co-Authored-By: OraCatQAQ <570768706@qq.com> * fix: address PR #540 review comments - Fix docs referencing removed Models page - Fix error-handler returning raw messages instead of i18n keys - Fix auth.py misleading docstring and missing no-password guard - Fix connection_tester using wrong env var for openai_compatible - Add provision_provider_keys before model discovery/sync - Update CLAUDE.md to reflect credential-based system - Fix missing closing brace in api-keys page useEffect * fix: add logging to credential migration and surface errors in UI - Add comprehensive logging to migrate-from-env and migrate-from-provider-config endpoints (start, per-provider progress, success/failure with stack traces, final summary) - Fix frontend migration hooks ignoring errors array from response - Show error toast when migration fails instead of "nothing to migrate" - Invalidate status/envStatus queries after migration so banner updates * docs: update CLAUDE.md files for credential system Replace stale ProviderConfig and /api-keys/ references across 8 CLAUDE.md files to reflect the new Credential-based system from PR #540. * docs: update user documentation for credential-based system Replace env var API key instructions with Settings UI credential workflow across all user-facing documentation. The new flow is: set OPEN_NOTEBOOK_ENCRYPTION_KEY → start services → add credential in Settings UI → test → discover models → register. - Rewrite ai-providers.md, api-configuration.md, environment-reference.md - Update all quick-start guides and installation docs - Update ollama.md, openai-compatible.md, local-tts/stt networking sections - Update reverse-proxy.md, development-setup.md, security.md - Fix broken links to non-existent docs/deployment/ paths - Add credentials endpoints to api-reference.md - Move all API key env vars to deprecated/legacy sections * chore: bump version to 1.7.0-rc1 Release candidate for credential-based provider management system. * fix: initialize provider before try block in test_credential Prevents UnboundLocalError when Credential.get() throws (e.g., invalid credential_id) before provider is assigned. * fix: reorder down migration to drop index before table Removes duplicate REMOVE FIELD statement and reorders so the index is dropped before the table, preventing rollback failures. * refactor: simplify encryption key to always derive via SHA-256 Remove the dual code path in _ensure_fernet_key() that detected native Fernet keys. Since the credential system is new, always deriving via SHA-256 removes unnecessary complexity. Also removes the generate_key() function and Fernet.generate_key() references from docs. * fix: correct mock patch targets in embedding tests and URL validation Fix embedding tests patching wrong module path for model_manager (was targeting open_notebook.utils.embedding.model_manager but it's imported locally from open_notebook.ai.models). Also fix URL validation to allow unresolvable hostnames since they may be valid in the deployment environment (e.g., Azure endpoints, internal DNS). * feat: add global setup banner for encryption and migration status Show a persistent banner in AppShell when encryption key is missing (red) or env var API keys can be migrated (amber), so users see these prompts on every page instead of only on Settings > API Keys. Includes a docs link for the encryption banner and i18n support across all 7 locales. * docs: several improvements to docker-compose e env examples * Update README.md Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com> * docs: fix env var format in README and update model setup instructions Align the encryption key snippet in README Step 2 with the list format used in the compose file. Replace deprecated "Settings → Models" instructions with credential-based Discover Models flow. * fix: address credential system review issues - Fix SSRF bypass via IPv4-mapped IPv6 addresses (::ffff:169.254.x.x) - Fix TTS connection test missing config parameter - Add Azure-specific model discovery using api-key auth header - Add Vertex static model list for credential-based discovery - Fix PROVIDER_DISCOVERY_FUNCTIONS incorrect azure/vertex mapping - Extract business logic to api/credentials_service.py (service layer) - Move credential Pydantic schemas to api/models.py - Update tests to use new service imports and ValueError assertions * fix: sanitize error responses and migrate key_provider to Credential - Replace raw exception messages in all credential router 500 responses with generic error strings (internal details logged server-side only) - Refactor key_provider.py to use Credential.get_by_provider() instead of deprecated ProviderConfig.get_instance() - Remove unused functions (get_provider_configs, get_default_api_key, get_provider_config) that were dead code --------- Co-authored-by: JFMD <git@jfmd.us> Co-authored-by: OraCatQAQ <570768706@qq.com>
16 KiB
AI Module
Model configuration, provisioning, and management for multi-provider AI integration via Esperanto.
Purpose
Centralizes AI model lifecycle: database models for model metadata (provider, type), default model configuration, and factory for instantiating LLM/embedding/speech models at runtime with fallback logic.
Architecture Overview
Two-tier system:
- Database models (
Model,DefaultModels): Metadata storage and default configuration - ModelManager: Factory for provisioning models with intelligent fallback (large context detection, config override)
All models use Esperanto library as provider abstraction (OpenAI, Anthropic, Google, Groq, Ollama, Mistral, DeepSeek, xAI, OpenRouter).
Component Catalog
models.py
Model (ObjectModel)
- Database record: name, provider, type (language/embedding/speech_to_text/text_to_speech), credential (optional link to Credential record)
get_models_by_type(): Async query to fetch all models of a specific typeget_credential_obj(): Fetches linked Credential object (if credential field set)get_by_credential(credential_id): Class method to find all models linked to a credential- Stores provider-model pairs for AI factory instantiation
DefaultModels (RecordModel)
- Singleton configuration record (record_id:
open_notebook:default_models) - Fields: default_chat_model, default_transformation_model, large_context_model, default_text_to_speech_model, default_speech_to_text_model, default_embedding_model, default_tools_model
get_instance(): Always fetches fresh from database (overrides parent caching for real-time updates)- Returns fresh instance on each call (no singleton cache)
ModelManager
- Stateless factory for instantiating AI models
get_model(model_id): Retrieves Model by ID; if model has linked credential, usescredential.to_esperanto_config()for provider config; otherwise falls back to env var provisioning viakey_providerget_defaults(): Fetches DefaultModels configurationget_default_model(model_type): Smart lookup (e.g., "chat" → default_chat_model, "transformation" → default_transformation_model with fallback to chat)get_speech_to_text(),get_text_to_speech(),get_embedding_model(): Type-specific convenience methods with assertions- Global instance:
model_managersingleton exported for use throughout app
provision.py
provision_langchain_model()
- Factory for LangGraph nodes needing LLM provisioning
- Smart fallback logic:
- If tokens > 105,000: Use
large_context_model - Elif
model_idspecified: Use specific model - Else: Use default model for type (e.g., "chat", "transformation")
- If tokens > 105,000: Use
- Returns LangChain-compatible model via
.to_langchain() - Logs model selection decision
key_provider.py
API Key Provider (Credential→Env Fallback)
- Purpose: Provides API keys from database first, falls back to environment variables
- Pattern: Before Esperanto creates a model, keys are loaded from
Credentialrecords and set as environment variables - Integration point: Called by
ModelManager.get_model()as fallback when model has no linked credential
Key Functions
get_api_key(provider): Get single API key (DB first, then env var)provision_provider_keys(provider): Set env vars from DB config for a providerprovision_all_keys(): Load all provider keys from DB into env vars (useful at startup)
Provider Configuration Maps
PROVIDER_CONFIG: Simple providers (openai, anthropic, google, groq, etc.)VERTEX_CONFIG: Google Vertex AI (project, location, credentials)AZURE_CONFIG: Azure OpenAI (api_key, endpoint, api_version, mode-specific endpoints)OPENAI_COMPATIBLE_CONFIG: Generic OpenAI-compatible (generic + mode-specific for LLM/EMBEDDING/STT/TTS)
Common Patterns
- Type dispatch: Model.type field drives factory logic (4 model types)
- Provider abstraction: Esperanto handles provider differences; ModelManager unaware of provider specifics
- Fresh defaults: DefaultModels.get_instance() always fetches from database (not cached) for live config updates
- Config override: provision_langchain_model() accepts kwargs passed to AIFactory.create_* methods
- Token-based selection: provision_langchain_model() detects large contexts and upgrades model automatically
- Type assertions: get_speech_to_text(), get_embedding_model() assert returned type (safety check)
- Credential→Env fallback: If model has linked credential, config from
credential.to_esperanto_config()is used directly; otherwise keys checked in database via key_provider, then environment variables; enables UI-based key management while maintaining backward compatibility
Key Dependencies
esperanto: AIFactory.create_language(), create_embedding(), create_speech_to_text(), create_text_to_speech()open_notebook.database.repository: repo_query, ensure_record_idopen_notebook.domain.base: ObjectModel, RecordModel base classesopen_notebook.domain.credential: Credential for database-stored API keysopen_notebook.utils: token_count() for context size detectionloguru: Logging for model selection decisions
Important Quirks & Gotchas
- Token counting rough estimate: provision_langchain_model() uses token_count() which estimates via cl100k_base encoding (may differ 5-10% from actual model)
- Large context threshold hard-coded: 105,000 token threshold for large_context_model upgrade (not configurable)
- DefaultModels.get_instance() fresh fetch: Intentionally bypasses parent singleton cache to pick up live config changes; creates new instance each call
- Type-specific getters use assertions: get_speech_to_text() asserts isinstance (catches misconfiguration early)
- No validation of model existence: ModelManager.get_model() raises ValueError if model not found (not caught upstream)
- Esperanto caching: Actual model instances cached by Esperanto (not by ModelManager); ModelManager stateless
- Fallback chain specificity: "transformation" type falls back to default_chat_model if not explicitly set (convention-based)
- kwargs passed through: provision_langchain_model() passes kwargs to AIFactory but doesn't validate what's accepted
- Key provider sets env vars:
provision_provider_keys()modifiesos.environto inject DB-stored keys (fromCredentialrecords); Esperanto reads from env vars (only used as fallback when model has no linked credential)
How to Extend
- Add new model type: Add type string to Model.type enum, add create_* method in AIFactory, handle in ModelManager.get_model()
- Add new default configuration: Extend DefaultModels with new field (e.g., default_vision_model), add getter in ModelManager
- Change fallback logic: Modify provision_langchain_model() token threshold or fallback chain
- Add model filtering: Extend Model.get_models_by_type() with additional filters (e.g., by provider)
- Implement model caching: Wrap ModelManager methods with functools.lru_cache (be aware of kwargs mutability)
Usage Example
from open_notebook.ai.models import model_manager
# Get default chat model
chat_model = await model_manager.get_default_model("chat")
# Get specific model by ID
embedding_model = await model_manager.get_model("model:openai_embedding")
# Get embedding model with config override
embedding_model = await model_manager.get_embedding_model(temperature=0.1)
# Provision model for LangGraph (auto-detects large context)
from open_notebook.ai.provision import provision_langchain_model
langchain_model = await provision_langchain_model(
content=long_text,
model_id=None, # Use default
default_type="chat",
temperature=0.7
)
Connection Testing (connection_tester.py)
Purpose
Provides functionality to test if a provider's API key is valid by making minimal API calls. Used by the API Configuration UI to validate user-entered credentials before saving.
test_provider_connection()
Main entry point for testing provider connectivity.
async def test_provider_connection(
provider: str, model_type: str = "language",
config_id: Optional[str] = None
) -> Tuple[bool, str]
Returns: (success: bool, message: str) - Success status and human-readable message.
Flow:
- If
config_idprovided: Loads credential viaCredential.get(config_id), usescredential.to_esperanto_config()for provider config - Looks up test model from
TEST_MODELSdict - For URL-based providers (ollama, openai_compatible): Tests server connectivity
- For Azure: Tests
/openai/modelsendpoint with api_version - For API-based providers: Creates minimal model via Esperanto and makes test call
- Returns user-friendly error messages for common failures
test_individual_model()
Tests a specific Model instance by loading its linked credential (if any) and making a minimal API call.
TEST_MODELS Configuration
Maps each provider to (model_name, model_type) for testing:
TEST_MODELS = {
"openai": ("gpt-3.5-turbo", "language"),
"anthropic": ("claude-3-haiku-20240307", "language"),
"google": ("gemini-1.5-flash", "language"),
"groq": ("llama-3.1-8b-instant", "language"),
"voyage": ("voyage-3-lite", "embedding"),
"elevenlabs": ("eleven_multilingual_v2", "text_to_speech"),
"ollama": (None, "language"), # Dynamic
# ... more providers
}
Special Provider Handlers
_test_ollama_connection(base_url): Tests Ollama server via/api/tagsendpoint, returns model count_test_openai_compatible_connection(base_url, api_key): Tests OpenAI-compatible servers via/modelsendpoint_get_ollama_models(base_url): Fetches available models from Ollama server
Error Message Normalization
The tester normalizes error messages for user-friendly display:
401/unauthorized-> "Invalid API key"403/forbidden-> "API key lacks required permissions"rate limit-> "Rate limited - but connection works" (success)model not found-> "API key valid (test model not available)" (success)- Connection/timeout errors -> Helpful troubleshooting messages
Key Provider (key_provider.py)
Purpose
Unified interface for retrieving API keys with database-first, environment-fallback strategy. Enables UI-based key management while maintaining backward compatibility with .env files. Used as fallback when models don't have a directly linked credential.
Core Functions
get_api_key(provider)
async def get_api_key(provider: str) -> Optional[str]
Gets API key for a provider. Checks database (Credential records) first, then environment variable.
Fallback Chain:
- Query
Credentialrecords from database for the given provider - Get api_key from default credential
- Handle
SecretStr(call.get_secret_value()) vs regular strings - If DB value exists and is non-empty, return it
- Otherwise, return
os.environ.get(env_var)
provision_provider_keys(provider)
async def provision_provider_keys(provider: str) -> bool
Main entry point for DB->Env fallback. Sets environment variables from database config for a provider. Called before model provisioning to ensure Esperanto can read keys from env vars.
Returns: True if any keys were set from database.
Usage:
# Before creating a model, ensure DB keys are in env vars
await provision_provider_keys("openai")
model = AIFactory.create_language(model_name="gpt-4", provider="openai")
provision_all_keys()
async def provision_all_keys() -> dict[str, bool]
Provisions all providers at once. Useful at application startup.
Provider Configuration Maps
PROVIDER_CONFIG (Simple Providers)
Single-field providers with API key only:
PROVIDER_CONFIG = {
"openai": {"env_var": "OPENAI_API_KEY", "config_field": "openai_api_key"},
"anthropic": {"env_var": "ANTHROPIC_API_KEY", "config_field": "anthropic_api_key"},
"google": {"env_var": "GOOGLE_API_KEY", "config_field": "google_api_key"},
"groq": {"env_var": "GROQ_API_KEY", "config_field": "groq_api_key"},
"mistral": {"env_var": "MISTRAL_API_KEY", "config_field": "mistral_api_key"},
"deepseek": {"env_var": "DEEPSEEK_API_KEY", "config_field": "deepseek_api_key"},
"xai": {"env_var": "XAI_API_KEY", "config_field": "xai_api_key"},
"openrouter": {"env_var": "OPENROUTER_API_KEY", "config_field": "openrouter_api_key"},
"voyage": {"env_var": "VOYAGE_API_KEY", "config_field": "voyage_api_key"},
"elevenlabs": {"env_var": "ELEVENLABS_API_KEY", "config_field": "elevenlabs_api_key"},
"ollama": {"env_var": "OLLAMA_API_BASE", "config_field": "ollama_api_base"},
}
VERTEX_CONFIG (Google Vertex AI)
Multi-field configuration for Vertex AI:
VERTEX_CONFIG = {
"project": {"env_var": "VERTEX_PROJECT", "config_field": "vertex_project"},
"location": {"env_var": "VERTEX_LOCATION", "config_field": "vertex_location"},
"credentials": {"env_var": "GOOGLE_APPLICATION_CREDENTIALS", "config_field": "google_application_credentials"},
}
AZURE_CONFIG (Azure OpenAI)
Generic and mode-specific endpoints for Azure:
AZURE_CONFIG = {
"api_key": {"env_var": "AZURE_OPENAI_API_KEY", "config_field": "azure_openai_api_key"},
"api_version": {"env_var": "AZURE_OPENAI_API_VERSION", "config_field": "azure_openai_api_version"},
"endpoint": {"env_var": "AZURE_OPENAI_ENDPOINT", "config_field": "azure_openai_endpoint"},
# Mode-specific endpoints
"endpoint_llm": {"env_var": "AZURE_OPENAI_ENDPOINT_LLM", "config_field": "azure_openai_endpoint_llm"},
"endpoint_embedding": {"env_var": "AZURE_OPENAI_ENDPOINT_EMBEDDING", "config_field": "azure_openai_endpoint_embedding"},
"endpoint_stt": {"env_var": "AZURE_OPENAI_ENDPOINT_STT", "config_field": "azure_openai_endpoint_stt"},
"endpoint_tts": {"env_var": "AZURE_OPENAI_ENDPOINT_TTS", "config_field": "azure_openai_endpoint_tts"},
}
OPENAI_COMPATIBLE_CONFIG
Generic and mode-specific configuration for OpenAI-compatible providers:
OPENAI_COMPATIBLE_CONFIG = {
# Generic
"api_key": {"env_var": "OPENAI_COMPATIBLE_API_KEY", "config_field": "openai_compatible_api_key"},
"base_url": {"env_var": "OPENAI_COMPATIBLE_BASE_URL", "config_field": "openai_compatible_base_url"},
# Mode-specific: LLM, Embedding, STT, TTS
"api_key_llm": {"env_var": "OPENAI_COMPATIBLE_API_KEY_LLM", "config_field": "openai_compatible_api_key_llm"},
"base_url_llm": {"env_var": "OPENAI_COMPATIBLE_BASE_URL_LLM", "config_field": "openai_compatible_base_url_llm"},
# ... similar for embedding, stt, tts
}
Internal Helper Functions
_provision_simple_provider(provider): Sets single env var for simple providers_provision_vertex(): Sets all Vertex AI env vars_provision_azure(): Sets all Azure OpenAI env vars (handles SecretStr)_provision_openai_compatible(): Sets all OpenAI-compatible env vars
Integration with ModelManager
The credential system integrates with model provisioning in two ways:
- Credential-linked models (preferred): Model has
credentialfield pointing to a Credential record.ModelManager.get_model()callscredential.to_esperanto_config()and passes config directly to Esperanto'sAIFactory.create_*methods - Env var fallback: If model has no linked credential,
provision_provider_keys(provider)sets env vars from DB credentials; Esperanto reads from env vars - ConnectionTester loads Credential directly via
Credential.get(config_id)for testing
The credential-linked approach is preferred as it allows multiple credentials per provider and avoids env var mutation.