mirror of
https://github.com/lfnovo/open-notebook.git
synced 2026-04-28 11:30:00 +00:00
Some checks are pending
Development Build / extract-version (push) Waiting to run
Development Build / build-regular (push) Blocked by required conditions
Development Build / build-single (push) Blocked by required conditions
Development Build / summary (push) Blocked by required conditions
Tests / Backend Tests (push) Waiting to run
Tests / Frontend Tests (push) Waiting to run
* docs: update CLAUDE.md and user docs for error handling and podcast retry Add missing documentation for features introduced in v1.7.2 (#590) and v1.7.3 (#595): error classification system, global exception handlers, ConfigurationError, podcast failure recovery, and retry endpoint. * chore: update uv.lock
79 lines
4.4 KiB
Markdown
79 lines
4.4 KiB
Markdown
# Graphs Module
|
|
|
|
LangGraph-based workflow orchestration for content processing, chat interactions, and AI-powered transformations.
|
|
|
|
## Key Components
|
|
|
|
- **`chat.py`**: Conversational agent with message history, notebook context, and model override support
|
|
- **`source_chat.py`**: Source-focused chat with ContextBuilder for insights/content injection and context tracking
|
|
- **`ask.py`**: Multi-search strategy agent (generates search terms, retrieves results, synthesizes answers)
|
|
- **`source.py`**: Content ingestion pipeline (extract → save → transform with content-core)
|
|
- **`transformation.py`**: Single-node transformation executor with prompt templating via ai_prompter
|
|
- **`prompt.py`**: Generic pattern chain for arbitrary prompt-based LLM calls
|
|
- **`tools.py`**: Minimal tool library (currently just `get_current_timestamp()`)
|
|
|
|
## Important Patterns
|
|
|
|
- **Async/sync bridging in graphs**: Both `chat.py` and `source_chat.py` use `asyncio.new_event_loop()` workaround because LangGraph nodes are sync but `provision_langchain_model()` is async
|
|
- **State machines via StateGraph**: Each graph compiles to stateful runnable; conditional edges fan out work (ask.py, source.py do parallel transforms)
|
|
- **Prompt templating**: `ai_prompter.Prompter` with Jinja2 templates referenced by path ("chat/system", "ask/entry", etc.)
|
|
- **Model provisioning via context**: Config dict passed to node via `RunnableConfig`; defaults fall back to state overrides
|
|
- **Checkpointing**: `chat.py` and `source_chat.py` use SqliteSaver for message history (LangGraph's built-in persistence)
|
|
- **Content extraction**: `source.py` uses content-core library with provider/model from DefaultModels; URLs and files both supported
|
|
|
|
## Error Handling in Graphs
|
|
|
|
All graph nodes use `classify_error()` from `open_notebook.utils.error_classifier` to catch raw LLM provider exceptions and re-raise them as typed `OpenNotebookError` subclasses with user-friendly messages. This ensures that errors from any AI provider (authentication failures, rate limits, model not found, network issues) are surfaced to the user with actionable messages instead of opaque stack traces.
|
|
|
|
**Pattern in nodes**:
|
|
```python
|
|
from open_notebook.utils.error_classifier import classify_error
|
|
|
|
try:
|
|
result = await model.ainvoke(...)
|
|
except Exception as e:
|
|
exc_class, message = classify_error(e)
|
|
raise exc_class(message) from e
|
|
```
|
|
|
|
---
|
|
|
|
## Quirks & Edge Cases
|
|
|
|
- **Async loop gymnastics**: ThreadPoolExecutor workaround needed because LangGraph invokes sync nodes but we call async functions; fragile if event loop state changes
|
|
- **`clean_thinking_content()` ubiquitous**: Strips `<think>...</think>` tags from model responses (handles extended thinking models)
|
|
- **source_chat.py builds context twice**: ContextBuilder runs during node execution to fetch source/insights; rebuilds list from context_data (inefficient but safe)
|
|
- **source.py embedding is async**: `source.vectorize()` returns job command ID; not awaited (fire-and-forget)
|
|
- **transformation.py nullable source**: Accepts `input_text` or `source.full_text` (falls back to second if first missing)
|
|
- **ask.py hard-coded vector_search**: No fallback to text search despite commented code suggesting it was planned
|
|
- **SqliteSaver location**: Checkpoints stored in path from `LANGGRAPH_CHECKPOINT_FILE` env var; connection shared across graphs
|
|
|
|
## Key Dependencies
|
|
|
|
- `langgraph`: StateGraph, Send, END, START, SqliteSaver checkpoint persistence
|
|
- `langchain_core`: Messages, OutputParser, RunnableConfig
|
|
- `ai_prompter`: Prompter for Jinja2 template rendering
|
|
- `content_core`: `extract_content()` for file/URL processing
|
|
- `open_notebook.ai.provision`: `provision_langchain_model()` (async factory with fallback logic)
|
|
- `open_notebook.utils.error_classifier`: `classify_error()` for user-friendly LLM error messages
|
|
- `open_notebook.domain.notebook`: Domain models (Source, Note, SourceInsight, vector_search)
|
|
- `loguru`: Logging
|
|
|
|
## Usage Example
|
|
|
|
```python
|
|
# Invoke a graph with config override
|
|
config = {"configurable": {"model_id": "model:custom_id"}}
|
|
result = await chat_graph.ainvoke(
|
|
{"messages": [HumanMessage(content="...")], "notebook": notebook},
|
|
config=config
|
|
)
|
|
|
|
# Source processing (content → save → transform)
|
|
result = await source_graph.ainvoke({
|
|
"content_state": {...}, # ProcessSourceState from content-core
|
|
"apply_transformations": [t1, t2],
|
|
"source_id": "source:123",
|
|
"embed": True
|
|
})
|
|
```
|