docs: generate comprehensive CLAUDE.md reference documentation across codebase

Create a hierarchical CLAUDE.md documentation system for the entire Open Notebook
codebase with focus on concise, pattern-driven reference cards rather than
comprehensive tutorials.

## Changes

### Core Documentation System
- Updated `.claude/commands/build-claude-md.md` to distinguish between leaf and
  parent modules, with special handling for prompt/template modules
- Established clear patterns:
  * Leaf modules (40-70 lines): Components, hooks, API clients
  * Parent modules (50-150 lines): Architecture, cross-layer patterns, data flows
  * Template modules: Pattern focus, not catalog listings

### Generated Documentation
Created 15 CLAUDE.md reference files across the project:

**Frontend (React/Next.js)**
- frontend/src/CLAUDE.md: Architecture overview, data flow, three-tier design
- frontend/src/lib/hooks/CLAUDE.md: React Query patterns, state management
- frontend/src/lib/api/CLAUDE.md: Axios client, FormData handling, interceptors
- frontend/src/lib/stores/CLAUDE.md: Zustand state persistence, auth patterns
- frontend/src/components/ui/CLAUDE.md: Radix UI primitives, CVA styling

**Backend (Python/FastAPI)**
- open_notebook/CLAUDE.md: System architecture, layer interactions
- open_notebook/ai/CLAUDE.md: Model provisioning, Esperanto integration
- open_notebook/domain/CLAUDE.md: Data models, ObjectModel/RecordModel patterns
- open_notebook/database/CLAUDE.md: Repository pattern, async migrations
- open_notebook/graphs/CLAUDE.md: LangGraph workflows, async orchestration
- open_notebook/utils/CLAUDE.md: Cross-cutting utilities, context building
- open_notebook/podcasts/CLAUDE.md: Episode/speaker profiles, job tracking

**API & Other**
- api/CLAUDE.md: REST layer, service architecture
- commands/CLAUDE.md: Async command handlers, job queue patterns
- prompts/CLAUDE.md: Jinja2 templates, prompt engineering patterns (refactored)

**Project Root**
- CLAUDE.md: Project overview, three-tier architecture, tech stack, getting started

### Key Features
- Zero duplication: Parent modules reference child CLAUDE.md files, don't repeat them
- Pattern-focused: Emphasizes how components work together, not component catalogs
- Scannable: Short bullets, code examples only when necessary (1-2 per file)
- Practical: "How to extend" guides, quirks/gotchas for each module
- Navigation: Root CLAUDE.md acts as hub pointing to specialized documentation

### Cleanup
- Removed unused `batch_fix_services.py`
- Removed deprecated `open_notebook/plugins/podcasts.py`
- Updated .gitignore for documentation consistency

## Impact
New contributors can now:
1. Read root CLAUDE.md for system architecture (5 min)
2. Jump to specific layer documentation (frontend, api, open_notebook)
3. Dive into module-specific patterns in child CLAUDE.md files (1 min per module)
All documentation is lean, reference-focused, and avoids duplication.
This commit is contained in:
LUIS NOVO 2026-01-03 16:27:52 -03:00
parent ab5560c9a2
commit 71b8d13b24
19 changed files with 1949 additions and 372 deletions

6
.gitignore vendored
View file

@ -133,4 +133,8 @@ doc_exports/
specs/
.claude
.playwright-mcp/
.playwright-mcp/
**/*.local.md

351
CLAUDE.md
View file

@ -1,3 +1,352 @@
# Open Notebook - Root CLAUDE.md
We have a good amount of documentation on this project on the ./docs folder. Please read through them when necessary, and always review the docs/index.md file before starting a new feature so you know at least which docs are available.
This file provides architectural guidance for contributors working on Open Notebook at the project level.
## Project Overview
**Open Notebook** is an open-source, privacy-focused alternative to Google's Notebook LM. It's an AI-powered research assistant enabling users to upload multi-modal content (PDFs, audio, video, web pages), generate intelligent notes, search semantically, chat with AI models, and produce professional podcasts—all with complete control over data and choice of AI providers.
**Key Values**: Privacy-first, multi-provider AI support, fully self-hosted option, open-source transparency.
---
## Three-Tier Architecture
```
┌─────────────────────────────────────────────────────────┐
│ Frontend (React/Next.js) │
│ frontend/ @ port 3000 │
├─────────────────────────────────────────────────────────┤
│ - Notebooks, sources, notes, chat, podcasts, search UI │
│ - Zustand state management, TanStack Query (React Query)│
│ - Shadcn/ui component library with Tailwind CSS │
└────────────────────────┬────────────────────────────────┘
│ HTTP REST
┌────────────────────────▼────────────────────────────────┐
│ API (FastAPI) │
│ api/ @ port 5055 │
├─────────────────────────────────────────────────────────┤
│ - REST endpoints for notebooks, sources, notes, chat │
│ - LangGraph workflow orchestration │
│ - Job queue for async operations (podcasts) │
│ - Multi-provider AI provisioning via Esperanto │
└────────────────────────┬────────────────────────────────┘
│ SurrealQL
┌────────────────────────▼────────────────────────────────┐
│ Database (SurrealDB) │
│ Graph database @ port 8000 │
├─────────────────────────────────────────────────────────┤
│ - Records: Notebook, Source, Note, ChatSession, etc. │
│ - Relationships: source-to-notebook, note-to-source │
│ - Vector embeddings for semantic search │
└─────────────────────────────────────────────────────────┘
```
---
## Tech Stack
### Frontend (`frontend/`)
- **Framework**: Next.js 15 (React 19)
- **Language**: TypeScript
- **State Management**: Zustand
- **Data Fetching**: TanStack Query (React Query)
- **Styling**: Tailwind CSS + Shadcn/ui
- **Build Tool**: Webpack (via Next.js)
### API Backend (`api/` + `open_notebook/`)
- **Framework**: FastAPI 0.104+
- **Language**: Python 3.11+
- **Workflows**: LangGraph state machines
- **Database**: SurrealDB async driver
- **AI Providers**: Esperanto library (8+ providers: OpenAI, Anthropic, Google, Groq, Ollama, Mistral, DeepSeek, xAI)
- **Job Queue**: Surreal-Commands for async jobs (podcasts)
- **Logging**: Loguru
- **Validation**: Pydantic v2
- **Testing**: Pytest
### Database
- **SurrealDB**: Graph database with built-in embedding storage and vector search
- **Schema Migrations**: Automatic on API startup via AsyncMigrationManager
### Additional Services
- **Content Processing**: content-core library (file/URL extraction)
- **Prompts**: AI-Prompter with Jinja2 templating
- **Podcast Generation**: podcast-creator library
- **Embeddings**: Multi-provider via Esperanto
---
## Directory Structure
```
open-notebook/
├── frontend/ # React/Next.js UI
│ ├── src/
│ │ ├── app/ # Next.js app router
│ │ ├── components/ # React components
│ │ ├── hooks/ # Custom React hooks
│ │ ├── lib/ # Utilities
│ │ └── styles/ # Global styles
│ ├── package.json # Node dependencies
│ └── CLAUDE.md # Frontend-specific guidance
├── api/ # FastAPI REST layer
│ ├── routers/ # HTTP endpoints
│ ├── services/ # Business logic
│ ├── models.py # Request/response schemas
│ ├── main.py # FastAPI app + lifespan
│ └── CLAUDE.md # API-specific guidance
├── open_notebook/ # Python backend core (domain + workflows)
│ ├── domain/ # Data models (Notebook, Source, Note, etc.)
│ ├── database/ # SurrealDB async repository & migrations
│ ├── graphs/ # LangGraph workflows (chat, ask, source)
│ ├── ai/ # ModelManager, AI provider provisioning
│ ├── utils/ # Context builders, token utils
│ ├── podcasts/ # Podcast models & generation
│ ├── config.py # Configuration & paths
│ ├── exceptions.py # Error hierarchy
│ └── CLAUDE.md # Backend core guidance
├── prompts/ # Jinja2 prompt templates
│ ├── chat/ # Chat prompt templates
│ ├── ask/ # Search/synthesis prompts
│ ├── podcast/ # Podcast outline & transcript
│ └── source_chat/ # Source-specific chat
├── migrations/ # SurrealDB schema migrations
│ ├── 001_*.surql # Initial schema
│ └── ...
├── tests/ # Python unit & integration tests
│ ├── test_domain.py
│ ├── test_graphs.py
│ └── conftest.py
├── commands/ # CLI utilities
├── docs/ # User & deployment documentation
├── scripts/ # Utility scripts
├── setup_guide/ # Setup guides
├── docker-compose.yml # Multi-container orchestration
├── Dockerfile # API container image
├── Makefile # Development commands
├── pyproject.toml # Python project config
├── README.md # Project README
├── CLAUDE.md # This file
└── CLAUDE.md # Root project guidance (THIS FILE)
```
---
## Getting Started
### 1. Clone & Install
```bash
git clone https://github.com/lfnovo/open-notebook.git
cd open-notebook
# Python dependencies
uv sync
# Frontend dependencies
cd frontend
npm install
cd ..
```
### 2. Environment Setup
```bash
cp .env.example .env
# Edit .env with your API keys (OpenAI, Anthropic, etc.)
```
### 3. Start Services
```bash
# Terminal 1: Start SurrealDB
make database
# Terminal 2: Start API (port 5055)
make api
# or: uv run --env-file .env uvicorn api.main:app --host 0.0.0.0 --port 5055
# Terminal 3: Start Frontend (port 3000)
cd frontend && npm run dev
# Full stack (development)
make start-all
```
### 4. Verify
- Frontend: http://localhost:3000
- API docs: http://localhost:5055/docs
- SurrealDB: http://localhost:8000
---
## Development Workflow
### Key Commands
```bash
# Code quality
make ruff # Lint + auto-fix Python
make lint # Type checking (mypy)
# Testing
uv run pytest tests/
# Database migrations (auto-run on API startup)
# Manual check: API logs show "Running migration X"
# Docker
make docker-build # Build multi-platform image
docker compose --profile multi up # Full stack in Docker
```
### Code Style
- **Python**: Ruff (auto-fix), mypy (type checking)
- **TypeScript**: ESLint config provided
- **Commits**: Conventional commits (feat:, fix:, docs:, refactor:)
- **Git Flow**: Feature branches from `main`
---
## Architecture Highlights
### 1. Async-First Design
- All database queries, graph invocations, and API calls are async (await)
- SurrealDB async driver with connection pooling
- FastAPI handles concurrent requests efficiently
### 2. LangGraph Workflows
- **source.py**: Content ingestion (extract → embed → save)
- **chat.py**: Conversational agent with message history
- **ask.py**: Search + synthesis (retrieve relevant sources → LLM)
- **transformation.py**: Custom transformations on sources
- All use `provision_langchain_model()` for smart model selection
### 3. Multi-Provider AI
- **Esperanto library**: Unified interface to 8+ AI providers
- **ModelManager**: Factory pattern with fallback logic
- **Smart selection**: Detects large contexts, prefers long-context models
- **Override support**: Per-request model configuration
### 4. Database Schema
- **Automatic migrations**: AsyncMigrationManager runs on API startup
- **SurrealDB graph model**: Records with relationships and embeddings
- **Vector search**: Built-in semantic search across all content
- **Transactions**: Repo functions handle ACID operations
### 5. Authentication
- **Current**: Simple password middleware (insecure, dev-only)
- **Production**: Replace with OAuth/JWT (see CONFIGURATION.md)
---
## Important Quirks & Gotchas
### API Startup
- **Migrations run automatically** on startup; check logs for errors
- **Must start API before UI**: UI depends on API for all data
- **SurrealDB must be running**: API fails without database connection
### Frontend-Backend Communication
- **Base API URL**: Configured in `.env.local` (default: http://localhost:5055)
- **CORS enabled**: Configured in `api/main.py` (allow all origins in dev)
- **Rate limiting**: Not built-in; add at proxy layer for production
### LangGraph Workflows
- **Blocking operations**: Chat/podcast workflows may take minutes; no timeout
- **State persistence**: Uses SQLite checkpoint storage in `/data/sqlite-db/`
- **Model fallback**: If primary model fails, falls back to cheaper/smaller model
### Podcast Generation
- **Async job queue**: `podcast_service.py` submits jobs but doesn't wait
- **Track status**: Use `/commands/{command_id}` endpoint to poll status
- **TTS failures**: Fall back to silent audio if speech synthesis fails
### Content Processing
- **File extraction**: Uses content-core library; supports 50+ file types
- **URL handling**: Extracts text + metadata from web pages
- **Large files**: Content processing is sync; may block API briefly
---
## Component References
See dedicated CLAUDE.md files for detailed guidance:
- **[frontend/CLAUDE.md](frontend/CLAUDE.md)**: React/Next.js architecture, state management, API integration
- **[api/CLAUDE.md](api/CLAUDE.md)**: FastAPI structure, service pattern, endpoint development
- **[open_notebook/CLAUDE.md](open_notebook/CLAUDE.md)**: Backend core, domain models, LangGraph workflows, AI provisioning
- **[open_notebook/domain/CLAUDE.md](open_notebook/domain/CLAUDE.md)**: Data models, repository pattern, search functions
- **[open_notebook/ai/CLAUDE.md](open_notebook/ai/CLAUDE.md)**: ModelManager, AI provider integration, Esperanto usage
- **[open_notebook/graphs/CLAUDE.md](open_notebook/graphs/CLAUDE.md)**: LangGraph workflow design, state machines
- **[open_notebook/database/CLAUDE.md](open_notebook/database/CLAUDE.md)**: SurrealDB operations, migrations, async patterns
---
## Documentation Map
- **[README.md](README.md)**: Project overview, features, quick start
- **[docs/index.md](docs/index.md)**: Complete user & deployment documentation
- **[CONFIGURATION.md](CONFIGURATION.md)**: Environment variables, model configuration
- **[DESIGN_PRINCIPLES.md](DESIGN_PRINCIPLES.md)**: Architectural decisions & philosophy
- **[MIGRATION.md](MIGRATION.md)**: v1.0 upgrade guide from Streamlit → React
- **[CONTRIBUTING.md](CONTRIBUTING.md)**: Contribution guidelines
- **[MAINTAINER_GUIDE.md](MAINTAINER_GUIDE.md)**: Release & maintenance procedures
---
## Testing Strategy
- **Unit tests**: `tests/test_domain.py`, `test_models_api.py`
- **Graph tests**: `tests/test_graphs.py` (workflow integration)
- **Utils tests**: `tests/test_utils.py`
- **Run all**: `uv run pytest tests/`
- **Coverage**: Check with `pytest --cov`
---
## Common Tasks
### Add a New API Endpoint
1. Create router in `api/routers/feature.py`
2. Create service in `api/feature_service.py`
3. Define schemas in `api/models.py`
4. Register router in `api/main.py`
5. Test via http://localhost:5055/docs
### Add a New LangGraph Workflow
1. Create `open_notebook/graphs/workflow_name.py`
2. Define StateDict and node functions
3. Build graph with `.add_node()` / `.add_edge()`
4. Invoke in service: `graph.ainvoke({"input": ...}, config={"..."})`
5. Test with sample data in `tests/`
### Add Database Migration
1. Create `migrations/XXX_description.surql`
2. Write SurrealQL schema changes
3. Create `migrations/XXX_description_down.surql` (optional rollback)
4. API auto-detects on startup; migration runs if newer than recorded version
### Deploy to Production
1. Review [CONFIGURATION.md](CONFIGURATION.md) for security settings
2. Use `make docker-release` for multi-platform image
3. Push to Docker Hub / GitHub Container Registry
4. Deploy `docker compose --profile multi up`
5. Verify migrations via API logs
---
## Support & Community
- **Documentation**: https://open-notebook.ai
- **Discord**: https://discord.gg/37XJPXfz2w
- **Issues**: https://github.com/lfnovo/open-notebook/issues
- **License**: MIT (see LICENSE)
---
**Last Updated**: January 2026 | **Project Version**: 1.2.4+

117
api/CLAUDE.md Normal file
View file

@ -0,0 +1,117 @@
# API Module
FastAPI-based REST backend exposing services for notebooks, sources, notes, chat, podcasts, and AI model management.
## Purpose
FastAPI application serving three architectural layers: routes (HTTP endpoints), services (business logic), and models (request/response schemas). Integrates LangGraph workflows (chat, ask, source_chat), SurrealDB persistence, and AI providers via Esperanto.
## Architecture Overview
**Three layers**:
1. **Routes** (`routers/*`): HTTP endpoints mapping to services
2. **Services** (`*_service.py`): Business logic orchestrating domain models, database, graphs, AI providers
3. **Models** (`models.py`): Pydantic request/response schemas with validation
**Startup flow**:
- Load .env environment variables
- Initialize CORS middleware + password auth middleware
- Run database migrations via AsyncMigrationManager on lifespan startup
- Register all routers
**Key services**:
- `chat_service.py`: Invokes chat graph with messages, context
- `podcast_service.py`: Orchestrates outline + transcript generation
- `sources_service.py`: Content ingestion, vectorization, metadata
- `notes_service.py`: Note creation, linking to sources/insights
- `transformations_service.py`: Applies transformations to content
- `models_service.py`: Manages AI provider/model configuration
- `episode_profiles_service.py`: Manages podcast speaker/episode profiles
## Component Catalog
### Main Application
- **main.py**: FastAPI app initialization, CORS setup, auth middleware, lifespan event, router registration
- **Lifespan handler**: Runs AsyncMigrationManager on startup (database schema migration)
- **Auth middleware**: PasswordAuthMiddleware protects endpoints (password-based access control)
### Services (Business Logic)
- **chat_service.py**: Invokes chat.py graph; handles message history via SqliteSaver
- **podcast_service.py**: Generates outline (outline.jinja), then transcript (transcript.jinja) for episodes
- **sources_service.py**: Ingests files/URLs (content_core), extracts text, vectorizes, saves to SurrealDB
- **transformations_service.py**: Applies transformations via transformation.py graph
- **models_service.py**: Manages ModelManager config (AI provider overrides)
- **episode_profiles_service.py**: CRUD for EpisodeProfile and SpeakerProfile models
- **insights_service.py**: Generates and retrieves source insights
- **notes_service.py**: Creates notes linked to sources/insights
### Models (Schemas)
- **models.py**: Pydantic schemas for request/response validation
- Request bodies: ChatRequest, CreateNoteRequest, PodcastGenerationRequest, etc.
- Response bodies: ChatResponse, NoteResponse, PodcastResponse, etc.
- Custom validators for enum fields, file paths, model references
### Routers
- **routers/chat.py**: POST /chat
- **routers/source_chat.py**: POST /source/{source_id}/chat
- **routers/podcasts.py**: POST /podcasts, GET /podcasts/{id}, etc.
- **routers/notes.py**: POST /notes, GET /notes/{id}
- **routers/sources.py**: POST /sources, GET /sources/{id}, DELETE /sources/{id}
- **routers/models.py**: GET /models, POST /models/config
- **routers/transformations.py**: POST /transformations
- **routers/insights.py**: GET /sources/{source_id}/insights
- **routers/auth.py**: POST /auth/password (password-based auth)
- **routers/commands.py**: GET /commands/{command_id} (job status tracking)
## Common Patterns
- **Service injection via FastAPI**: Routers import services directly; no DI framework
- **Async/await throughout**: All DB queries, graph invocations, AI calls are async
- **SurrealDB transactions**: Services use repo_query, repo_create, repo_upsert from database layer
- **Config override pattern**: Models/config override via models_service passed to graph.ainvoke(config=...)
- **Error handling**: Services catch exceptions and return HTTP status codes (400 Bad Request, 404 Not Found, 500 Internal Server Error)
- **Logging**: loguru logger in main.py; services expected to log key operations
- **Response normalization**: All responses follow standard schema (data + metadata structure)
## Key Dependencies
- `fastapi`: FastAPI app, routers, HTTPException
- `pydantic`: Validation models with Field, field_validator
- `open_notebook.graphs`: chat, ask, source_chat, source, transformation graphs
- `open_notebook.database`: SurrealDB repository functions (repo_query, repo_create, repo_upsert)
- `open_notebook.domain`: Notebook, Source, Note, SourceInsight models
- `open_notebook.ai.provision`: provision_langchain_model() factory
- `ai_prompter`: Prompter for template rendering
- `content_core`: extract_content() for file/URL processing
- `esperanto`: AI provider client library (LLM, embeddings, TTS)
- `surreal_commands`: Job queue for async operations (podcast generation)
- `loguru`: Structured logging
## Important Quirks & Gotchas
- **Migration auto-run**: Database schema migrations run on every API startup (via lifespan); no manual migration steps
- **PasswordAuthMiddleware is basic**: Uses simple password check; production deployments should replace with OAuth/JWT
- **No request rate limiting**: No built-in rate limiting; deployment must add via proxy/middleware
- **Service state is stateless**: Services don't cache results; each request re-queries database/AI models
- **Graph invocation is blocking**: chat/podcast workflows may take minutes; no timeout handling in services
- **Command job fire-and-forget**: podcast_service.py submits jobs but doesn't wait (async job queue pattern)
- **Model override scoping**: Model config override via RunnableConfig is per-request only (not persistent)
- **CORS open by default**: main.py CORS settings allow all origins (restrict before production)
- **No OpenAPI security scheme**: API docs available without auth (disable before production)
- **Services don't validate user permission**: All endpoints trust authentication layer; no per-notebook permission checks
## How to Add New Endpoint
1. Create router file in `routers/` (e.g., `routers/new_feature.py`)
2. Import router into `main.py` and register: `app.include_router(new_feature.router, tags=["new_feature"])`
3. Create service in `new_feature_service.py` with business logic
4. Define request/response schemas in `models.py` (or create `new_feature_models.py`)
5. Implement router functions calling service methods
6. Test with `uv run uvicorn api.main:app --host 0.0.0.0 --port 5055`
## Testing Patterns
- **Interactive docs**: http://localhost:5055/docs (Swagger UI)
- **Direct service tests**: Import service, call methods directly with test data
- **Mock graphs**: Replace graph.ainvoke() with mock for testing service logic
- **Database: Use test database** (separate SurrealDB instance or mock repo_query)

View file

@ -1,77 +0,0 @@
#!/usr/bin/env python3
"""Batch fix service files for mypy errors."""
import re
from pathlib import Path
SERVICE_FILES = [
'api/notes_service.py',
'api/insights_service.py',
'api/episode_profiles_service.py',
'api/settings_service.py',
'api/sources_service.py',
'api/podcast_service.py',
'api/command_service.py',
]
BASE_DIR = Path('/Users/luisnovo/dev/projetos/open-notebook/open-notebook')
for service_file in SERVICE_FILES:
file_path = BASE_DIR / service_file
if not file_path.exists():
print(f"Skipping {service_file} - file not found")
continue
content = file_path.read_text()
original_content = content
# Pattern to find: var_name = api_client.method(args)
# Followed by: var_name["key"] or var_name.get("key")
lines = content.split('\n')
new_lines = []
i = 0
while i < len(lines):
line = lines[i]
# Check if this line has an api_client call assignment
match = re.match(r'(\s*)(\w+)\s*=\s*api_client\.(\w+)\((.*)\)\s*$', line)
if match and 'response = api_client' not in line:
indent = match.group(1)
var_name = match.group(2)
method_name = match.group(3)
args = match.group(4)
# Look ahead to see if this variable is used with dict access
has_dict_access = False
for j in range(i+1, min(i+15, len(lines))):
next_line = lines[j]
if f'{var_name}["' in next_line or f"{var_name}['" in next_line or f'{var_name}.get(' in next_line:
has_dict_access = True
break
# Stop looking if we hit a blank line, new function, or new assignment
if (not next_line.strip() or
next_line.strip().startswith('def ') or
next_line.strip().startswith('class ') or
(re.match(r'\s*\w+\s*=', next_line) and var_name not in next_line)):
break
if has_dict_access:
# Replace with response and isinstance check
new_lines.append(f'{indent}response = api_client.{method_name}({args})')
new_lines.append(f'{indent}{var_name} = response if isinstance(response, dict) else response[0]')
i += 1
continue
new_lines.append(line)
i += 1
new_content = '\n'.join(new_lines)
# Check if content changed
if new_content != original_content:
file_path.write_text(new_content)
print(f"✓ Fixed {service_file}")
else:
print(f"- No changes needed for {service_file}")
print("\nDone!")

49
commands/CLAUDE.md Normal file
View file

@ -0,0 +1,49 @@
# Commands Module
**Purpose**: Defines async command handlers for long-running operations via `surreal-commands` job queue system.
## Key Components
- **`process_source_command`**: Ingests content through `source_graph`, creates embeddings (optional), and generates insights. Retries on transaction conflicts (exp. jitter, max 5×).
- **`embed_single_item_command`**: Embeds individual sources/notes/insights; splits content into chunks for vector storage.
- **`rebuild_embeddings_command`**: Bulk re-embed all/existing items with selective type filtering.
- **`generate_podcast_command`**: Creates podcasts via `podcast-creator` library using stored episode/speaker profiles.
- **`process_text_command`** (example): Test fixture for text operations (uppercase, lowercase, reverse, word_count).
- **`analyze_data_command`** (example): Test fixture for numeric aggregations.
## Important Patterns
- **Pydantic I/O**: All commands use `CommandInput`/`CommandOutput` subclasses for type safety and serialization.
- **Error handling**: Permanent errors return failure output; `RuntimeError` exceptions auto-retry via surreal-commands.
- **Model dumping**: Recursive `full_model_dump()` utility converts Pydantic models → dicts for DB/API responses.
- **Logging**: Uses `loguru.logger` throughout; logs execution start/end and key metrics (processing time, counts).
- **Time tracking**: All commands measure `start_time``processing_time` for monitoring.
## Dependencies
**External**: `surreal_commands` (command decorator, job queue), `loguru`, `pydantic`, `podcast_creator`
**Internal**: `open_notebook.domain.*` (Source, Note, Transformation), `open_notebook.graphs.source`, `open_notebook.ai.models`
## Quirks & Edge Cases
- **source_commands**: `ensure_record_id()` wraps command IDs for DB storage; transaction conflicts trigger exponential backoff retry (1-30s). Non-`RuntimeError` exceptions are permanent.
- **embedding_commands**: Queries DB directly for item state; chunk index must match source's chunk list. Model availability checked at command start.
- **podcast_commands**: Profiles loaded from SurrealDB by name (must exist); briefing can be extended with suffix. Episode records created mid-execution.
- **Example commands**: Accept optional `delay_seconds` for testing async behavior; not for production.
## Code Example
```python
@command("process_source", app="open_notebook", retry={...})
async def process_source_command(input_data: SourceProcessingInput) -> SourceProcessingOutput:
start_time = time.time()
try:
transformations = [await Transformation.get(id) for id in input_data.transformations]
source = await Source.get(input_data.source_id)
result = await source_graph.ainvoke({...})
return SourceProcessingOutput(success=True, ...)
except RuntimeError as e:
raise # Retry this
except Exception as e:
return SourceProcessingOutput(success=False, error_message=str(e))
```

159
frontend/src/CLAUDE.md Normal file
View file

@ -0,0 +1,159 @@
# Frontend Architecture
Next.js React application providing UI for Open Notebook research assistant. Three-layer architecture: **pages** (Next.js App Router), **components** (feature-specific UI), and **lib** (data fetching, state management, utilities).
## High-Level Data Flow
```
Pages (Next.js) → Components (feature-specific) → Hooks (queries/mutations)
Stores (auth/modal state) → API module → Backend
```
User interactions trigger mutations/queries via hooks, which communicate with the backend through the API module. Store state (auth, modals) flows back to components via hooks. Child CLAUDE.md files document specific modules in detail:
- **`lib/api/CLAUDE.md`**: Axios client, FormData handling, interceptors
- **`lib/hooks/CLAUDE.md`**: TanStack Query wrappers, SSE streaming, context building
- **`lib/stores/CLAUDE.md`**: Zustand auth/modal state, localStorage persistence
- **`components/ui/CLAUDE.md`**: Radix UI primitives, CVA styling, accessibility
## Architectural Layers
### Pages (`src/app/`) — Next.js App Router
- `(auth)/login`: Authentication entry point
- `(dashboard)/`: Protected routes (notebooks, sources, search, models, etc.)
- Directory-based routing; each `page.tsx` is a route endpoint
- **Key pattern**: Pages call hooks to fetch data, render components with state
- **Router groups** `(auth)`, `(dashboard)` organize routes by feature without affecting URL
### Components (`src/components/`) — Feature-Specific UI
- **layout**: `AppShell.tsx`, `AppSidebar.tsx` — main layout wrapper used by all pages
- **providers**: `ThemeProvider`, `QueryProvider`, `ModalProvider` — app-wide context setup
- **auth**: `LoginForm.tsx` — authentication UI
- **common**: `CommandPalette`, `ErrorBoundary`, `ContextToggle`, `ModelSelector` — shared across pages
- **ui**: Reusable Radix UI building blocks (see child CLAUDE.md)
- **source**, **notebooks**, **search**, **podcasts**: Feature-specific components consuming hooks
**Component composition pattern**: Pages → Feature components → UI components. Feature components handle page-level state (loading, error), UI components remain stateless and styled.
### Lib (`src/lib/`) — Data & State Layer
#### `lib/api/` — Backend Communication
- **`client.ts`**: Central Axios instance with auth interceptor, FormData handling, 10-min timeout
- **`query-client.ts`**: TanStack Query configuration
- **Resource modules** (`sources.ts`, `chat.ts`, `notebooks.ts`, etc.): Endpoint-specific functions returning typed responses
- **Pattern**: All requests go through `apiClient`; auth token auto-added from localStorage
#### `lib/hooks/` — React Query + Custom Logic
- **Query hooks**: `useNotebookSources`, `useSources`, `useSource` — TanStack Query wrappers with cache keys
- **Mutation hooks**: `useCreateSource`, `useUpdateSource`, `useDeleteSource` — mutations with toast feedback + cache invalidation
- **Complex hooks**: `useNotebookChat`, `useSourceChat` — session management, message streaming, context building
- **SSE streaming**: `useAsk` — parses newline-delimited JSON from backend for multi-stage workflows
- **Pattern**: Hooks return `{ data, isLoading, error, refetch }` + action functions; cache invalidation on mutations
#### `lib/stores/` — Application State
- **`auth-store.ts`**: Authentication state (token, isAuthenticated) with 30-second check caching
- **Zustand + persist middleware**: Auto-syncs sensitive state to localStorage
- **Pattern**: Store actions (`login()`, `logout()`, `checkAuth()`) update state; consumed via hooks in components
#### `lib/types/` — TypeScript Definitions
- API request/response shapes, domain models (Notebook, Source, Note, etc.)
- Ensures type safety across API calls and store mutations
## Data & Control Flow Walkthrough
### Example: Notebook Chat
1. **Page** (`notebooks/[id]/page.tsx`) fetches initial data, passes `notebookId` to `ChatColumn` component
2. **Hook call** (`useNotebookChat()`):
- Queries sessions for notebook via TanStack Query
- Sets up message state + context building logic
- Returns `{ messages, sendMessage(), setModelOverride() }`
3. **Component renders**: `ChatColumn` displays messages, text input
4. **User sends message**: Component calls `sendMessage()` hook
5. **Hook execution**:
- Builds context from selected sources/notes via `buildContext()` helper
- Calls `chatApi.sendMessage()` (from API module)
- Client-side optimistic update: adds message to local state before response
6. **Backend response** arrives, TanStack Query updates cache
7. **Cache invalidation** on other source/note mutations ensures stale UI refreshes
### Example: File Upload with Source Creation
1. **Component** (`SourceDialog`) renders form with file picker
2. **Hook** (`useFileUpload`):
- Converts file to FormData (JSON fields stringified)
- Calls `sourcesApi.create()` with FormData
- API client interceptor deletes Content-Type header (lets browser set multipart boundary)
3. **Toast notifications** show progress
4. **Cache invalidation** on success: `queryClient.invalidateQueries(['sources'])`
5. **Related queries** auto-refetch: notebooks, sources list, etc.
## Key Patterns & Cross-Layer Coordination
### Caching & Invalidation
- **Query keys**: `QUERY_KEYS.notebook(id)`, `QUERY_KEYS.sources(notebookId)` — hierarchical structure
- **Broad invalidation**: `['sources']` invalidates all source queries; trade-off between accuracy + performance
- **Auto-refetch**: `refetchOnWindowFocus: true` on frequently-changing data (sources, notebooks)
### Auth & Protected Routes
- **Middleware** (`src/middleware.ts`): Redirects unauthenticated users to `/login`
- **Auth store**: Validates token via `/notebooks` API call (actual validation, not JWT decode)
- **Interceptor**: Adds `Bearer {token}` to all requests; 401 response clears auth and redirects to login
### Modal State Management
- **Modal hooks**: Components query modal state from stores
- **Context**: Modals pass data (e.g., notebook ID) to child components
- **Pattern**: One store per modal type; triggered by button clicks + data passing via hook arguments
### Error Handling
- **API errors**: All request failures propagate to consuming code; components show toast notifications
- **Toast feedback**: Mutations show success/error toasts (from `sonner` library)
- **Error boundary**: App-level error boundary catches React render errors; shows fallback UI
### FormData Handling
- **JSON fields**: Nested objects (arrays, objects) must be JSON stringified before FormData
- **Content-Type header**: Removed by interceptor for FormData requests (lets browser set boundary)
- **Example**: `sources` array converted to string via `JSON.stringify()` before appending to FormData
## Component Organization Within Features
- **Feature folders** (`source/`, `notebooks/`, `podcasts/`): Group related components
- **Composition**: Larger components nest smaller ones; no deep prop drilling (state lifted to hooks)
- **Dialog patterns**: Features define dialog components for inline actions (edit, create, delete)
- **Props**: Components accept data + action callbacks from parent or hooks
## Providers & Context Setup
**Root layout** (`app/layout.tsx`) wraps app with:
1. `ThemeProvider` — next-themes for light/dark mode
2. `QueryProvider` — TanStack Query client
3. `ErrorBoundary` — React error boundary
4. `ConnectionGuard` — checks backend connectivity on startup
5. `Toaster` — sonner toast notification system
## Important Gotchas & Design Decisions
- **Token storage**: Stored in localStorage under `auth-storage` key (Zustand persist); consumed by API interceptor
- **Base URL discovery**: API client fetches base URL from runtime config on first request (async; can be slow on startup)
- **Optimistic updates**: Chat messages added to state before server confirmation; removed on error
- **Modal lifecycle**: Dialogs not auto-reset; parent must clear form state after submit
- **Focus management**: Dialog auto-focuses first input; can cause layout shifts if inputs are conditional
- **Cache invalidation breadth**: Trade-off between precision + simplicity; broad invalidation simpler but may over-fetch
## How to Add a New Feature
1. **Create page**: `app/(dashboard)/feature/page.tsx` — calls hooks, renders components
2. **Create feature components**: `components/feature/` — compose UI + business logic
3. **Add hooks** (if data needed): `lib/hooks/useFeature.ts` — TanStack Query wrapper
4. **Add API module** (if backend call needed): `lib/api/feature.ts` — resource-specific functions
5. **Add types**: `lib/types/api.ts` — request/response shapes
6. **Use UI components**: Import from `components/ui/` for consistent styling
7. **Handle auth**: Middleware redirects unauthenticated users; no special handling needed in component
## Testing
- **Hooks**: Mock API functions, wrap in `QueryClientProvider`, assert query/mutation behavior
- **Components**: Mock hooks via `vi.fn()`, test rendering + user interactions
- **API calls**: Mock `axios` interceptors; test request/response shapes
- **Stores**: Mock store state, test mutations via `act()`, assert state changes
See child CLAUDE.md files for module-specific testing patterns.

View file

@ -0,0 +1,64 @@
# UI Components Module
Radix UI-based accessible component library with CVA styling, composed building blocks, and theming support.
## Key Components
- **Primitives** (`button.tsx`, `dialog.tsx`, `select.tsx`, `dropdown-menu.tsx`): Radix UI wrappers with Tailwind styling
- **Composite components** (`checkbox-list.tsx`, `wizard-container.tsx`, `command.tsx`): Multi-part patterns combining primitives
- **Form components** (`input.tsx`, `textarea.tsx`, `label.tsx`, `form-section.tsx`): Input handling with accessibility
- **Feedback** (`alert.tsx`, `alert-dialog.tsx`, `sonner.tsx`, `progress.tsx`): User notifications and status
- **Layout** (`card.tsx`, `accordion.tsx`, `tabs.tsx`, `scroll-area.tsx`): Structural wrappers
- **Utilities** (`badge.tsx`, `separator.tsx`, `tooltip.tsx`, `popover.tsx`, `collapsible.tsx`): Small focused components
## Important Patterns
- **Radix UI wrappers**: Components delegate to Radix primitives; apply Tailwind classes via `cn()` utility
- **CVA (Class Variance Authority)**: `button.tsx` and similar use CVA for variant/size combinations
- **Composition via Slot**: `Button` uses `asChild` prop + `Slot` from radix to render as any element type
- **Data slots**: All components have `data-slot` attributes for testing/styling isolation
- **Controlled styling**: Classes hardcoded in components; use `className` prop to override/extend
- **Animations**: Radix `data-[state]` selectors for open/close animations (fade-in, zoom-in)
- **Accessibility first**: ARIA attributes from Radix (aria-invalid, sr-only labels, focus rings)
- **Dark mode support**: Uses Tailwind dark: prefix for color scheme (e.g., `dark:border-input`)
## Key Dependencies
- `@radix-ui/*`: Unstyled accessible primitives (dialog, select, dropdown-menu, etc.)
- `class-variance-authority`: CVA for variant patterns
- `lucide-react`: Icon library (XIcon in dialog close button)
- `@/lib/utils`: `cn()` utility for class merging
## How to Add New Components
1. Create `.tsx` file wrapping Radix primitive or composing existing components
2. Add `data-slot="component-name"` to root element
3. Use `cn()` to merge default classes with `className` prop
4. Export both component and variants (if using CVA)
5. Document prop shape and usage in JSDoc
## Important Quirks & Gotchas
- **Slot forwarding**: `asChild={true}` on Button passes all props to child; ensure child accepts them
- **FormData in dialogs**: Dialog not reset automatically; parent must manually clear form state
- **Focus management**: Dialog auto-focuses first input; can cause layout shifts if inputs conditionally rendered
- **Z-index stacking**: Fixed elements (Dialog overlay, dropdown menus) use z-50; be careful with other fixed elements
- **Click outside closes dropdown**: Radix dropdowns auto-close on outside click; may conflict with hover-triggered actions
- **SVG size inference**: Button uses `[&_svg:not([class*='size-'])]:size-4` to default unlabeled icons to 4x4; be explicit if different size needed
- **CSS-in-JS conflicts**: Hardcoded Tailwind classes may conflict with global CSS; specificity matters
- **Dark mode class**: Requires `dark` class on document root; not automatic with prefers-color-scheme alone
## Testing Patterns
```typescript
// Test component rendering with props
render(<Button variant="destructive" size="sm">Delete</Button>)
expect(screen.getByRole('button')).toHaveClass('bg-destructive')
// Test Dialog interaction
render(<Dialog open={true}><DialogContent>Content</DialogContent></Dialog>)
expect(screen.getByText('Content')).toBeInTheDocument()
// Test accessibility
expect(screen.getByRole('dialog')).toHaveAttribute('role', 'dialog')
```

View file

@ -0,0 +1,66 @@
# API Module
Axios-based client and resource-specific API modules for backend communication with auth, FormData handling, and error recovery.
## Key Components
- **`client.ts`**: Central Axios instance with request/response interceptors, auth headers, base URL resolution
- **Resource modules** (`sources.ts`, `notebooks.ts`, `chat.ts`, `search.ts`, etc.): Endpoint-specific functions returning typed responses
- **`query-client.ts`**: TanStack Query client configuration with default options
- **`models.ts`, `notes.ts`, `embeddings.ts`, `settings.ts`**: Additional resource APIs
## Important Patterns
- **Single axios instance**: `apiClient` with 10-minute timeout (for slow LLM operations)
- **Request interceptor**: Auto-fetches base URL from config, adds Bearer auth from localStorage `auth-storage`
- **FormData handling**: Auto-removes Content-Type header for FormData to let browser set multipart boundary
- **Response interceptor**: 401 clears auth and redirects to `/login`
- **Async base URL resolution**: `getApiUrl()` fetches from runtime config on first request
- **Error propagation**: All functions return typed responses via `response.data`
- **Method chaining**: Resource modules export namespaced objects (e.g., `sourcesApi.list()`, `sourcesApi.create()`)
## Key Dependencies
- `axios`: HTTP client library
- `@/lib/config`: `getApiUrl()` for dynamic base URL
- `@/lib/types/api`: TypeScript types for request/response shapes
## How to Add New API Modules
1. Create new file (e.g., `transforms.ts`)
2. Import `apiClient`
3. Export namespaced object with methods:
```typescript
export const transformsApi = {
list: async () => { const response = await apiClient.get('/transforms'); return response.data }
}
```
4. Add types to `@/lib/types/api` if new response shapes needed
## Important Quirks & Gotchas
- **Base URL delay**: First request waits for `getApiUrl()` to resolve; can be slow on startup
- **FormData fields as JSON strings**: Nested objects (arrays, objects) must be JSON stringified in FormData (e.g., `notebooks`, `transformations`)
- **Timeout for streaming**: 10-minute timeout may not cover very long-running LLM operations; consider extending if needed
- **Auth token management**: Token stored in localStorage `auth-storage` key; uses Zustand persist middleware
- **Headers mutation in interceptor**: Mutating `config.headers` directly; be careful with middleware order
- **No retry logic**: Failed requests not automatically retried; must be handled in consuming code
- **Content-Type header precedence**: FormData interceptor deletes Content-Type after checking; subsequent interceptors won't re-add it
## Usage Example
```typescript
// Basic list
const sources = await sourcesApi.list({ notebook_id: notebookId })
// File upload with FormData
const response = await sourcesApi.create({
type: 'upload',
file: fileObj,
notebook_id: notebookId,
async_processing: true
})
// With auth token (auto-added by interceptor)
const notes = await notesApi.list()
```

View file

@ -0,0 +1,64 @@
# Hooks Module
React hooks for API data fetching, state management, and complex workflows (chat, streaming, file handling).
## Key Components
- **Query hooks** (`useNotebookSources`, `useSource`, `useSources`): TanStack Query wrappers for source data with infinite scroll and refetch strategies
- **Mutation hooks** (`useCreateSource`, `useUpdateSource`, `useDeleteSource`, `useFileUpload`, `useRetrySource`): Server mutations with toast notifications and cache invalidation
- **Chat hooks** (`useNotebookChat`, `useSourceChat`): Complex session management, context building, and message streaming
- **Streaming hooks** (`useAsk`): SSE parsing for multi-stage Ask workflows (strategy → answers → final answer)
- **Model/config hooks** (`useModels`, `useSettings`, `useTransformations`): Application-level settings and model management
- **Utility hooks** (`useMediaQuery`, `useToast`, `useNavigation`, `useAuth`): UI state and auth checking
## Important Patterns
- **TanStack Query integration**: All data hooks use `useQuery`/`useMutation` with `QUERY_KEYS` for cache consistency
- **Optimistic updates**: Mutations add local state before server response (e.g., notebook chat messages)
- **Cache invalidation**: Broad invalidation of query keys on mutations (e.g., `['sources']` catches all source queries)
- **Auto-refetch on return**: `refetchOnWindowFocus: true` on frequently-changing data (sources, notebooks)
- **Manual refetch controls**: Hooks return `refetch()` for parent components to trigger refresh
- **SSE streaming pattern**: `useAsk` manually parses newline-delimited JSON from `/api/search/ask`; handles incomplete buffers
- **Status polling**: `useSourceStatus` auto-refetches every 2s while `status === 'running' | 'queued' | 'new'`
- **Context building**: `useNotebookChat.buildContext()` assembles selected sources + notes with token/char counts
## Key Dependencies
- `@tanstack/react-query`: Data fetching and caching
- `sonner`: Toast notifications
- `@/lib/api/*`: API module exports (sourcesApi, chatApi, searchApi, etc.)
- `@/lib/types/api`: TypeScript response types
- Zustand stores: `useAuthStore`, modal managers
## How to Add New Hooks
1. **Data queries**: Create `useQuery` hook wrapping API call; use `QUERY_KEYS.entityName(id)` for cache key
2. **Mutations**: Create `useMutation` hook with `onSuccess` cache invalidation + toast feedback
3. **Complex state**: Use `useState` + callbacks for local state (see `useAsk`, `useNotebookChat`)
4. **Return shape**: Export object with both state and action functions for composability
## Important Quirks & Gotchas
- **Cache invalidation breadth**: Invalidating `['sources']` affects ALL source queries; be precise if performance matters
- **Optimistic updates + error handling**: `useNotebookChat` removes optimistic messages on error; ensure cleanup
- **SSE buffer handling**: `useAsk` keeps incomplete lines in buffer between reads; incomplete JSON silently skipped
- **Model override timing**: `useNotebookChat` stores pending model override if no session exists; applied on session creation
- **Pagination cursor**: `useNotebookSources` uses offset-based pagination; `nextOffset` calculated from page size
- **Status polling race**: `useSourceStatus` may refetch stale data before server catches up; retry logic has 3-attempt limit
- **Keyboard trap in dialogs**: Some hooks manage modal state; ensure Dialog/Modal components handle escape key properly
- **Form data handling**: `useFileUpload` and source creation convert JSON fields to strings in FormData
## Testing Patterns
```typescript
// Mock API
const mockApi = {
list: vi.fn().mockResolvedValue([...])
}
// Test hook with QueryClientProvider + wrapper
render(<Component />, { wrapper: QueryClientProvider })
// Assert mutations trigger cache invalidation
await waitFor(() => expect(queryClient.invalidateQueries).toHaveBeenCalled())
```

View file

@ -0,0 +1,68 @@
# Stores Module
Zustand-based state management for authentication, modals, and application-level settings with localStorage persistence.
## Key Components
- **`auth-store.ts`**: Authentication state (token, isAuthenticated) with login, logout, auth checking, and Zustand persistence
- **Modal stores** (imported via hooks): Modal visibility and data state management
- **Settings persistence**: Auto-saves sensitive state (token, auth status) to localStorage via Zustand persist middleware
## Important Patterns
- **Zustand create + persist**: State + actions combined in single store; `persist` middleware auto-syncs to localStorage
- **Selective persistence**: `partialize` option limits what's saved (e.g., only `token` and `isAuthenticated`, not `isLoading`)
- **Hydration tracking**: `setHasHydrated()` marks when localStorage data loaded; used to avoid hydration mismatch in SSR
- **Auth caching**: 30-second cache on `checkAuth()` to avoid excessive API calls; stores `lastAuthCheck` timestamp
- **Network resilience**: Handles 401 globally in API interceptor; graceful degradation if API unreachable
- **API validation**: Uses actual API call (`/notebooks` endpoint) to validate token instead of parsing JWT
## Key Dependencies
- `zustand`: State management library
- `@/lib/config`: `getApiUrl()` for dynamic server discovery
- localStorage: Browser persistence API
## How to Add New Stores
1. Create new file (e.g., `settings-store.ts`)
2. Define interface extending store state and actions
3. Use `create<Interface>()(persist(...))` for persistence, or plain `create<Interface>()` for ephemeral state:
```typescript
export const useSettingsStore = create<SettingsState>()(
persist((set) => ({
theme: 'dark',
setTheme: (theme) => set({ theme })
}), {
name: 'settings-storage'
})
)
```
## Important Quirks & Gotchas
- **Hydration mismatch**: Server-side rendered stores must check `hasHydrated` before rendering to prevent SSR mismatches
- **localStorage key collision**: Persist middleware uses `name` option as localStorage key; ensure unique per store
- **Token not validated**: `login()` only checks HTTP 200 response; doesn't decode or validate JWT structure
- **Auth check race condition**: Multiple simultaneous `checkAuth()` calls return early if one already in progress (`isCheckingAuth`)
- **Error messages from HTTP**: Shows 401/403/5xx status codes to user; helps with debugging but may leak info
- **Network timeout handling**: Network errors in `checkAuthRequired()` set `authRequired: null` (safe default); `login()` shows generic message
- **Logout doesn't invalidate session**: Client-side logout only clears local token; server session may still be valid
- **Double authentication**: Both `login()` and `checkAuth()` test same `/notebooks` endpoint; could be optimized with dedicated endpoint
## Testing Patterns
```typescript
// Mock store
const mockAuthStore = {
isAuthenticated: true,
token: 'test-token',
checkAuth: vi.fn().mockResolvedValue(true),
login: vi.fn().mockResolvedValue(true),
logout: vi.fn()
}
// Test store mutations
act(() => store.setState({ theme: 'light' }))
expect(store.getState().theme).toBe('light')
```

242
open_notebook/CLAUDE.md Normal file
View file

@ -0,0 +1,242 @@
# Open Notebook Core Backend
The `open_notebook` module is the heart of the system: a multi-layer backend orchestrating AI-powered research workflows. It bridges domain models, asynchronous database operations, LangGraph-based content processing, and multi-provider AI model management.
## Purpose
Encapsulates the entire backend architecture:
1. **Data layer**: SurrealDB persistence with async CRUD and migrations
2. **Domain layer**: Research models (Notebook, Source, Note, etc.) with embedded relationships
3. **Workflow layer**: LangGraph state machines for content ingestion, chat, and transformations
4. **AI provisioning**: Multi-provider model management with smart fallback logic
5. **Support services**: Context building, tokenization, and utility functions
All components communicate through async/await patterns and use Pydantic for validation.
## Architecture Overview
```
┌─────────────────────────────────────────────────────────────┐
│ API / Streamlit UI │
└──────────────────────┬──────────────────────────────────────┘
┌──────────────────┴──────────────────┐
│ │
┌───▼────────────────────┐ ┌──────────▼────────────────┐
│ Graphs (LangGraph) │ │ Domain Models (Data) │
│ - source.py (ingestion) │ │ - Notebook, Source, Note │
│ - chat.py │ │ - ChatSession, Asset │
│ - ask.py (search) │ │ - SourceInsight, Embedding│
│ - transformation.py │ │ - Transformation, Settings│
└───┬────────────────────┘ │ - EpisodeProfile, Podcast │
│ └──────────┬─────────────────┘
│ │
└───────────────────┬───────────────┘
┌───────────────────┴────────────────────┐
│ │
┌───▼─────────────────┐ ┌──────────────▼──────┐
│ AI Module (Models) │ │ Utils (Helpers) │
│ - ModelManager │ │ - ContextBuilder │
│ - DefaultModels │ │ - TokenUtils │
│ - provision_langchain│ │ - TextUtils │
│ - Multi-provider AI │ │ - VersionUtils │
└───┬─────────────────┘ └──────────┬──────────┘
│ │
└───────────────────┬───────────────┘
┌──────────────▼────────────────┐
│ Database (SurrealDB) │
│ - repository.py (CRUD ops) │
│ - async_migrate.py (schema) │
│ - Configuration │
└────────────────────────────────┘
```
## Component Catalog
### Core Layers
**See dedicated CLAUDE.md files for detailed patterns and usage:**
- **`database/`**: Async repository pattern (repo_query, repo_create, repo_upsert), connection pooling, and automatic schema migrations on API startup. See `database/CLAUDE.md`.
- **`domain/`**: Core data models using Pydantic with SurrealDB persistence. Two base classes: `ObjectModel` (mutable records with auto-increment IDs and embedding) and `RecordModel` (singleton configuration). Includes search functions (text_search, vector_search). See `domain/CLAUDE.md`.
- **`graphs/`**: LangGraph state machines for async workflows. Content ingestion (source.py), conversational agents (chat.py), search synthesis (ask.py), and transformations. Uses provision_langchain_model() for smart model selection with token-aware fallback. See `graphs/CLAUDE.md`.
- **`ai/`**: Centralized AI model lifecycle via Esperanto library. ModelManager factory with intelligent fallback (large context detection, type-specific defaults, config override). Supports 8+ providers (OpenAI, Anthropic, Google, Groq, Ollama, Mistral, DeepSeek, xAI). See `ai/CLAUDE.md`.
- **`utils/`**: Cross-cutting utilities: ContextBuilder (flexible context assembly from sources/notes/insights with token budgeting), TextUtils (truncation, cleaning), TokenUtils (GPT token counting), VersionUtils (schema compatibility). See `utils/CLAUDE.md`.
- **`podcasts/`**: Podcast generation models: SpeakerProfile (TTS voice config), EpisodeProfile (generation settings), PodcastEpisode (job tracking via surreal-commands). See `podcasts/CLAUDE.md`.
### Configuration & Exceptions
- **`config.py`**: Paths for data folder, uploads, LangGraph checkpoints, and tiktoken cache. Auto-creates directories.
- **`exceptions.py`**: Hierarchy of OpenNotebookError subclasses for database, file, network, authentication, and rate-limit failures.
## Data Flow: Content Ingestion
```
User uploads file/URL
┌─────────────────────────────────────┐
│ source.py (LangGraph state machine) │
├─────────────────────────────────────┤
│ 1. content_process() │
│ - extract_content() from file/URL│
│ - Use ContentSettings defaults │
│ - speech_to_text model from DB │
│ │
│ 2. save_source() │
│ - Update Source with full_text │
│ - Preserve title if empty │
│ │
│ 3. trigger_transformations() │
│ - Parallel fan-out to each TXN │
└────────────────┬────────────────────┘
┌──────────────┐
│ transformation.py (parallel)
│ - Apply prompt to source text
│ - Generate insights
│ - Auto-embed results
└──────────────┘
┌────────────────────┐
│ Database Storage │
│ - Source.full_text │
│ - SourceInsight │
│ - Embeddings │
│ - (async job) │
└────────────────────┘
```
**Fire-and-forget embeddings**: Source.vectorize() returns command_id without awaiting; embedding happens asynchronously via surreal-commands job system.
## Data Flow: Chat & Search
```
User message in chat
┌──────────────────────────┐
│ ContextBuilder │
│ - Select sources/notes │
│ - Token budget limiting │
│ - Priority weighting │
└──────────┬───────────────┘
┌──────────────────────────────────┐
│ chat.py or ask.py (LangGraph) │
│ - Load context from above │
│ - provision_langchain_model() │
│ * Auto-upgrade for large text │
│ * Apply model_id override │
│ - Call LLM with context │
│ - Store message in SqliteSaver │
└──────────┬───────────────────────┘
┌──────────────┐
│ LLM Response │
│ (persisted) │
└──────────────┘
```
## Key Patterns Across Layers
### Async/Await Everywhere
All database operations, model provisioning, and graph execution are async. Mix with sync code only via `asyncio.run()` or LangGraph's async bridges (see graphs/CLAUDE.md for workarounds).
### Type-Driven Dispatch
Model types (language, embedding, speech_to_text, text_to_speech) drive factory logic in ModelManager. Domain model IDs encode their type: `notebook:uuid`, `source:uuid`, `note:uuid`.
### Smart Fallback Logic
`provision_langchain_model()` auto-detects large contexts (105K+ tokens) and upgrades to dedicated large_context_model. Falls back to default_chat_model if specific type not found.
### Fire-and-Forget Jobs
Time-consuming operations (embedding, podcast generation) return command_id immediately. Caller polls surreal-commands for status; no blocking.
### Embedding on Save
Domain models with `needs_embedding()=True` auto-generate embeddings in `save()`. Search functions (text_search, vector_search) use embeddings for semantic matching.
### Relationship Management
SurrealDB graph edges link entities: Notebook→Source (has), Source→Note (artifact), Note→Source (refers_to). See `relate()` in domain/base.py.
## Integration Points
**API startup** (`api/main.py`):
- AsyncMigrationManager.run_migration_up() on lifespan startup
- Ensures schema is current before handling requests
**Streamlit UI** (`pages/stream_app/`):
- Calls domain models directly to fetch/create notebooks, sources, notes
- Invokes graphs (chat, source, ask) via async wrapper
- Relies on API for migrations (deprecated check in UI)
**Background Jobs** (`surreal_commands`):
- Source.vectorize() submits async embedding job
- PodcastEpisode.get_job_status() polls job queue
- Decouples long-running operations from request flow
## Important Quirks & Gotchas
1. **Token counting rough estimate**: Uses cl100k_base encoding; may differ 5-10% from actual model
2. **Large context threshold hard-coded**: 105,000 token limit for large_context_model upgrade (not configurable)
3. **Async loop gymnastics in graphs**: ThreadPoolExecutor workaround for LangGraph sync nodes calling async functions (fragile)
4. **DefaultModels always fresh**: get_instance() bypasses singleton cache to pick up live config changes
5. **Polymorphic model.get()**: Resolves subclass from ID prefix; fails silently if subclass not imported
6. **RecordID string inconsistency**: repo_update() accepts both "table:id" format and full RecordID
7. **Snapshot profiles**: podcast profiles stored as dicts, so config updates don't affect past episodes
8. **No connection pooling**: Each repo_* creates new connection (adequate for HTTP but inefficient for bulk)
9. **Circular import guard**: utils imports domain; domain must not import utils (breaks on import)
10. **SqliteSaver shared location**: LangGraph checkpoints from LANGGRAPH_CHECKPOINT_FILE env var; all graphs use same file
## How to Add New Feature
**New data model**:
1. Create class inheriting from `ObjectModel` with `table_name` ClassVar
2. Define Pydantic fields and validators
3. Override `needs_embedding()` if searchable
4. Add custom methods for domain logic (get_X, add_to_Y)
5. Register in domain/__init__.py exports
**New workflow**:
1. Create state machine in graphs/WORKFLOW.py using StateGraph
2. Import domain models and provision_langchain_model()
3. Define nodes as async functions taking State, returning dict
4. Compile with graph.compile()
5. Invoke from API endpoint or Streamlit page
**New AI model type**:
1. Add type string to Model class
2. Add AIFactory.create_* method in Esperanto
3. Handle in ModelManager.get_model()
4. Add DefaultModels field + getter
## Key Dependencies
- **surrealdb**: AsyncSurreal client, RecordID type
- **pydantic**: Validation, field_validator
- **langgraph**: StateGraph, Send, SqliteSaver, async/sync bridging
- **langchain_core**: Messages, OutputParser, RunnableConfig
- **esperanto**: Multi-provider AI model abstraction (OpenAI, Anthropic, Google, Groq, Ollama, etc.)
- **content-core**: File/URL content extraction
- **ai_prompter**: Jinja2 template rendering for prompts
- **surreal_commands**: Async job queue for embeddings, podcast generation
- **loguru**: Structured logging throughout
- **tiktoken**: GPT token encoding for context window estimation
## Codebase Statistics
- **Modules**: 6 core layers + support services
- **Async operations**: Database, AI provisioning, graph execution, embedding, job tracking
- **Supported AI providers**: 8+ (OpenAI, Anthropic, Google, Groq, Ollama, Mistral, DeepSeek, xAI, OpenRouter)
- **Domain models**: Notebook, Source, Note, SourceInsight, SourceEmbedding, ChatSession, Asset, Transformation, ContentSettings, EpisodeProfile, SpeakerProfile, PodcastEpisode
- **Graph workflows**: 6 (source, chat, source_chat, ask, transformation, prompt)

109
open_notebook/ai/CLAUDE.md Normal file
View file

@ -0,0 +1,109 @@
# AI Module
Model configuration, provisioning, and management for multi-provider AI integration via Esperanto.
## Purpose
Centralizes AI model lifecycle: database models for model metadata (provider, type), default model configuration, and factory for instantiating LLM/embedding/speech models at runtime with fallback logic.
## Architecture Overview
**Two-tier system**:
1. **Database models** (`Model`, `DefaultModels`): Metadata storage and default configuration
2. **ModelManager**: Factory for provisioning models with intelligent fallback (large context detection, config override)
All models use Esperanto library as provider abstraction (OpenAI, Anthropic, Google, Groq, Ollama, Mistral, DeepSeek, xAI, OpenRouter).
## Component Catalog
### models.py
#### Model (ObjectModel)
- Database record: name, provider, type (language/embedding/speech_to_text/text_to_speech)
- `get_models_by_type()`: Async query to fetch all models of a specific type
- Stores provider-model pairs for AI factory instantiation
#### DefaultModels (RecordModel)
- Singleton configuration record (record_id: `open_notebook:default_models`)
- Fields: default_chat_model, default_transformation_model, large_context_model, default_text_to_speech_model, default_speech_to_text_model, default_embedding_model, default_tools_model
- `get_instance()`: Always fetches fresh from database (overrides parent caching for real-time updates)
- Returns fresh instance on each call (no singleton cache)
#### ModelManager
- Stateless factory for instantiating AI models
- `get_model(model_id)`: Retrieves Model by ID, creates via AIFactory.create_* based on type
- `get_defaults()`: Fetches DefaultModels configuration
- `get_default_model(model_type)`: Smart lookup (e.g., "chat" → default_chat_model, "transformation" → default_transformation_model with fallback to chat)
- `get_speech_to_text()`, `get_text_to_speech()`, `get_embedding_model()`: Type-specific convenience methods with assertions
- **Global instance**: `model_manager` singleton exported for use throughout app
### provision.py
#### provision_langchain_model()
- Factory for LangGraph nodes needing LLM provisioning
- **Smart fallback logic**:
- If tokens > 105,000: Use `large_context_model`
- Elif `model_id` specified: Use specific model
- Else: Use default model for type (e.g., "chat", "transformation")
- Returns LangChain-compatible model via `.to_langchain()`
- Logs model selection decision
## Common Patterns
- **Type dispatch**: Model.type field drives factory logic (4 model types)
- **Provider abstraction**: Esperanto handles provider differences; ModelManager unaware of provider specifics
- **Fresh defaults**: DefaultModels.get_instance() always fetches from database (not cached) for live config updates
- **Config override**: provision_langchain_model() accepts kwargs passed to AIFactory.create_* methods
- **Token-based selection**: provision_langchain_model() detects large contexts and upgrades model automatically
- **Type assertions**: get_speech_to_text(), get_embedding_model() assert returned type (safety check)
## Key Dependencies
- `esperanto`: AIFactory.create_language(), create_embedding(), create_speech_to_text(), create_text_to_speech()
- `open_notebook.database.repository`: repo_query, ensure_record_id
- `open_notebook.domain.base`: ObjectModel, RecordModel base classes
- `open_notebook.utils`: token_count() for context size detection
- `loguru`: Logging for model selection decisions
## Important Quirks & Gotchas
- **Token counting rough estimate**: provision_langchain_model() uses token_count() which estimates via cl100k_base encoding (may differ 5-10% from actual model)
- **Large context threshold hard-coded**: 105,000 token threshold for large_context_model upgrade (not configurable)
- **DefaultModels.get_instance() fresh fetch**: Intentionally bypasses parent singleton cache to pick up live config changes; creates new instance each call
- **Type-specific getters use assertions**: get_speech_to_text() asserts isinstance (catches misconfiguration early)
- **No validation of model existence**: ModelManager.get_model() raises ValueError if model not found (not caught upstream)
- **Esperanto caching**: Actual model instances cached by Esperanto (not by ModelManager); ModelManager stateless
- **Fallback chain specificity**: "transformation" type falls back to default_chat_model if not explicitly set (convention-based)
- **kwargs passed through**: provision_langchain_model() passes kwargs to AIFactory but doesn't validate what's accepted
## How to Extend
1. **Add new model type**: Add type string to Model.type enum, add create_* method in AIFactory, handle in ModelManager.get_model()
2. **Add new default configuration**: Extend DefaultModels with new field (e.g., default_vision_model), add getter in ModelManager
3. **Change fallback logic**: Modify provision_langchain_model() token threshold or fallback chain
4. **Add model filtering**: Extend Model.get_models_by_type() with additional filters (e.g., by provider)
5. **Implement model caching**: Wrap ModelManager methods with functools.lru_cache (be aware of kwargs mutability)
## Usage Example
```python
from open_notebook.ai.models import model_manager
# Get default chat model
chat_model = await model_manager.get_default_model("chat")
# Get specific model by ID
embedding_model = await model_manager.get_model("model:openai_embedding")
# Get embedding model with config override
embedding_model = await model_manager.get_embedding_model(temperature=0.1)
# Provision model for LangGraph (auto-detects large context)
from open_notebook.ai.provision import provision_langchain_model
langchain_model = await provision_langchain_model(
content=long_text,
model_id=None, # Use default
default_type="chat",
temperature=0.7
)
```

View file

@ -0,0 +1,124 @@
# Database Module
SurrealDB abstraction layer providing repository pattern for CRUD operations and async migration management.
## Purpose
Encapsulates all database interactions: connection pooling, async CRUD operations, relationship management, and schema migrations. Provides clean interface for domain models and API endpoints to interact with SurrealDB without direct query knowledge.
## Architecture Overview
Two-tier system:
1. **Repository Layer** (repository.py): Raw async CRUD operations on SurrealDB via AsyncSurreal client
2. **Migration Layer** (async_migrate.py): Schema versioning and migration execution
Both leverage connection context manager for lifecycle management and automatic cleanup.
## Component Catalog
### repository.py
**Connection Management**
- `get_database_url()`: Resolves `SURREAL_URL` or constructs from `SURREAL_ADDRESS`/`SURREAL_PORT` (backward compatible)
- `get_database_password()`: Falls back from `SURREAL_PASSWORD` to legacy `SURREAL_PASS` env var
- `db_connection()`: Async context manager handling sign-in, namespace/database selection, and cleanup
- Opens AsyncSurreal, authenticates, selects namespace/database, yields connection, closes on exit
**Query Operations**
- `repo_query(query_str, vars)`: Execute raw SurrealQL with parameter substitution; returns list of dicts
- `repo_create(table, data)`: Insert record; auto-adds `created`/`updated` timestamps; removes any existing `id` field
- `repo_insert(table, data_list, ignore_duplicates)`: Bulk insert multiple records; optionally ignores "already contains" errors
- `repo_upsert(table, id, data, add_timestamp)`: MERGE operation for create-or-update; optionally adds `updated` timestamp
- `repo_update(table, id, data)`: Update existing record by table+id or full record_id; auto-adds `updated`, parses ISO dates
- `repo_delete(record_id)`: Delete record by RecordID
- `repo_relate(source, relationship, target, data)`: Create graph relationship; optional relationship data
**Utilities**
- `parse_record_ids(obj)`: Recursively converts SurrealDB RecordID objects to strings (deep tree traversal)
- `ensure_record_id(value)`: Coerces string or RecordID to RecordID type
### async_migrate.py
**Migration Classes**
- `AsyncMigration`: Single migration wrapper
- `from_file(path)`: Load .surrealql file; strips comments and whitespace
- `run(bump)`: Execute SQL; call bump_version() on success (bump=True) or lower_version() (bump=False)
- `AsyncMigrationRunner`: Sequences multiple migrations
- `run_all()`: Execute pending migrations from current_version to end
- `run_one_up()`: Run next migration
- `run_one_down()`: Rollback latest migration
- `AsyncMigrationManager`: Main orchestrator
- Loads 9 up migrations + 9 down migrations (hard-coded in __init__)
- `get_current_version()`: Query max version from _sbl_migrations table
- `needs_migration()`: Boolean check (current < total migrations available)
- `run_migration_up()`: Run all pending migrations with logging
**Version Tracking**
- `get_latest_version()`: Query max version; returns 0 if _sbl_migrations table missing
- `get_all_versions()`: Fetch all migration records; returns empty list on error
- `bump_version()`: INSERT new entry into _sbl_migrations with version + applied_at timestamp
- `lower_version()`: DELETE latest migration record (rollback)
### migrate.py
**Backward Compatibility**
- `MigrationManager`: Sync wrapper around AsyncMigrationManager
- `get_current_version()`: Wraps async call with asyncio.run()
- `needs_migration` property: Checks if migration pending
- `run_migration_up()`: Execute migrations synchronously
## Common Patterns
- **Async-first design**: All operations async via AsyncSurreal; sync wrapper provided for legacy code
- **Connection per operation**: Each repo_* function opens/closes connection (no pooling); designed for serverless/stateless API
- **Auto-timestamping**: repo_create() and repo_update() auto-set `created`/`updated` fields
- **Error resilience**: RuntimeError for transaction conflicts (retriable); catches and re-raises other exceptions
- **RecordID polymorphism**: Functions accept string or RecordID; coerced to consistent type
- **Graceful degradation**: Migration queries catch exceptions and treat table-not-found as version 0
## Key Dependencies
- `surrealdb`: AsyncSurreal client, RecordID type
- `loguru`: Logging with context (debug/error/success levels)
- Python stdlib: `os` (env vars), `datetime` (timestamps), `contextlib` (async context manager)
## Important Quirks & Gotchas
- **No connection pooling**: Each repo_* operation creates new connection; adequate for HTTP request-scoped operations but inefficient for bulk workloads
- **Hard-coded migration files**: AsyncMigrationManager lists migrations 1-9 explicitly; adding new migration requires code change (not auto-discovery)
- **Record ID format inconsistency**: repo_update() accepts both `table:id` format and full RecordID; path handling can be subtle
- **ISO date parsing**: repo_update() parses `created` field from string to datetime if present; assumes ISO format
- **Timestamp overwrite risk**: repo_create() always sets new timestamps; can't preserve original created time on reimport
- **Transaction conflict handling**: RuntimeError from transaction conflicts logged without stack trace (prevents log spam)
- **Graceful null returns**: get_all_versions() returns [] on table missing; allows migration system to bootstrap cleanly
## How to Extend
1. **Add new CRUD operation**: Follow repo_* pattern (open connection, execute query, handle errors, close)
2. **Add migration**: Create migration file in `/migrations/N.surrealql` and `/migrations/N_down.surrealql`; update AsyncMigrationManager to load new files
3. **Change timestamp behavior**: Modify repo_create()/repo_update() to not auto-set `updated` field if caller-provided
4. **Implement connection pooling**: Replace db_connection context manager with pool.acquire() pattern (for high-throughput scenarios)
## Integration Points
- **API startup** (api/main.py): FastAPI lifespan handler calls AsyncMigrationManager.run_migration_up() on server start
- **Domain models** (domain/*.py): All models call repo_* functions for persistence
- **Commands** (commands/*.py): Background jobs use repo_* for state updates
- **Streamlit UI** (pages/*.py): Deprecated migration check; relies on API to run migrations
## Usage Example
```python
from open_notebook.database.repository import repo_create, repo_query, repo_update
# Create
record = await repo_create("notebooks", {"title": "Research"})
# Query
results = await repo_query("SELECT * FROM notebooks WHERE title = $title", {"title": "Research"})
# Update
await repo_update("notebooks", record["id"], {"title": "Updated Research"})
```

View file

@ -0,0 +1,100 @@
# Domain Module
Core data models for notebooks, sources, notes, and settings with async SurrealDB persistence, auto-embedding, and relationship management.
## Purpose
Two base classes support different persistence patterns: **ObjectModel** (mutable records with auto-increment IDs) and **RecordModel** (singleton configuration with fixed IDs).
## Key Components
### base.py
- **ObjectModel**: Base for notebooks, sources, notes
- `save()`: Create/update with auto-embedding for searchable content
- `delete()`: Remove by ID
- `relate(relationship, target_id)`: Create graph relationships (reference, artifact, refers_to)
- `get(id)`: Polymorphic fetch; resolves subclass from ID prefix
- `get_all(order_by)`: Fetch all records from table
- Integrates with ModelManager for automatic embedding
- **RecordModel**: Singleton configuration (ContentSettings, DefaultPrompts)
- Fixed record_id per subclass
- `update()`: Upsert to database
- Lazy DB loading via `_load_from_db()`
### notebook.py
- **Notebook**: Research project container
- `get_sources()`, `get_notes()`, `get_chat_sessions()`: Navigate relationships
- **Source**: Content item (file/URL)
- `vectorize()`: Submit async embedding job (returns command_id, fire-and-forget)
- `get_status()`, `get_processing_progress()`: Track job via surreal_commands
- `get_context()`: Returns summary for LLM context
- `add_insight()`: Generate and store insights with embeddings
- **Note**: Standalone or linked notes
- `needs_embedding()`: Always True (searchable)
- `add_to_notebook()`: Link to notebook
- **SourceInsight, SourceEmbedding**: Derived content models
- **ChatSession**: Conversation container with optional model_override
- **Asset**: File/URL reference helper
- **Search functions**:
- `text_search()`: Full-text keyword search
- `vector_search()`: Semantic search via embeddings (default minimum_score=0.2)
### content_settings.py
- **ContentSettings**: Singleton for processing engines, embedding strategy, file deletion, YouTube languages
### transformation.py
- **Transformation**: Reusable prompts for content transformation
- **DefaultPrompts**: Singleton with transformation instructions
## Important Patterns
- **Async/await**: All DB operations async; always use await
- **Polymorphic get()**: `ObjectModel.get(id)` determines subclass from ID prefix (table:id format)
- **Auto-embedding**: `save()` generates embeddings if `needs_embedding()` returns True
- **Nullable fields**: Declare via `nullable_fields` ClassVar to allow None in database
- **Timestamps**: `created` and `updated` auto-managed as ISO strings
- **Fire-and-forget jobs**: `source.vectorize()` returns command_id without waiting
## Key Dependencies
- `surrealdb`: RecordID type for relationships
- `pydantic`: Validation and field_validator decorators
- `open_notebook.database.repository`: CRUD and relationship functions
- `open_notebook.ai.models`: ModelManager for embeddings
- `surreal_commands`: Async job submission (vectorization, insights)
- `loguru`: Logging
## Quirks & Gotchas
- **Polymorphic resolution**: `ObjectModel.get()` fails if subclass not imported (search subclasses list)
- **RecordModel singleton**: __new__ returns existing instance; call `clear_instance()` in tests
- **Source.command field**: Stored as RecordID; auto-parsed from strings via field_validator
- **Text truncation**: `Note.get_context(short)` hardcodes 100-char limit
- **Embedding async**: Only Note and SourceInsight embed on save; Source too large (uses async job)
- **Relationship strings**: Must match SurrealDB schema (reference, artifact, refers_to)
## How to Add New Model
1. Inherit from ObjectModel with table_name ClassVar
2. Define Pydantic fields with validators
3. Override `needs_embedding()` if searchable
4. Add custom methods for domain logic (get_X, add_to_Y)
5. Implement `_prepare_save_data()` if custom serialization needed
## Usage
```python
notebook = Notebook(name="Research", description="My project")
await notebook.save()
obj = await ObjectModel.get("notebook:123") # Polymorphic fetch
# Search
await text_search("quantum", results=5)
await vector_search("quantum computing", results=10, minimum_score=0.3)
```

View file

@ -0,0 +1,61 @@
# Graphs Module
LangGraph-based workflow orchestration for content processing, chat interactions, and AI-powered transformations.
## Key Components
- **`chat.py`**: Conversational agent with message history, notebook context, and model override support
- **`source_chat.py`**: Source-focused chat with ContextBuilder for insights/content injection and context tracking
- **`ask.py`**: Multi-search strategy agent (generates search terms, retrieves results, synthesizes answers)
- **`source.py`**: Content ingestion pipeline (extract → save → transform with content-core)
- **`transformation.py`**: Single-node transformation executor with prompt templating via ai_prompter
- **`prompt.py`**: Generic pattern chain for arbitrary prompt-based LLM calls
- **`tools.py`**: Minimal tool library (currently just `get_current_timestamp()`)
## Important Patterns
- **Async/sync bridging in graphs**: Both `chat.py` and `source_chat.py` use `asyncio.new_event_loop()` workaround because LangGraph nodes are sync but `provision_langchain_model()` is async
- **State machines via StateGraph**: Each graph compiles to stateful runnable; conditional edges fan out work (ask.py, source.py do parallel transforms)
- **Prompt templating**: `ai_prompter.Prompter` with Jinja2 templates referenced by path ("chat/system", "ask/entry", etc.)
- **Model provisioning via context**: Config dict passed to node via `RunnableConfig`; defaults fall back to state overrides
- **Checkpointing**: `chat.py` and `source_chat.py` use SqliteSaver for message history (LangGraph's built-in persistence)
- **Content extraction**: `source.py` uses content-core library with provider/model from DefaultModels; URLs and files both supported
## Quirks & Edge Cases
- **Async loop gymnastics**: ThreadPoolExecutor workaround needed because LangGraph invokes sync nodes but we call async functions; fragile if event loop state changes
- **`clean_thinking_content()` ubiquitous**: Strips `<think>...</think>` tags from model responses (handles extended thinking models)
- **source_chat.py builds context twice**: ContextBuilder runs during node execution to fetch source/insights; rebuilds list from context_data (inefficient but safe)
- **source.py embedding is async**: `source.vectorize()` returns job command ID; not awaited (fire-and-forget)
- **transformation.py nullable source**: Accepts `input_text` or `source.full_text` (falls back to second if first missing)
- **ask.py hard-coded vector_search**: No fallback to text search despite commented code suggesting it was planned
- **SqliteSaver location**: Checkpoints stored in path from `LANGGRAPH_CHECKPOINT_FILE` env var; connection shared across graphs
## Key Dependencies
- `langgraph`: StateGraph, Send, END, START, SqliteSaver checkpoint persistence
- `langchain_core`: Messages, OutputParser, RunnableConfig
- `ai_prompter`: Prompter for Jinja2 template rendering
- `content_core`: `extract_content()` for file/URL processing
- `open_notebook.ai.provision`: `provision_langchain_model()` (async factory with fallback logic)
- `open_notebook.domain.notebook`: Domain models (Source, Note, SourceInsight, vector_search)
- `loguru`: Logging
## Usage Example
```python
# Invoke a graph with config override
config = {"configurable": {"model_id": "model:custom_id"}}
result = await chat_graph.ainvoke(
{"messages": [HumanMessage(content="...")], "notebook": notebook},
config=config
)
# Source processing (content → save → transform)
result = await source_graph.ainvoke({
"content_state": {...}, # ProcessSourceState from content-core
"apply_transformations": [t1, t2],
"source_id": "source:123",
"embed": True
})
```

View file

@ -1,293 +0,0 @@
from typing import ClassVar, List, Optional
from loguru import logger
from podcastfy.client import generate_podcast
from pydantic import Field, field_validator, model_validator
from open_notebook.config import DATA_FOLDER
from open_notebook.domain.notebook import ObjectModel
class PodcastEpisode(ObjectModel):
table_name: ClassVar[str] = "podcast_episode"
name: str
template: str
instructions: str
text: str
audio_file: str
class PodcastConfig(ObjectModel):
table_name: ClassVar[str] = "podcast_config"
name: str
podcast_name: str
podcast_tagline: str
output_language: str = Field(default="English")
person1_role: List[str]
person2_role: List[str]
conversation_style: List[str]
engagement_technique: List[str]
dialogue_structure: List[str]
transcript_model: Optional[str] = None
transcript_model_provider: Optional[str] = None
user_instructions: Optional[str] = None
ending_message: Optional[str] = None
creativity: float = Field(ge=0, le=1)
provider: str = Field(default="openai")
voice1: str
voice2: str
model: str
# Backwards compatibility
@field_validator("person1_role", "person2_role", mode="before")
@classmethod
def split_string_to_list(cls, value):
if isinstance(value, str):
return [item.strip() for item in value.split(",")]
return value
@model_validator(mode="after")
def validate_voices(self) -> "PodcastConfig":
if not self.voice1 or not self.voice2:
raise ValueError("Both voice1 and voice2 must be provided")
return self
async def generate_episode(
self,
episode_name: str,
text: str,
instructions: str = "",
longform: bool = False,
chunks: int = 8,
min_chunk_size=600,
):
self.user_instructions = (
instructions if instructions else self.user_instructions
)
conversation_config = {
"max_num_chunks": chunks,
"min_chunk_size": min_chunk_size,
"conversation_style": self.conversation_style,
"roles_person1": self.person1_role,
"roles_person2": self.person2_role,
"dialogue_structure": self.dialogue_structure,
"podcast_name": self.podcast_name,
"podcast_tagline": self.podcast_tagline,
"output_language": self.output_language,
"user_instructions": self.user_instructions,
"engagement_techniques": self.engagement_technique,
"creativity": self.creativity,
"text_to_speech": {
"output_directories": {
"transcripts": f"{DATA_FOLDER}/podcasts/transcripts",
"audio": f"{DATA_FOLDER}/podcasts/audio",
},
"temp_audio_dir": f"{DATA_FOLDER}/podcasts/audio/tmp",
"ending_message": "Thank you for listening to this episode. Don't forget to subscribe to our podcast for more interesting conversations.",
"default_tts_model": self.provider,
self.provider: {
"default_voices": {
"question": self.voice1,
"answer": self.voice2,
},
"model": self.model,
},
"audio_format": "mp3",
},
}
api_key_label = None
llm_model_name = None
tts_model = None
if self.transcript_model_provider:
if self.transcript_model_provider == "openai":
api_key_label = "OPENAI_API_KEY"
llm_model_name = self.transcript_model
elif self.transcript_model_provider == "anthropic":
api_key_label = "ANTHROPIC_API_KEY"
llm_model_name = self.transcript_model
elif self.transcript_model_provider == "gemini":
api_key_label = "GOOGLE_API_KEY"
llm_model_name = self.transcript_model
if self.provider == "google":
tts_model = "gemini"
elif self.provider == "openai":
tts_model = "openai"
elif self.provider == "anthropic":
tts_model = "anthropic"
elif self.provider == "vertexai":
tts_model = "geminimulti"
elif self.provider == "elevenlabs":
tts_model = "elevenlabs"
logger.info(
f"Generating episode {episode_name} with config {conversation_config} and using model {llm_model_name}, tts model {tts_model}"
)
try:
audio_file = generate_podcast(
conversation_config=conversation_config,
text=text,
tts_model=tts_model,
llm_model_name=llm_model_name,
api_key_label=api_key_label,
longform=longform,
)
episode = PodcastEpisode(
name=episode_name,
template=self.name,
instructions=instructions,
text=str(text),
audio_file=audio_file,
)
await episode.save()
except Exception as e:
logger.error(f"Failed to generate episode {episode_name}: {e}")
raise
@field_validator(
"name", "podcast_name", "podcast_tagline", "output_language", "model"
)
@classmethod
def validate_required_strings(cls, value: str, field) -> str:
if value is None or value.strip() == "":
raise ValueError(f"{field.field_name} cannot be None or empty string")
return value.strip()
@field_validator("creativity")
def validate_creativity(cls, value):
if not 0 <= value <= 1:
raise ValueError("Creativity must be between 0 and 1")
return value
conversation_styles = [
"Analytical",
"Argumentative",
"Informative",
"Humorous",
"Casual",
"Formal",
"Inspirational",
"Debate-style",
"Interview-style",
"Storytelling",
"Satirical",
"Educational",
"Philosophical",
"Speculative",
"Motivational",
"Fun",
"Technical",
"Light-hearted",
"Serious",
"Investigative",
"Debunking",
"Didactic",
"Thought-provoking",
"Controversial",
"Sarcastic",
"Emotional",
"Exploratory",
"Fast-paced",
"Slow-paced",
"Introspective",
]
# Dialogue Structures
dialogue_structures = [
"Topic Introduction",
"Opening Monologue",
"Guest Introduction",
"Icebreakers",
"Historical Context",
"Defining Terms",
"Problem Statement",
"Overview of the Issue",
"Deep Dive into Subtopics",
"Pro Arguments",
"Con Arguments",
"Cross-examination",
"Expert Interviews",
"Case Studies",
"Myth Busting",
"Q&A Session",
"Rapid-fire Questions",
"Summary of Key Points",
"Recap",
"Key Takeaways",
"Actionable Tips",
"Call to Action",
"Future Outlook",
"Closing Remarks",
"Resource Recommendations",
"Trending Topics",
"Closing Inspirational Quote",
"Final Reflections",
]
# Podcast Participant Roles
participant_roles = [
"Main Summarizer",
"Questioner/Clarifier",
"Optimist",
"Skeptic",
"Specialist",
"Thesis Presenter",
"Counterargument Provider",
"Professor",
"Student",
"Moderator",
"Host",
"Co-host",
"Expert Guest",
"Novice",
"Devil's Advocate",
"Analyst",
"Storyteller",
"Fact-checker",
"Comedian",
"Interviewer",
"Interviewee",
"Historian",
"Visionary",
"Strategist",
"Critic",
"Enthusiast",
"Mediator",
"Commentator",
"Researcher",
"Reporter",
"Advocate",
"Debater",
"Explorer",
]
# Engagement Techniques
engagement_techniques = [
"Rhetorical Questions",
"Anecdotes",
"Analogies",
"Humor",
"Metaphors",
"Storytelling",
"Quizzes",
"Personal Testimonials",
"Quotes",
"Jokes",
"Emotional Appeals",
"Provocative Statements",
"Sarcasm",
"Pop Culture References",
"Thought Experiments",
"Puzzles and Riddles",
"Role-playing",
"Debates",
"Catchphrases",
"Statistics and Facts",
"Open-ended Questions",
"Challenges to Assumptions",
"Evoking Curiosity",
]

View file

@ -0,0 +1,68 @@
# Podcasts Module
Domain models for podcast generation featuring speaker and episode profile management with job tracking.
## Purpose
Encapsulates podcast metadata and configuration: speaker profiles (voice/personality config), episode profiles (generation settings), and podcast episodes (with job status tracking via surreal-commands).
## Architecture Overview
Two-tier profile system:
- **SpeakerProfile**: TTS provider/model + 1-4 speaker configurations (name, voice_id, backstory, personality)
- **EpisodeProfile**: Generation settings (outline/transcript models, segment count, briefing template)
- **PodcastEpisode**: Generated episode record linking profiles, content, and async job
All inherit from `ObjectModel` (SurrealDB base class with table_name and save/load).
## Component Catalog
### SpeakerProfile
- Validates 1-4 speakers with required fields: name, voice_id, backstory, personality
- Stores TTS provider/model (e.g., "elevenlabs", "openai")
- `get_by_name()` async query by profile name
- Raises ValueError on invalid speaker counts or missing fields
### EpisodeProfile
- Configures outline/transcript generation: provider, model, num_segments (3-20 validated)
- References speaker_config by name
- Stores default_briefing template for episode generation
- `get_by_name()` async query
### PodcastEpisode
- Stores episode_profile and speaker_profile as dicts (snapshots of config at generation time)
- Optional audio_file path, transcript/outline dicts
- **Job tracking**: command field links to surreal-commands RecordID
- `get_job_status()` fetches async job status via surreal-commands library
- `_prepare_save_data()` ensures command field is always RecordID format for database
## Common Patterns
- **Profile snapshots**: episode_profile and speaker_profile stored as dicts to freeze config at generation time
- **Field validation**: Pydantic validators enforce constraints (segment count, speaker count, required fields)
- **Async database access**: `get_by_name()` queries via repo_query
- **Job tracking**: command field delegates to surreal-commands; get_job_status() returns "unknown" on failure
- **Record ID handling**: ensure_record_id() converts string to RecordID before save
## Key Dependencies
- `pydantic`: Field validators, ObjectModel inheritance
- `surrealdb`: RecordID type for job references
- `open_notebook.database.repository`: repo_query, ensure_record_id
- `open_notebook.domain.base`: ObjectModel base class
- `surreal_commands` (optional): get_command_status() for job status
## Important Quirks & Gotchas
- **Snapshot approach**: Episode/speaker profiles stored as dicts (not references), so profile updates don't retroactively affect past episodes
- **Job status resilience**: get_job_status() catches all exceptions and returns "unknown" (no error propagation)
- **validate_speakers executes late**: Validators run at instantiation; bulk inserts may not trigger full validation
- **RecordID coercion**: ensure_record_id() handles both string and RecordID inputs; command field parsed during deserialization
- **No cascade delete**: Removing a profile doesn't cascade to episodes using it
## How to Extend
1. **Add new speaker field**: Add to required_fields list in validate_speakers()
2. **Add episode config field**: Validate in EpisodeProfile, update briefing generation code
3. **Add job metadata**: Extend PodcastEpisode with new fields (e.g., progress tracking)
4. **Change job provider**: Replace surreal-commands with alternative job queue library; update get_job_status()

View file

@ -0,0 +1,113 @@
# Utils Module
Utility functions and helpers for context building, text processing, tokenization, and versioning.
## Purpose
Provides cross-cutting concerns: building LLM context from sources/insights, text utilities (truncation, cleaning), token counting, and version management.
## Architecture Overview
**Four core utilities**:
1. **context_builder.py**: Flexible context assembly from sources, notes, insights with token budgeting
2. **text_utils.py**: Text truncation, whitespace cleaning, formatting helpers
3. **token_utils.py**: Token counting for LLM context windows (wrapper around encoding library)
4. **version_utils.py**: Version parsing, comparison, and schema compatibility checks
Each utility is stateless and can be imported independently.
## Component Catalog
### context_builder.py
- **ContextItem**: Dataclass for individual context piece (id, type, content, priority, token_count)
- **ContextConfig**: Configuration for context building (sources/notes/insights selection, max tokens, priority weights)
- **ContextBuilder**: Main class assembling context
- `add_source()`: Include source by ID with inclusion level
- `add_note()`: Include note by ID
- `add_insight()`: Include insight by ID
- `build()`: Assemble context respecting token budget and priorities
- Uses vector_search to fetch source/insight content from SurrealDB
- Returns list of ContextItem objects sorted by priority
**Key behavior**:
- Token counting is automatic (calculated in ContextItem.__post_init__)
- Max token enforcement via priority weighting (higher priority items included first)
- Type-specific fetching: sources → Source.full_text, notes → Note.content, insights → SourceInsight.content
- Raises DatabaseOperationError if source/note fetch fails
### text_utils.py
- **truncate_text(text, max_chars, suffix="...")**: Truncates string, adds ellipsis
- **clean_text(text)**: Removes extra whitespace, normalizes newlines
- **extract_sentences(text, max_count)**: Splits text into sentences up to limit
- **normalize_whitespace(text)**: Collapse multiple spaces/newlines into single
- **format_for_llm(text)**: Combines cleaning + normalization for LLM consumption
**Key behavior**: All functions are pure (no side effects); safe for high-volume processing
### token_utils.py
- **token_count(text)**: Returns estimated token count for string (via encoding library)
- **remaining_tokens(max_tokens, used)**: Returns remaining tokens in budget
- **fits_in_context(text, max_tokens)**: Boolean check if text fits token budget
**Key behavior**: Uses fixed encoding (cl100k_base for GPT models); may differ slightly from actual model tokenization
### version_utils.py
- **parse_version(version_string)**: Parses "1.2.3" format; returns Version namedtuple
- **compare_versions(v1, v2)**: Returns -1 (v1 < v2), 0 (equal), 1 (v1 > v2)
- **is_compatible(current, required)**: Checks if current version meets requirement (e.g., current >= required)
- **schema_version_check()**: Validates database schema version on startup
**Key behavior**: Assumes semantic versioning (MAJOR.MINOR.PATCH); non-standard formats raise ValueError
## Common Patterns
- **Dataclass-driven config**: ContextConfig used by ContextBuilder (immutable after init)
- **Token budgeting**: ContextBuilder respects max_tokens constraint; prioritizes high-priority items
- **Error handling resilience**: token_count() returns estimate; context_builder catches DB errors gracefully
- **Pure text functions**: text_utils functions are stateless utilities (no class needed)
- **Lazy evaluation**: ContextBuilder doesn't fetch items until build() called
- **Type hints throughout**: All functions use Optional, List, Dict for clarity
## Key Dependencies
- `open_notebook.domain.notebook`: Source, Note, SourceInsight models; vector_search function
- `open_notebook.exceptions`: DatabaseOperationError, NotFoundError
- `tiktoken` (via token_utils.py): Token encoding for GPT models
- `loguru`: Logging in context_builder (debug-level)
## Important Quirks & Gotchas
- **Token count estimation**: Uses cl100k_base encoding; may differ 5-10% from actual model tokens
- **Priority weights default**: If not specified, ContextConfig uses default weights (source=1, note=0.8, insight=1.2)
- **Vector search required**: ContextBuilder assumes vector_search is available on Notebook model; fails if not
- **Source.full_text vs content**: Uses full_text field (may include extracted text + metadata)
- **Type-specific fetch logic**: ContextItem.content stores raw dict; caller must parse (e.g., dict["content"])
- **Circular import risk**: context_builder imports from domain.notebook; avoid domain importing utils
- **Max tokens hard limit**: ContextBuilder stops adding items once max_tokens exceeded (not prorated)
- **No caching**: Every build() call re-fetches from database (use cache layer if needed)
- **Whitespace normalization lossy**: clean_text() may change intended formatting (code blocks, poetry, etc.)
## How to Extend
1. **Add new context source type**: Create fetch method in ContextBuilder; update ContextConfig.sources dict
2. **Add text preprocessing**: Add new function to text_utils (e.g., remove_urls, extract_keywords)
3. **Change tokenization**: Replace tiktoken with alternative library in token_utils; update all calls
4. **Add context filtering**: Extend ContextConfig with filter_by_date, filter_by_topic fields
5. **Implement caching**: Wrap ContextBuilder.build() with functools.lru_cache (be aware of mutability)
## Usage Example
```python
from open_notebook.utils.context_builder import ContextBuilder, ContextConfig
config = ContextConfig(
sources={"source:123": "full", "source:456": "summary"},
max_tokens=2000,
)
builder = ContextBuilder(notebook, config)
context_items = await builder.build()
# context_items is List[ContextItem] sorted by priority
for item in context_items:
print(f"{item.type}:{item.id} ({item.token_count} tokens)")
```

190
prompts/CLAUDE.md Normal file
View file

@ -0,0 +1,190 @@
# Prompts Module
Jinja2 prompt templates for multi-provider AI workflows in Open Notebook.
## Purpose
Centralized prompt repository using `ai_prompter` library to:
1. Separate prompt engineering from Python application logic
2. Provide reusable Jinja2 templates with variable injection
3. Support multi-stage prompt chains (orchestrated by LangGraph workflows)
4. Ensure consistency across similar workflows (chat, search, content generation)
## Architecture Overview
**Template Organization by Workflow**:
- **`ask/`**: Multi-stage search synthesis (entry → query_process → final_answer)
- **`chat/`**: Conversational agent with notebook context (system prompt only)
- **`source_chat/`**: Source-focused chat with insight injection (system prompt only)
- **`podcast/`**: Podcast generation pipeline (outline → transcript)
**Rendering Pattern** (all workflows):
```python
from ai_prompter import Prompter
# Load template + render with variables
system_prompt = Prompter(prompt_template="ask/entry", parser=parser).render(
data=state
)
# Then invoke LLM
model = await provision_langchain_model(system_prompt, ...)
response = await model.ainvoke(system_prompt)
```
See detailed workflow integration in `open_notebook/graphs/CLAUDE.md` for how each template fits into chat.py, ask.py, source_chat.py.
## Prompt Engineering Patterns
### 1. Multi-Stage Chain (Ask Workflow)
Three-template chain for intelligent search:
```
entry.jinja (user question → search strategy)
query_process.jinja (run each search, generate sub-answer)
↓ (multiple parallel)
final_answer.jinja (synthesize all results into final response)
```
**Key pattern**: `entry.jinja` generates JSON-structured reasoning (via PydanticOutputParser). Each `query_process.jinja` invocation receives one search term + retrieved results. `final_answer.jinja` combines all answers with proper source citation.
### 2. Conditional Variable Injection (Podcast Workflow)
Templates accept optional variables for context assembly:
```jinja
{% if notebook %}
# PROJECT INFORMATION
{{ notebook }}
{% endif %}
{% if context %}
# CONTEXT
{{ context }}
{% endif %}
```
Enabled by Jinja2's conditional blocks. Critical for podcast outline (handles list or string context) and source_chat (injects variable notebook/insight data).
### 3. Repeated Emphasis on Citation Format (Ask & Chat)
All response-generating templates emphasize source citation rules:
- Document ID syntax: `[source:id]`, `[note:id]`, `[insight:id]`
- "Do not make up document IDs" repeated multiple times
- Example citations provided inline
**Rationale**: LLMs naturally hallucinate citations without explicit guidance; repetition + examples reduce hallucination.
### 4. Format Instructions Delegation
Templates accept external `{{ format_instructions }}` variable:
```jinja
# OUTPUT FORMATTING
{{ format_instructions }}
```
Allows caller to inject JSON schema, XML format, or other output constraints without modifying template. Decouples prompt from output format evolution.
### 5. JSON Output with Extended Thinking Support
Podcast templates include extended thinking pattern:
```jinja
IMPORTANT OUTPUT FORMAT:
- If you use extended thinking with <think> tags, put ALL your reasoning inside <think></think> tags
- Put the final JSON output OUTSIDE and AFTER any <think> tags
```
Guides models with extended thinking capability to separate reasoning from output (cleaner parsing downstream).
## File Catalog
**`ask/` - Search Synthesis Pipeline**:
- **entry.jinja**: Analyzes user question, generates search strategy with JSON output (term + instructions per search)
- **query_process.jinja**: Accepts one search term + retrieved results, generates sub-answer with citations
- **final_answer.jinja**: Combines all sub-answers into coherent final response, enforces source citation
**`chat/` - Conversational Agent**:
- **system.jinja**: Single system prompt for general chat. Uses conditional blocks for optional notebook context. Emphasizes citation format.
**`source_chat/` - Source-Focused Chat**:
- **system.jinja**: Single system prompt for source-specific discussion. Injects source metadata (ID, title, topics) + selected context. Conditional blocks for optional notebook/context data.
**`podcast/` - Podcast Generation**:
- **outline.jinja**: Takes briefing + content + speaker profiles (list support via Jinja2 for-loop). Generates JSON outline with segments (name, description, size).
- **transcript.jinja**: Takes outline + segment index + optional existing transcript. Generates JSON dialogue array (speaker name + dialogue). Iterates speakers with for-loop.
## Key Dependencies
- **ai_prompter**: Prompter class for Jinja2 template rendering with optional OutputParser binding
- **Jinja2** (transitive via ai_prompter): Template syntax (if/for, filters, variable interpolation)
- **No external AI calls**: Templates are pure text; LLM invocation happens in calling code (graphs/)
## How to Add New Template
1. **Create subdirectory** in `prompts/` matching workflow name (e.g., `prompts/new_workflow/`)
2. **Define .jinja file(s)** with Jinja2 syntax:
- Use `{{ variable_name }}` for scalar injection
- Use `{% if condition %} ... {% endif %}` for optional sections
- Use `{% for item in list %} ... {% endfor %}` for iteration
3. **Document template variables** as inline comments (follow existing templates)
4. **Reference in calling code** (graphs/):
```python
from ai_prompter import Prompter
prompt = Prompter(prompt_template="new_workflow/template_name").render(data=context_dict)
```
5. **If structured output needed**: Pass `parser=PydanticOutputParser(...)` to Prompter
6. **Document in graphs/CLAUDE.md** how new template fits into workflow chain
## Important Quirks & Gotchas
1. **Template path syntax**: Uses forward slashes without `.jinja` extension in Prompter. `"ask/entry"` maps to `/prompts/ask/entry.jinja`
2. **Variable key convention**: All data passed as `data=dict` arg to `.render()`. Template accesses variables directly (e.g., `{{ question }}`). Ensure dict keys match template variable names.
3. **OutputParser binding**: When using PydanticOutputParser, Prompter auto-injects `{{ format_instructions }}` into template. If template doesn't have this placeholder, parser is ignored.
4. **Jinja2 whitespace sensitivity**: Template indentation doesn't affect output, but raw newlines do. Use explicit `\n` or trim filters if output formatting matters.
5. **Conditional blocks are loose**: Jinja2 if-condition evaluates any truthy value (non-empty string, list, dict). `{% if variable %}` is False for empty string/"" but True for any non-empty content.
6. **For-loop list assumption**: Templates using `{% for item in list %}` don't validate list type. If caller passes string instead of list, iteration happens character-by-character (bug risk).
7. **No template composition/inheritance**: Templates are flat (no `{% extends %}` or `{% include %}`). Each workflow keeps templates independent to avoid coupling.
8. **Citation ID format is caller's responsibility**: Templates emphasize citation rules but don't validate. If caller returns wrong ID format, template can't catch it upstream.
9. **Parser extraction happens post-render**: OutputParser.parse() is called AFTER `.render()` returns string. If template has syntax errors, render fails before parsing logic runs.
10. **Template cache**: Prompter likely caches loaded templates. File edits require app restart if using cached instance.
## Testing Patterns
**Manual render test**:
```python
from ai_prompter import Prompter
prompt = Prompter(prompt_template="ask/entry").render(
data={"question": "What is RAG?"}
)
print(prompt) # Inspect Jinja2 output before sending to LLM
```
**With parser**:
```python
from pydantic import BaseModel
from langchain_core.output_parsers.pydantic import PydanticOutputParser
class Strategy(BaseModel):
reasoning: str
searches: list
parser = PydanticOutputParser(pydantic_object=Strategy)
prompt = Prompter(prompt_template="ask/entry", parser=parser).render(
data={"question": "..."}
)
# prompt now includes {{ format_instructions }} substitution
```
**Integration test** (invoke full graph):
See `open_notebook/graphs/ask.py` for how entry.jinja is invoked inside ask_graph workflow.
## Reference Documentation
- **Jinja2 syntax guide**: See existing templates for for-loop, if-conditional, variable interpolation patterns
- **Graph integration**: `open_notebook/graphs/CLAUDE.md` documents which template is used in which workflow
- **Sub-directory CLAUDE.md files**: `ask/CLAUDE.md`, `chat/CLAUDE.md`, `podcast/CLAUDE.md` (if created) provide template-specific implementation notes