docs: generate comprehensive CLAUDE.md reference documentation across codebase

Create a hierarchical CLAUDE.md documentation system for the entire Open Notebook codebase with focus on concise, pattern-driven reference cards rather than comprehensive tutorials. ## Changes ### Core Documentation System - Updated `.claude/commands/build-claude-md.md` to distinguish between leaf and parent modules, with special handling for prompt/template modules - Established clear patterns: * Leaf modules (40-70 lines): Components, hooks, API clients * Parent modules (50-150 lines): Architecture, cross-layer patterns, data flows * Template modules: Pattern focus, not catalog listings ### Generated Documentation Created 15 CLAUDE.md reference files across the project: **Frontend (React/Next.js)** - frontend/src/CLAUDE.md: Architecture overview, data flow, three-tier design - frontend/src/lib/hooks/CLAUDE.md: React Query patterns, state management - frontend/src/lib/api/CLAUDE.md: Axios client, FormData handling, interceptors - frontend/src/lib/stores/CLAUDE.md: Zustand state persistence, auth patterns - frontend/src/components/ui/CLAUDE.md: Radix UI primitives, CVA styling **Backend (Python/FastAPI)** - open_notebook/CLAUDE.md: System architecture, layer interactions - open_notebook/ai/CLAUDE.md: Model provisioning, Esperanto integration - open_notebook/domain/CLAUDE.md: Data models, ObjectModel/RecordModel patterns - open_notebook/database/CLAUDE.md: Repository pattern, async migrations - open_notebook/graphs/CLAUDE.md: LangGraph workflows, async orchestration - open_notebook/utils/CLAUDE.md: Cross-cutting utilities, context building - open_notebook/podcasts/CLAUDE.md: Episode/speaker profiles, job tracking **API & Other** - api/CLAUDE.md: REST layer, service architecture - commands/CLAUDE.md: Async command handlers, job queue patterns - prompts/CLAUDE.md: Jinja2 templates, prompt engineering patterns (refactored) **Project Root** - CLAUDE.md: Project overview, three-tier architecture, tech stack, getting started ### Key Features - Zero duplication: Parent modules reference child CLAUDE.md files, don't repeat them - Pattern-focused: Emphasizes how components work together, not component catalogs - Scannable: Short bullets, code examples only when necessary (1-2 per file) - Practical: "How to extend" guides, quirks/gotchas for each module - Navigation: Root CLAUDE.md acts as hub pointing to specialized documentation ### Cleanup - Removed unused `batch_fix_services.py` - Removed deprecated `open_notebook/plugins/podcasts.py` - Updated .gitignore for documentation consistency ## Impact New contributors can now: 1. Read root CLAUDE.md for system architecture (5 min) 2. Jump to specific layer documentation (frontend, api, open_notebook) 3. Dive into module-specific patterns in child CLAUDE.md files (1 min per module) All documentation is lean, reference-focused, and avoids duplication.
2026-01-03 16:27:52 -03:00 · 2026-01-03 16:27:52 -03:00 · 71b8d13b24
commit 71b8d13b24
parent ab5560c9a2
19 changed files with 1949 additions and 372 deletions
--- a/.gitignore
+++ b/.gitignore
@ -133,4 +133,8 @@ doc_exports/
 specs/
 .claude

-.playwright-mcp/
+.playwright-mcp/
+
+
+
+**/*.local.md
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -1,3 +1,352 @@
+# Open Notebook - Root CLAUDE.md

-We have a good amount of documentation on this project on the ./docs folder. Please read through them when necessary, and always review the docs/index.md file before starting a new feature so you know at least which docs are available. 
+This file provides architectural guidance for contributors working on Open Notebook at the project level.

+## Project Overview
+
+**Open Notebook** is an open-source, privacy-focused alternative to Google's Notebook LM. It's an AI-powered research assistant enabling users to upload multi-modal content (PDFs, audio, video, web pages), generate intelligent notes, search semantically, chat with AI models, and produce professional podcasts—all with complete control over data and choice of AI providers.
+
+**Key Values**: Privacy-first, multi-provider AI support, fully self-hosted option, open-source transparency.
+
+---
+
+## Three-Tier Architecture
+
+```
+┌─────────────────────────────────────────────────────────┐
+│              Frontend (React/Next.js)                    │
+│              frontend/ @ port 3000                       │
+├─────────────────────────────────────────────────────────┤
+│ - Notebooks, sources, notes, chat, podcasts, search UI  │
+│ - Zustand state management, TanStack Query (React Query)│
+│ - Shadcn/ui component library with Tailwind CSS         │
+└────────────────────────┬────────────────────────────────┘
+                         │ HTTP REST
+┌────────────────────────▼────────────────────────────────┐
+│              API (FastAPI)                              │
+│              api/ @ port 5055                           │
+├─────────────────────────────────────────────────────────┤
+│ - REST endpoints for notebooks, sources, notes, chat    │
+│ - LangGraph workflow orchestration                      │
+│ - Job queue for async operations (podcasts)             │
+│ - Multi-provider AI provisioning via Esperanto          │
+└────────────────────────┬────────────────────────────────┘
+                         │ SurrealQL
+┌────────────────────────▼────────────────────────────────┐
+│         Database (SurrealDB)                            │
+│         Graph database @ port 8000                      │
+├─────────────────────────────────────────────────────────┤
+│ - Records: Notebook, Source, Note, ChatSession, etc.    │
+│ - Relationships: source-to-notebook, note-to-source     │
+│ - Vector embeddings for semantic search                 │
+└─────────────────────────────────────────────────────────┘
+```
+
+---
+
+## Tech Stack
+
+### Frontend (`frontend/`)
+- **Framework**: Next.js 15 (React 19)
+- **Language**: TypeScript
+- **State Management**: Zustand
+- **Data Fetching**: TanStack Query (React Query)
+- **Styling**: Tailwind CSS + Shadcn/ui
+- **Build Tool**: Webpack (via Next.js)
+
+### API Backend (`api/` + `open_notebook/`)
+- **Framework**: FastAPI 0.104+
+- **Language**: Python 3.11+
+- **Workflows**: LangGraph state machines
+- **Database**: SurrealDB async driver
+- **AI Providers**: Esperanto library (8+ providers: OpenAI, Anthropic, Google, Groq, Ollama, Mistral, DeepSeek, xAI)
+- **Job Queue**: Surreal-Commands for async jobs (podcasts)
+- **Logging**: Loguru
+- **Validation**: Pydantic v2
+- **Testing**: Pytest
+
+### Database
+- **SurrealDB**: Graph database with built-in embedding storage and vector search
+- **Schema Migrations**: Automatic on API startup via AsyncMigrationManager
+
+### Additional Services
+- **Content Processing**: content-core library (file/URL extraction)
+- **Prompts**: AI-Prompter with Jinja2 templating
+- **Podcast Generation**: podcast-creator library
+- **Embeddings**: Multi-provider via Esperanto
+
+---
+
+## Directory Structure
+
+```
+open-notebook/
+├── frontend/                    # React/Next.js UI
+│   ├── src/
+│   │   ├── app/               # Next.js app router
+│   │   ├── components/        # React components
+│   │   ├── hooks/             # Custom React hooks
+│   │   ├── lib/               # Utilities
+│   │   └── styles/            # Global styles
+│   ├── package.json           # Node dependencies
+│   └── CLAUDE.md              # Frontend-specific guidance
+│
+├── api/                         # FastAPI REST layer
+│   ├── routers/               # HTTP endpoints
+│   ├── services/              # Business logic
+│   ├── models.py              # Request/response schemas
+│   ├── main.py                # FastAPI app + lifespan
+│   └── CLAUDE.md              # API-specific guidance
+│
+├── open_notebook/             # Python backend core (domain + workflows)
+│   ├── domain/                # Data models (Notebook, Source, Note, etc.)
+│   ├── database/              # SurrealDB async repository & migrations
+│   ├── graphs/                # LangGraph workflows (chat, ask, source)
+│   ├── ai/                    # ModelManager, AI provider provisioning
+│   ├── utils/                 # Context builders, token utils
+│   ├── podcasts/              # Podcast models & generation
+│   ├── config.py              # Configuration & paths
+│   ├── exceptions.py          # Error hierarchy
+│   └── CLAUDE.md              # Backend core guidance
+│
+├── prompts/                    # Jinja2 prompt templates
+│   ├── chat/                  # Chat prompt templates
+│   ├── ask/                   # Search/synthesis prompts
+│   ├── podcast/               # Podcast outline & transcript
+│   └── source_chat/           # Source-specific chat
+│
+├── migrations/                # SurrealDB schema migrations
+│   ├── 001_*.surql           # Initial schema
+│   └── ...
+│
+├── tests/                      # Python unit & integration tests
+│   ├── test_domain.py
+│   ├── test_graphs.py
+│   └── conftest.py
+│
+├── commands/                   # CLI utilities
+├── docs/                       # User & deployment documentation
+├── scripts/                    # Utility scripts
+├── setup_guide/               # Setup guides
+│
+├── docker-compose.yml         # Multi-container orchestration
+├── Dockerfile                 # API container image
+├── Makefile                   # Development commands
+├── pyproject.toml            # Python project config
+├── README.md                 # Project README
+├── CLAUDE.md                 # This file
+└── CLAUDE.md                 # Root project guidance (THIS FILE)
+```
+
+---
+
+## Getting Started
+
+### 1. Clone & Install
+```bash
+git clone https://github.com/lfnovo/open-notebook.git
+cd open-notebook
+
+# Python dependencies
+uv sync
+
+# Frontend dependencies
+cd frontend
+npm install
+cd ..
+```
+
+### 2. Environment Setup
+```bash
+cp .env.example .env
+# Edit .env with your API keys (OpenAI, Anthropic, etc.)
+```
+
+### 3. Start Services
+```bash
+# Terminal 1: Start SurrealDB
+make database
+
+# Terminal 2: Start API (port 5055)
+make api
+# or: uv run --env-file .env uvicorn api.main:app --host 0.0.0.0 --port 5055
+
+# Terminal 3: Start Frontend (port 3000)
+cd frontend && npm run dev
+
+# Full stack (development)
+make start-all
+```
+
+### 4. Verify
+- Frontend: http://localhost:3000
+- API docs: http://localhost:5055/docs
+- SurrealDB: http://localhost:8000
+
+---
+
+## Development Workflow
+
+### Key Commands
+```bash
+# Code quality
+make ruff          # Lint + auto-fix Python
+make lint          # Type checking (mypy)
+
+# Testing
+uv run pytest tests/
+
+# Database migrations (auto-run on API startup)
+# Manual check: API logs show "Running migration X"
+
+# Docker
+make docker-build                 # Build multi-platform image
+docker compose --profile multi up # Full stack in Docker
+```
+
+### Code Style
+- **Python**: Ruff (auto-fix), mypy (type checking)
+- **TypeScript**: ESLint config provided
+- **Commits**: Conventional commits (feat:, fix:, docs:, refactor:)
+- **Git Flow**: Feature branches from `main`
+
+---
+
+## Architecture Highlights
+
+### 1. Async-First Design
+- All database queries, graph invocations, and API calls are async (await)
+- SurrealDB async driver with connection pooling
+- FastAPI handles concurrent requests efficiently
+
+### 2. LangGraph Workflows
+- **source.py**: Content ingestion (extract → embed → save)
+- **chat.py**: Conversational agent with message history
+- **ask.py**: Search + synthesis (retrieve relevant sources → LLM)
+- **transformation.py**: Custom transformations on sources
+- All use `provision_langchain_model()` for smart model selection
+
+### 3. Multi-Provider AI
+- **Esperanto library**: Unified interface to 8+ AI providers
+- **ModelManager**: Factory pattern with fallback logic
+- **Smart selection**: Detects large contexts, prefers long-context models
+- **Override support**: Per-request model configuration
+
+### 4. Database Schema
+- **Automatic migrations**: AsyncMigrationManager runs on API startup
+- **SurrealDB graph model**: Records with relationships and embeddings
+- **Vector search**: Built-in semantic search across all content
+- **Transactions**: Repo functions handle ACID operations
+
+### 5. Authentication
+- **Current**: Simple password middleware (insecure, dev-only)
+- **Production**: Replace with OAuth/JWT (see CONFIGURATION.md)
+
+---
+
+## Important Quirks & Gotchas
+
+### API Startup
+- **Migrations run automatically** on startup; check logs for errors
+- **Must start API before UI**: UI depends on API for all data
+- **SurrealDB must be running**: API fails without database connection
+
+### Frontend-Backend Communication
+- **Base API URL**: Configured in `.env.local` (default: http://localhost:5055)
+- **CORS enabled**: Configured in `api/main.py` (allow all origins in dev)
+- **Rate limiting**: Not built-in; add at proxy layer for production
+
+### LangGraph Workflows
+- **Blocking operations**: Chat/podcast workflows may take minutes; no timeout
+- **State persistence**: Uses SQLite checkpoint storage in `/data/sqlite-db/`
+- **Model fallback**: If primary model fails, falls back to cheaper/smaller model
+
+### Podcast Generation
+- **Async job queue**: `podcast_service.py` submits jobs but doesn't wait
+- **Track status**: Use `/commands/{command_id}` endpoint to poll status
+- **TTS failures**: Fall back to silent audio if speech synthesis fails
+
+### Content Processing
+- **File extraction**: Uses content-core library; supports 50+ file types
+- **URL handling**: Extracts text + metadata from web pages
+- **Large files**: Content processing is sync; may block API briefly
+
+---
+
+## Component References
+
+See dedicated CLAUDE.md files for detailed guidance:
+
+- **[frontend/CLAUDE.md](frontend/CLAUDE.md)**: React/Next.js architecture, state management, API integration
+- **[api/CLAUDE.md](api/CLAUDE.md)**: FastAPI structure, service pattern, endpoint development
+- **[open_notebook/CLAUDE.md](open_notebook/CLAUDE.md)**: Backend core, domain models, LangGraph workflows, AI provisioning
+- **[open_notebook/domain/CLAUDE.md](open_notebook/domain/CLAUDE.md)**: Data models, repository pattern, search functions
+- **[open_notebook/ai/CLAUDE.md](open_notebook/ai/CLAUDE.md)**: ModelManager, AI provider integration, Esperanto usage
+- **[open_notebook/graphs/CLAUDE.md](open_notebook/graphs/CLAUDE.md)**: LangGraph workflow design, state machines
+- **[open_notebook/database/CLAUDE.md](open_notebook/database/CLAUDE.md)**: SurrealDB operations, migrations, async patterns
+
+---
+
+## Documentation Map
+
+- **[README.md](README.md)**: Project overview, features, quick start
+- **[docs/index.md](docs/index.md)**: Complete user & deployment documentation
+- **[CONFIGURATION.md](CONFIGURATION.md)**: Environment variables, model configuration
+- **[DESIGN_PRINCIPLES.md](DESIGN_PRINCIPLES.md)**: Architectural decisions & philosophy
+- **[MIGRATION.md](MIGRATION.md)**: v1.0 upgrade guide from Streamlit → React
+- **[CONTRIBUTING.md](CONTRIBUTING.md)**: Contribution guidelines
+- **[MAINTAINER_GUIDE.md](MAINTAINER_GUIDE.md)**: Release & maintenance procedures
+
+---
+
+## Testing Strategy
+
+- **Unit tests**: `tests/test_domain.py`, `test_models_api.py`
+- **Graph tests**: `tests/test_graphs.py` (workflow integration)
+- **Utils tests**: `tests/test_utils.py`
+- **Run all**: `uv run pytest tests/`
+- **Coverage**: Check with `pytest --cov`
+
+---
+
+## Common Tasks
+
+### Add a New API Endpoint
+1. Create router in `api/routers/feature.py`
+2. Create service in `api/feature_service.py`
+3. Define schemas in `api/models.py`
+4. Register router in `api/main.py`
+5. Test via http://localhost:5055/docs
+
+### Add a New LangGraph Workflow
+1. Create `open_notebook/graphs/workflow_name.py`
+2. Define StateDict and node functions
+3. Build graph with `.add_node()` / `.add_edge()`
+4. Invoke in service: `graph.ainvoke({"input": ...}, config={"..."})`
+5. Test with sample data in `tests/`
+
+### Add Database Migration
+1. Create `migrations/XXX_description.surql`
+2. Write SurrealQL schema changes
+3. Create `migrations/XXX_description_down.surql` (optional rollback)
+4. API auto-detects on startup; migration runs if newer than recorded version
+
+### Deploy to Production
+1. Review [CONFIGURATION.md](CONFIGURATION.md) for security settings
+2. Use `make docker-release` for multi-platform image
+3. Push to Docker Hub / GitHub Container Registry
+4. Deploy `docker compose --profile multi up`
+5. Verify migrations via API logs
+
+---
+
+## Support & Community
+
+- **Documentation**: https://open-notebook.ai
+- **Discord**: https://discord.gg/37XJPXfz2w
+- **Issues**: https://github.com/lfnovo/open-notebook/issues
+- **License**: MIT (see LICENSE)
+
+---
+
+**Last Updated**: January 2026 | **Project Version**: 1.2.4+
--- a/api/CLAUDE.md
+++ b/api/CLAUDE.md
@ -0,0 +1,117 @@
+# API Module
+
+FastAPI-based REST backend exposing services for notebooks, sources, notes, chat, podcasts, and AI model management.
+
+## Purpose
+
+FastAPI application serving three architectural layers: routes (HTTP endpoints), services (business logic), and models (request/response schemas). Integrates LangGraph workflows (chat, ask, source_chat), SurrealDB persistence, and AI providers via Esperanto.
+
+## Architecture Overview
+
+**Three layers**:
+1. **Routes** (`routers/*`): HTTP endpoints mapping to services
+2. **Services** (`*_service.py`): Business logic orchestrating domain models, database, graphs, AI providers
+3. **Models** (`models.py`): Pydantic request/response schemas with validation
+
+**Startup flow**:
+- Load .env environment variables
+- Initialize CORS middleware + password auth middleware
+- Run database migrations via AsyncMigrationManager on lifespan startup
+- Register all routers
+
+**Key services**:
+- `chat_service.py`: Invokes chat graph with messages, context
+- `podcast_service.py`: Orchestrates outline + transcript generation
+- `sources_service.py`: Content ingestion, vectorization, metadata
+- `notes_service.py`: Note creation, linking to sources/insights
+- `transformations_service.py`: Applies transformations to content
+- `models_service.py`: Manages AI provider/model configuration
+- `episode_profiles_service.py`: Manages podcast speaker/episode profiles
+
+## Component Catalog
+
+### Main Application
+- **main.py**: FastAPI app initialization, CORS setup, auth middleware, lifespan event, router registration
+- **Lifespan handler**: Runs AsyncMigrationManager on startup (database schema migration)
+- **Auth middleware**: PasswordAuthMiddleware protects endpoints (password-based access control)
+
+### Services (Business Logic)
+- **chat_service.py**: Invokes chat.py graph; handles message history via SqliteSaver
+- **podcast_service.py**: Generates outline (outline.jinja), then transcript (transcript.jinja) for episodes
+- **sources_service.py**: Ingests files/URLs (content_core), extracts text, vectorizes, saves to SurrealDB
+- **transformations_service.py**: Applies transformations via transformation.py graph
+- **models_service.py**: Manages ModelManager config (AI provider overrides)
+- **episode_profiles_service.py**: CRUD for EpisodeProfile and SpeakerProfile models
+- **insights_service.py**: Generates and retrieves source insights
+- **notes_service.py**: Creates notes linked to sources/insights
+
+### Models (Schemas)
+- **models.py**: Pydantic schemas for request/response validation
+- Request bodies: ChatRequest, CreateNoteRequest, PodcastGenerationRequest, etc.
+- Response bodies: ChatResponse, NoteResponse, PodcastResponse, etc.
+- Custom validators for enum fields, file paths, model references
+
+### Routers
+- **routers/chat.py**: POST /chat
+- **routers/source_chat.py**: POST /source/{source_id}/chat
+- **routers/podcasts.py**: POST /podcasts, GET /podcasts/{id}, etc.
+- **routers/notes.py**: POST /notes, GET /notes/{id}
+- **routers/sources.py**: POST /sources, GET /sources/{id}, DELETE /sources/{id}
+- **routers/models.py**: GET /models, POST /models/config
+- **routers/transformations.py**: POST /transformations
+- **routers/insights.py**: GET /sources/{source_id}/insights
+- **routers/auth.py**: POST /auth/password (password-based auth)
+- **routers/commands.py**: GET /commands/{command_id} (job status tracking)
+
+## Common Patterns
+
+- **Service injection via FastAPI**: Routers import services directly; no DI framework
+- **Async/await throughout**: All DB queries, graph invocations, AI calls are async
+- **SurrealDB transactions**: Services use repo_query, repo_create, repo_upsert from database layer
+- **Config override pattern**: Models/config override via models_service passed to graph.ainvoke(config=...)
+- **Error handling**: Services catch exceptions and return HTTP status codes (400 Bad Request, 404 Not Found, 500 Internal Server Error)
+- **Logging**: loguru logger in main.py; services expected to log key operations
+- **Response normalization**: All responses follow standard schema (data + metadata structure)
+
+## Key Dependencies
+
+- `fastapi`: FastAPI app, routers, HTTPException
+- `pydantic`: Validation models with Field, field_validator
+- `open_notebook.graphs`: chat, ask, source_chat, source, transformation graphs
+- `open_notebook.database`: SurrealDB repository functions (repo_query, repo_create, repo_upsert)
+- `open_notebook.domain`: Notebook, Source, Note, SourceInsight models
+- `open_notebook.ai.provision`: provision_langchain_model() factory
+- `ai_prompter`: Prompter for template rendering
+- `content_core`: extract_content() for file/URL processing
+- `esperanto`: AI provider client library (LLM, embeddings, TTS)
+- `surreal_commands`: Job queue for async operations (podcast generation)
+- `loguru`: Structured logging
+
+## Important Quirks & Gotchas
+
+- **Migration auto-run**: Database schema migrations run on every API startup (via lifespan); no manual migration steps
+- **PasswordAuthMiddleware is basic**: Uses simple password check; production deployments should replace with OAuth/JWT
+- **No request rate limiting**: No built-in rate limiting; deployment must add via proxy/middleware
+- **Service state is stateless**: Services don't cache results; each request re-queries database/AI models
+- **Graph invocation is blocking**: chat/podcast workflows may take minutes; no timeout handling in services
+- **Command job fire-and-forget**: podcast_service.py submits jobs but doesn't wait (async job queue pattern)
+- **Model override scoping**: Model config override via RunnableConfig is per-request only (not persistent)
+- **CORS open by default**: main.py CORS settings allow all origins (restrict before production)
+- **No OpenAPI security scheme**: API docs available without auth (disable before production)
+- **Services don't validate user permission**: All endpoints trust authentication layer; no per-notebook permission checks
+
+## How to Add New Endpoint
+
+1. Create router file in `routers/` (e.g., `routers/new_feature.py`)
+2. Import router into `main.py` and register: `app.include_router(new_feature.router, tags=["new_feature"])`
+3. Create service in `new_feature_service.py` with business logic
+4. Define request/response schemas in `models.py` (or create `new_feature_models.py`)
+5. Implement router functions calling service methods
+6. Test with `uv run uvicorn api.main:app --host 0.0.0.0 --port 5055`
+
+## Testing Patterns
+
+- **Interactive docs**: http://localhost:5055/docs (Swagger UI)
+- **Direct service tests**: Import service, call methods directly with test data
+- **Mock graphs**: Replace graph.ainvoke() with mock for testing service logic
+- **Database: Use test database** (separate SurrealDB instance or mock repo_query)
--- a/batch_fix_services.py
+++ b/batch_fix_services.py
@ -1,77 +0,0 @@
-#!/usr/bin/env python3
-"""Batch fix service files for mypy errors."""
-import re
-from pathlib import Path
-
-SERVICE_FILES = [
-    'api/notes_service.py',
-    'api/insights_service.py',
-    'api/episode_profiles_service.py',
-    'api/settings_service.py',
-    'api/sources_service.py',
-    'api/podcast_service.py',
-    'api/command_service.py',
-]
-
-BASE_DIR = Path('/Users/luisnovo/dev/projetos/open-notebook/open-notebook')
-
-for service_file in SERVICE_FILES:
-    file_path = BASE_DIR / service_file
-    if not file_path.exists():
-        print(f"Skipping {service_file} - file not found")
-        continue
-
-    content = file_path.read_text()
-    original_content = content
-
-    # Pattern to find: var_name = api_client.method(args)
-    # Followed by: var_name["key"] or var_name.get("key")
-    lines = content.split('\n')
-    new_lines = []
-    i = 0
-
-    while i < len(lines):
-        line = lines[i]
-
-        # Check if this line has an api_client call assignment
-        match = re.match(r'(\s*)(\w+)\s*=\s*api_client\.(\w+)\((.*)\)\s*$', line)
-        if match and 'response = api_client' not in line:
-            indent = match.group(1)
-            var_name = match.group(2)
-            method_name = match.group(3)
-            args = match.group(4)
-
-            # Look ahead to see if this variable is used with dict access
-            has_dict_access = False
-            for j in range(i+1, min(i+15, len(lines))):
-                next_line = lines[j]
-                if f'{var_name}["' in next_line or f"{var_name}['" in next_line or f'{var_name}.get(' in next_line:
-                    has_dict_access = True
-                    break
-                # Stop looking if we hit a blank line, new function, or new assignment
-                if (not next_line.strip() or
-                    next_line.strip().startswith('def ') or
-                    next_line.strip().startswith('class ') or
-                    (re.match(r'\s*\w+\s*=', next_line) and var_name not in next_line)):
-                    break
-
-            if has_dict_access:
-                # Replace with response and isinstance check
-                new_lines.append(f'{indent}response = api_client.{method_name}({args})')
-                new_lines.append(f'{indent}{var_name} = response if isinstance(response, dict) else response[0]')
-                i += 1
-                continue
-
-        new_lines.append(line)
-        i += 1
-
-    new_content = '\n'.join(new_lines)
-
-    # Check if content changed
-    if new_content != original_content:
-        file_path.write_text(new_content)
-        print(f"✓ Fixed {service_file}")
-    else:
-        print(f"- No changes needed for {service_file}")
-
-print("\nDone!")
--- a/commands/CLAUDE.md
+++ b/commands/CLAUDE.md
@ -0,0 +1,49 @@
+# Commands Module
+
+**Purpose**: Defines async command handlers for long-running operations via `surreal-commands` job queue system.
+
+## Key Components
+
+- **`process_source_command`**: Ingests content through `source_graph`, creates embeddings (optional), and generates insights. Retries on transaction conflicts (exp. jitter, max 5×).
+- **`embed_single_item_command`**: Embeds individual sources/notes/insights; splits content into chunks for vector storage.
+- **`rebuild_embeddings_command`**: Bulk re-embed all/existing items with selective type filtering.
+- **`generate_podcast_command`**: Creates podcasts via `podcast-creator` library using stored episode/speaker profiles.
+- **`process_text_command`** (example): Test fixture for text operations (uppercase, lowercase, reverse, word_count).
+- **`analyze_data_command`** (example): Test fixture for numeric aggregations.
+
+## Important Patterns
+
+- **Pydantic I/O**: All commands use `CommandInput`/`CommandOutput` subclasses for type safety and serialization.
+- **Error handling**: Permanent errors return failure output; `RuntimeError` exceptions auto-retry via surreal-commands.
+- **Model dumping**: Recursive `full_model_dump()` utility converts Pydantic models → dicts for DB/API responses.
+- **Logging**: Uses `loguru.logger` throughout; logs execution start/end and key metrics (processing time, counts).
+- **Time tracking**: All commands measure `start_time` → `processing_time` for monitoring.
+
+## Dependencies
+
+**External**: `surreal_commands` (command decorator, job queue), `loguru`, `pydantic`, `podcast_creator`
+**Internal**: `open_notebook.domain.*` (Source, Note, Transformation), `open_notebook.graphs.source`, `open_notebook.ai.models`
+
+## Quirks & Edge Cases
+
+- **source_commands**: `ensure_record_id()` wraps command IDs for DB storage; transaction conflicts trigger exponential backoff retry (1-30s). Non-`RuntimeError` exceptions are permanent.
+- **embedding_commands**: Queries DB directly for item state; chunk index must match source's chunk list. Model availability checked at command start.
+- **podcast_commands**: Profiles loaded from SurrealDB by name (must exist); briefing can be extended with suffix. Episode records created mid-execution.
+- **Example commands**: Accept optional `delay_seconds` for testing async behavior; not for production.
+
+## Code Example
+
+```python
+@command("process_source", app="open_notebook", retry={...})
+async def process_source_command(input_data: SourceProcessingInput) -> SourceProcessingOutput:
+    start_time = time.time()
+    try:
+        transformations = [await Transformation.get(id) for id in input_data.transformations]
+        source = await Source.get(input_data.source_id)
+        result = await source_graph.ainvoke({...})
+        return SourceProcessingOutput(success=True, ...)
+    except RuntimeError as e:
+        raise  # Retry this
+    except Exception as e:
+        return SourceProcessingOutput(success=False, error_message=str(e))
+```
--- a/frontend/src/CLAUDE.md
+++ b/frontend/src/CLAUDE.md
@ -0,0 +1,159 @@
+# Frontend Architecture
+
+Next.js React application providing UI for Open Notebook research assistant. Three-layer architecture: **pages** (Next.js App Router), **components** (feature-specific UI), and **lib** (data fetching, state management, utilities).
+
+## High-Level Data Flow
+
+```
+Pages (Next.js) → Components (feature-specific) → Hooks (queries/mutations)
+                                                       ↓
+                          Stores (auth/modal state) → API module → Backend
+```
+
+User interactions trigger mutations/queries via hooks, which communicate with the backend through the API module. Store state (auth, modals) flows back to components via hooks. Child CLAUDE.md files document specific modules in detail:
+
+- **`lib/api/CLAUDE.md`**: Axios client, FormData handling, interceptors
+- **`lib/hooks/CLAUDE.md`**: TanStack Query wrappers, SSE streaming, context building
+- **`lib/stores/CLAUDE.md`**: Zustand auth/modal state, localStorage persistence
+- **`components/ui/CLAUDE.md`**: Radix UI primitives, CVA styling, accessibility
+
+## Architectural Layers
+
+### Pages (`src/app/`) — Next.js App Router
+- `(auth)/login`: Authentication entry point
+- `(dashboard)/`: Protected routes (notebooks, sources, search, models, etc.)
+- Directory-based routing; each `page.tsx` is a route endpoint
+- **Key pattern**: Pages call hooks to fetch data, render components with state
+- **Router groups** `(auth)`, `(dashboard)` organize routes by feature without affecting URL
+
+### Components (`src/components/`) — Feature-Specific UI
+- **layout**: `AppShell.tsx`, `AppSidebar.tsx` — main layout wrapper used by all pages
+- **providers**: `ThemeProvider`, `QueryProvider`, `ModalProvider` — app-wide context setup
+- **auth**: `LoginForm.tsx` — authentication UI
+- **common**: `CommandPalette`, `ErrorBoundary`, `ContextToggle`, `ModelSelector` — shared across pages
+- **ui**: Reusable Radix UI building blocks (see child CLAUDE.md)
+- **source**, **notebooks**, **search**, **podcasts**: Feature-specific components consuming hooks
+
+**Component composition pattern**: Pages → Feature components → UI components. Feature components handle page-level state (loading, error), UI components remain stateless and styled.
+
+### Lib (`src/lib/`) — Data & State Layer
+
+#### `lib/api/` — Backend Communication
+- **`client.ts`**: Central Axios instance with auth interceptor, FormData handling, 10-min timeout
+- **`query-client.ts`**: TanStack Query configuration
+- **Resource modules** (`sources.ts`, `chat.ts`, `notebooks.ts`, etc.): Endpoint-specific functions returning typed responses
+- **Pattern**: All requests go through `apiClient`; auth token auto-added from localStorage
+
+#### `lib/hooks/` — React Query + Custom Logic
+- **Query hooks**: `useNotebookSources`, `useSources`, `useSource` — TanStack Query wrappers with cache keys
+- **Mutation hooks**: `useCreateSource`, `useUpdateSource`, `useDeleteSource` — mutations with toast feedback + cache invalidation
+- **Complex hooks**: `useNotebookChat`, `useSourceChat` — session management, message streaming, context building
+- **SSE streaming**: `useAsk` — parses newline-delimited JSON from backend for multi-stage workflows
+- **Pattern**: Hooks return `{ data, isLoading, error, refetch }` + action functions; cache invalidation on mutations
+
+#### `lib/stores/` — Application State
+- **`auth-store.ts`**: Authentication state (token, isAuthenticated) with 30-second check caching
+- **Zustand + persist middleware**: Auto-syncs sensitive state to localStorage
+- **Pattern**: Store actions (`login()`, `logout()`, `checkAuth()`) update state; consumed via hooks in components
+
+#### `lib/types/` — TypeScript Definitions
+- API request/response shapes, domain models (Notebook, Source, Note, etc.)
+- Ensures type safety across API calls and store mutations
+
+## Data & Control Flow Walkthrough
+
+### Example: Notebook Chat
+1. **Page** (`notebooks/[id]/page.tsx`) fetches initial data, passes `notebookId` to `ChatColumn` component
+2. **Hook call** (`useNotebookChat()`):
+   - Queries sessions for notebook via TanStack Query
+   - Sets up message state + context building logic
+   - Returns `{ messages, sendMessage(), setModelOverride() }`
+3. **Component renders**: `ChatColumn` displays messages, text input
+4. **User sends message**: Component calls `sendMessage()` hook
+5. **Hook execution**:
+   - Builds context from selected sources/notes via `buildContext()` helper
+   - Calls `chatApi.sendMessage()` (from API module)
+   - Client-side optimistic update: adds message to local state before response
+6. **Backend response** arrives, TanStack Query updates cache
+7. **Cache invalidation** on other source/note mutations ensures stale UI refreshes
+
+### Example: File Upload with Source Creation
+1. **Component** (`SourceDialog`) renders form with file picker
+2. **Hook** (`useFileUpload`):
+   - Converts file to FormData (JSON fields stringified)
+   - Calls `sourcesApi.create()` with FormData
+   - API client interceptor deletes Content-Type header (lets browser set multipart boundary)
+3. **Toast notifications** show progress
+4. **Cache invalidation** on success: `queryClient.invalidateQueries(['sources'])`
+5. **Related queries** auto-refetch: notebooks, sources list, etc.
+
+## Key Patterns & Cross-Layer Coordination
+
+### Caching & Invalidation
+- **Query keys**: `QUERY_KEYS.notebook(id)`, `QUERY_KEYS.sources(notebookId)` — hierarchical structure
+- **Broad invalidation**: `['sources']` invalidates all source queries; trade-off between accuracy + performance
+- **Auto-refetch**: `refetchOnWindowFocus: true` on frequently-changing data (sources, notebooks)
+
+### Auth & Protected Routes
+- **Middleware** (`src/middleware.ts`): Redirects unauthenticated users to `/login`
+- **Auth store**: Validates token via `/notebooks` API call (actual validation, not JWT decode)
+- **Interceptor**: Adds `Bearer {token}` to all requests; 401 response clears auth and redirects to login
+
+### Modal State Management
+- **Modal hooks**: Components query modal state from stores
+- **Context**: Modals pass data (e.g., notebook ID) to child components
+- **Pattern**: One store per modal type; triggered by button clicks + data passing via hook arguments
+
+### Error Handling
+- **API errors**: All request failures propagate to consuming code; components show toast notifications
+- **Toast feedback**: Mutations show success/error toasts (from `sonner` library)
+- **Error boundary**: App-level error boundary catches React render errors; shows fallback UI
+
+### FormData Handling
+- **JSON fields**: Nested objects (arrays, objects) must be JSON stringified before FormData
+- **Content-Type header**: Removed by interceptor for FormData requests (lets browser set boundary)
+- **Example**: `sources` array converted to string via `JSON.stringify()` before appending to FormData
+
+## Component Organization Within Features
+
+- **Feature folders** (`source/`, `notebooks/`, `podcasts/`): Group related components
+- **Composition**: Larger components nest smaller ones; no deep prop drilling (state lifted to hooks)
+- **Dialog patterns**: Features define dialog components for inline actions (edit, create, delete)
+- **Props**: Components accept data + action callbacks from parent or hooks
+
+## Providers & Context Setup
+
+**Root layout** (`app/layout.tsx`) wraps app with:
+1. `ThemeProvider` — next-themes for light/dark mode
+2. `QueryProvider` — TanStack Query client
+3. `ErrorBoundary` — React error boundary
+4. `ConnectionGuard` — checks backend connectivity on startup
+5. `Toaster` — sonner toast notification system
+
+## Important Gotchas & Design Decisions
+
+- **Token storage**: Stored in localStorage under `auth-storage` key (Zustand persist); consumed by API interceptor
+- **Base URL discovery**: API client fetches base URL from runtime config on first request (async; can be slow on startup)
+- **Optimistic updates**: Chat messages added to state before server confirmation; removed on error
+- **Modal lifecycle**: Dialogs not auto-reset; parent must clear form state after submit
+- **Focus management**: Dialog auto-focuses first input; can cause layout shifts if inputs are conditional
+- **Cache invalidation breadth**: Trade-off between precision + simplicity; broad invalidation simpler but may over-fetch
+
+## How to Add a New Feature
+
+1. **Create page**: `app/(dashboard)/feature/page.tsx` — calls hooks, renders components
+2. **Create feature components**: `components/feature/` — compose UI + business logic
+3. **Add hooks** (if data needed): `lib/hooks/useFeature.ts` — TanStack Query wrapper
+4. **Add API module** (if backend call needed): `lib/api/feature.ts` — resource-specific functions
+5. **Add types**: `lib/types/api.ts` — request/response shapes
+6. **Use UI components**: Import from `components/ui/` for consistent styling
+7. **Handle auth**: Middleware redirects unauthenticated users; no special handling needed in component
+
+## Testing
+
+- **Hooks**: Mock API functions, wrap in `QueryClientProvider`, assert query/mutation behavior
+- **Components**: Mock hooks via `vi.fn()`, test rendering + user interactions
+- **API calls**: Mock `axios` interceptors; test request/response shapes
+- **Stores**: Mock store state, test mutations via `act()`, assert state changes
+
+See child CLAUDE.md files for module-specific testing patterns.
--- a/frontend/src/components/ui/CLAUDE.md
+++ b/frontend/src/components/ui/CLAUDE.md
@ -0,0 +1,64 @@
+# UI Components Module
+
+Radix UI-based accessible component library with CVA styling, composed building blocks, and theming support.
+
+## Key Components
+
+- **Primitives** (`button.tsx`, `dialog.tsx`, `select.tsx`, `dropdown-menu.tsx`): Radix UI wrappers with Tailwind styling
+- **Composite components** (`checkbox-list.tsx`, `wizard-container.tsx`, `command.tsx`): Multi-part patterns combining primitives
+- **Form components** (`input.tsx`, `textarea.tsx`, `label.tsx`, `form-section.tsx`): Input handling with accessibility
+- **Feedback** (`alert.tsx`, `alert-dialog.tsx`, `sonner.tsx`, `progress.tsx`): User notifications and status
+- **Layout** (`card.tsx`, `accordion.tsx`, `tabs.tsx`, `scroll-area.tsx`): Structural wrappers
+- **Utilities** (`badge.tsx`, `separator.tsx`, `tooltip.tsx`, `popover.tsx`, `collapsible.tsx`): Small focused components
+
+## Important Patterns
+
+- **Radix UI wrappers**: Components delegate to Radix primitives; apply Tailwind classes via `cn()` utility
+- **CVA (Class Variance Authority)**: `button.tsx` and similar use CVA for variant/size combinations
+- **Composition via Slot**: `Button` uses `asChild` prop + `Slot` from radix to render as any element type
+- **Data slots**: All components have `data-slot` attributes for testing/styling isolation
+- **Controlled styling**: Classes hardcoded in components; use `className` prop to override/extend
+- **Animations**: Radix `data-[state]` selectors for open/close animations (fade-in, zoom-in)
+- **Accessibility first**: ARIA attributes from Radix (aria-invalid, sr-only labels, focus rings)
+- **Dark mode support**: Uses Tailwind dark: prefix for color scheme (e.g., `dark:border-input`)
+
+## Key Dependencies
+
+- `@radix-ui/*`: Unstyled accessible primitives (dialog, select, dropdown-menu, etc.)
+- `class-variance-authority`: CVA for variant patterns
+- `lucide-react`: Icon library (XIcon in dialog close button)
+- `@/lib/utils`: `cn()` utility for class merging
+
+## How to Add New Components
+
+1. Create `.tsx` file wrapping Radix primitive or composing existing components
+2. Add `data-slot="component-name"` to root element
+3. Use `cn()` to merge default classes with `className` prop
+4. Export both component and variants (if using CVA)
+5. Document prop shape and usage in JSDoc
+
+## Important Quirks & Gotchas
+
+- **Slot forwarding**: `asChild={true}` on Button passes all props to child; ensure child accepts them
+- **FormData in dialogs**: Dialog not reset automatically; parent must manually clear form state
+- **Focus management**: Dialog auto-focuses first input; can cause layout shifts if inputs conditionally rendered
+- **Z-index stacking**: Fixed elements (Dialog overlay, dropdown menus) use z-50; be careful with other fixed elements
+- **Click outside closes dropdown**: Radix dropdowns auto-close on outside click; may conflict with hover-triggered actions
+- **SVG size inference**: Button uses `[&_svg:not([class*='size-'])]:size-4` to default unlabeled icons to 4x4; be explicit if different size needed
+- **CSS-in-JS conflicts**: Hardcoded Tailwind classes may conflict with global CSS; specificity matters
+- **Dark mode class**: Requires `dark` class on document root; not automatic with prefers-color-scheme alone
+
+## Testing Patterns
+
+```typescript
+// Test component rendering with props
+render(<Button variant="destructive" size="sm">Delete</Button>)
+expect(screen.getByRole('button')).toHaveClass('bg-destructive')
+
+// Test Dialog interaction
+render(<Dialog open={true}><DialogContent>Content</DialogContent></Dialog>)
+expect(screen.getByText('Content')).toBeInTheDocument()
+
+// Test accessibility
+expect(screen.getByRole('dialog')).toHaveAttribute('role', 'dialog')
+```
--- a/frontend/src/lib/api/CLAUDE.md
+++ b/frontend/src/lib/api/CLAUDE.md
@ -0,0 +1,66 @@
+# API Module
+
+Axios-based client and resource-specific API modules for backend communication with auth, FormData handling, and error recovery.
+
+## Key Components
+
+- **`client.ts`**: Central Axios instance with request/response interceptors, auth headers, base URL resolution
+- **Resource modules** (`sources.ts`, `notebooks.ts`, `chat.ts`, `search.ts`, etc.): Endpoint-specific functions returning typed responses
+- **`query-client.ts`**: TanStack Query client configuration with default options
+- **`models.ts`, `notes.ts`, `embeddings.ts`, `settings.ts`**: Additional resource APIs
+
+## Important Patterns
+
+- **Single axios instance**: `apiClient` with 10-minute timeout (for slow LLM operations)
+- **Request interceptor**: Auto-fetches base URL from config, adds Bearer auth from localStorage `auth-storage`
+- **FormData handling**: Auto-removes Content-Type header for FormData to let browser set multipart boundary
+- **Response interceptor**: 401 clears auth and redirects to `/login`
+- **Async base URL resolution**: `getApiUrl()` fetches from runtime config on first request
+- **Error propagation**: All functions return typed responses via `response.data`
+- **Method chaining**: Resource modules export namespaced objects (e.g., `sourcesApi.list()`, `sourcesApi.create()`)
+
+## Key Dependencies
+
+- `axios`: HTTP client library
+- `@/lib/config`: `getApiUrl()` for dynamic base URL
+- `@/lib/types/api`: TypeScript types for request/response shapes
+
+## How to Add New API Modules
+
+1. Create new file (e.g., `transforms.ts`)
+2. Import `apiClient`
+3. Export namespaced object with methods:
+   ```typescript
+   export const transformsApi = {
+     list: async () => { const response = await apiClient.get('/transforms'); return response.data }
+   }
+   ```
+4. Add types to `@/lib/types/api` if new response shapes needed
+
+## Important Quirks & Gotchas
+
+- **Base URL delay**: First request waits for `getApiUrl()` to resolve; can be slow on startup
+- **FormData fields as JSON strings**: Nested objects (arrays, objects) must be JSON stringified in FormData (e.g., `notebooks`, `transformations`)
+- **Timeout for streaming**: 10-minute timeout may not cover very long-running LLM operations; consider extending if needed
+- **Auth token management**: Token stored in localStorage `auth-storage` key; uses Zustand persist middleware
+- **Headers mutation in interceptor**: Mutating `config.headers` directly; be careful with middleware order
+- **No retry logic**: Failed requests not automatically retried; must be handled in consuming code
+- **Content-Type header precedence**: FormData interceptor deletes Content-Type after checking; subsequent interceptors won't re-add it
+
+## Usage Example
+
+```typescript
+// Basic list
+const sources = await sourcesApi.list({ notebook_id: notebookId })
+
+// File upload with FormData
+const response = await sourcesApi.create({
+  type: 'upload',
+  file: fileObj,
+  notebook_id: notebookId,
+  async_processing: true
+})
+
+// With auth token (auto-added by interceptor)
+const notes = await notesApi.list()
+```
--- a/frontend/src/lib/hooks/CLAUDE.md
+++ b/frontend/src/lib/hooks/CLAUDE.md
@ -0,0 +1,64 @@
+# Hooks Module
+
+React hooks for API data fetching, state management, and complex workflows (chat, streaming, file handling).
+
+## Key Components
+
+- **Query hooks** (`useNotebookSources`, `useSource`, `useSources`): TanStack Query wrappers for source data with infinite scroll and refetch strategies
+- **Mutation hooks** (`useCreateSource`, `useUpdateSource`, `useDeleteSource`, `useFileUpload`, `useRetrySource`): Server mutations with toast notifications and cache invalidation
+- **Chat hooks** (`useNotebookChat`, `useSourceChat`): Complex session management, context building, and message streaming
+- **Streaming hooks** (`useAsk`): SSE parsing for multi-stage Ask workflows (strategy → answers → final answer)
+- **Model/config hooks** (`useModels`, `useSettings`, `useTransformations`): Application-level settings and model management
+- **Utility hooks** (`useMediaQuery`, `useToast`, `useNavigation`, `useAuth`): UI state and auth checking
+
+## Important Patterns
+
+- **TanStack Query integration**: All data hooks use `useQuery`/`useMutation` with `QUERY_KEYS` for cache consistency
+- **Optimistic updates**: Mutations add local state before server response (e.g., notebook chat messages)
+- **Cache invalidation**: Broad invalidation of query keys on mutations (e.g., `['sources']` catches all source queries)
+- **Auto-refetch on return**: `refetchOnWindowFocus: true` on frequently-changing data (sources, notebooks)
+- **Manual refetch controls**: Hooks return `refetch()` for parent components to trigger refresh
+- **SSE streaming pattern**: `useAsk` manually parses newline-delimited JSON from `/api/search/ask`; handles incomplete buffers
+- **Status polling**: `useSourceStatus` auto-refetches every 2s while `status === 'running' | 'queued' | 'new'`
+- **Context building**: `useNotebookChat.buildContext()` assembles selected sources + notes with token/char counts
+
+## Key Dependencies
+
+- `@tanstack/react-query`: Data fetching and caching
+- `sonner`: Toast notifications
+- `@/lib/api/*`: API module exports (sourcesApi, chatApi, searchApi, etc.)
+- `@/lib/types/api`: TypeScript response types
+- Zustand stores: `useAuthStore`, modal managers
+
+## How to Add New Hooks
+
+1. **Data queries**: Create `useQuery` hook wrapping API call; use `QUERY_KEYS.entityName(id)` for cache key
+2. **Mutations**: Create `useMutation` hook with `onSuccess` cache invalidation + toast feedback
+3. **Complex state**: Use `useState` + callbacks for local state (see `useAsk`, `useNotebookChat`)
+4. **Return shape**: Export object with both state and action functions for composability
+
+## Important Quirks & Gotchas
+
+- **Cache invalidation breadth**: Invalidating `['sources']` affects ALL source queries; be precise if performance matters
+- **Optimistic updates + error handling**: `useNotebookChat` removes optimistic messages on error; ensure cleanup
+- **SSE buffer handling**: `useAsk` keeps incomplete lines in buffer between reads; incomplete JSON silently skipped
+- **Model override timing**: `useNotebookChat` stores pending model override if no session exists; applied on session creation
+- **Pagination cursor**: `useNotebookSources` uses offset-based pagination; `nextOffset` calculated from page size
+- **Status polling race**: `useSourceStatus` may refetch stale data before server catches up; retry logic has 3-attempt limit
+- **Keyboard trap in dialogs**: Some hooks manage modal state; ensure Dialog/Modal components handle escape key properly
+- **Form data handling**: `useFileUpload` and source creation convert JSON fields to strings in FormData
+
+## Testing Patterns
+
+```typescript
+// Mock API
+const mockApi = {
+  list: vi.fn().mockResolvedValue([...])
+}
+
+// Test hook with QueryClientProvider + wrapper
+render(<Component />, { wrapper: QueryClientProvider })
+
+// Assert mutations trigger cache invalidation
+await waitFor(() => expect(queryClient.invalidateQueries).toHaveBeenCalled())
+```
--- a/frontend/src/lib/stores/CLAUDE.md
+++ b/frontend/src/lib/stores/CLAUDE.md
@ -0,0 +1,68 @@
+# Stores Module
+
+Zustand-based state management for authentication, modals, and application-level settings with localStorage persistence.
+
+## Key Components
+
+- **`auth-store.ts`**: Authentication state (token, isAuthenticated) with login, logout, auth checking, and Zustand persistence
+- **Modal stores** (imported via hooks): Modal visibility and data state management
+- **Settings persistence**: Auto-saves sensitive state (token, auth status) to localStorage via Zustand persist middleware
+
+## Important Patterns
+
+- **Zustand create + persist**: State + actions combined in single store; `persist` middleware auto-syncs to localStorage
+- **Selective persistence**: `partialize` option limits what's saved (e.g., only `token` and `isAuthenticated`, not `isLoading`)
+- **Hydration tracking**: `setHasHydrated()` marks when localStorage data loaded; used to avoid hydration mismatch in SSR
+- **Auth caching**: 30-second cache on `checkAuth()` to avoid excessive API calls; stores `lastAuthCheck` timestamp
+- **Network resilience**: Handles 401 globally in API interceptor; graceful degradation if API unreachable
+- **API validation**: Uses actual API call (`/notebooks` endpoint) to validate token instead of parsing JWT
+
+## Key Dependencies
+
+- `zustand`: State management library
+- `@/lib/config`: `getApiUrl()` for dynamic server discovery
+- localStorage: Browser persistence API
+
+## How to Add New Stores
+
+1. Create new file (e.g., `settings-store.ts`)
+2. Define interface extending store state and actions
+3. Use `create<Interface>()(persist(...))`  for persistence, or plain `create<Interface>()` for ephemeral state:
+   ```typescript
+   export const useSettingsStore = create<SettingsState>()(
+     persist((set) => ({
+       theme: 'dark',
+       setTheme: (theme) => set({ theme })
+     }), {
+       name: 'settings-storage'
+     })
+   )
+   ```
+
+## Important Quirks & Gotchas
+
+- **Hydration mismatch**: Server-side rendered stores must check `hasHydrated` before rendering to prevent SSR mismatches
+- **localStorage key collision**: Persist middleware uses `name` option as localStorage key; ensure unique per store
+- **Token not validated**: `login()` only checks HTTP 200 response; doesn't decode or validate JWT structure
+- **Auth check race condition**: Multiple simultaneous `checkAuth()` calls return early if one already in progress (`isCheckingAuth`)
+- **Error messages from HTTP**: Shows 401/403/5xx status codes to user; helps with debugging but may leak info
+- **Network timeout handling**: Network errors in `checkAuthRequired()` set `authRequired: null` (safe default); `login()` shows generic message
+- **Logout doesn't invalidate session**: Client-side logout only clears local token; server session may still be valid
+- **Double authentication**: Both `login()` and `checkAuth()` test same `/notebooks` endpoint; could be optimized with dedicated endpoint
+
+## Testing Patterns
+
+```typescript
+// Mock store
+const mockAuthStore = {
+  isAuthenticated: true,
+  token: 'test-token',
+  checkAuth: vi.fn().mockResolvedValue(true),
+  login: vi.fn().mockResolvedValue(true),
+  logout: vi.fn()
+}
+
+// Test store mutations
+act(() => store.setState({ theme: 'light' }))
+expect(store.getState().theme).toBe('light')
+```
--- a/open_notebook/CLAUDE.md
+++ b/open_notebook/CLAUDE.md
@ -0,0 +1,242 @@
+# Open Notebook Core Backend
+
+The `open_notebook` module is the heart of the system: a multi-layer backend orchestrating AI-powered research workflows. It bridges domain models, asynchronous database operations, LangGraph-based content processing, and multi-provider AI model management.
+
+## Purpose
+
+Encapsulates the entire backend architecture:
+1. **Data layer**: SurrealDB persistence with async CRUD and migrations
+2. **Domain layer**: Research models (Notebook, Source, Note, etc.) with embedded relationships
+3. **Workflow layer**: LangGraph state machines for content ingestion, chat, and transformations
+4. **AI provisioning**: Multi-provider model management with smart fallback logic
+5. **Support services**: Context building, tokenization, and utility functions
+
+All components communicate through async/await patterns and use Pydantic for validation.
+
+## Architecture Overview
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│                    API / Streamlit UI                        │
+└──────────────────────┬──────────────────────────────────────┘
+                       │
+    ┌──────────────────┴──────────────────┐
+    │                                     │
+┌───▼────────────────────┐   ┌──────────▼────────────────┐
+│    Graphs (LangGraph)   │   │   Domain Models (Data)    │
+│ - source.py (ingestion) │   │ - Notebook, Source, Note  │
+│ - chat.py              │   │ - ChatSession, Asset       │
+│ - ask.py (search)      │   │ - SourceInsight, Embedding│
+│ - transformation.py    │   │ - Transformation, Settings│
+└───┬────────────────────┘   │ - EpisodeProfile, Podcast │
+    │                        └──────────┬─────────────────┘
+    │                                   │
+    └───────────────────┬───────────────┘
+                        │
+    ┌───────────────────┴────────────────────┐
+    │                                        │
+┌───▼─────────────────┐      ┌──────────────▼──────┐
+│  AI Module (Models)  │      │  Utils (Helpers)     │
+│ - ModelManager       │      │ - ContextBuilder     │
+│ - DefaultModels      │      │ - TokenUtils         │
+│ - provision_langchain│      │ - TextUtils          │
+│ - Multi-provider AI  │      │ - VersionUtils       │
+└───┬─────────────────┘      └──────────┬──────────┘
+    │                                   │
+    └───────────────────┬───────────────┘
+                        │
+         ┌──────────────▼────────────────┐
+         │  Database (SurrealDB)          │
+         │ - repository.py (CRUD ops)     │
+         │ - async_migrate.py (schema)    │
+         │ - Configuration                │
+         └────────────────────────────────┘
+```
+
+## Component Catalog
+
+### Core Layers
+
+**See dedicated CLAUDE.md files for detailed patterns and usage:**
+
+- **`database/`**: Async repository pattern (repo_query, repo_create, repo_upsert), connection pooling, and automatic schema migrations on API startup. See `database/CLAUDE.md`.
+
+- **`domain/`**: Core data models using Pydantic with SurrealDB persistence. Two base classes: `ObjectModel` (mutable records with auto-increment IDs and embedding) and `RecordModel` (singleton configuration). Includes search functions (text_search, vector_search). See `domain/CLAUDE.md`.
+
+- **`graphs/`**: LangGraph state machines for async workflows. Content ingestion (source.py), conversational agents (chat.py), search synthesis (ask.py), and transformations. Uses provision_langchain_model() for smart model selection with token-aware fallback. See `graphs/CLAUDE.md`.
+
+- **`ai/`**: Centralized AI model lifecycle via Esperanto library. ModelManager factory with intelligent fallback (large context detection, type-specific defaults, config override). Supports 8+ providers (OpenAI, Anthropic, Google, Groq, Ollama, Mistral, DeepSeek, xAI). See `ai/CLAUDE.md`.
+
+- **`utils/`**: Cross-cutting utilities: ContextBuilder (flexible context assembly from sources/notes/insights with token budgeting), TextUtils (truncation, cleaning), TokenUtils (GPT token counting), VersionUtils (schema compatibility). See `utils/CLAUDE.md`.
+
+- **`podcasts/`**: Podcast generation models: SpeakerProfile (TTS voice config), EpisodeProfile (generation settings), PodcastEpisode (job tracking via surreal-commands). See `podcasts/CLAUDE.md`.
+
+### Configuration & Exceptions
+
+- **`config.py`**: Paths for data folder, uploads, LangGraph checkpoints, and tiktoken cache. Auto-creates directories.
+- **`exceptions.py`**: Hierarchy of OpenNotebookError subclasses for database, file, network, authentication, and rate-limit failures.
+
+## Data Flow: Content Ingestion
+
+```
+User uploads file/URL
+         │
+         ▼
+┌─────────────────────────────────────┐
+│ source.py (LangGraph state machine) │
+├─────────────────────────────────────┤
+│ 1. content_process()                │
+│    - extract_content() from file/URL│
+│    - Use ContentSettings defaults    │
+│    - speech_to_text model from DB   │
+│                                     │
+│ 2. save_source()                    │
+│    - Update Source with full_text   │
+│    - Preserve title if empty        │
+│                                     │
+│ 3. trigger_transformations()        │
+│    - Parallel fan-out to each TXN   │
+└────────────────┬────────────────────┘
+                 │
+                 ▼
+         ┌──────────────┐
+         │ transformation.py (parallel)
+         │ - Apply prompt to source text
+         │ - Generate insights
+         │ - Auto-embed results
+         └──────────────┘
+                 │
+                 ▼
+        ┌────────────────────┐
+        │ Database Storage    │
+        │ - Source.full_text  │
+        │ - SourceInsight     │
+        │ - Embeddings        │
+        │ - (async job)       │
+        └────────────────────┘
+```
+
+**Fire-and-forget embeddings**: Source.vectorize() returns command_id without awaiting; embedding happens asynchronously via surreal-commands job system.
+
+## Data Flow: Chat & Search
+
+```
+User message in chat
+         │
+         ▼
+┌──────────────────────────┐
+│ ContextBuilder           │
+│ - Select sources/notes   │
+│ - Token budget limiting  │
+│ - Priority weighting     │
+└──────────┬───────────────┘
+           │
+           ▼
+┌──────────────────────────────────┐
+│ chat.py or ask.py (LangGraph)    │
+│ - Load context from above        │
+│ - provision_langchain_model()    │
+│   * Auto-upgrade for large text  │
+│   * Apply model_id override      │
+│ - Call LLM with context          │
+│ - Store message in SqliteSaver   │
+└──────────┬───────────────────────┘
+           │
+           ▼
+    ┌──────────────┐
+    │ LLM Response │
+    │ (persisted)  │
+    └──────────────┘
+```
+
+## Key Patterns Across Layers
+
+### Async/Await Everywhere
+All database operations, model provisioning, and graph execution are async. Mix with sync code only via `asyncio.run()` or LangGraph's async bridges (see graphs/CLAUDE.md for workarounds).
+
+### Type-Driven Dispatch
+Model types (language, embedding, speech_to_text, text_to_speech) drive factory logic in ModelManager. Domain model IDs encode their type: `notebook:uuid`, `source:uuid`, `note:uuid`.
+
+### Smart Fallback Logic
+`provision_langchain_model()` auto-detects large contexts (105K+ tokens) and upgrades to dedicated large_context_model. Falls back to default_chat_model if specific type not found.
+
+### Fire-and-Forget Jobs
+Time-consuming operations (embedding, podcast generation) return command_id immediately. Caller polls surreal-commands for status; no blocking.
+
+### Embedding on Save
+Domain models with `needs_embedding()=True` auto-generate embeddings in `save()`. Search functions (text_search, vector_search) use embeddings for semantic matching.
+
+### Relationship Management
+SurrealDB graph edges link entities: Notebook→Source (has), Source→Note (artifact), Note→Source (refers_to). See `relate()` in domain/base.py.
+
+## Integration Points
+
+**API startup** (`api/main.py`):
+- AsyncMigrationManager.run_migration_up() on lifespan startup
+- Ensures schema is current before handling requests
+
+**Streamlit UI** (`pages/stream_app/`):
+- Calls domain models directly to fetch/create notebooks, sources, notes
+- Invokes graphs (chat, source, ask) via async wrapper
+- Relies on API for migrations (deprecated check in UI)
+
+**Background Jobs** (`surreal_commands`):
+- Source.vectorize() submits async embedding job
+- PodcastEpisode.get_job_status() polls job queue
+- Decouples long-running operations from request flow
+
+## Important Quirks & Gotchas
+
+1. **Token counting rough estimate**: Uses cl100k_base encoding; may differ 5-10% from actual model
+2. **Large context threshold hard-coded**: 105,000 token limit for large_context_model upgrade (not configurable)
+3. **Async loop gymnastics in graphs**: ThreadPoolExecutor workaround for LangGraph sync nodes calling async functions (fragile)
+4. **DefaultModels always fresh**: get_instance() bypasses singleton cache to pick up live config changes
+5. **Polymorphic model.get()**: Resolves subclass from ID prefix; fails silently if subclass not imported
+6. **RecordID string inconsistency**: repo_update() accepts both "table:id" format and full RecordID
+7. **Snapshot profiles**: podcast profiles stored as dicts, so config updates don't affect past episodes
+8. **No connection pooling**: Each repo_* creates new connection (adequate for HTTP but inefficient for bulk)
+9. **Circular import guard**: utils imports domain; domain must not import utils (breaks on import)
+10. **SqliteSaver shared location**: LangGraph checkpoints from LANGGRAPH_CHECKPOINT_FILE env var; all graphs use same file
+
+## How to Add New Feature
+
+**New data model**:
+1. Create class inheriting from `ObjectModel` with `table_name` ClassVar
+2. Define Pydantic fields and validators
+3. Override `needs_embedding()` if searchable
+4. Add custom methods for domain logic (get_X, add_to_Y)
+5. Register in domain/__init__.py exports
+
+**New workflow**:
+1. Create state machine in graphs/WORKFLOW.py using StateGraph
+2. Import domain models and provision_langchain_model()
+3. Define nodes as async functions taking State, returning dict
+4. Compile with graph.compile()
+5. Invoke from API endpoint or Streamlit page
+
+**New AI model type**:
+1. Add type string to Model class
+2. Add AIFactory.create_* method in Esperanto
+3. Handle in ModelManager.get_model()
+4. Add DefaultModels field + getter
+
+## Key Dependencies
+
+- **surrealdb**: AsyncSurreal client, RecordID type
+- **pydantic**: Validation, field_validator
+- **langgraph**: StateGraph, Send, SqliteSaver, async/sync bridging
+- **langchain_core**: Messages, OutputParser, RunnableConfig
+- **esperanto**: Multi-provider AI model abstraction (OpenAI, Anthropic, Google, Groq, Ollama, etc.)
+- **content-core**: File/URL content extraction
+- **ai_prompter**: Jinja2 template rendering for prompts
+- **surreal_commands**: Async job queue for embeddings, podcast generation
+- **loguru**: Structured logging throughout
+- **tiktoken**: GPT token encoding for context window estimation
+
+## Codebase Statistics
+
+- **Modules**: 6 core layers + support services
+- **Async operations**: Database, AI provisioning, graph execution, embedding, job tracking
+- **Supported AI providers**: 8+ (OpenAI, Anthropic, Google, Groq, Ollama, Mistral, DeepSeek, xAI, OpenRouter)
+- **Domain models**: Notebook, Source, Note, SourceInsight, SourceEmbedding, ChatSession, Asset, Transformation, ContentSettings, EpisodeProfile, SpeakerProfile, PodcastEpisode
+- **Graph workflows**: 6 (source, chat, source_chat, ask, transformation, prompt)
--- a/open_notebook/ai/CLAUDE.md
+++ b/open_notebook/ai/CLAUDE.md
@ -0,0 +1,109 @@
+# AI Module
+
+Model configuration, provisioning, and management for multi-provider AI integration via Esperanto.
+
+## Purpose
+
+Centralizes AI model lifecycle: database models for model metadata (provider, type), default model configuration, and factory for instantiating LLM/embedding/speech models at runtime with fallback logic.
+
+## Architecture Overview
+
+**Two-tier system**:
+1. **Database models** (`Model`, `DefaultModels`): Metadata storage and default configuration
+2. **ModelManager**: Factory for provisioning models with intelligent fallback (large context detection, config override)
+
+All models use Esperanto library as provider abstraction (OpenAI, Anthropic, Google, Groq, Ollama, Mistral, DeepSeek, xAI, OpenRouter).
+
+## Component Catalog
+
+### models.py
+
+#### Model (ObjectModel)
+- Database record: name, provider, type (language/embedding/speech_to_text/text_to_speech)
+- `get_models_by_type()`: Async query to fetch all models of a specific type
+- Stores provider-model pairs for AI factory instantiation
+
+#### DefaultModels (RecordModel)
+- Singleton configuration record (record_id: `open_notebook:default_models`)
+- Fields: default_chat_model, default_transformation_model, large_context_model, default_text_to_speech_model, default_speech_to_text_model, default_embedding_model, default_tools_model
+- `get_instance()`: Always fetches fresh from database (overrides parent caching for real-time updates)
+- Returns fresh instance on each call (no singleton cache)
+
+#### ModelManager
+- Stateless factory for instantiating AI models
+- `get_model(model_id)`: Retrieves Model by ID, creates via AIFactory.create_* based on type
+- `get_defaults()`: Fetches DefaultModels configuration
+- `get_default_model(model_type)`: Smart lookup (e.g., "chat" → default_chat_model, "transformation" → default_transformation_model with fallback to chat)
+- `get_speech_to_text()`, `get_text_to_speech()`, `get_embedding_model()`: Type-specific convenience methods with assertions
+- **Global instance**: `model_manager` singleton exported for use throughout app
+
+### provision.py
+
+#### provision_langchain_model()
+- Factory for LangGraph nodes needing LLM provisioning
+- **Smart fallback logic**:
+  - If tokens > 105,000: Use `large_context_model`
+  - Elif `model_id` specified: Use specific model
+  - Else: Use default model for type (e.g., "chat", "transformation")
+- Returns LangChain-compatible model via `.to_langchain()`
+- Logs model selection decision
+
+## Common Patterns
+
+- **Type dispatch**: Model.type field drives factory logic (4 model types)
+- **Provider abstraction**: Esperanto handles provider differences; ModelManager unaware of provider specifics
+- **Fresh defaults**: DefaultModels.get_instance() always fetches from database (not cached) for live config updates
+- **Config override**: provision_langchain_model() accepts kwargs passed to AIFactory.create_* methods
+- **Token-based selection**: provision_langchain_model() detects large contexts and upgrades model automatically
+- **Type assertions**: get_speech_to_text(), get_embedding_model() assert returned type (safety check)
+
+## Key Dependencies
+
+- `esperanto`: AIFactory.create_language(), create_embedding(), create_speech_to_text(), create_text_to_speech()
+- `open_notebook.database.repository`: repo_query, ensure_record_id
+- `open_notebook.domain.base`: ObjectModel, RecordModel base classes
+- `open_notebook.utils`: token_count() for context size detection
+- `loguru`: Logging for model selection decisions
+
+## Important Quirks & Gotchas
+
+- **Token counting rough estimate**: provision_langchain_model() uses token_count() which estimates via cl100k_base encoding (may differ 5-10% from actual model)
+- **Large context threshold hard-coded**: 105,000 token threshold for large_context_model upgrade (not configurable)
+- **DefaultModels.get_instance() fresh fetch**: Intentionally bypasses parent singleton cache to pick up live config changes; creates new instance each call
+- **Type-specific getters use assertions**: get_speech_to_text() asserts isinstance (catches misconfiguration early)
+- **No validation of model existence**: ModelManager.get_model() raises ValueError if model not found (not caught upstream)
+- **Esperanto caching**: Actual model instances cached by Esperanto (not by ModelManager); ModelManager stateless
+- **Fallback chain specificity**: "transformation" type falls back to default_chat_model if not explicitly set (convention-based)
+- **kwargs passed through**: provision_langchain_model() passes kwargs to AIFactory but doesn't validate what's accepted
+
+## How to Extend
+
+1. **Add new model type**: Add type string to Model.type enum, add create_* method in AIFactory, handle in ModelManager.get_model()
+2. **Add new default configuration**: Extend DefaultModels with new field (e.g., default_vision_model), add getter in ModelManager
+3. **Change fallback logic**: Modify provision_langchain_model() token threshold or fallback chain
+4. **Add model filtering**: Extend Model.get_models_by_type() with additional filters (e.g., by provider)
+5. **Implement model caching**: Wrap ModelManager methods with functools.lru_cache (be aware of kwargs mutability)
+
+## Usage Example
+
+```python
+from open_notebook.ai.models import model_manager
+
+# Get default chat model
+chat_model = await model_manager.get_default_model("chat")
+
+# Get specific model by ID
+embedding_model = await model_manager.get_model("model:openai_embedding")
+
+# Get embedding model with config override
+embedding_model = await model_manager.get_embedding_model(temperature=0.1)
+
+# Provision model for LangGraph (auto-detects large context)
+from open_notebook.ai.provision import provision_langchain_model
+langchain_model = await provision_langchain_model(
+    content=long_text,
+    model_id=None,  # Use default
+    default_type="chat",
+    temperature=0.7
+)
+```
--- a/open_notebook/database/CLAUDE.md
+++ b/open_notebook/database/CLAUDE.md
@ -0,0 +1,124 @@
+# Database Module
+
+SurrealDB abstraction layer providing repository pattern for CRUD operations and async migration management.
+
+## Purpose
+
+Encapsulates all database interactions: connection pooling, async CRUD operations, relationship management, and schema migrations. Provides clean interface for domain models and API endpoints to interact with SurrealDB without direct query knowledge.
+
+## Architecture Overview
+
+Two-tier system:
+1. **Repository Layer** (repository.py): Raw async CRUD operations on SurrealDB via AsyncSurreal client
+2. **Migration Layer** (async_migrate.py): Schema versioning and migration execution
+
+Both leverage connection context manager for lifecycle management and automatic cleanup.
+
+## Component Catalog
+
+### repository.py
+
+**Connection Management**
+- `get_database_url()`: Resolves `SURREAL_URL` or constructs from `SURREAL_ADDRESS`/`SURREAL_PORT` (backward compatible)
+- `get_database_password()`: Falls back from `SURREAL_PASSWORD` to legacy `SURREAL_PASS` env var
+- `db_connection()`: Async context manager handling sign-in, namespace/database selection, and cleanup
+  - Opens AsyncSurreal, authenticates, selects namespace/database, yields connection, closes on exit
+
+**Query Operations**
+- `repo_query(query_str, vars)`: Execute raw SurrealQL with parameter substitution; returns list of dicts
+- `repo_create(table, data)`: Insert record; auto-adds `created`/`updated` timestamps; removes any existing `id` field
+- `repo_insert(table, data_list, ignore_duplicates)`: Bulk insert multiple records; optionally ignores "already contains" errors
+- `repo_upsert(table, id, data, add_timestamp)`: MERGE operation for create-or-update; optionally adds `updated` timestamp
+- `repo_update(table, id, data)`: Update existing record by table+id or full record_id; auto-adds `updated`, parses ISO dates
+- `repo_delete(record_id)`: Delete record by RecordID
+- `repo_relate(source, relationship, target, data)`: Create graph relationship; optional relationship data
+
+**Utilities**
+- `parse_record_ids(obj)`: Recursively converts SurrealDB RecordID objects to strings (deep tree traversal)
+- `ensure_record_id(value)`: Coerces string or RecordID to RecordID type
+
+### async_migrate.py
+
+**Migration Classes**
+- `AsyncMigration`: Single migration wrapper
+  - `from_file(path)`: Load .surrealql file; strips comments and whitespace
+  - `run(bump)`: Execute SQL; call bump_version() on success (bump=True) or lower_version() (bump=False)
+
+- `AsyncMigrationRunner`: Sequences multiple migrations
+  - `run_all()`: Execute pending migrations from current_version to end
+  - `run_one_up()`: Run next migration
+  - `run_one_down()`: Rollback latest migration
+
+- `AsyncMigrationManager`: Main orchestrator
+  - Loads 9 up migrations + 9 down migrations (hard-coded in __init__)
+  - `get_current_version()`: Query max version from _sbl_migrations table
+  - `needs_migration()`: Boolean check (current < total migrations available)
+  - `run_migration_up()`: Run all pending migrations with logging
+
+**Version Tracking**
+- `get_latest_version()`: Query max version; returns 0 if _sbl_migrations table missing
+- `get_all_versions()`: Fetch all migration records; returns empty list on error
+- `bump_version()`: INSERT new entry into _sbl_migrations with version + applied_at timestamp
+- `lower_version()`: DELETE latest migration record (rollback)
+
+### migrate.py
+
+**Backward Compatibility**
+- `MigrationManager`: Sync wrapper around AsyncMigrationManager
+  - `get_current_version()`: Wraps async call with asyncio.run()
+  - `needs_migration` property: Checks if migration pending
+  - `run_migration_up()`: Execute migrations synchronously
+
+## Common Patterns
+
+- **Async-first design**: All operations async via AsyncSurreal; sync wrapper provided for legacy code
+- **Connection per operation**: Each repo_* function opens/closes connection (no pooling); designed for serverless/stateless API
+- **Auto-timestamping**: repo_create() and repo_update() auto-set `created`/`updated` fields
+- **Error resilience**: RuntimeError for transaction conflicts (retriable); catches and re-raises other exceptions
+- **RecordID polymorphism**: Functions accept string or RecordID; coerced to consistent type
+- **Graceful degradation**: Migration queries catch exceptions and treat table-not-found as version 0
+
+## Key Dependencies
+
+- `surrealdb`: AsyncSurreal client, RecordID type
+- `loguru`: Logging with context (debug/error/success levels)
+- Python stdlib: `os` (env vars), `datetime` (timestamps), `contextlib` (async context manager)
+
+## Important Quirks & Gotchas
+
+- **No connection pooling**: Each repo_* operation creates new connection; adequate for HTTP request-scoped operations but inefficient for bulk workloads
+- **Hard-coded migration files**: AsyncMigrationManager lists migrations 1-9 explicitly; adding new migration requires code change (not auto-discovery)
+- **Record ID format inconsistency**: repo_update() accepts both `table:id` format and full RecordID; path handling can be subtle
+- **ISO date parsing**: repo_update() parses `created` field from string to datetime if present; assumes ISO format
+- **Timestamp overwrite risk**: repo_create() always sets new timestamps; can't preserve original created time on reimport
+- **Transaction conflict handling**: RuntimeError from transaction conflicts logged without stack trace (prevents log spam)
+- **Graceful null returns**: get_all_versions() returns [] on table missing; allows migration system to bootstrap cleanly
+
+## How to Extend
+
+1. **Add new CRUD operation**: Follow repo_* pattern (open connection, execute query, handle errors, close)
+2. **Add migration**: Create migration file in `/migrations/N.surrealql` and `/migrations/N_down.surrealql`; update AsyncMigrationManager to load new files
+3. **Change timestamp behavior**: Modify repo_create()/repo_update() to not auto-set `updated` field if caller-provided
+4. **Implement connection pooling**: Replace db_connection context manager with pool.acquire() pattern (for high-throughput scenarios)
+
+## Integration Points
+
+- **API startup** (api/main.py): FastAPI lifespan handler calls AsyncMigrationManager.run_migration_up() on server start
+- **Domain models** (domain/*.py): All models call repo_* functions for persistence
+- **Commands** (commands/*.py): Background jobs use repo_* for state updates
+- **Streamlit UI** (pages/*.py): Deprecated migration check; relies on API to run migrations
+
+## Usage Example
+
+```python
+from open_notebook.database.repository import repo_create, repo_query, repo_update
+
+# Create
+record = await repo_create("notebooks", {"title": "Research"})
+
+# Query
+results = await repo_query("SELECT * FROM notebooks WHERE title = $title", {"title": "Research"})
+
+# Update
+await repo_update("notebooks", record["id"], {"title": "Updated Research"})
+```
--- a/open_notebook/domain/CLAUDE.md
+++ b/open_notebook/domain/CLAUDE.md
@ -0,0 +1,100 @@
+# Domain Module
+
+Core data models for notebooks, sources, notes, and settings with async SurrealDB persistence, auto-embedding, and relationship management.
+
+## Purpose
+
+Two base classes support different persistence patterns: **ObjectModel** (mutable records with auto-increment IDs) and **RecordModel** (singleton configuration with fixed IDs).
+
+## Key Components
+
+### base.py
+- **ObjectModel**: Base for notebooks, sources, notes
+  - `save()`: Create/update with auto-embedding for searchable content
+  - `delete()`: Remove by ID
+  - `relate(relationship, target_id)`: Create graph relationships (reference, artifact, refers_to)
+  - `get(id)`: Polymorphic fetch; resolves subclass from ID prefix
+  - `get_all(order_by)`: Fetch all records from table
+  - Integrates with ModelManager for automatic embedding
+
+- **RecordModel**: Singleton configuration (ContentSettings, DefaultPrompts)
+  - Fixed record_id per subclass
+  - `update()`: Upsert to database
+  - Lazy DB loading via `_load_from_db()`
+
+### notebook.py
+- **Notebook**: Research project container
+  - `get_sources()`, `get_notes()`, `get_chat_sessions()`: Navigate relationships
+
+- **Source**: Content item (file/URL)
+  - `vectorize()`: Submit async embedding job (returns command_id, fire-and-forget)
+  - `get_status()`, `get_processing_progress()`: Track job via surreal_commands
+  - `get_context()`: Returns summary for LLM context
+  - `add_insight()`: Generate and store insights with embeddings
+
+- **Note**: Standalone or linked notes
+  - `needs_embedding()`: Always True (searchable)
+  - `add_to_notebook()`: Link to notebook
+
+- **SourceInsight, SourceEmbedding**: Derived content models
+- **ChatSession**: Conversation container with optional model_override
+- **Asset**: File/URL reference helper
+
+- **Search functions**:
+  - `text_search()`: Full-text keyword search
+  - `vector_search()`: Semantic search via embeddings (default minimum_score=0.2)
+
+### content_settings.py
+- **ContentSettings**: Singleton for processing engines, embedding strategy, file deletion, YouTube languages
+
+### transformation.py
+- **Transformation**: Reusable prompts for content transformation
+- **DefaultPrompts**: Singleton with transformation instructions
+
+## Important Patterns
+
+- **Async/await**: All DB operations async; always use await
+- **Polymorphic get()**: `ObjectModel.get(id)` determines subclass from ID prefix (table:id format)
+- **Auto-embedding**: `save()` generates embeddings if `needs_embedding()` returns True
+- **Nullable fields**: Declare via `nullable_fields` ClassVar to allow None in database
+- **Timestamps**: `created` and `updated` auto-managed as ISO strings
+- **Fire-and-forget jobs**: `source.vectorize()` returns command_id without waiting
+
+## Key Dependencies
+
+- `surrealdb`: RecordID type for relationships
+- `pydantic`: Validation and field_validator decorators
+- `open_notebook.database.repository`: CRUD and relationship functions
+- `open_notebook.ai.models`: ModelManager for embeddings
+- `surreal_commands`: Async job submission (vectorization, insights)
+- `loguru`: Logging
+
+## Quirks & Gotchas
+
+- **Polymorphic resolution**: `ObjectModel.get()` fails if subclass not imported (search subclasses list)
+- **RecordModel singleton**: __new__ returns existing instance; call `clear_instance()` in tests
+- **Source.command field**: Stored as RecordID; auto-parsed from strings via field_validator
+- **Text truncation**: `Note.get_context(short)` hardcodes 100-char limit
+- **Embedding async**: Only Note and SourceInsight embed on save; Source too large (uses async job)
+- **Relationship strings**: Must match SurrealDB schema (reference, artifact, refers_to)
+
+## How to Add New Model
+
+1. Inherit from ObjectModel with table_name ClassVar
+2. Define Pydantic fields with validators
+3. Override `needs_embedding()` if searchable
+4. Add custom methods for domain logic (get_X, add_to_Y)
+5. Implement `_prepare_save_data()` if custom serialization needed
+
+## Usage
+
+```python
+notebook = Notebook(name="Research", description="My project")
+await notebook.save()
+
+obj = await ObjectModel.get("notebook:123")  # Polymorphic fetch
+
+# Search
+await text_search("quantum", results=5)
+await vector_search("quantum computing", results=10, minimum_score=0.3)
+```
--- a/open_notebook/graphs/CLAUDE.md
+++ b/open_notebook/graphs/CLAUDE.md
@ -0,0 +1,61 @@
+# Graphs Module
+
+LangGraph-based workflow orchestration for content processing, chat interactions, and AI-powered transformations.
+
+## Key Components
+
+- **`chat.py`**: Conversational agent with message history, notebook context, and model override support
+- **`source_chat.py`**: Source-focused chat with ContextBuilder for insights/content injection and context tracking
+- **`ask.py`**: Multi-search strategy agent (generates search terms, retrieves results, synthesizes answers)
+- **`source.py`**: Content ingestion pipeline (extract → save → transform with content-core)
+- **`transformation.py`**: Single-node transformation executor with prompt templating via ai_prompter
+- **`prompt.py`**: Generic pattern chain for arbitrary prompt-based LLM calls
+- **`tools.py`**: Minimal tool library (currently just `get_current_timestamp()`)
+
+## Important Patterns
+
+- **Async/sync bridging in graphs**: Both `chat.py` and `source_chat.py` use `asyncio.new_event_loop()` workaround because LangGraph nodes are sync but `provision_langchain_model()` is async
+- **State machines via StateGraph**: Each graph compiles to stateful runnable; conditional edges fan out work (ask.py, source.py do parallel transforms)
+- **Prompt templating**: `ai_prompter.Prompter` with Jinja2 templates referenced by path ("chat/system", "ask/entry", etc.)
+- **Model provisioning via context**: Config dict passed to node via `RunnableConfig`; defaults fall back to state overrides
+- **Checkpointing**: `chat.py` and `source_chat.py` use SqliteSaver for message history (LangGraph's built-in persistence)
+- **Content extraction**: `source.py` uses content-core library with provider/model from DefaultModels; URLs and files both supported
+
+## Quirks & Edge Cases
+
+- **Async loop gymnastics**: ThreadPoolExecutor workaround needed because LangGraph invokes sync nodes but we call async functions; fragile if event loop state changes
+- **`clean_thinking_content()` ubiquitous**: Strips `<think>...</think>` tags from model responses (handles extended thinking models)
+- **source_chat.py builds context twice**: ContextBuilder runs during node execution to fetch source/insights; rebuilds list from context_data (inefficient but safe)
+- **source.py embedding is async**: `source.vectorize()` returns job command ID; not awaited (fire-and-forget)
+- **transformation.py nullable source**: Accepts `input_text` or `source.full_text` (falls back to second if first missing)
+- **ask.py hard-coded vector_search**: No fallback to text search despite commented code suggesting it was planned
+- **SqliteSaver location**: Checkpoints stored in path from `LANGGRAPH_CHECKPOINT_FILE` env var; connection shared across graphs
+
+## Key Dependencies
+
+- `langgraph`: StateGraph, Send, END, START, SqliteSaver checkpoint persistence
+- `langchain_core`: Messages, OutputParser, RunnableConfig
+- `ai_prompter`: Prompter for Jinja2 template rendering
+- `content_core`: `extract_content()` for file/URL processing
+- `open_notebook.ai.provision`: `provision_langchain_model()` (async factory with fallback logic)
+- `open_notebook.domain.notebook`: Domain models (Source, Note, SourceInsight, vector_search)
+- `loguru`: Logging
+
+## Usage Example
+
+```python
+# Invoke a graph with config override
+config = {"configurable": {"model_id": "model:custom_id"}}
+result = await chat_graph.ainvoke(
+    {"messages": [HumanMessage(content="...")], "notebook": notebook},
+    config=config
+)
+
+# Source processing (content → save → transform)
+result = await source_graph.ainvoke({
+    "content_state": {...},  # ProcessSourceState from content-core
+    "apply_transformations": [t1, t2],
+    "source_id": "source:123",
+    "embed": True
+})
+```
--- a/open_notebook/plugins/podcasts.py
+++ b/open_notebook/plugins/podcasts.py
@ -1,293 +0,0 @@
-from typing import ClassVar, List, Optional
-
-from loguru import logger
-from podcastfy.client import generate_podcast
-from pydantic import Field, field_validator, model_validator
-
-from open_notebook.config import DATA_FOLDER
-from open_notebook.domain.notebook import ObjectModel
-
-
-class PodcastEpisode(ObjectModel):
-    table_name: ClassVar[str] = "podcast_episode"
-    name: str
-    template: str
-    instructions: str
-    text: str
-    audio_file: str
-
-
-class PodcastConfig(ObjectModel):
-    table_name: ClassVar[str] = "podcast_config"
-    name: str
-    podcast_name: str
-    podcast_tagline: str
-    output_language: str = Field(default="English")
-    person1_role: List[str]
-    person2_role: List[str]
-    conversation_style: List[str]
-    engagement_technique: List[str]
-    dialogue_structure: List[str]
-    transcript_model: Optional[str] = None
-    transcript_model_provider: Optional[str] = None
-    user_instructions: Optional[str] = None
-    ending_message: Optional[str] = None
-    creativity: float = Field(ge=0, le=1)
-    provider: str = Field(default="openai")
-    voice1: str
-    voice2: str
-    model: str
-
-    # Backwards compatibility
-    @field_validator("person1_role", "person2_role", mode="before")
-    @classmethod
-    def split_string_to_list(cls, value):
-        if isinstance(value, str):
-            return [item.strip() for item in value.split(",")]
-        return value
-
-    @model_validator(mode="after")
-    def validate_voices(self) -> "PodcastConfig":
-        if not self.voice1 or not self.voice2:
-            raise ValueError("Both voice1 and voice2 must be provided")
-        return self
-
-    async def generate_episode(
-        self,
-        episode_name: str,
-        text: str,
-        instructions: str = "",
-        longform: bool = False,
-        chunks: int = 8,
-        min_chunk_size=600,
-    ):
-        self.user_instructions = (
-            instructions if instructions else self.user_instructions
-        )
-        conversation_config = {
-            "max_num_chunks": chunks,
-            "min_chunk_size": min_chunk_size,
-            "conversation_style": self.conversation_style,
-            "roles_person1": self.person1_role,
-            "roles_person2": self.person2_role,
-            "dialogue_structure": self.dialogue_structure,
-            "podcast_name": self.podcast_name,
-            "podcast_tagline": self.podcast_tagline,
-            "output_language": self.output_language,
-            "user_instructions": self.user_instructions,
-            "engagement_techniques": self.engagement_technique,
-            "creativity": self.creativity,
-            "text_to_speech": {
-                "output_directories": {
-                    "transcripts": f"{DATA_FOLDER}/podcasts/transcripts",
-                    "audio": f"{DATA_FOLDER}/podcasts/audio",
-                },
-                "temp_audio_dir": f"{DATA_FOLDER}/podcasts/audio/tmp",
-                "ending_message": "Thank you for listening to this episode. Don't forget to subscribe to our podcast for more interesting conversations.",
-                "default_tts_model": self.provider,
-                self.provider: {
-                    "default_voices": {
-                        "question": self.voice1,
-                        "answer": self.voice2,
-                    },
-                    "model": self.model,
-                },
-                "audio_format": "mp3",
-            },
-        }
-
-        api_key_label = None
-        llm_model_name = None
-        tts_model = None
-
-        if self.transcript_model_provider:
-            if self.transcript_model_provider == "openai":
-                api_key_label = "OPENAI_API_KEY"
-                llm_model_name = self.transcript_model
-            elif self.transcript_model_provider == "anthropic":
-                api_key_label = "ANTHROPIC_API_KEY"
-                llm_model_name = self.transcript_model
-            elif self.transcript_model_provider == "gemini":
-                api_key_label = "GOOGLE_API_KEY"
-                llm_model_name = self.transcript_model
-
-        if self.provider == "google":
-            tts_model = "gemini"
-        elif self.provider == "openai":
-            tts_model = "openai"
-        elif self.provider == "anthropic":
-            tts_model = "anthropic"
-        elif self.provider == "vertexai":
-            tts_model = "geminimulti"
-        elif self.provider == "elevenlabs":
-            tts_model = "elevenlabs"
-
-        logger.info(
-            f"Generating episode {episode_name} with config {conversation_config} and using model {llm_model_name}, tts model {tts_model}"
-        )
-
-        try:
-            audio_file = generate_podcast(
-                conversation_config=conversation_config,
-                text=text,
-                tts_model=tts_model,
-                llm_model_name=llm_model_name,
-                api_key_label=api_key_label,
-                longform=longform,
-            )
-            episode = PodcastEpisode(
-                name=episode_name,
-                template=self.name,
-                instructions=instructions,
-                text=str(text),
-                audio_file=audio_file,
-            )
-            await episode.save()
-        except Exception as e:
-            logger.error(f"Failed to generate episode {episode_name}: {e}")
-            raise
-
-    @field_validator(
-        "name", "podcast_name", "podcast_tagline", "output_language", "model"
-    )
-    @classmethod
-    def validate_required_strings(cls, value: str, field) -> str:
-        if value is None or value.strip() == "":
-            raise ValueError(f"{field.field_name} cannot be None or empty string")
-        return value.strip()
-
-    @field_validator("creativity")
-    def validate_creativity(cls, value):
-        if not 0 <= value <= 1:
-            raise ValueError("Creativity must be between 0 and 1")
-        return value
-
-
-conversation_styles = [
-    "Analytical",
-    "Argumentative",
-    "Informative",
-    "Humorous",
-    "Casual",
-    "Formal",
-    "Inspirational",
-    "Debate-style",
-    "Interview-style",
-    "Storytelling",
-    "Satirical",
-    "Educational",
-    "Philosophical",
-    "Speculative",
-    "Motivational",
-    "Fun",
-    "Technical",
-    "Light-hearted",
-    "Serious",
-    "Investigative",
-    "Debunking",
-    "Didactic",
-    "Thought-provoking",
-    "Controversial",
-    "Sarcastic",
-    "Emotional",
-    "Exploratory",
-    "Fast-paced",
-    "Slow-paced",
-    "Introspective",
-]
-
-# Dialogue Structures
-dialogue_structures = [
-    "Topic Introduction",
-    "Opening Monologue",
-    "Guest Introduction",
-    "Icebreakers",
-    "Historical Context",
-    "Defining Terms",
-    "Problem Statement",
-    "Overview of the Issue",
-    "Deep Dive into Subtopics",
-    "Pro Arguments",
-    "Con Arguments",
-    "Cross-examination",
-    "Expert Interviews",
-    "Case Studies",
-    "Myth Busting",
-    "Q&A Session",
-    "Rapid-fire Questions",
-    "Summary of Key Points",
-    "Recap",
-    "Key Takeaways",
-    "Actionable Tips",
-    "Call to Action",
-    "Future Outlook",
-    "Closing Remarks",
-    "Resource Recommendations",
-    "Trending Topics",
-    "Closing Inspirational Quote",
-    "Final Reflections",
-]
-
-# Podcast Participant Roles
-participant_roles = [
-    "Main Summarizer",
-    "Questioner/Clarifier",
-    "Optimist",
-    "Skeptic",
-    "Specialist",
-    "Thesis Presenter",
-    "Counterargument Provider",
-    "Professor",
-    "Student",
-    "Moderator",
-    "Host",
-    "Co-host",
-    "Expert Guest",
-    "Novice",
-    "Devil's Advocate",
-    "Analyst",
-    "Storyteller",
-    "Fact-checker",
-    "Comedian",
-    "Interviewer",
-    "Interviewee",
-    "Historian",
-    "Visionary",
-    "Strategist",
-    "Critic",
-    "Enthusiast",
-    "Mediator",
-    "Commentator",
-    "Researcher",
-    "Reporter",
-    "Advocate",
-    "Debater",
-    "Explorer",
-]
-
-# Engagement Techniques
-engagement_techniques = [
-    "Rhetorical Questions",
-    "Anecdotes",
-    "Analogies",
-    "Humor",
-    "Metaphors",
-    "Storytelling",
-    "Quizzes",
-    "Personal Testimonials",
-    "Quotes",
-    "Jokes",
-    "Emotional Appeals",
-    "Provocative Statements",
-    "Sarcasm",
-    "Pop Culture References",
-    "Thought Experiments",
-    "Puzzles and Riddles",
-    "Role-playing",
-    "Debates",
-    "Catchphrases",
-    "Statistics and Facts",
-    "Open-ended Questions",
-    "Challenges to Assumptions",
-    "Evoking Curiosity",
-]
--- a/open_notebook/podcasts/CLAUDE.md
+++ b/open_notebook/podcasts/CLAUDE.md
@ -0,0 +1,68 @@
+# Podcasts Module
+
+Domain models for podcast generation featuring speaker and episode profile management with job tracking.
+
+## Purpose
+
+Encapsulates podcast metadata and configuration: speaker profiles (voice/personality config), episode profiles (generation settings), and podcast episodes (with job status tracking via surreal-commands).
+
+## Architecture Overview
+
+Two-tier profile system:
+- **SpeakerProfile**: TTS provider/model + 1-4 speaker configurations (name, voice_id, backstory, personality)
+- **EpisodeProfile**: Generation settings (outline/transcript models, segment count, briefing template)
+- **PodcastEpisode**: Generated episode record linking profiles, content, and async job
+
+All inherit from `ObjectModel` (SurrealDB base class with table_name and save/load).
+
+## Component Catalog
+
+### SpeakerProfile
+- Validates 1-4 speakers with required fields: name, voice_id, backstory, personality
+- Stores TTS provider/model (e.g., "elevenlabs", "openai")
+- `get_by_name()` async query by profile name
+- Raises ValueError on invalid speaker counts or missing fields
+
+### EpisodeProfile
+- Configures outline/transcript generation: provider, model, num_segments (3-20 validated)
+- References speaker_config by name
+- Stores default_briefing template for episode generation
+- `get_by_name()` async query
+
+### PodcastEpisode
+- Stores episode_profile and speaker_profile as dicts (snapshots of config at generation time)
+- Optional audio_file path, transcript/outline dicts
+- **Job tracking**: command field links to surreal-commands RecordID
+- `get_job_status()` fetches async job status via surreal-commands library
+- `_prepare_save_data()` ensures command field is always RecordID format for database
+
+## Common Patterns
+
+- **Profile snapshots**: episode_profile and speaker_profile stored as dicts to freeze config at generation time
+- **Field validation**: Pydantic validators enforce constraints (segment count, speaker count, required fields)
+- **Async database access**: `get_by_name()` queries via repo_query
+- **Job tracking**: command field delegates to surreal-commands; get_job_status() returns "unknown" on failure
+- **Record ID handling**: ensure_record_id() converts string to RecordID before save
+
+## Key Dependencies
+
+- `pydantic`: Field validators, ObjectModel inheritance
+- `surrealdb`: RecordID type for job references
+- `open_notebook.database.repository`: repo_query, ensure_record_id
+- `open_notebook.domain.base`: ObjectModel base class
+- `surreal_commands` (optional): get_command_status() for job status
+
+## Important Quirks & Gotchas
+
+- **Snapshot approach**: Episode/speaker profiles stored as dicts (not references), so profile updates don't retroactively affect past episodes
+- **Job status resilience**: get_job_status() catches all exceptions and returns "unknown" (no error propagation)
+- **validate_speakers executes late**: Validators run at instantiation; bulk inserts may not trigger full validation
+- **RecordID coercion**: ensure_record_id() handles both string and RecordID inputs; command field parsed during deserialization
+- **No cascade delete**: Removing a profile doesn't cascade to episodes using it
+
+## How to Extend
+
+1. **Add new speaker field**: Add to required_fields list in validate_speakers()
+2. **Add episode config field**: Validate in EpisodeProfile, update briefing generation code
+3. **Add job metadata**: Extend PodcastEpisode with new fields (e.g., progress tracking)
+4. **Change job provider**: Replace surreal-commands with alternative job queue library; update get_job_status()
--- a/open_notebook/utils/CLAUDE.md
+++ b/open_notebook/utils/CLAUDE.md
@ -0,0 +1,113 @@
+# Utils Module
+
+Utility functions and helpers for context building, text processing, tokenization, and versioning.
+
+## Purpose
+
+Provides cross-cutting concerns: building LLM context from sources/insights, text utilities (truncation, cleaning), token counting, and version management.
+
+## Architecture Overview
+
+**Four core utilities**:
+1. **context_builder.py**: Flexible context assembly from sources, notes, insights with token budgeting
+2. **text_utils.py**: Text truncation, whitespace cleaning, formatting helpers
+3. **token_utils.py**: Token counting for LLM context windows (wrapper around encoding library)
+4. **version_utils.py**: Version parsing, comparison, and schema compatibility checks
+
+Each utility is stateless and can be imported independently.
+
+## Component Catalog
+
+### context_builder.py
+- **ContextItem**: Dataclass for individual context piece (id, type, content, priority, token_count)
+- **ContextConfig**: Configuration for context building (sources/notes/insights selection, max tokens, priority weights)
+- **ContextBuilder**: Main class assembling context
+  - `add_source()`: Include source by ID with inclusion level
+  - `add_note()`: Include note by ID
+  - `add_insight()`: Include insight by ID
+  - `build()`: Assemble context respecting token budget and priorities
+  - Uses vector_search to fetch source/insight content from SurrealDB
+  - Returns list of ContextItem objects sorted by priority
+
+**Key behavior**:
+- Token counting is automatic (calculated in ContextItem.__post_init__)
+- Max token enforcement via priority weighting (higher priority items included first)
+- Type-specific fetching: sources → Source.full_text, notes → Note.content, insights → SourceInsight.content
+- Raises DatabaseOperationError if source/note fetch fails
+
+### text_utils.py
+- **truncate_text(text, max_chars, suffix="...")**: Truncates string, adds ellipsis
+- **clean_text(text)**: Removes extra whitespace, normalizes newlines
+- **extract_sentences(text, max_count)**: Splits text into sentences up to limit
+- **normalize_whitespace(text)**: Collapse multiple spaces/newlines into single
+- **format_for_llm(text)**: Combines cleaning + normalization for LLM consumption
+
+**Key behavior**: All functions are pure (no side effects); safe for high-volume processing
+
+### token_utils.py
+- **token_count(text)**: Returns estimated token count for string (via encoding library)
+- **remaining_tokens(max_tokens, used)**: Returns remaining tokens in budget
+- **fits_in_context(text, max_tokens)**: Boolean check if text fits token budget
+
+**Key behavior**: Uses fixed encoding (cl100k_base for GPT models); may differ slightly from actual model tokenization
+
+### version_utils.py
+- **parse_version(version_string)**: Parses "1.2.3" format; returns Version namedtuple
+- **compare_versions(v1, v2)**: Returns -1 (v1 < v2), 0 (equal), 1 (v1 > v2)
+- **is_compatible(current, required)**: Checks if current version meets requirement (e.g., current >= required)
+- **schema_version_check()**: Validates database schema version on startup
+
+**Key behavior**: Assumes semantic versioning (MAJOR.MINOR.PATCH); non-standard formats raise ValueError
+
+## Common Patterns
+
+- **Dataclass-driven config**: ContextConfig used by ContextBuilder (immutable after init)
+- **Token budgeting**: ContextBuilder respects max_tokens constraint; prioritizes high-priority items
+- **Error handling resilience**: token_count() returns estimate; context_builder catches DB errors gracefully
+- **Pure text functions**: text_utils functions are stateless utilities (no class needed)
+- **Lazy evaluation**: ContextBuilder doesn't fetch items until build() called
+- **Type hints throughout**: All functions use Optional, List, Dict for clarity
+
+## Key Dependencies
+
+- `open_notebook.domain.notebook`: Source, Note, SourceInsight models; vector_search function
+- `open_notebook.exceptions`: DatabaseOperationError, NotFoundError
+- `tiktoken` (via token_utils.py): Token encoding for GPT models
+- `loguru`: Logging in context_builder (debug-level)
+
+## Important Quirks & Gotchas
+
+- **Token count estimation**: Uses cl100k_base encoding; may differ 5-10% from actual model tokens
+- **Priority weights default**: If not specified, ContextConfig uses default weights (source=1, note=0.8, insight=1.2)
+- **Vector search required**: ContextBuilder assumes vector_search is available on Notebook model; fails if not
+- **Source.full_text vs content**: Uses full_text field (may include extracted text + metadata)
+- **Type-specific fetch logic**: ContextItem.content stores raw dict; caller must parse (e.g., dict["content"])
+- **Circular import risk**: context_builder imports from domain.notebook; avoid domain importing utils
+- **Max tokens hard limit**: ContextBuilder stops adding items once max_tokens exceeded (not prorated)
+- **No caching**: Every build() call re-fetches from database (use cache layer if needed)
+- **Whitespace normalization lossy**: clean_text() may change intended formatting (code blocks, poetry, etc.)
+
+## How to Extend
+
+1. **Add new context source type**: Create fetch method in ContextBuilder; update ContextConfig.sources dict
+2. **Add text preprocessing**: Add new function to text_utils (e.g., remove_urls, extract_keywords)
+3. **Change tokenization**: Replace tiktoken with alternative library in token_utils; update all calls
+4. **Add context filtering**: Extend ContextConfig with filter_by_date, filter_by_topic fields
+5. **Implement caching**: Wrap ContextBuilder.build() with functools.lru_cache (be aware of mutability)
+
+## Usage Example
+
+```python
+from open_notebook.utils.context_builder import ContextBuilder, ContextConfig
+
+config = ContextConfig(
+    sources={"source:123": "full", "source:456": "summary"},
+    max_tokens=2000,
+)
+builder = ContextBuilder(notebook, config)
+context_items = await builder.build()
+
+# context_items is List[ContextItem] sorted by priority
+for item in context_items:
+    print(f"{item.type}:{item.id} ({item.token_count} tokens)")
+```
--- a/prompts/CLAUDE.md
+++ b/prompts/CLAUDE.md
@ -0,0 +1,190 @@
+# Prompts Module
+
+Jinja2 prompt templates for multi-provider AI workflows in Open Notebook.
+
+## Purpose
+
+Centralized prompt repository using `ai_prompter` library to:
+1. Separate prompt engineering from Python application logic
+2. Provide reusable Jinja2 templates with variable injection
+3. Support multi-stage prompt chains (orchestrated by LangGraph workflows)
+4. Ensure consistency across similar workflows (chat, search, content generation)
+
+## Architecture Overview
+
+**Template Organization by Workflow**:
+- **`ask/`**: Multi-stage search synthesis (entry → query_process → final_answer)
+- **`chat/`**: Conversational agent with notebook context (system prompt only)
+- **`source_chat/`**: Source-focused chat with insight injection (system prompt only)
+- **`podcast/`**: Podcast generation pipeline (outline → transcript)
+
+**Rendering Pattern** (all workflows):
+```python
+from ai_prompter import Prompter
+
+# Load template + render with variables
+system_prompt = Prompter(prompt_template="ask/entry", parser=parser).render(
+    data=state
+)
+
+# Then invoke LLM
+model = await provision_langchain_model(system_prompt, ...)
+response = await model.ainvoke(system_prompt)
+```
+
+See detailed workflow integration in `open_notebook/graphs/CLAUDE.md` for how each template fits into chat.py, ask.py, source_chat.py.
+
+## Prompt Engineering Patterns
+
+### 1. Multi-Stage Chain (Ask Workflow)
+
+Three-template chain for intelligent search:
+
+```
+entry.jinja (user question → search strategy)
+    ↓
+query_process.jinja (run each search, generate sub-answer)
+    ↓ (multiple parallel)
+final_answer.jinja (synthesize all results into final response)
+```
+
+**Key pattern**: `entry.jinja` generates JSON-structured reasoning (via PydanticOutputParser). Each `query_process.jinja` invocation receives one search term + retrieved results. `final_answer.jinja` combines all answers with proper source citation.
+
+### 2. Conditional Variable Injection (Podcast Workflow)
+
+Templates accept optional variables for context assembly:
+
+```jinja
+{% if notebook %}
+# PROJECT INFORMATION
+{{ notebook }}
+{% endif %}
+
+{% if context %}
+# CONTEXT
+{{ context }}
+{% endif %}
+```
+
+Enabled by Jinja2's conditional blocks. Critical for podcast outline (handles list or string context) and source_chat (injects variable notebook/insight data).
+
+### 3. Repeated Emphasis on Citation Format (Ask & Chat)
+
+All response-generating templates emphasize source citation rules:
+- Document ID syntax: `[source:id]`, `[note:id]`, `[insight:id]`
+- "Do not make up document IDs" repeated multiple times
+- Example citations provided inline
+
+**Rationale**: LLMs naturally hallucinate citations without explicit guidance; repetition + examples reduce hallucination.
+
+### 4. Format Instructions Delegation
+
+Templates accept external `{{ format_instructions }}` variable:
+
+```jinja
+# OUTPUT FORMATTING
+{{ format_instructions }}
+```
+
+Allows caller to inject JSON schema, XML format, or other output constraints without modifying template. Decouples prompt from output format evolution.
+
+### 5. JSON Output with Extended Thinking Support
+
+Podcast templates include extended thinking pattern:
+
+```jinja
+IMPORTANT OUTPUT FORMAT:
+- If you use extended thinking with <think> tags, put ALL your reasoning inside <think></think> tags
+- Put the final JSON output OUTSIDE and AFTER any <think> tags
+```
+
+Guides models with extended thinking capability to separate reasoning from output (cleaner parsing downstream).
+
+## File Catalog
+
+**`ask/` - Search Synthesis Pipeline**:
+- **entry.jinja**: Analyzes user question, generates search strategy with JSON output (term + instructions per search)
+- **query_process.jinja**: Accepts one search term + retrieved results, generates sub-answer with citations
+- **final_answer.jinja**: Combines all sub-answers into coherent final response, enforces source citation
+
+**`chat/` - Conversational Agent**:
+- **system.jinja**: Single system prompt for general chat. Uses conditional blocks for optional notebook context. Emphasizes citation format.
+
+**`source_chat/` - Source-Focused Chat**:
+- **system.jinja**: Single system prompt for source-specific discussion. Injects source metadata (ID, title, topics) + selected context. Conditional blocks for optional notebook/context data.
+
+**`podcast/` - Podcast Generation**:
+- **outline.jinja**: Takes briefing + content + speaker profiles (list support via Jinja2 for-loop). Generates JSON outline with segments (name, description, size).
+- **transcript.jinja**: Takes outline + segment index + optional existing transcript. Generates JSON dialogue array (speaker name + dialogue). Iterates speakers with for-loop.
+
+## Key Dependencies
+
+- **ai_prompter**: Prompter class for Jinja2 template rendering with optional OutputParser binding
+- **Jinja2** (transitive via ai_prompter): Template syntax (if/for, filters, variable interpolation)
+- **No external AI calls**: Templates are pure text; LLM invocation happens in calling code (graphs/)
+
+## How to Add New Template
+
+1. **Create subdirectory** in `prompts/` matching workflow name (e.g., `prompts/new_workflow/`)
+2. **Define .jinja file(s)** with Jinja2 syntax:
+   - Use `{{ variable_name }}` for scalar injection
+   - Use `{% if condition %} ... {% endif %}` for optional sections
+   - Use `{% for item in list %} ... {% endfor %}` for iteration
+3. **Document template variables** as inline comments (follow existing templates)
+4. **Reference in calling code** (graphs/):
+   ```python
+   from ai_prompter import Prompter
+   prompt = Prompter(prompt_template="new_workflow/template_name").render(data=context_dict)
+   ```
+5. **If structured output needed**: Pass `parser=PydanticOutputParser(...)` to Prompter
+6. **Document in graphs/CLAUDE.md** how new template fits into workflow chain
+
+## Important Quirks & Gotchas
+
+1. **Template path syntax**: Uses forward slashes without `.jinja` extension in Prompter. `"ask/entry"` maps to `/prompts/ask/entry.jinja`
+2. **Variable key convention**: All data passed as `data=dict` arg to `.render()`. Template accesses variables directly (e.g., `{{ question }}`). Ensure dict keys match template variable names.
+3. **OutputParser binding**: When using PydanticOutputParser, Prompter auto-injects `{{ format_instructions }}` into template. If template doesn't have this placeholder, parser is ignored.
+4. **Jinja2 whitespace sensitivity**: Template indentation doesn't affect output, but raw newlines do. Use explicit `\n` or trim filters if output formatting matters.
+5. **Conditional blocks are loose**: Jinja2 if-condition evaluates any truthy value (non-empty string, list, dict). `{% if variable %}` is False for empty string/"" but True for any non-empty content.
+6. **For-loop list assumption**: Templates using `{% for item in list %}` don't validate list type. If caller passes string instead of list, iteration happens character-by-character (bug risk).
+7. **No template composition/inheritance**: Templates are flat (no `{% extends %}` or `{% include %}`). Each workflow keeps templates independent to avoid coupling.
+8. **Citation ID format is caller's responsibility**: Templates emphasize citation rules but don't validate. If caller returns wrong ID format, template can't catch it upstream.
+9. **Parser extraction happens post-render**: OutputParser.parse() is called AFTER `.render()` returns string. If template has syntax errors, render fails before parsing logic runs.
+10. **Template cache**: Prompter likely caches loaded templates. File edits require app restart if using cached instance.
+
+## Testing Patterns
+
+**Manual render test**:
+```python
+from ai_prompter import Prompter
+
+prompt = Prompter(prompt_template="ask/entry").render(
+    data={"question": "What is RAG?"}
+)
+print(prompt)  # Inspect Jinja2 output before sending to LLM
+```
+
+**With parser**:
+```python
+from pydantic import BaseModel
+from langchain_core.output_parsers.pydantic import PydanticOutputParser
+
+class Strategy(BaseModel):
+    reasoning: str
+    searches: list
+
+parser = PydanticOutputParser(pydantic_object=Strategy)
+prompt = Prompter(prompt_template="ask/entry", parser=parser).render(
+    data={"question": "..."}
+)
+# prompt now includes {{ format_instructions }} substitution
+```
+
+**Integration test** (invoke full graph):
+See `open_notebook/graphs/ask.py` for how entry.jinja is invoked inside ask_graph workflow.
+
+## Reference Documentation
+
+- **Jinja2 syntax guide**: See existing templates for for-loop, if-conditional, variable interpolation patterns
+- **Graph integration**: `open_notebook/graphs/CLAUDE.md` documents which template is used in which workflow
+- **Sub-directory CLAUDE.md files**: `ask/CLAUDE.md`, `chat/CLAUDE.md`, `podcast/CLAUDE.md` (if created) provide template-specific implementation notes