* feat: content-type aware chunking and unified embedding - Add chunking.py with HTML, Markdown, and plain text detection - Add embedding.py with mean pooling for large content - Create dedicated commands: embed_note, embed_insight, embed_source - Use fire-and-forget pattern for embedding via submit_command() - Refactor rebuild_embeddings_command to delegate to individual commands - Remove legacy commands and needs_embedding() methods - Reduce chunk size to 1500 chars for Ollama compatibility - Update CLAUDE.md documentation for new architecture Fixes #350, #142 * fix: address code review issues - Note.save() now returns command_id for tracking embedding jobs - Add length check after generate_embeddings() to fail fast on mismatch - Add numpy as explicit dependency (was transitive) - Remove hardcoded chunk sizes from docstrings * docs: address code review comments - Rename "SYNC PATH" to "DOMAIN MODEL PATH" in embedding router - Add test_chunking.py and test_embedding.py to Testing Strategy - Clarify auto-embedding behavior for each domain model * fix: clean thinking tags from prompt graph output Adds clean_thinking_content() to prompt.py to handle extended thinking models that return <think>...</think> tags. This fixes empty titles when saving notes from chat. * chore: remove local docker-compose from git * fix(frontend): handle null parent_id in search results Add defensive check for null parent_id in search results to prevent "Cannot read properties of null (reading 'split')" error. This can happen with orphaned records in the database. * fix: cascade delete embeddings and insights when source is deleted When deleting a Source, now also deletes associated: - source_embedding records - source_insight records This prevents orphaned records that cause null parent_id errors in vector search results. * fix: add cleanup for orphan embedding/insight records in migration 10 Deletes source_embedding and source_insight records where the linked source no longer exists (source.id = NONE). * chore: bump esperanto to 2.16 Increases ctx_num for Ollama models to accommodate larger notebook context windows. See: https://github.com/lfnovo/esperanto/pull/69
4.3 KiB
4.3 KiB
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
[1.6.0] - 2026-01-16
Added
- Content-type aware text chunking with automatic HTML, Markdown, and plain text detection (#350, #142)
- Unified embedding generation with mean pooling for large content that exceeds model context limits
- Dedicated embedding commands:
embed_note,embed_insight,embed_source - New utility modules:
chunking.pyandembedding.pyinopen_notebook/utils/
Changed
- Embedding is now fire-and-forget: domain models submit embedding commands asynchronously after save
rebuild_embeddings_commandnow delegates to individual embed_* commands instead of inline processing- Chunk size reduced to 1500 characters for better compatibility with Ollama embedding models
Removed
- Legacy embedding commands:
embed_single_item_command,embed_chunk_command,vectorize_source_command needs_embedding()andget_embedding_content()methods from domain modelssplit_text()function from text_utils (replaced bychunk_text()in chunking module)
Fixed
- Embedding failures when content exceeds model context limits (#350, #142)
[1.5.2] - 2026-01-15
Performance
- Improved source listing speed by 20-30x (#436, closes #351)
- Added database indexes on
sourcefield forsource_insightandsource_embeddingtables - Use SurrealDB
FETCHclause for command status instead of N async calls
- Added database indexes on
[1.5.1] - 2026-01-15
Fixed
- Podcast dialog infinite loop error caused by excessive translation Proxy accesses in loops
- Podcast dialog UI freezing when typing episode name or additional instructions
- Removed incorrect translation keys for user-defined episode profiles (user content should not be translated)
[1.5.0] - 2026-01-15
Added
- Internationalization (i18n) support with Chinese (Simplified and Traditional) translations (#371, closes #344, #349, #360)
- Frontend test infrastructure with Vitest (#371)
- Language toggle component for switching UI language (#371)
- Date localization using date-fns locales (#371)
- Error message translation system (#371)
Fixed
- Accessibility improvements: added missing
id,name, andautoCompleteattributes to form inputs (#371) - Added
DialogDescriptionto dialogs for Radix UI accessibility compliance (#371) - Fixed "Collapsible is changing from uncontrolled to controlled" warning in SettingsForm (#371)
- Fixed lint command for Next.js 16 compatibility (
eslintinstead ofnext lint)
Changed
- Dockerfile optimizations: better layer caching,
--no-install-recommendsfor smaller images (#371) - Dockerfile.single refactored into 3 separate build stages for better caching (#371)
[1.4.0] - 2026-01-14
Added
- CTA button to empty state notebook list for better onboarding (#408)
- Offline deployment support for Docker containers (#414)
Fixed
- Large file uploads (>10MB) by upgrading to Next.js 16 (#423)
- Orphaned uploaded files when sources are removed (#421)
- Broken documentation links to ai-providers.md (#419)
- ZIP support indication removed from UI (#418)
- Duplicate Claude Code workflow runs on PRs (#417)
- Claude Code review workflow now runs on PRs from forks (#416)
Changed
- Upgraded Next.js from 15.4.10 to 16.1.1 (#423)
- Upgraded React from 19.1.0 to 19.2.3 (#423)
- Renamed
middleware.tstoproxy.tsfor Next.js 16 compatibility (#423)
Dependencies
- next: 15.4.10 → 16.1.1
- react: 19.1.0 → 19.2.3
- react-dom: 19.1.0 → 19.2.3
[1.2.4] - 2025-12-14
Added
- Infinite scroll for notebook sources - no more 50 source limit (#325)
- Markdown table rendering in chat responses, search results, and insights (#325)
Fixed
- Timeout errors with Ollama and local LLMs - increased to 10 minutes (#325)
- "Unable to Connect to API Server" on Docker startup - frontend now waits for API health check (#325, #315)
- SSL issues with langchain (#274)
- Query key consistency for source mutations to properly refresh infinite scroll (#325)
- Docker compose start-all flow (#323)
Changed
- Timeout configuration now uses granular httpx.Timeout (short connect, long read) (#325)
Dependencies
- Updated next.js to 15.4.10
- Updated httpx to >=0.27.0 for SSL fix