useCreateSource and useFileUpload only invalidated QUERY_KEYS.sources()
which doesn't match the infinite scroll query key used by the notebook
page (QUERY_KEYS.sourcesInfinite()). Added sourcesInfinite invalidation
to both hooks so the source list auto-refreshes after adding sources.
Tests were reimplementing UUID logic locally instead of testing the
actual production code path. Extract the path-building logic into a
testable helper function and import it directly in tests.
Podcast episode names with spaces or special characters caused
filesystem errors when used directly as directory names.
Use UUID-based directory names instead, keeping the original
episode name in the database for display purposes.
Closes#663
The OllamaEmbeddingModel was ignoring the base_url from credentials/config,
always falling back to env vars or localhost. This caused embedding failures
for users with custom Ollama endpoints.
Fixes#655
Broad 'except Exception' could silently swallow unexpected failures.
URLError and ConnectionError are both subclasses of OSError, so
'except (ImportError, OSError)' captures all real offline/not-installed
cases while letting genuine programming errors propagate.
Also include the exception detail in the warning message so failures
are diagnosable in logs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
In air-gapped / offline Docker deployments, tiktoken.get_encoding() tries
to download the encoding file from openaipublic.blob.core.windows.net.
When that request fails it raises a URLError / OSError — not an ImportError
— so the previous except clause silently missed it and the crash surfaced in
the UI.
Widened `except ImportError` to `except Exception` so all failures —
"not installed" and "network unreachable" — fall through to the word-count
fallback (words × 1.3). Added a loguru WARNING so operators can see when
the fallback is active.
TIKTOKEN_CACHE_DIR now reads from the environment with a blank-safe
fallback (`or` guard prevents os.makedirs("") on empty env var). This lets
Docker images redirect the cache to a path outside /app/data/ so user-data
volume mounts cannot shadow the pre-baked encoding.
Both images now pre-download the o200k_base encoding during the builder
stage (internet is available at build time) and copy it into the runtime
image at /app/tiktoken-cache. ENV TIKTOKEN_CACHE_DIR=/app/tiktoken-cache
is set in the runtime stage so no network call is ever needed at runtime.
Added test_token_count_network_error_fallback in tests/test_utils.py:
patches tiktoken.get_encoding with a URLError and asserts token_count()
returns a positive int instead of raising.
Fixes#264
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The language field on EpisodeProfile was being saved to the database but
had no effect during generation because podcast-creator 0.11.x didn't
support the language parameter. Version 0.12.0 adds language support to
the generation pipeline (outline + transcript templates), and since
open-notebook already passes the full episode profile config to
podcast-creator, the language field is picked up automatically.
Closes#640
* feat(podcasts): integrate model registry for profiles and credential passthrough
Replace loose provider/model string fields with record<model> references
in podcast profiles, enabling credential passthrough to podcast-creator.
Backend:
- EpisodeProfile: outline_llm, transcript_llm (record<model>) replace
outline_provider/outline_model strings. New language field (BCP 47).
- SpeakerProfile: voice_model (record<model>) replaces tts_provider/
tts_model strings. Per-speaker voice_model override support.
- Migration 14: schema changes making legacy fields optional, adding new
record<model> fields.
- Data migration (migration.py): auto-converts legacy profiles to model
registry references on startup. Idempotent.
- podcast_commands.py: resolves credentials for ALL profiles before
calling podcast-creator.
- New /api/languages endpoint (pycountry + babel) with BCP 47 locale
codes (pt-BR, en-US, etc.).
Frontend:
- Episode/speaker profile forms use ModelSelector instead of manual
provider/model dropdowns.
- Language dropdown with BCP 47 codes in episode profile form.
- Per-speaker TTS voice model override in speaker profile form.
- "Templates" tab renamed to "Profiles".
- Setup required badge on unconfigured profiles.
- i18n updated across all 8 locales.
Closes#486, closes#552
* fix(i18n): remove unused legacy podcast provider/model keys
Remove 10 orphaned i18n keys across all 8 locales that were left behind
after replacing manual provider/model dropdowns with ModelSelector.
* fix: address review violations in podcast model registry
- P1: Remove profiles with failed model resolution from dicts to prevent
podcast-creator validation errors on unrelated profiles
- P2: Use centralized QUERY_KEYS.languages instead of inline key
- P3: Fix ISO 639-1 → BCP 47 in model field description and CLAUDE.md
- P3: Update "templates" → "profiles" in locale string values (all 8)
* chore: bump version to 1.8.0
- Replace curl-based SurrealDB install in Dockerfile.single with a
multi-stage build that copies the binary from surrealdb/surrealdb:v2,
aligning it with the version used in docker-compose.yml and preventing
breakage when newer SurrealDB versions introduce syntax changes.
- Fix SURREAL_PASSWORD documentation in single-container.md: the actual
password set in supervisord.single.conf is `root`, not `password`.
Closes#498
* fix(chat): remove 50-source cap from notebook chat context
ChatColumn was independently fetching sources via useSources() which
defaults to a limit of 50 from the API. This caused the chat context
to always be capped at 50 sources regardless of how many are in the
notebook.
ChatColumn now receives sources as a prop from the parent NotebookPage,
which already fetches all sources via useNotebookSources with infinite
scroll pagination.
* test(chat): update ChatColumn tests for new sources prop interface
The inline docker-compose example in README and the environment
variable reference tables in installation docs were missing these
two required variables, causing connection failures for users who
copy-pasted the examples instead of downloading the actual file.
Closes#592
- Add batching to generate_embeddings() (50 texts per batch with per-batch retry)
to prevent 413 Payload Too Large errors on large documents
- Add 413 error classification rule for user-friendly error messages
- Fix misleading "Created 0 embedded chunks" log in process_source_command
by removing premature get_embedded_chunks() call (embedding is fire-and-forget)
Closes#594