open-notebook/api/routers
Luis Novo 5b2c97cab7
Fix re-embedding issues and improve retry strategy (#515)
* fix: filter empty content in rebuild embeddings queries

Update collect_items_for_rebuild() to properly filter out items with
empty or whitespace-only content before submitting embedding jobs.

Changes:
- Sources: add string::trim(full_text) != '' filter
- Notes: add string::trim(content) != '' filter
- Insights: add content != none AND string::trim(content) != '' filter
  (previously had no content filter at all)

This prevents unnecessary job submissions that would fail validation
in the individual embed commands.

Ref #513

* feat: add command_id to embedding error logs

Add get_command_id() helper to extract command_id from execution context.
Include command_id in error logs for all embedding commands:
- embed_note_command
- embed_insight_command
- embed_source_command
- create_insight_command

This makes it easier to trace failed embedding jobs back to specific
command records in the database.

Ref #513

* fix: improve logging for embedding commands

Log improvements:
- Add command_id to all embedding error logs for traceability
- Transaction conflicts in repo_insert now log at DEBUG (not ERROR)
- Embedding API errors log at DEBUG, only ERROR when retries exhausted
- Friendlier retry messages: "This will be retried automatically"
- Include model name and command_id in generate_embeddings errors

Files changed:
- commands/embedding_commands.py: command_id in logs, friendlier messages
- open_notebook/database/repository.py: DEBUG for transaction conflicts
- open_notebook/utils/embedding.py: DEBUG logging, pass-through command_id

Ref #513

* fix: correct field names in rebuild embeddings status endpoint

The API status endpoint was looking for wrong field names:
- sources_processed → sources_submitted
- notes_processed → notes_submitted
- insights_processed → insights_submitted
- processed_items → jobs_submitted
- failed_items → failed_submissions

The command outputs "_submitted" because embedding happens async
(we count jobs submitted, not items processed).

Ref #513

* fix: update rebuild UI text to reflect async job submission

Changed terminology from "Completed/processed" to "Jobs Submitted"
since the rebuild command submits embedding jobs for async processing,
not completing them synchronously.

Updated in all locales: en-US, pt-BR, zh-CN, zh-TW, ja-JP

Ref #513

* refactor: migrate retry strategy from allowlist to blocklist

- Change from `retry_on: [RuntimeError, ...]` to `stop_on: [ValueError]`
- This is more resilient: new exception types auto-retry by default
- Simplified exception handling: ValueError = permanent, else = retry
- Transient errors logged at DEBUG (surreal-commands logs final failure)
- Permanent errors (ValueError) logged at ERROR

Ref #513
2026-01-31 18:55:01 -03:00
..
__init__.py Api podcast migration (#93) 2025-07-17 08:36:11 -03:00
auth.py Feat/localization tests docker (#371) 2026-01-15 13:51:05 -03:00
chat.py feat: message counting for chat sessions (#430) 2026-01-29 23:00:22 -03:00
commands.py Feat/localization tests docker (#371) 2026-01-15 13:51:05 -03:00
config.py Feat/localization tests docker (#371) 2026-01-15 13:51:05 -03:00
context.py Feat/localization tests docker (#371) 2026-01-15 13:51:05 -03:00
embedding.py feat: content-type aware chunking and unified embedding (#444) 2026-01-21 23:49:08 -03:00
embedding_rebuild.py Fix re-embedding issues and improve retry strategy (#515) 2026-01-31 18:55:01 -03:00
episode_profiles.py Feat/localization tests docker (#371) 2026-01-15 13:51:05 -03:00
insights.py Feat/localization tests docker (#371) 2026-01-15 13:51:05 -03:00
models.py Feat/localization tests docker (#371) 2026-01-15 13:51:05 -03:00
notebooks.py feat: add cascade deletion for notebooks with delete preview (#471) 2026-01-25 14:56:14 -03:00
notes.py Feat/localization tests docker (#371) 2026-01-15 13:51:05 -03:00
podcasts.py Feat/localization tests docker (#371) 2026-01-15 13:51:05 -03:00
search.py refactor: reorganize folder structure for better maintainability 2026-01-03 14:04:27 -03:00
settings.py Feat/localization tests docker (#371) 2026-01-15 13:51:05 -03:00
source_chat.py feat: message counting for chat sessions (#430) 2026-01-29 23:00:22 -03:00
sources.py fix: async insight creation to prevent transaction conflicts (#512) 2026-01-31 15:51:27 -03:00
speaker_profiles.py Feat/localization tests docker (#371) 2026-01-15 13:51:05 -03:00
transformations.py Feat/localization tests docker (#371) 2026-01-15 13:51:05 -03:00