open-notebook/.claude/sessions/api_troubleshoot/migration_plan.md
Luis Novo d7b0fff954
Api podcast migration (#93)
Creates the API layer for Open Notebook
Creates a services API gateway for the Streamlit front-end
Migrates the SurrealDB SDK to the official one
Change all database calls to async
New podcast framework supporting multiple speaker configurations
Implement the surreal-commands library for async processing
Improve docker image and docker-compose configurations
2025-07-17 08:36:11 -03:00

10 KiB

API Migration Plan: Direct Domain Calls to API Calls

Project Context

The Open Notebook project has undergone a significant architectural migration from direct domain model access to a proper API-based architecture. The project consists of:

  1. Domain Layer: Core business logic and data models (in open_notebook/domain/)
  2. API Layer: FastAPI-based REST API endpoints (in api/)
  3. Streamlit Frontend: User interface components (in pages/)

During the development process, a comprehensive API layer was built to provide proper separation of concerns, better error handling, and standardized interfaces. However, it appears that some Streamlit components were not fully migrated to use the API layer and are still making direct calls to domain models using asyncio.run().

This creates several issues:

  • Architectural inconsistency: Some parts use APIs while others bypass them
  • Potential data consistency problems: Direct domain calls might bypass API validation and business logic
  • Maintenance difficulties: Changes to domain models could break Streamlit components unexpectedly
  • Performance issues: Direct async calls in Streamlit can cause blocking behavior

Migration Strategy

This document systematically identifies every instance where Streamlit components directly call domain models and provides the exact API replacement. The goal is to ensure that ALL frontend interactions go through the API layer, maintaining proper architectural boundaries.

Overview

This document maps all instances where the Streamlit app is directly calling domain models instead of using the API layer. Each entry includes the current implementation and the recommended API replacement.

Migration Mappings

1. pages/components/source_panel.py

Line 18: Get Source by ID

Current:

source: Source = asyncio.run(Source.get(source_id))

Should be:

from api.client import api_client
source = api_client.get_source(source_id)

API Endpoint: GET /api/sources/{source_id}

Line 62: Get All Transformations

Current:

transformations = asyncio.run(Transformation.get_all(order_by="name asc"))

Should be:

from api.transformations_service import transformations_service
transformations = transformations_service.get_all_transformations()

API Endpoint: GET /api/transformations

Line 83: Get Embedding Model

Current:

embedding_model = asyncio.run(model_manager.get_embedding_model())

Should be:

from api.models_service import models_service
default_models = models_service.get_default_models()
embedding_model = default_models.get("embedding")

API Endpoint: GET /api/models/defaults

Line 91: Check Embedded Chunks

Current:

if not asyncio.run(source.get_embedded_chunks()) and st.button(

Should be:

# Use the source object already fetched from API that includes embedded_chunks field
if not source.embedded_chunks and st.button(

API Endpoint: GET /api/sources/{source_id} (uses embedded_chunks field)

2. pages/components/note_panel.py

Line 16: Get Embedding Model

Current:

if not asyncio.run(model_manager.get_embedding_model()):

Should be:

from api.models_service import models_service
default_models = models_service.get_default_models()
if not default_models.get("embedding"):

API Endpoint: GET /api/models/defaults

Line 20: Get Note by ID

Current:

note: Note = asyncio.run(Note.get(note_id))

Should be:

from api.client import api_client
note = api_client.get_note(note_id)

API Endpoint: GET /api/notes/{note_id}

3. pages/components/model_selector.py

Line 21: Get Models by Type

Current:

models = asyncio.run(Model.get_models_by_type(model_type))

Should be:

from api.models_service import models_service
models = models_service.get_models(type=model_type)

API Endpoint: GET /api/models?type={model_type}

4. pages/stream_app/utils.py

Line 122: Get Default Models Instance

Current:

default_models = asyncio.run(DefaultModels.get_instance())

Should be:

from api.models_service import models_service
default_models = models_service.get_default_models()

API Endpoint: GET /api/models/defaults

5. pages/stream_app/chat.py

Line 89: Get All Episode Profiles

Current:

episode_profiles = asyncio.run(EpisodeProfile.get_all())

Should be:

from api.client import api_client
episode_profiles = api_client.get_episode_profiles()

API Endpoint: GET /api/episode-profiles

6. pages/stream_app/source.py

Line 30: Get Speech to Text Model

Current:

if not asyncio.run(model_manager.get_speech_to_text()):

Should be:

from api.models_service import models_service
default_models = models_service.get_default_models()
if not default_models.get("speech_to_text"):

API Endpoint: GET /api/models/defaults

Line 40: Get All Transformations

Current:

transformations = asyncio.run(Transformation.get_all())

Should be:

from api.transformations_service import transformations_service
transformations = transformations_service.get_all_transformations()

API Endpoint: GET /api/transformations

Line 167: Get Source Insights

Current:

insights = asyncio.run(source.get_insights())

Should be:

from api.insights_service import insights_service
insights = insights_service.get_source_insights(source.id)

API Endpoint: GET /api/sources/{source_id}/insights

7. pages/stream_app/note.py

Line 20: Get Embedding Model

Current:

if not asyncio.run(model_manager.get_embedding_model()):

Should be:

from api.models_service import models_service
default_models = models_service.get_default_models()
if not default_models.get("embedding"):

API Endpoint: GET /api/models/defaults

7. pages/3_🔍_Ask_and_Search.py

Line 66: Get Embedding Model

Current:

embedding_model = asyncio.run(model_manager.get_embedding_model())

Should be:

from api.models_service import models_service
default_models = models_service.get_default_models()
embedding_model = default_models.get("embedding")

API Endpoint: GET /api/models/defaults

8. pages/2_📒_Notebooks.py

Line 75: Get Notebook Sources

Current:

sources = asyncio.run(current_notebook.get_sources())

Should be:

from api.sources_service import sources_service
sources = sources_service.get_sources(notebook_id=current_notebook.id)

API Endpoint: GET /api/sources?notebook_id={notebook_id}

Line 76: Get Notebook Notes

Current:

notes = asyncio.run(current_notebook.get_notes())

Should be:

from api.notes_service import notes_service
notes = notes_service.get_notes(notebook_id=current_notebook.id)

API Endpoint: GET /api/notes?notebook_id={notebook_id}

9. pages/5_🎙️_Podcasts.py

Line 428: Get Text to Speech Models

Current:

text_to_speech_models = asyncio.run(Model.get_models_by_type("text_to_speech"))

Should be:

from api.models_service import models_service
text_to_speech_models = models_service.get_models(type="text_to_speech")

API Endpoint: GET /api/models?type=text_to_speech

Line 429: Get Language Models

Current:

text_models = asyncio.run(Model.get_models_by_type("language"))

Should be:

from api.models_service import models_service
text_models = models_service.get_models(type="language")

API Endpoint: GET /api/models?type=language

Missing APIs

All required APIs are already implemented!

The Source API already properly exposes embedded chunks information through the embedded_chunks field in both SourceResponse and SourceListResponse models.

Implementation Notes

  1. All asyncio.run() calls should be removed since the API client handles async operations internally
  2. Import statements need to be updated to use API services instead of domain models
  3. Error handling should be added for API calls
  4. Consider caching frequently accessed data like default models
  5. The API client should handle authentication and error responses consistently

Completed Tasks

API Analysis Complete: All required APIs are implemented and available Migration Plan Created: Comprehensive mapping of 20 violations across 9 files
Source API Verification: Confirmed embedded_chunks field is properly exposed SourceWithMetadata Pattern: Created clean wrapper for domain objects with API metadata Complete API Migration: All 27 violations across 11 files successfully migrated Episode Profiles Service: Created new API service for podcast episode profiles Final Verification: Independent audit confirmed 100% migration completion Post-Audit Fixes: Fixed 3 additional violations found during final review Architecture Consistency: All Streamlit components now use API layer exclusively

Remaining Tasks

  1. Systematically replace each direct domain call with its API equivalent (20/20 violations completed)
  2. Remove unused domain model imports after migration (optional cleanup)
  3. Test each component after migration to ensure functionality is preserved

Implementation Status

Phase 1: Critical Components

  • pages/components/source_panel.py (4 violations)
  • pages/components/note_panel.py (2 violations)
  • pages/components/model_selector.py (1 violation)

Phase 2: Core Streamlit Pages

  • pages/2_📒_Notebooks.py (2 violations)
  • pages/3_🔍_Ask_and_Search.py (1 violation)
  • pages/5_🎙️_Podcasts.py (2 violations)

Phase 3: Supporting Pages

  • pages/stream_app/source.py (3 violations)
  • pages/stream_app/note.py (1 violation)
  • pages/stream_app/utils.py (1 violation)
  • pages/stream_app/chat.py (1 violation)

Progress: 27/27 violations fixed (100%) 🎉