remove stale files

This commit is contained in:
akdeb 2026-05-09 18:45:55 +05:30
parent b7fd3a865b
commit 72a5c60ec1
13 changed files with 6 additions and 397 deletions

View file

@ -1,256 +0,0 @@
## ElatoAI: Realtime Voice AI Models on FastAPI
`server-fastapi` is the simplest self-hosted Elato backend for people who want a normal Python server instead of an edge runtime.
Use this if you want:
- a FastAPI server you can run on your own machine or VM
- a classic `STT -> LLM -> TTS` voice pipeline
- a smaller provider surface that is easy to understand
- the same ESP32 transport shape as the rest of Elato
If you are new to the project, read these first:
- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/README.md`
- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server/README.md`
## The Simple Provider Set
To keep onboarding straightforward, the classic FastAPI route is centered around a small set of providers.
### LLM
- `openai`
- `claude`
- `gemini`
- `grok`
### STT
- `deepgram`
- `whisper`
### TTS
- `elevenlabs`
- `cartesia`
- `deepgram`
- `openai`
The code still uses the `models/llm`, `models/stt`, and `models/tts` layout, but the active registry is intentionally trimmed so the default experience stays simple.
## Default Setup
The default classic route is:
- STT: `deepgram`
- LLM: `openai`
- TTS: `elevenlabs`
That gives people one obvious path to get running before they start swapping providers.
## Project Layout
```text
server-fastapi/
├── bot.py
├── classic_route.py
├── esp32_transport.py
├── server.py
├── env.example
└── models/
├── llm/
├── stt/
└── tts/
```
## How The FastAPI Server Fits Into Elato
Elato has three backend options right now:
- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server/deno`
- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server/cloudflare`
- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server-fastapi`
A clean way to think about them is:
- `Deno`: edge-first, mature provider integrations
- `Cloudflare`: Workers + Durable Objects + Workers AI
- `FastAPI`: normal Python server, easy to self-host, easy to reason about
## Quick Start
### 1. Create or activate your Python environment
Use whatever you prefer. If you already use `uv`, that is a good default.
### 2. Install dependencies
This repo uses `pyproject.toml`, so install from that environment rather than a `requirements.txt` file.
With `uv`:
```bash
cd /Users/akashdeepdeb/Desktop/Projects/ElatoAI/server-fastapi
uv sync
```
Or with plain pip in your venv:
```bash
cd /Users/akashdeepdeb/Desktop/Projects/ElatoAI/server-fastapi
pip install -e .
```
### 3. Create your env file
Copy the example values from:
- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server-fastapi/env.example`
Minimum example for the default route:
```env
DEEPGRAM_API_KEY=your_deepgram_api_key
OPENAI_API_KEY=your_openai_api_key
ELEVENLABS_API_KEY=your_elevenlabs_api_key
CURRENT_VOICE_ROUTE=classic
CLASSIC_STT_PROVIDER=deepgram
CLASSIC_LLM_PROVIDER=openai
CLASSIC_TTS_PROVIDER=elevenlabs
ESP32_INPUT_SAMPLE_RATE=16000
BROWSER_INPUT_SAMPLE_RATE=16000
AUDIO_OUTPUT_SAMPLE_RATE=24000
PIPELINE_AUDIO_IN_SAMPLE_RATE=16000
PIPELINE_AUDIO_OUT_SAMPLE_RATE=24000
ALLOWED_ORIGINS=*
HOST=0.0.0.0
PORT=7860
```
### 4. Run the server
If you use `uv`:
```bash
cd /Users/akashdeepdeb/Desktop/Projects/ElatoAI/server-fastapi
uv run server.py
```
If you use your activated venv directly:
```bash
cd /Users/akashdeepdeb/Desktop/Projects/ElatoAI/server-fastapi
python server.py
```
### 5. Point your ESP32 at the FastAPI backend
Update the firmware config so your hardware connects to this server instead of the Deno or Cloudflare backend.
The ESP32 route is:
```text
/ws/esp32
```
For browser or Next.js testing, the server also exposes:
- `/ws/browser`
- `/ws/nextjs`
## How Provider Selection Works
The classic route reads three env vars:
- `CLASSIC_STT_PROVIDER`
- `CLASSIC_LLM_PROVIDER`
- `CLASSIC_TTS_PROVIDER`
So changing providers is just an env change.
Examples:
### OpenAI + Deepgram + ElevenLabs
```env
CLASSIC_STT_PROVIDER=deepgram
CLASSIC_LLM_PROVIDER=openai
CLASSIC_TTS_PROVIDER=elevenlabs
```
### Whisper + Claude + Cartesia
```env
CLASSIC_STT_PROVIDER=whisper
CLASSIC_LLM_PROVIDER=claude
CLASSIC_TTS_PROVIDER=cartesia
```
### Deepgram + Gemini + OpenAI TTS
```env
CLASSIC_STT_PROVIDER=deepgram
CLASSIC_LLM_PROVIDER=gemini
CLASSIC_TTS_PROVIDER=openai
```
## Unified Experience Across Elato
A simple way to keep the product understandable is:
- keep the Next.js frontend focused on character creation and device management
- keep the ESP32 firmware focused on one transport protocol
- let users choose one backend runtime:
- Deno
- Cloudflare
- FastAPI
- inside each backend, expose the same conceptual knobs:
- `STT`
- `LLM`
- `TTS`
That means the hardware story stays stable:
- one firmware
- one websocket-style mental model
- three server deployment choices
The cleanest unification strategy is not “every backend supports every provider.”
It is:
- every backend should expose the same categories
- each backend should have one recommended default stack
- advanced users can swap providers later
## Recommended Defaults
If you want a simple opinionated experience for users, keep one default combo per backend.
Suggested defaults:
- `Deno`: OpenAI realtime
- `Cloudflare`: Workers AI STT/TTS + OpenAI LLM
- `FastAPI`: Deepgram + OpenAI + ElevenLabs
That gives users one obvious starting point without taking away flexibility.
## Important Files
If you want to change the FastAPI backend, start here:
- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server-fastapi/server.py`
- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server-fastapi/classic_route.py`
- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server-fastapi/esp32_transport.py`
- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server-fastapi/models/llm/__init__.py`
- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server-fastapi/models/stt/__init__.py`
- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server-fastapi/models/tts/__init__.py`
## Current Notes
- The filesystem still contains many scaffolded provider modules from the earlier broader experiment.
- The active provider registry is now intentionally much smaller.
- That means the codebase stays extensible, but the user-facing default path stays simple.

View file

@ -1,55 +0,0 @@
"""Classic STT -> LLM -> TTS pipeline builder."""
from __future__ import annotations
import os
from character_prompt import LANGUAGE_LEARNING_PAL_PROMPT
from loguru import logger
from models.llm import create_llm_service
from models.stt import create_stt_service
from models.tts import create_tts_service
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import (
LLMContextAggregatorPair,
LLMUserAggregatorParams,
)
def build_classic_route(input_processor, context: LLMContext):
stt_provider = os.getenv("CLASSIC_STT_PROVIDER", "deepgram")
llm_provider = os.getenv("CLASSIC_LLM_PROVIDER", "openai")
tts_provider = os.getenv("CLASSIC_TTS_PROVIDER", "elevenlabs")
logger.info(
"Building classic route with stt={} llm={} tts={}",
stt_provider,
llm_provider,
tts_provider,
)
stt = create_stt_service(stt_provider)
llm = create_llm_service(
llm_provider,
system_instruction=LANGUAGE_LEARNING_PAL_PROMPT,
)
tts = create_tts_service(tts_provider)
user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
context,
user_params=LLMUserAggregatorParams(
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=1))
),
)
processors = [
input_processor,
stt,
user_aggregator,
llm,
tts,
]
return processors, assistant_aggregator

View file

@ -1,27 +0,0 @@
DEEPGRAM_API_KEY=your_deepgram_api_key
OPENAI_API_KEY=your_openai_api_key
ANTHROPIC_API_KEY=your_anthropic_api_key
GEMINI_API_KEY=your_gemini_api_key
XAI_API_KEY=your_xai_api_key
ELEVENLABS_API_KEY=your_elevenlabs_api_key
CARTESIA_API_KEY=your_cartesia_api_key
# Classic route providers
CURRENT_VOICE_ROUTE=classic
CLASSIC_STT_PROVIDER=deepgram
CLASSIC_LLM_PROVIDER=openai
CLASSIC_TTS_PROVIDER=elevenlabs
# Transport and pipeline sample rates
ESP32_INPUT_SAMPLE_RATE=16000
BROWSER_INPUT_SAMPLE_RATE=16000
AUDIO_OUTPUT_SAMPLE_RATE=24000
PIPELINE_AUDIO_IN_SAMPLE_RATE=16000
PIPELINE_AUDIO_OUT_SAMPLE_RATE=24000
# Browser / Next.js access
ALLOWED_ORIGINS=*
# WebSocket server settings
HOST=0.0.0.0
PORT=7860

View file

@ -1,20 +0,0 @@
"""LLM provider registry."""
from __future__ import annotations
from models._provider_loader import load_provider_factory
LLM_REGISTRY = {
"claude": "models.llm.anthropic",
"anthropic": "models.llm.anthropic",
"gemini": "models.llm.google_gemini",
"google_gemini": "models.llm.google_gemini",
"google_vertex_ai": "models.llm.google_vertex_ai",
"grok": "models.llm.grok",
"openai": "models.llm.openai",
}
def create_llm_service(provider_name: str, **kwargs):
factory = load_provider_factory(LLM_REGISTRY, provider_name, "LLM")
return factory(**kwargs)

View file

@ -1,16 +0,0 @@
"""STT provider registry."""
from __future__ import annotations
from models._provider_loader import load_provider_factory
STT_REGISTRY = {
"deepgram": "models.stt.deepgram",
"openai": "models.stt.openai",
"whisper": "models.stt.whisper",
}
def create_stt_service(provider_name: str, **kwargs):
factory = load_provider_factory(STT_REGISTRY, provider_name, "STT")
return factory(**kwargs)

View file

@ -1,17 +0,0 @@
"""TTS provider registry."""
from __future__ import annotations
from models._provider_loader import load_provider_factory
TTS_REGISTRY = {
"cartesia": "models.tts.cartesia",
"deepgram": "models.tts.deepgram",
"elevenlabs": "models.tts.elevenlabs",
"openai": "models.tts.openai",
}
def create_tts_service(provider_name: str, **kwargs):
factory = load_provider_factory(TTS_REGISTRY, provider_name, "TTS")
return factory(**kwargs)

View file

@ -54,7 +54,7 @@ That gives people one obvious path to get running before they start swapping pro
```text
server/fastapi/
├── bot.py
├── classic_route.py
├── voice_pipeline.py
├── esp32_transport.py
├── server.py
├── env.example
@ -327,7 +327,7 @@ That gives users one obvious starting point without taking away flexibility.
If you want to change the FastAPI backend, start here:
- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server/fastapi/server.py`
- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server/fastapi/classic_route.py`
- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server/fastapi/voice_pipeline.py`
- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server/fastapi/esp32_transport.py`
- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server/fastapi/models/llm/__init__.py`
- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server/fastapi/models/stt/__init__.py`

View file

@ -9,7 +9,7 @@
import os
from typing import Literal
from classic_route import build_classic_route
from voice_pipeline import build_voice_pipeline
from dotenv import load_dotenv
from gem_live_route import build_gem_live_route
from grok_route import build_grok_route
@ -155,7 +155,7 @@ async def run_bot_session(
elif voice_route == "grok":
route_processors, assistant_aggregator = build_grok_route(input_processor, context)
else:
route_processors, assistant_aggregator = build_classic_route(input_processor, context)
route_processors, assistant_aggregator = build_voice_pipeline(input_processor, context)
processors = [transport.input(), *route_processors]

View file

@ -1,4 +1,4 @@
"""Classic STT -> LLM -> TTS pipeline builder."""
"""Default STT -> LLM -> TTS voice pipeline builder."""
from __future__ import annotations
@ -18,7 +18,7 @@ from pipecat.processors.aggregators.llm_response_universal import (
)
def build_classic_route(input_processor, context: LLMContext):
def build_voice_pipeline(input_processor, context: LLMContext):
stt_provider = os.getenv("CLASSIC_STT_PROVIDER", "deepgram")
llm_provider = os.getenv("CLASSIC_LLM_PROVIDER", "openai")
tts_provider = os.getenv("CLASSIC_TTS_PROVIDER", "elevenlabs")