AIvoices/server-cloudflare
2026-04-17 13:46:46 +05:30
..
models test new tts 2026-04-17 13:46:46 +05:30
src update opus 2026-04-17 12:45:52 +05:30
.dev.vars.example cloudflare test 2026-04-17 10:49:40 +05:30
.gitignore cloudflare test 2026-04-17 10:49:40 +05:30
package-lock.json cloudflare test 2026-04-17 10:49:40 +05:30
package.json cloudflare test 2026-04-17 10:49:40 +05:30
README.md test cloudflare DO 2026-04-17 12:08:44 +05:30
tsconfig.json test cloudflare DO 2026-04-17 12:08:44 +05:30
wrangler.toml adding logs 2026-04-17 12:19:55 +05:30

server-cloudflare

Cloudflare Workers + Durable Objects voice backend for Elato.

This starts with one ESP32-compatible websocket path:

  • /ws/esp32

The route is backed by a Durable Object that preserves the Elato device control protocol.

Current stack

  • STT: @cf/openai/whisper
  • LLM: OpenAI Chat Completions
  • TTS: @cf/deepgram/aura-1

Local setup

  1. Install dependencies
npm install
  1. Copy .dev.vars.example to .dev.vars and fill in your keys.

  2. Run locally

npm run dev

Notes

  • ESP32 clients should connect to:
wss://<worker-domain>/ws/esp32
  • Auth is intentionally left out of this iteration. Add your own auth check in the Worker route before using this in production.
  • This backend now targets the current Elato ESP32 control protocol first: auth, AUDIO.COMMITTED, RESPONSE.CREATED, binary audio frames, RESPONSE.COMPLETE, and SESSION.END.
  • It does not currently use @cloudflare/voice; the Durable Object owns the websocket session directly so the firmware protocol stays explicit.
  • The ESP32 route now packetizes Cloudflare TTS output into Opus frames before sending binary websocket packets, matching the same 24kHz mono / 120ms framing shape used by server-deno.
  • The remaining gap is operational, not transport-level: this prototype still has placeholder auth / DB comments and has not been load-tested against long-running device sessions yet.