elevenlabs docs update

This commit is contained in:
akdeb 2025-09-20 05:15:57 +07:00
parent 759340de72
commit e629665dcc
3 changed files with 24 additions and 24 deletions

View file

@ -69,7 +69,7 @@ Realtime AI Speech powered by **OpenAI Realtime API**, **Eleven Labs AI Agents**
</div>
## 📽️ Demo Video [(✨ Gemini demo)](https://youtu.be/_zUBue3pfVI)
## 📽️ Demo Video ([✨ Gemini demo](https://youtu.be/_zUBue3pfVI), [Eleven Labs Demo](https://youtu.be/7LKTIuEW-hg))
<div align="center">
<a href="https://www.youtube.com/watch?v=o1eIAwVll5I" target="_blank">
@ -77,7 +77,7 @@ Realtime AI Speech powered by **OpenAI Realtime API**, **Eleven Labs AI Agents**
</a>
</div>
Video links: [OpenAI Demo](https://youtu.be/o1eIAwVll5I) | [Gemini Demo](https://youtu.be/_zUBue3pfVI)
Video links: [OpenAI Demo](https://youtu.be/o1eIAwVll5I) | [Gemini Demo](https://youtu.be/_zUBue3pfVI) | [Eleven Labs Demo](https://youtu.be/7LKTIuEW-hg)
## 👷‍♀️ DIY Hardware Design
@ -235,10 +235,10 @@ flowchart TD
ESP32[ESP32 Device] -->|WebSocket| Edge[Deno Edge Function]
Edge -->|OpenAI API| OpenAI[OpenAI Realtime API]
Edge -->|Gemini API| Gemini[Gemini Live API]
Edge -->|ElevenLabs API| ElevenLabs[ElevenLabs AI Agents]
Edge -->|Eleven Labs API| Eleven Labs[Eleven Labs AI Agents]
OpenAI --> Edge
Gemini --> Edge
ElevenLabs --> Edge
Eleven Labs --> Edge
Edge -->|WebSocket| ESP32
ESP32 --> UserOutput
```
@ -281,7 +281,7 @@ lib_deps =
```
## Additional Docs
- [⏸️ Using the ElevenLabs API](./docs/ElevenLabs.md)
- [⏸️ Using the Eleven Labs API](./docs/ElevenLabs.md)
- [📈 Core Use Cases](./docs/Usecases.md)
- [🤖🤖🤖 Getting Started with multiple devices](./docs/MultipleDevices.md)
@ -311,7 +311,7 @@ lib_deps =
2. Adding Arduino IDE support
3. Add Hume API client for emotion detection
4. Add MCP support on Deno Edge
5. Plug in ElevenLabs API for voice generation
5. Plug in Eleven Labs API for voice generation
6. Add Azure OpenAI Support (easy pickings)
We welcome contributions

View file

@ -7,7 +7,7 @@
```
2. **Agent Configuration**
- Create an agent in the ElevenLabs dashboard
- Create an agent in the Eleven Labs dashboard
- Copy the agent ID
- On the Elato UI, Click `+ Create new` and create an Eleven Labs character with a `title` and the `agentId`
@ -16,16 +16,16 @@
1. **Connection Flow**:
- ESP32 connects to your Deno server via WebSocket
- Server authenticates the user and gets their personality configuration
- If provider is "elevenlabs", server requests a signed URL from ElevenLabs API
- Server establishes WebSocket connection to ElevenLabs using the signed URL
- Server acts as a relay between ESP32 and ElevenLabs
- If provider is "elevenlabs", server requests a signed URL from Eleven Labs API
- Server establishes WebSocket connection to Eleven Labs using the signed URL
- Server acts as a relay between ESP32 and Eleven Labs
2. **Audio Processing**:
- IMPORTANT: In your ElevenLabs Agent Settings > Voice > TTS Output Format > Set this to PCM 24kHz.
- IMPORTANT: In your Eleven Labs Agent Settings > Voice > TTS Output Format > Set this to PCM 24kHz.
- Currently the AI Agent must speak first. You can change this behaviour in `Audio.cpp`.
- ESP32 sends PCM16 audio data (binary) to server (If you change this, you will also want to change your ElevenLabs Audio input settings)
- Server converts to base64 and forwards to ElevenLabs
- ElevenLabs sends back base64 audio data
- ESP32 sends PCM16 audio data (binary) to server (If you change this, you will also want to change your Eleven Labs Audio input settings)
- Server converts to base64 and forwards to Eleven Labs
- Eleven Labs sends back base64 audio data
- Server converts to PCM16, encodes with Opus, and sends to ESP32
3. **Message Types**:
@ -35,16 +35,16 @@
## Usage
Once configured, the ElevenLabs provider works exactly like OpenAI and Gemini. The server will automatically route to the ElevenLabs implementation when the provider is set to "elevenlabs".
Once configured, the Eleven Labs provider works exactly like OpenAI and Gemini. The server will automatically route to the Eleven Labs implementation when the provider is set to "elevenlabs".
## Differences from OpenAI/Gemini
- **No system prompts**: ElevenLabs agents are configured in their dashboard through your account (and api key via a signedUrl)
- **Agent-based**: More advanced with workflow handling and tool calling with MCP through the ElevenLabs dashboard
- **No system prompts**: Eleven Labs agents are configured in their dashboard through your account (and api key via a signedUrl)
- **Agent-based**: More advanced with workflow handling and tool calling with MCP through the Eleven Labs dashboard
## Troubleshooting
1. **Connection Issues**: Ensure your ElevenLabs API key is valid and has access to Conversational AI
2. **Agent Not Found**: Verify the agent ID is correct and the agent exists in your ElevenLabs account
1. **Connection Issues**: Ensure your Eleven Labs API key is valid and has access to Conversational AI
2. **Agent Not Found**: Verify the agent ID is correct and the agent exists in your Eleven Labs account
3. **Audio Issues**: Check that the AI is speaking first. And that TTS Output is 24kHz pcm not 16kHz pcm (which is default)
4. **Transcription Missing**: Ensure your ElevenLabs agent has transcription enabled in the dashboard
4. **Transcription Missing**: Ensure your Eleven Labs agent has transcription enabled in the dashboard

View file

@ -84,7 +84,7 @@ export default function ElevenLabsModal({ isOpen, onClose, onSuccess, selectedUs
<Label htmlFor="elevenLabsName">Character Name</Label>
<Input
id="elevenLabsName"
placeholder="My ElevenLabs Character"
placeholder="My Eleven Labs Character"
value={form.name}
onChange={handleNameChange}
required
@ -101,7 +101,7 @@ export default function ElevenLabsModal({ isOpen, onClose, onSuccess, selectedUs
required
/>
<p className="text-xs text-gray-500">
Find this in your ElevenLabs dashboard under your agent settings
Find this in your Eleven Labs dashboard under your agent settings
</p>
</div>
@ -130,7 +130,7 @@ export default function ElevenLabsModal({ isOpen, onClose, onSuccess, selectedUs
<Dialog open={isOpen} onOpenChange={handleClose}>
<DialogContent className="sm:max-w-md">
<DialogHeader>
<DialogTitle>Add ElevenLabs Character</DialogTitle>
<DialogTitle>Add Eleven Labs Character</DialogTitle>
</DialogHeader>
{FormContent}
</DialogContent>
@ -142,7 +142,7 @@ export default function ElevenLabsModal({ isOpen, onClose, onSuccess, selectedUs
<Drawer open={isOpen} onOpenChange={handleClose}>
<DrawerContent>
<DrawerHeader>
<DrawerTitle>Add ElevenLabs Character</DrawerTitle>
<DrawerTitle>Add Eleven Labs Character</DrawerTitle>
</DrawerHeader>
<div className="px-4 pb-4">
{FormContent}