7.4 KiB
AI Model Selection Guide
This guide helps you choose the best AI models for your Open Notebook setup. We'll cover what makes each provider special, which models work best for different tasks, and give you ready-to-use combinations to get started quickly.
Understanding Model Types
Open Notebook uses four types of AI models:
- Language Models: For chat, text generation, summaries, and tool calling
- Embedding Models: For semantic search and content similarity
- Text-to-Speech (TTS): For generating podcasts and audio content
- Speech-to-Text (STT): For transcribing audio files
What to Consider When Choosing Models
💰 Cost: Some models are free (Ollama), others charge per token 🎯 Quality: Higher quality models often cost more but produce better results ⚡ Speed: Smaller models are faster but may be less capable 🔧 Features: Some models excel at specific tasks like tool calling or large contexts
Provider Breakdown
🟦 Google (Gemini)
Best for: Large context processing, affordable high-quality models
Language Models
gemini-2.0-flash- Excellent balance of price and performance with 1M context windowgemini-2.5-pro-preview-06-05- Premium model for complex reasoning tasks
Text-to-Speech
gemini-2.5-flash-preview-tts- Good quality at $10 per 1M tokensgemini-2.5-pro-preview-tts- Higher quality at $20 per 1M tokens
Embedding
text-embedding-004- Solid performance with generous free tier
🟢 OpenAI
Best for: Reliable performance, excellent tool calling, wide ecosystem support
Language Models
gpt-4o-mini- Great value for most tasks, perfect for everyday usegpt-4o- Premium quality with excellent tool calling capabilities
Text-to-Speech
tts-1- Good quality for personal use and podcasts
Speech-to-Text
whisper-1- Industry-standard transcription quality
Embedding
text-embedding-3-small- Affordable at $0.02 per 1M tokens with solid performance
🎤 ElevenLabs
Best for: High-quality voice synthesis and transcription
Text-to-Speech
eleven_turbo_v2_5- Excellent voice quality with reasonable pricing
Speech-to-Text
scribe_v1- High-quality transcription service
🔵 DeepSeek
Best for: Cost-effective language models with good performance
Language Models
deepseek-chat- Excellent quality-to-price ratio with 64k context window
🟡 Mistral
Best for: European-based alternative with competitive pricing
Language Models
mistral-medium-latest- Good balance of quality and priceministral-8b-latest- Perfect for simple tasks like transformations
Embedding
mistral-embed- Good quality, though not the most cost-effective
⚡ Grok (xAI)
Best for: Cutting-edge intelligence and reasoning
Language Models
grok-3- Top-tier intelligence, premium pricinggrok-3-mini- Excellent performance at more accessible pricing
🚢 Voyage AI
Best for: Specialized embedding models
Embedding
voyage-3.5-lite- Competitive with OpenAI's offering at similar pricing
🟣 Anthropic (Claude)
Best for: High-quality reasoning and safety
Language Models
claude-3-5-sonnet-latest- Exceptional quality for complex tasks
🦙 Ollama (Local/Free)
Best for: Privacy, offline use, and zero ongoing costs
Language Models
qwen3- Excellent free alternative for most language tasksgemma3- Great for chat and simple transformationsphi4- Compact but capable modeldeepseek-r1- Advanced reasoning capabilitiesllama4- Well-rounded performance
Embedding
mxbai-embed-large- Outstanding free embedding model
Recommended Combinations
🌟 Best Value (Mixed Providers)
Perfect balance of cost and performance
- Chat:
gpt-4o-mini(OpenAI) - Reliable and affordable - Tools:
gpt-4o(OpenAI) - Excellent tool calling - Transformations:
ministral-8b-latest(Mistral) - Cost-effective - Large Context:
gemini-2.0-flash(Google) - 1M context window - Embedding:
text-embedding-3-small(OpenAI) - Good price/performance - TTS:
gemini-2.5-flash-preview-tts(Google) - Affordable quality - STT:
whisper-1(OpenAI) - Industry standard
💰 Budget-Friendly (Mostly Free)
Great for getting started or keeping costs low
- Language:
qwen3(Ollama) - Free and capable - Tools:
qwen3(Ollama) - Handles basic tool calling - Transformations:
gemma3(Ollama) - Free and fast - Embedding:
mxbai-embed-large(Ollama) - Free, high quality - TTS:
gpt-4o-mini-tts(OpenAI) - Reasonable cost - STT:
whisper-1(OpenAI) - Best value
🚀 High Performance (Premium)
When quality is your top priority
- Chat:
claude-3-5-sonnet-latest(Anthropic) orgrok-3(xAI) - Exceptional reasoning - Tools:
gpt-4o(OpenAI) orclaude-3-5-sonnet-latest(Anthropic) orgrok-3(xAI) - Best tool calling - Transformations:
grok-3-mini(xAI) - Smart and efficient - Large Context:
gemini-2.5-pro-preview-06-05(Google) - Premium quality - Embedding:
voyage-3.5-lite(Voyage) - Specialized performance - TTS:
eleven_turbo_v2_5(ElevenLabs) - Premium voice quality - STT:
whisper-1(OpenAI) - Proven reliability
🏢 Single Provider (OpenAI)
Simplify billing and setup with one provider
- Chat:
gpt-4o-mini- Everyday conversations - Tools:
gpt-4o- Complex operations - Transformations:
gpt-4o-mini- Cost-effective processing - Embedding:
text-embedding-3-small- Solid performance - TTS:
gpt-4o-mini-tts- Great quality - STT:
whisper-1- Industry standard
Setting up Models
Here are the environment variables that you need to set up for each provider:
| Provider | Environment Variables |
|---|---|
| Mistral | MISTRAL_API_KEY |
| Deepseek | DEEPSEEK_API_KEY |
| OpenAI | OPENAI_API_KEY |
| Google (Gemini) | GEMINI_API_KEY |
| X.AI | XAI_API_KEY |
| ElevenLabs | ELEVENLABS_API_KEY |
| Anthropic | ANTHROPIC_API_KEY |
| Ollama | OLLAMA_BASE_URL |
| Azure OpenAI | AZURE_OPENAI_API_KEY, AZURE_OPENAI_API_VERSION, AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_DEPLOYMENT_NAME |
| Groq | GROQ_API_KEY |
| Vertex AI | VERTEX_PROJECT, GOOGLE_APPLICATION_CREDENTIALS, VERTEX_LOCATION |
Tips to use Text to Speech
OpenAI
To use the OpenAI provider for audio:
- Use "gpt-4o-mini-tts" as the audio model name
- pass the following Environment Variables
- OPENAI_API_KEY=your-openai-api-key
Gemini
To use the Gemini provider for audio:
- Use "default" as the audio model name (since vertex does not require a model name)
- pass the following Environment Variables
- GEMINI_API_KEY=gemini-2.5-flash-preview-tts
Google Cloud / Vertex (previously supported as Google)
If you were using this before, we recommend moving to GEMINI for better quality, price and ease of configuration.
To use the Google Cloud (Vertex) provider for audio:
- Use "default" as the audio model name (since vertex does not require a model name)
- pass the following Environment Variables
- VERTEX_PROJECT=your-google-cloud-project-name
- GOOGLE_APPLICATION_CREDENTIALS=./google-credentials.json
- VERTEX_LOCATION=your-google-cloud-project-location
- Setup the correct permissions in the Google Cloud Console