- Bump esperanto dependency to >=2.20.0 for new provider profiles - Register both providers in credentials, key provider, connection tester, model discovery, and models router - Add frontend provider entries (display names, modalities, docs links) - Add documentation sections for both providers in ai-providers.md, environment-reference.md, and provider comparison
16 KiB
AI Providers - Configuration Guide
Complete setup instructions for each AI provider via the Settings UI.
New in v1.2: All AI provider credentials are now managed through the Settings UI. Environment variables for API keys are deprecated.
How Provider Setup Works
Open Notebook uses a credential-based system for managing AI providers:
- Get your API key from the provider's website
- Open Settings → API Keys → Add Credential
- Test the connection to verify it works
- Discover & Register Models to make them available
- Start using the provider in your notebooks
Prerequisite: You must set
OPEN_NOTEBOOK_ENCRYPTION_KEYin your docker-compose.yml before storing credentials. See API Configuration for details.
Cloud Providers (Recommended for Most)
OpenAI
Cost: ~$0.03-0.15 per 1K tokens (varies by model)
Get Your API Key:
- Go to https://platform.openai.com/api-keys
- Create account (if needed)
- Create new API key (starts with "sk-proj-")
- Add $5+ credits to account
Configure in Open Notebook:
- Go to Settings → API Keys
- Click Add Credential
- Select provider: OpenAI
- Give it a name (e.g., "My OpenAI Key")
- Paste your API key
- Click Save, then Test Connection
- Click Discover Models to find available models
- Click Register Models to make them available
Available Models (in Open Notebook):
gpt-4o— Best quality, fast (latest version)gpt-4o-mini— Fast, cheap, good for testingo1— Advanced reasoning model (slower, more expensive)o1-mini— Faster reasoning model
Recommended:
- For general use:
gpt-4o(best balance) - For testing/cheap:
gpt-4o-mini(90% cheaper) - For complex reasoning:
o1(best for hard problems)
Cost Estimate:
Light use: $1-5/month
Medium use: $10-30/month
Heavy use: $50-100+/month
Troubleshooting:
- "Invalid API key" → Check key starts with "sk-proj-" and test the connection in Settings
- "Rate limit exceeded" → Wait or upgrade account
- "Model not available" → Try gpt-4o-mini instead, or re-discover models
Anthropic (Claude)
Cost: ~$0.80-3.00 per 1M tokens (cheaper than OpenAI for long context)
Get Your API Key:
- Go to https://console.anthropic.com/
- Create account or login
- Go to API keys section
- Create new API key (starts with "sk-ant-")
Configure in Open Notebook:
- Go to Settings → API Keys
- Click Add Credential
- Select provider: Anthropic
- Give it a name, paste your API key
- Click Save, then Test Connection
- Click Discover Models → Register Models
Available Models:
claude-sonnet-4-5-20250929— Latest, best quality (recommended)claude-3-5-sonnet-20241022— Previous generation, still excellentclaude-3-5-haiku-20241022— Fast, cheapclaude-opus-4-5-20251101— Most powerful, expensive
Recommended:
- For general use:
claude-sonnet-4-5(best overall, latest) - For cheap:
claude-3-5-haiku(80% cheaper) - For complex:
claude-opus-4-5(most capable)
Cost Estimate:
Sonnet: $3-20/month (typical use)
Haiku: $0.50-3/month
Opus: $10-50+/month
Advantages:
- Great long-context support (200K tokens)
- Excellent reasoning
- Fast processing
Troubleshooting:
- "Invalid API key" → Check it starts with "sk-ant-" and test in Settings
- "Overloaded" → Anthropic is busy, retry later
- "Model unavailable" → Re-discover models from the credential
Google Gemini
Cost: ~$0.075-0.30 per 1K tokens (competitive with OpenAI)
Get Your API Key:
- Go to https://aistudio.google.com/app/apikey
- Create account or login
- Create new API key
Configure in Open Notebook:
- Go to Settings → API Keys
- Click Add Credential
- Select provider: Google Gemini
- Give it a name, paste your API key
- Click Save, then Test Connection
- Click Discover Models → Register Models
Available Models:
gemini-2.0-flash-exp— Latest experimental, fastest (recommended)gemini-2.0-flash— Stable version, fast, cheap
Recommended:
- For general use:
gemini-2.0-flash-exp(best value, latest) - For cheap:
gemini-1.5-flash(very cheap) - For complex/long context:
gemini-1.5-pro-latest(2M token context)
Advantages:
- Very long context (1M tokens)
- Multimodal (images, audio, video)
- Good for podcasts
Troubleshooting:
- "API key invalid" → Get fresh key from aistudio.google.com
- "Quota exceeded" → Free tier limited, upgrade account
- "Model not found" → Re-discover models from the credential
Groq
Cost: ~$0.05 per 1M tokens (cheapest, but limited models)
Get Your API Key:
- Go to https://console.groq.com/keys
- Create account or login
- Create new API key
Configure in Open Notebook:
- Go to Settings → API Keys
- Click Add Credential
- Select provider: Groq
- Give it a name, paste your API key
- Click Save, then Test Connection
- Click Discover Models → Register Models
Available Models:
llama-3.3-70b-versatile— Best on Groq (recommended)llama-3.1-70b-versatile— Fast, capablemixtral-8x7b-32768— Good alternativegemma2-9b-it— Small, very fast
Recommended:
- For quality:
llama-3.3-70b-versatile(best overall) - For speed:
gemma2-9b-it(ultra-fast) - For balance:
llama-3.1-70b-versatile
Advantages:
- Ultra-fast inference
- Very cheap
- Great for transformations/batch work
Disadvantages:
- Limited model selection
- Smaller models than OpenAI/Anthropic
Troubleshooting:
- "Rate limited" → Free tier has limits, upgrade
- "Model not available" → Re-discover models from the credential
OpenRouter
Cost: Varies by model ($0.05-15 per 1M tokens)
Get Your API Key:
- Go to https://openrouter.ai/keys
- Create account or login
- Add credits to your account
- Create new API key
Configure in Open Notebook:
- Go to Settings → API Keys
- Click Add Credential
- Select provider: OpenRouter
- Give it a name, paste your API key
- Click Save, then Test Connection
- Click Discover Models → Register Models
Available Models (100+ options):
- OpenAI:
openai/gpt-4o,openai/o1 - Anthropic:
anthropic/claude-sonnet-4.5,anthropic/claude-3.5-haiku - Google:
google/gemini-2.0-flash-exp,google/gemini-1.5-pro - Meta:
meta-llama/llama-3.3-70b-instruct,meta-llama/llama-3.1-405b-instruct - Mistral:
mistralai/mistral-large-2411 - DeepSeek:
deepseek/deepseek-chat - And many more...
Recommended:
- For quality:
anthropic/claude-sonnet-4.5(best overall) - For speed/cost:
google/gemini-2.0-flash-exp(very fast, cheap) - For open-source:
meta-llama/llama-3.3-70b-instruct - For reasoning:
openai/o1
Advantages:
- One API key for 100+ models
- Unified billing
- Easy model comparison
- Access to models that may have waitlists elsewhere
Cost Estimate:
Light use: $1-5/month
Medium use: $10-30/month
Heavy use: Depends on models chosen
Troubleshooting:
- "Invalid API key" → Check it starts with "sk-or-"
- "Insufficient credits" → Add credits at openrouter.ai
- "Model not available" → Check model ID spelling (use full path)
DashScope (Qwen)
Cost: ~$0.01-0.06 per 1K tokens (varies by model)
Get Your API Key:
- Go to https://dashscope.console.aliyun.com/
- Create an Alibaba Cloud account (if needed)
- Navigate to API Keys section
- Create a new API key
Configure in Open Notebook:
- Go to Settings → API Keys
- Click Add Credential
- Select provider: DashScope (Qwen)
- Give it a name, paste your API key
- Click Save, then Test Connection
- Click Discover Models → Register Models
Available Models:
qwen-max— Most capable Qwen modelqwen-plus— Good balance of quality and speedqwen-turbo— Fastest, cheapest
Recommended:
- For quality:
qwen-max(best overall) - For general use:
qwen-plus(good balance) - For speed/cost:
qwen-turbo(cheapest)
Troubleshooting:
- "Invalid API key" → Check the key in the DashScope console
- "Model not available" → Re-discover models from the credential
MiniMax
Cost: Varies by model
Get Your API Key:
- Go to https://platform.minimaxi.com/
- Create an account (if needed)
- Navigate to API Keys section
- Create a new API key
Configure in Open Notebook:
- Go to Settings → API Keys
- Click Add Credential
- Select provider: MiniMax
- Give it a name, paste your API key
- Click Save, then Test Connection
- Click Discover Models → Register Models
Available Models:
MiniMax-M2.5— Most capable, 204K contextMiniMax-M2.5-highspeed— Faster variant, 204K context
Recommended:
- For quality:
MiniMax-M2.5(best overall) - For speed:
MiniMax-M2.5-highspeed(faster responses)
Advantages:
- Very long context (204K tokens)
- Competitive pricing
Troubleshooting:
- "Invalid API key" → Check the key in the MiniMax platform
- "Model not available" → Re-discover models from the credential
Self-Hosted / Local
Ollama (Recommended for Local)
Cost: Free (electricity only)
Setup Ollama:
- Install Ollama: https://ollama.ai
- Run Ollama in background:
ollama serve - Download a model:
ollama pull mistral
Configure in Open Notebook:
- Go to Settings → API Keys
- Click Add Credential
- Select provider: Ollama
- Give it a name (e.g., "Local Ollama")
- Enter the base URL:
- Same machine (non-Docker):
http://localhost:11434 - Docker with Ollama on host:
http://host.docker.internal:11434 - Docker with Ollama container:
http://ollama:11434
- Same machine (non-Docker):
- Click Save, then Test Connection
- Click Discover Models → Register Models
See Ollama Setup Guide for detailed network configuration.
Available Models:
llama3.3:70b— Best quality (requires 40GB+ RAM)llama3.1:8b— Recommended, balanced (8GB RAM)qwen2.5:7b— Excellent for code and reasoningmistral:7b— Good general purposephi3:3.8b— Small, fast (4GB RAM)gemma2:9b— Google's model, balanced- Many more:
ollama listto see available
Recommended:
- For quality (with GPU):
llama3.3:70b(best) - For general use:
llama3.1:8b(best balance) - For speed/low memory:
phi3:3.8b(very fast) - For coding:
qwen2.5:7b(excellent at code)
Hardware Requirements:
GPU (NVIDIA/AMD):
8GB VRAM: Runs most models fine
6GB VRAM: Works, slower
4GB VRAM: Small models only
CPU-only:
16GB+ RAM: Slow but works
8GB RAM: Very slow
4GB RAM: Not recommended
Advantages:
- Completely private (runs locally)
- Free (electricity only)
- No API key needed
- Works offline
Disadvantages:
- Slower than cloud (unless on GPU)
- Smaller models than cloud
- Requires local hardware
Troubleshooting:
- "Connection refused" → Ollama not running or wrong URL in credential
- "Model not found" → Download it:
ollama pull modelname - "Out of memory" → Use smaller model or add more RAM
LM Studio (Local Alternative)
Cost: Free
Setup LM Studio:
- Download LM Studio: https://lmstudio.ai
- Open app
- Download a model from library
- Go to "Local Server" tab
- Start server (default port: 1234)
Configure in Open Notebook:
- Go to Settings → API Keys
- Click Add Credential
- Select provider: OpenAI-Compatible
- Give it a name (e.g., "LM Studio")
- Enter the base URL:
http://host.docker.internal:1234/v1(Docker) orhttp://localhost:1234/v1(local) - API key:
lm-studio(placeholder, LM Studio doesn't require one) - Click Save, then Test Connection
Advantages:
- GUI interface (easier than Ollama CLI)
- Good model selection
- Privacy-focused
- Works offline
Disadvantages:
- Desktop only (Mac/Windows/Linux)
- Slower than cloud
- Requires local GPU
Custom OpenAI-Compatible
For Text Generation UI, vLLM, or other OpenAI-compatible endpoints:
- Go to Settings → API Keys
- Click Add Credential
- Select provider: OpenAI-Compatible
- Enter the base URL for your endpoint (e.g.,
http://localhost:8000/v1) - Enter API key if required
- Optionally configure per-service URLs (LLM, Embedding, TTS, STT)
- Click Save, then Test Connection
See OpenAI-Compatible Setup for detailed instructions.
Enterprise
Azure OpenAI
Cost: Same as OpenAI (usage-based)
Configure in Open Notebook:
- Create Azure OpenAI service in Azure portal
- Deploy GPT-4/3.5-turbo model
- Get your endpoint and key
- Go to Settings → API Keys
- Click Add Credential
- Select provider: Azure OpenAI
- Fill in: API Key, Endpoint, API Version
- Optionally configure service-specific endpoints (LLM, Embedding)
- Click Save, then Test Connection
Advantages:
- Enterprise support
- VPC integration
- Compliance (HIPAA, SOC2, etc.)
Disadvantages:
- More complex setup
- Higher overhead
- Requires Azure account
Embeddings (For Search/Semantic Features)
By default, Open Notebook uses the LLM provider's embeddings. Embedding models are discovered and registered through the same credential system — when you discover models from a credential, embedding models are included alongside language models.
Choosing Your Provider
1. Don't want to run locally and don't want to mess around with different providers:
Use OpenAI
- Cloud-based
- Good quality
- Reasonable cost
- Simplest setup, supports all modes (text, embedding, tts, stt, etc)
For budget-conscious: Groq, OpenRouter or Ollama
- Groq: Super cheap cloud
- Ollama: Free, but local
- OpenRouter: many open source models very accessible
For privacy-first: Ollama or LM Studio and Speaches (TTS, STT)
- Everything stays local
- Works offline
- No API keys sent anywhere
For enterprise: Azure OpenAI
- Compliance
- VPC integration
- Support
Next Steps
- Choose your provider from above
- Get API key (if cloud) or install locally (if Ollama)
- Set
OPEN_NOTEBOOK_ENCRYPTION_KEYin your docker-compose.yml (required for storing credentials) - Open Settings → API Keys → Add Credential
- Test Connection to verify it works
- Discover & Register Models to make them available
- Verify it works with a test chat
Multiple providers: You can add credentials for as many providers as you want. Create separate credentials for different projects or team members.
Done!
Legacy: Environment Variables (Deprecated)
Deprecated: Configuring AI provider API keys via environment variables is deprecated. Use the Settings UI instead. Environment variables may still work as a fallback but are no longer the recommended approach.
If you are migrating from an older version that used environment variables, go to Settings → API Keys and click the Migrate to Database button to import your existing keys into the credential system.
Related
- API Configuration — Detailed credential management guide
- Environment Reference - Complete list of all environment variables
- Advanced Configuration - Timeouts, SSL, performance tuning
- Ollama Setup - Detailed Ollama configuration guide
- OpenAI-Compatible - LM Studio and other compatible providers
- Local TTS Setup - Text-to-speech with Speaches
- Local STT Setup - Speech-to-text with Speaches
- Troubleshooting - Common issues and fixes