# Frequently Asked Questions This document addresses common questions about Open Notebook usage, configuration, and best practices. ## General Usage ### What is Open Notebook? Open Notebook is an open-source, privacy-focused alternative to Google's Notebook LM. It allows you to: - Create and manage research notebooks - Chat with your documents using AI - Generate podcasts from your content - Search across all your sources with semantic search - Transform and analyze your content ### How is Open Notebook different from Google Notebook LM? **Privacy**: Your data stays local by default. Only your chosen AI providers receive queries. **Flexibility**: Support for 15+ AI providers (OpenAI, Anthropic, Google, local models, etc.) **Customization**: Open source, so you can modify and extend functionality **Control**: You control your data, models, and processing ### Do I need technical skills to use Open Notebook? **Basic usage**: No technical skills required. The Docker installation is designed for non-technical users. **Advanced features**: Some technical knowledge helpful for: - Custom model configurations - API integrations - Source code modifications ### Can I use Open Notebook offline? **Partially**: The application runs locally, but requires internet for: - AI model API calls (unless using local models like Ollama) - Web content scraping - Some file processing features **Fully offline**: Possible with local models (Ollama) for basic functionality. ### What file types does Open Notebook support? **Documents**: - PDF (text extraction) - Microsoft Word (DOC, DOCX) - Plain text (TXT) - Markdown (MD) **Web Content**: - URLs (automatic web scraping) - YouTube videos (transcript extraction) - Web articles and blog posts **Media**: - Images (PNG, JPG, GIF, WebP) with OCR - Audio files (MP3, WAV, M4A) with transcription - Video files (MP4, AVI, MOV) for transcript extraction **Other**: - Direct text input - CSV data - Code files (with syntax highlighting) ### How much does it cost to run Open Notebook? **Software**: Free (open source) **AI API costs**: Pay-per-use to providers: - OpenAI: ~$0.50-5 per 1M tokens depending on model - Anthropic: ~$3-75 per 1M tokens depending on model - Google: Often free tier available - Local models: Free after initial setup **Typical monthly costs**: $5-50 for moderate usage, depending on chosen models. ## AI Models and Providers ### Which AI provider should I choose? **For beginners**: OpenAI (reliable, well-documented, good balance of cost/quality) **For advanced users**: Mix of providers based on specific needs **For privacy**: Local models (Ollama) or European providers (Mistral) **For cost optimization**: DeepSeek, Google (free tier), or OpenRouter ### Can I use multiple AI providers simultaneously? **Yes**: Open Notebook supports multiple providers. You can configure different providers for different tasks: - OpenAI for chat - Google for embeddings - ElevenLabs for text-to-speech - Anthropic for complex reasoning ### What are the best model combinations? **Budget-friendly**: - Language: `gpt-5-mini` (OpenAI) or `deepseek-chat` (DeepSeek) - Embedding: `text-embedding-3-small` (OpenAI) - TTS: `gpt-4o-mini-tts` (OpenAI) **High-quality**: - Language: `claude-3-7-sonnet` (Anthropic) or `gpt-4o` (OpenAI) - Embedding: `text-embedding-3-large` (OpenAI) - TTS: `eleven_turbo_v2_5` (ElevenLabs) **Privacy-focused**: - Language: Local Ollama models - Embedding: Local embedding models - TTS: Local TTS solutions ### How do I set up local models with Ollama? 1. **Install Ollama**: Download from https://ollama.ai 2. **Start Ollama**: `ollama serve` 3. **Download models**: `ollama pull llama2` 4. **Configure Open Notebook**: ```env OLLAMA_API_BASE=http://localhost:11434 ``` 5. **Select models**: In Models, choose Ollama models ### Why are my AI requests failing? **Common causes**: - Invalid API keys - Insufficient credits/billing - Model not available for your account - Rate limiting - Network connectivity issues **Solutions**: 1. Verify API keys in provider dashboard 2. Check billing and usage limits 3. Try different models 4. Wait and retry for rate limits 5. Check internet connection ### How do I optimize AI costs? **Model selection**: - Use smaller models for simple tasks - Use larger models only for complex reasoning - Leverage free tiers when available **Usage optimization**: - Process documents in batches - Use shorter prompts - Cache results when possible - Use local models for frequent tasks **Provider diversity**: - Use OpenRouter for expensive models - Use free tier providers for testing - Mix providers based on strength ## Data Management ### Where is my data stored? **Local storage**: By default, all data is stored locally: - Database: SurrealDB files in `surreal_data/` - Uploads: Files in `notebook_data/` - No external data transmission (except to chosen AI providers) **Cloud storage**: Not implemented, but can be configured with external storage solutions. ### How do I backup my data? **Manual backup**: ```bash # Create backup tar -czf backup-$(date +%Y%m%d).tar.gz notebook_data/ surreal_data/ # Restore backup tar -xzf backup-20240101.tar.gz ``` **Automated backup**: Set up cron jobs or use your preferred backup solution to backup the data directories. ### Can I sync data between devices? **Currently**: No built-in sync functionality. **Workarounds**: - Use shared network storage for data directories - Manual backup/restore between devices - Database replication (advanced) ### How do I migrate data between installations? 1. **Stop services**: `make stop-all` 2. **Copy data directories**: ```bash cp -r surreal_data/ new_installation/ cp -r notebook_data/ new_installation/ ``` 3. **Start new installation** 4. **Verify data integrity** ### What happens to my data if I delete a notebook? **Soft deletion**: Notebooks are marked as archived, not permanently deleted. **Hard deletion**: Currently not implemented in UI, but possible via API. **Recovery**: Archived notebooks can be restored from the database. ### How do I clean up old data? **Manual cleanup**: - Delete unused notebooks through UI - Remove old files from `notebook_data/` - Clear browser cache **Database cleanup**: Advanced users can query the database directly to remove old records. ## Best Practices ### How should I organize my notebooks? **By topic**: Create separate notebooks for different research areas **By project**: One notebook per project or course **By source type**: Separate notebooks for different content types **By time period**: Monthly or quarterly notebooks ### What's the optimal notebook size? **Recommended**: 20-100 sources per notebook **Performance**: Larger notebooks may have slower search **Organization**: Better to have focused notebooks than everything in one ### How do I get the best search results? **Use descriptive queries**: Instead of "data", use "data analysis methods" **Combine keywords**: Use multiple related terms **Use natural language**: Ask questions as you would to a human **Refine iteratively**: Start broad, then get more specific ### How can I improve chat responses? **Provide context**: Reference specific sources or topics **Be specific**: Ask detailed questions rather than general ones **Use follow-up questions**: Build on previous responses **Include examples**: Show what kind of response you want ### What's the best way to process large documents? **Break into sections**: Split large documents into smaller parts **Use transformations**: Apply summarization before adding to notebook **Batch processing**: Process multiple documents at once **Use background jobs**: For heavy processing tasks ### How do I handle multiple languages? **Model selection**: Choose models that support your languages **Language-specific providers**: Some providers are better for certain languages **Separate notebooks**: Consider separate notebooks for different languages **Encoding**: Ensure proper text encoding for non-English content ### What are the security best practices? **API keys**: Never share API keys publicly **Password protection**: Use strong passwords for public deployments **Network security**: Use HTTPS for production deployments **Regular updates**: Keep Docker images updated **Backup encryption**: Encrypt backups if they contain sensitive data ### How do I optimize performance? **Hardware**: - Use SSD storage for database - Allocate sufficient RAM (4GB+ recommended) - Use fast internet connection **Configuration**: - Choose appropriate models for your needs - Optimize embedding dimensions - Use efficient file formats **Usage patterns**: - Process documents in batches - Use background jobs for heavy tasks - Clear cache periodically ## Technical Questions ### Can I use Open Notebook programmatically? **Yes**: Open Notebook provides a comprehensive REST API: - Full API documentation at `/docs` endpoint - Support for all UI functionality - Authentication via API keys - Webhook support for notifications ### How do I extend Open Notebook? **Plugin system**: Add custom transformations and processors **API integration**: Build custom applications using the API **Source code**: Modify the open-source codebase **Custom models**: Add support for new AI providers ### Can I run Open Notebook in production? **Yes**: Designed for production use with: - Docker deployment - Horizontal scaling capability - Security features - Monitoring and logging **Considerations**: - Use production-grade database settings - Implement proper backup strategy - Configure monitoring and alerting - Use HTTPS and security best practices ### How do I contribute to Open Notebook? **Ways to contribute**: - Report bugs and issues - Suggest new features - Contribute code improvements - Improve documentation - Help other users in the community **Getting started**: - Join Discord community - Check GitHub issues - Read contribution guidelines - Start with small improvements ### What's the development roadmap? **Current focus**: - Stability and performance improvements - Additional AI provider support - Enhanced podcast generation - Better mobile experience **Future plans**: - Multi-user support - Advanced analytics - Integration with external tools - Cloud deployment options ## Troubleshooting ### Why do I get timeout errors even though transformations complete successfully? **Cause**: The default client timeout (5 minutes) may be too short for slow AI providers or hardware. **Quick fix**: ```bash # Add to your .env file API_CLIENT_TIMEOUT=600 # 10 minutes for slow hardware ``` **When this happens**: - Using local Ollama models on CPU - Using remote LM Studio over slow network - First transformation after starting (model loading) - Very large documents - Slower hardware configurations **Detailed solutions**: See [Common Issues - API Timeout Errors](./common-issues.md#api-timeout-errors-during-transformations) **Note**: If transformations complete after you refresh the page, you only need to increase `API_CLIENT_TIMEOUT`, not `ESPERANTO_LLM_TIMEOUT`. ### My question isn't answered here. What should I do? 1. **Check the troubleshooting guide**: [Common Issues](./common-issues.md) 2. **Search existing issues**: GitHub repository issues 3. **Ask the community**: Discord server 4. **Create a GitHub issue**: For bugs or feature requests 5. **Check the documentation**: Other documentation sections ### How do I report a bug? **Include**: - Steps to reproduce - Expected vs actual behavior - Error messages and logs - System information - Configuration details (without API keys) **Submit to**: GitHub Issues with bug report template ### How do I request a new feature? **Process**: 1. Check if feature already exists or is planned 2. Discuss in Discord to gauge interest 3. Create detailed GitHub issue 4. Consider contributing implementation ### Where can I get help with installation? **Resources**: - [Installation Guide](../getting-started/installation.md) - [Docker Deployment Guide](../deployment/docker.md) - [ChatGPT Installation Assistant](https://chatgpt.com/g/g-68776e2765b48191bd1bae3f30212631-open-notebook-installation-assistant) - Discord community support ### How do I stay updated with new releases? **Methods**: - Watch GitHub repository - Join Discord for announcements - Follow release notes - Enable automatic Docker updates --- *This FAQ is updated regularly based on community questions and feedback. If you have a question that's not covered here, please ask in our Discord community or create a GitHub issue.*