From 8947f93471471a7eebc0bd94966f55f02dfb22cc Mon Sep 17 00:00:00 2001 From: hex2077 Date: Tue, 3 Mar 2026 23:09:04 +0800 Subject: [PATCH] =?UTF-8?q?refactor:=20=E7=A7=BB=E9=99=A4=20Ollama=20?= =?UTF-8?q?=E5=8D=8F=E8=AE=AE=E6=94=AF=E6=8C=81=E5=B9=B6=E9=87=8D=E6=9E=84?= =?UTF-8?q?=E6=A8=A1=E5=9E=8B=E8=B7=AF=E7=94=B1?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - 删除 Ollama 协议相关代码,包括处理器、转换器、文档和常量 - 重构模型列表获取逻辑,支持 auto 模式下的多提供商聚合 - 新增 token 计算工具函数,统一各提供商 token 计数逻辑 - 改进模型前缀路由解析,增强 auto 模式的健壮性 - 更新多语言文档,移除 Ollama 相关内容 --- README-JA.md | 43 +- README-ZH.md | 39 +- README.md | 41 +- docs/PROVIDER_ADAPTER_GUIDE.md | 5 - src/converters/register-converters.js | 2 - src/converters/strategies/OllamaConverter.js | 690 ---------------- src/converters/utils.js | 81 -- src/handlers/ollama-handler.js | 796 ------------------- src/handlers/request-handler.js | 79 +- src/providers/claude/claude-kiro.js | 164 +--- src/providers/provider-pool-manager.js | 101 ++- src/services/service-manager.js | 68 +- src/utils/common.js | 87 +- src/utils/token-utils.js | 170 ++++ static/app/i18n.js | 12 +- static/app/utils.js | 4 +- static/components/section-guide.html | 25 - 17 files changed, 437 insertions(+), 1970 deletions(-) delete mode 100644 src/converters/strategies/OllamaConverter.js delete mode 100644 src/handlers/ollama-handler.js create mode 100644 src/utils/token-utils.js diff --git a/README-JA.md b/README-JA.md index 66bdea7..7670d7c 100644 --- a/README-JA.md +++ b/README-JA.md @@ -43,7 +43,6 @@ > - **2025.12.25** - 設定ファイル統一管理:すべての設定を `configs/` ディレクトリに集約。Dockerユーザーはマウントパスを `-v "ローカルパス:/app/configs"` に更新が必要 > - **2025.12.11** - Dockerイメージが自動的にビルドされ、Docker Hubで公開されました: [justlikemaki/aiclient-2-api](https://hub.docker.com/r/justlikemaki/aiclient-2-api) > - **2025.11.30** - Antigravityプロトコルサポートの追加、Google内部インターフェース経由でGemini 3 Pro、Claude Sonnet 4.5などのモデルへのアクセスをサポート -> - **2025.11.16** - Ollamaプロトコルサポートの追加、統一インターフェースでサポートされるすべてのモデルにアクセス > - **2025.11.11** - Web UI管理コントロールコンソールの追加、リアルタイム設定管理と健康状態モニタリングをサポート > - **2025.11.06** - Gemini 3 プレビュー版のサポートを追加、モデル互換性とパフォーマンス最適化を向上 > - **2025.10.18** - Kiroオープン登録、新規アカウントに500クレジット付与、Claude Sonnet 4.5を完全サポート @@ -93,7 +92,6 @@ - [📋 コア機能](#-コア機能) - [🔐 認証設定ガイド](#-認証設定ガイド) - [📁 認証ファイル保存パス](#-認証ファイル保存パス) -- [🦙 Ollamaプロトコル使用例](#-ollamaプロトコル使用例) - [⚙️ 高度な設定](#高度な設定) - [❓ よくある質問](#-よくある質問) - [📄 オープンソースライセンス](#-オープンソースライセンス) @@ -347,41 +345,6 @@ curl http://localhost:3000/claude-kiro-oauth/v1/chat/completions \ --- -### 🦙 Ollamaプロトコル使用例 - -本プロジェクトはOllamaプロトコルをサポートしており、統一インターフェースを通じてすべてのサポートモデルにアクセスできます。Ollamaエンドポイントは`/api/tags`、`/api/chat`、`/api/generate`などの標準インターフェースを提供します。 - -**Ollama API呼び出し例**: - -1. **利用可能なすべてのモデルをリスト表示**: -```bash -curl http://localhost:3000/ollama/api/tags \ - -H "Authorization: Bearer your-api-key" -``` - -2. **チャットインターフェース**: -```bash -curl http://localhost:3000/ollama/api/chat \ - -H "Content-Type: application/json" \ - -H "Authorization: Bearer your-api-key" \ - -d '{ - "model": "[Claude] claude-sonnet-4.5", - "messages": [ - {"role": "user", "content": "こんにちは"} - ] - }' -``` - -3. **モデルプレフィックスを使用してプロバイダーを指定**: -- `[Kiro]` - Kiro APIを使用してClaudeモデルにアクセス -- `[Claude]` - 公式Claude APIを使用 -- `[Gemini CLI]` - Gemini CLI OAuth経由でアクセス -- `[OpenAI]` - 公式OpenAI APIを使用 -- `[Grok]` - Grok Cookie/SSO経由でアクセス -- `[Qwen CLI]` - Qwen OAuth経由でアクセス - ---- - ### 高度な設定
@@ -672,11 +635,9 @@ kill -9 ### 10. APIが404を返す -**問題の説明**:APIエンドポイントを呼び出すと404 Not Foundエラーが返されます。 - **解決策**: -- **エンドポイントパスを確認**:`/v1/chat/completions`、`/ollama/api/chat` などの正しいエンドポイントパスを使用していることを確認 -- **クライアントの自動補完を確認**:一部のクライアント(Cherry-Studio、NextChatなど)はBase URLの後にパス(`/v1/chat/completions` など)を自動的に追加し、パスの重複を引き起こします。コンソールで実際のリクエストURLを確認し、冗長なパス部分を削除してください +- **エンドポイントパスを確認**:`/v1/chat/completions` などの正しいエンドポイントパスを使用していることを確認 +- **クライアントの自動補完を確認**:一部のクライアント(Cherry-Studio、NextChatなど)はBase URLの後にパス(`/v1/chat/completions` など)を自動的に追加し、パスの重複を引き起こします。コンソールで実際のリクエストURLを確認し、冗长なパス部分を削除してください - **サービス状態を確認**:サービスが正常に起動していることを確認、`http://localhost:3000` にアクセスしてWeb UIを確認 - **ポート設定を確認**:リクエストが正しいポート(デフォルト3000)に送信されていることを確認 - **利用可能なルートを確認**:Web UIダッシュボードページの「インタラクティブルーティング例」ですべての利用可能なエンドポイントを確認 diff --git a/README-ZH.md b/README-ZH.md index 3076146..68ca6ff 100644 --- a/README-ZH.md +++ b/README-ZH.md @@ -43,7 +43,6 @@ > - **2025.12.25** - 配置文件统一管理:所有配置集中到 `configs/` 目录,Docker 用户需更新挂载路径为 `-v "本地路径:/app/configs"` > - **2025.12.11** - Docker 镜像自动构建并发布到 Docker Hub: [justlikemaki/aiclient-2-api](https://hub.docker.com/r/justlikemaki/aiclient-2-api) > - **2025.11.30** - 新增 Antigravity 协议支持,支持通过 Google 内部接口访问 Gemini 3 Pro、Claude Sonnet 4.5 等模型 -> - **2025.11.16** - 新增 Ollama 协议支持,统一接口访问所有支持的模型(Claude、Gemini、Qwen、OpenAI等) > - **2025.11.11** - 新增 Web UI 管理控制台,支持实时配置管理和健康状态监控 > - **2025.11.06** - 新增对 Gemini 3 预览版的支持,增强模型兼容性和性能优化 > - **2025.10.18** - Kiro 开放注册,新用户赠送 500 额度,已完整支持 Claude Sonnet 4.5 @@ -92,7 +91,6 @@ - [📋 核心功能](#-核心功能) - [🔐 授权配置指南](#-授权配置指南) - [📁 授权文件存储路径](#-授权文件存储路径) -- [🦙 Ollama 协议使用示例](#-ollama-协议使用示例) - [⚙️ 高级配置](#高级配置) - [❓ 常见问题](#-常见问题) - [📄 开源许可](#-开源许可) @@ -346,41 +344,6 @@ curl http://localhost:3000/claude-kiro-oauth/v1/chat/completions \ --- -### 🦙 Ollama 协议使用示例 - -本项目支持 Ollama 协议,可以通过统一接口访问所有支持的模型。Ollama 端点提供 `/api/tags`、`/api/chat`、`/api/generate` 等标准接口。 - -**Ollama API 调用示例**: - -1. **列出所有可用模型**: -```bash -curl http://localhost:3000/ollama/api/tags \ - -H "Authorization: Bearer your-api-key" -``` - -2. **聊天接口**: -```bash -curl http://localhost:3000/ollama/api/chat \ - -H "Content-Type: application/json" \ - -H "Authorization: Bearer your-api-key" \ - -d '{ - "model": "[Claude] claude-sonnet-4.5", - "messages": [ - {"role": "user", "content": "你好"} - ] - }' -``` - -3. **使用模型前缀指定提供商**: -- `[Kiro]` - 使用 Kiro API 访问 Claude 模型 -- `[Claude]` - 使用 Claude 官方 API -- `[Gemini CLI]` - 通过 Gemini CLI OAuth 访问 -- `[OpenAI]` - 使用 OpenAI 官方 API -- `[Grok]` - 通过 Grok Cookie/SSO 访问 -- `[Qwen CLI]` - 通过 Qwen OAuth 访问 - ---- - ### 高级配置
@@ -674,7 +637,7 @@ kill -9 **问题描述**:调用 API 接口时返回 404 Not Found 错误。 **解决方案**: -- **检查接口路径**:确保使用正确的接口路径,如 `/v1/chat/completions`、`/ollama/api/chat` 等 +- **检查接口路径**:确保使用正确的接口路径,如 `/v1/chat/completions` 等 - **检查客户端自动补全**:某些客户端(如 Cherry-Studio、NextChat)会自动在 Base URL 后追加路径(如 `/v1/chat/completions`),导致路径重复。请查看控制台中的实际请求 URL,移除多余的路径部分 - **检查服务状态**:确认服务已正常启动,访问 `http://localhost:3000` 查看 Web UI - **检查端口配置**:确保请求发送到正确的端口(默认 3000) diff --git a/README.md b/README.md index 28d7550..bc90009 100644 --- a/README.md +++ b/README.md @@ -43,7 +43,6 @@ > - **2025.12.25** - Unified configuration management: All configs centralized to `configs/` directory. Docker users need to update mount path to `-v "local_path:/app/configs"` > - **2025.12.11** - Automatically built Docker images are now available on Docker Hub: [justlikemaki/aiclient-2-api](https://hub.docker.com/r/justlikemaki/aiclient-2-api) > - **2025.11.30** - Added Antigravity protocol support, enabling access to Gemini 3 Pro, Claude Sonnet 4.5, and other models via Google internal interfaces -> - **2025.11.16** - Added Ollama protocol support, unified interface to access all supported models (Claude, Gemini, Qwen, OpenAI, etc.) > - **2025.11.11** - Added Web UI management console, supporting real-time configuration management and health status monitoring > - **2025.11.06** - Added support for Gemini 3 Preview, enhanced model compatibility and performance optimization > - **2025.10.18** - Kiro open registration, new accounts get 500 credits, full support for Claude Sonnet 4.5 @@ -93,7 +92,6 @@ - [📋 Core Features](#-core-features) - [🔐 Authorization Configuration Guide](#-authorization-configuration-guide) - [📁 Authorization File Storage Paths](#-authorization-file-storage-paths) -- [🦙 Ollama Protocol Usage Examples](#-ollama-protocol-usage-examples) - [⚙️ Advanced Configuration](#advanced-configuration) - [❓ FAQ](#-faq) - [📄 Open Source License](#-open-source-license) @@ -347,41 +345,6 @@ Default storage locations for authorization credential files of each service: --- -### 🦙 Ollama Protocol Usage Examples - -This project supports the Ollama protocol, allowing access to all supported models through a unified interface. The Ollama endpoint provides standard interfaces such as `/api/tags`, `/api/chat`, `/api/generate`, etc. - -**Ollama API Call Examples**: - -1. **List all available models**: -```bash -curl http://localhost:3000/ollama/api/tags \ - -H "Authorization: Bearer your-api-key" -``` - -2. **Chat interface**: -```bash -curl http://localhost:3000/ollama/api/chat \ - -H "Content-Type: application/json" \ - -H "Authorization: Bearer your-api-key" \ - -d '{ - "model": "[Claude] claude-sonnet-4.5", - "messages": [ - {"role": "user", "content": "Hello"} - ] - }' -``` - -3. **Specify provider using model prefix**: -- `[Kiro]` - Access Claude models using Kiro API -- `[Claude]` - Use official Claude API -- `[Gemini CLI]` - Access via Gemini CLI OAuth -- `[OpenAI]` - Use official OpenAI API -- `[Grok]` - Access via Grok Cookie/SSO -- `[Qwen CLI]` - Access via Qwen OAuth - ---- - ### Advanced Configuration
@@ -672,10 +635,8 @@ Or modify the port configuration in `configs/config.json` to use a different por ### 10. API Returns 404 -**Problem Description**: When calling API endpoints, it returns 404 Not Found error. - **Solutions**: -- **Check Endpoint Path**: Ensure you're using the correct endpoint path, such as `/v1/chat/completions`, `/ollama/api/chat`, etc. +- **Check Endpoint Path**: Ensure you're using the correct endpoint path, such as `/v1/chat/completions` etc. - **Check Client Auto-completion**: Some clients (like Cherry-Studio, NextChat) automatically append paths (like `/v1/chat/completions`) after the Base URL, causing path duplication. Check the actual request URL in the console and remove redundant path parts - **Check Service Status**: Confirm the service has started normally, visit `http://localhost:3000` to view Web UI - **Check Port Configuration**: Ensure requests are sent to the correct port (default 3000) diff --git a/docs/PROVIDER_ADAPTER_GUIDE.md b/docs/PROVIDER_ADAPTER_GUIDE.md index 54f44b9..aa97535 100644 --- a/docs/PROVIDER_ADAPTER_GUIDE.md +++ b/docs/PROVIDER_ADAPTER_GUIDE.md @@ -16,7 +16,6 @@ * `static/components/section-config.html`:配置按钮。 * `static/components/section-guide.html`:使用指南。 * `static/app/routing-examples.js`:路由调用示例。 - * `src/handlers/ollama-handler.js`:Ollama 协议前缀与支持映射。 6. **系统级映射(必做)**:在 OAuth 处理器、凭据关联工具、用量统计等模块中建立映射。 --- @@ -134,10 +133,6 @@ * **路由分发**:在 [`src/ui-modules/oauth-api.js`](src/ui-modules/oauth-api.js) 的 `handleGenerateAuthUrl` 中分发到相应的处理器。 * **回调处理**:若涉及 HTTP 回调,需在 `src/auth/` 下实现回调服务器逻辑。 -### 4.5 Ollama 协议映射 ([`src/handlers/ollama-handler.js`](src/handlers/ollama-handler.js)) -* 在 `MODEL_PREFIX_MAP` 中添加该提供商对应的日志/显示前缀。 -* 在 `supportedProviders` 数组中添加该提供商标识,以支持 Ollama 协议转换。 - --- ## 5. 注意事项 diff --git a/src/converters/register-converters.js b/src/converters/register-converters.js index bf1d6db..5428641 100644 --- a/src/converters/register-converters.js +++ b/src/converters/register-converters.js @@ -9,7 +9,6 @@ import { OpenAIConverter } from './strategies/OpenAIConverter.js'; import { OpenAIResponsesConverter } from './strategies/OpenAIResponsesConverter.js'; import { ClaudeConverter } from './strategies/ClaudeConverter.js'; import { GeminiConverter } from './strategies/GeminiConverter.js'; -import { OllamaConverter } from './strategies/OllamaConverter.js'; import { CodexConverter } from './strategies/CodexConverter.js'; import { GrokConverter } from './strategies/GrokConverter.js'; @@ -22,7 +21,6 @@ export function registerAllConverters() { ConverterFactory.registerConverter(MODEL_PROTOCOL_PREFIX.OPENAI_RESPONSES, OpenAIResponsesConverter); ConverterFactory.registerConverter(MODEL_PROTOCOL_PREFIX.CLAUDE, ClaudeConverter); ConverterFactory.registerConverter(MODEL_PROTOCOL_PREFIX.GEMINI, GeminiConverter); - ConverterFactory.registerConverter(MODEL_PROTOCOL_PREFIX.OLLAMA, OllamaConverter); ConverterFactory.registerConverter(MODEL_PROTOCOL_PREFIX.CODEX, CodexConverter); ConverterFactory.registerConverter(MODEL_PROTOCOL_PREFIX.GROK, GrokConverter); } diff --git a/src/converters/strategies/OllamaConverter.js b/src/converters/strategies/OllamaConverter.js deleted file mode 100644 index 870e1e7..0000000 --- a/src/converters/strategies/OllamaConverter.js +++ /dev/null @@ -1,690 +0,0 @@ -/** - * Ollama转换器 - * 处理Ollama协议与其他协议之间的转换 - */ - -import { v4 as uuidv4 } from 'uuid'; -import { createHash } from 'crypto'; -import { BaseConverter } from '../BaseConverter.js'; -import { MODEL_PROTOCOL_PREFIX } from '../../utils/common.js'; -import { - OLLAMA_DEFAULT_CONTEXT_LENGTH, - OLLAMA_DEFAULT_MAX_OUTPUT_TOKENS, - OLLAMA_CLAUDE_DEFAULT_CONTEXT_LENGTH, - OLLAMA_CLAUDE_SONNET_45_CONTEXT_LENGTH, - OLLAMA_CLAUDE_SONNET_45_MAX_OUTPUT_TOKENS, - OLLAMA_CLAUDE_HAIKU_45_CONTEXT_LENGTH, - OLLAMA_CLAUDE_HAIKU_45_MAX_OUTPUT_TOKENS, - OLLAMA_CLAUDE_OPUS_41_CONTEXT_LENGTH, - OLLAMA_CLAUDE_OPUS_41_MAX_OUTPUT_TOKENS, - OLLAMA_CLAUDE_SONNET_40_CONTEXT_LENGTH, - OLLAMA_CLAUDE_SONNET_40_MAX_OUTPUT_TOKENS, - OLLAMA_CLAUDE_SONNET_37_CONTEXT_LENGTH, - OLLAMA_CLAUDE_SONNET_37_MAX_OUTPUT_TOKENS, - OLLAMA_CLAUDE_OPUS_40_CONTEXT_LENGTH, - OLLAMA_CLAUDE_OPUS_40_MAX_OUTPUT_TOKENS, - OLLAMA_CLAUDE_HAIKU_35_CONTEXT_LENGTH, - OLLAMA_CLAUDE_HAIKU_35_MAX_OUTPUT_TOKENS, - OLLAMA_CLAUDE_HAIKU_30_CONTEXT_LENGTH, - OLLAMA_CLAUDE_HAIKU_30_MAX_OUTPUT_TOKENS, - OLLAMA_CLAUDE_SONNET_35_CONTEXT_LENGTH, - OLLAMA_CLAUDE_SONNET_35_MAX_OUTPUT_TOKENS, - OLLAMA_CLAUDE_OPUS_30_CONTEXT_LENGTH, - OLLAMA_CLAUDE_OPUS_30_MAX_OUTPUT_TOKENS, - OLLAMA_GEMINI_25_PRO_CONTEXT_LENGTH, - OLLAMA_GEMINI_25_PRO_MAX_OUTPUT_TOKENS, - OLLAMA_GEMINI_25_FLASH_CONTEXT_LENGTH, - OLLAMA_GEMINI_25_FLASH_MAX_OUTPUT_TOKENS, - OLLAMA_GEMINI_25_IMAGE_CONTEXT_LENGTH, - OLLAMA_GEMINI_25_IMAGE_MAX_OUTPUT_TOKENS, - OLLAMA_GEMINI_25_LIVE_CONTEXT_LENGTH, - OLLAMA_GEMINI_25_LIVE_MAX_OUTPUT_TOKENS, - OLLAMA_GEMINI_25_TTS_CONTEXT_LENGTH, - OLLAMA_GEMINI_25_TTS_MAX_OUTPUT_TOKENS, - OLLAMA_GEMINI_20_FLASH_CONTEXT_LENGTH, - OLLAMA_GEMINI_20_FLASH_MAX_OUTPUT_TOKENS, - OLLAMA_GEMINI_20_IMAGE_CONTEXT_LENGTH, - OLLAMA_GEMINI_20_IMAGE_MAX_OUTPUT_TOKENS, - OLLAMA_GEMINI_15_PRO_CONTEXT_LENGTH, - OLLAMA_GEMINI_15_PRO_MAX_OUTPUT_TOKENS, - OLLAMA_GEMINI_15_FLASH_CONTEXT_LENGTH, - OLLAMA_GEMINI_15_FLASH_MAX_OUTPUT_TOKENS, - OLLAMA_GEMINI_DEFAULT_CONTEXT_LENGTH, - OLLAMA_GEMINI_DEFAULT_MAX_OUTPUT_TOKENS, - OLLAMA_GPT4_TURBO_CONTEXT_LENGTH, - OLLAMA_GPT4_TURBO_MAX_OUTPUT_TOKENS, - OLLAMA_GPT4_32K_CONTEXT_LENGTH, - OLLAMA_GPT4_32K_MAX_OUTPUT_TOKENS, - OLLAMA_GPT4_BASE_CONTEXT_LENGTH, - OLLAMA_GPT4_BASE_MAX_OUTPUT_TOKENS, - OLLAMA_GPT35_16K_CONTEXT_LENGTH, - OLLAMA_GPT35_16K_MAX_OUTPUT_TOKENS, - OLLAMA_GPT35_BASE_CONTEXT_LENGTH, - OLLAMA_GPT35_BASE_MAX_OUTPUT_TOKENS, - OLLAMA_QWEN_CODER_PLUS_CONTEXT_LENGTH, - OLLAMA_QWEN_CODER_PLUS_MAX_OUTPUT_TOKENS, - OLLAMA_QWEN_VL_PLUS_CONTEXT_LENGTH, - OLLAMA_QWEN_VL_PLUS_MAX_OUTPUT_TOKENS, - OLLAMA_QWEN_CODER_FLASH_CONTEXT_LENGTH, - OLLAMA_QWEN_CODER_FLASH_MAX_OUTPUT_TOKENS, - OLLAMA_QWEN_DEFAULT_CONTEXT_LENGTH, - OLLAMA_QWEN_DEFAULT_MAX_OUTPUT_TOKENS, - OLLAMA_DEFAULT_FILE_TYPE, - OLLAMA_DEFAULT_QUANTIZATION_VERSION, - OLLAMA_DEFAULT_ROPE_FREQ_BASE, - OLLAMA_DEFAULT_TEMPERATURE, - OLLAMA_DEFAULT_TOP_P, - OLLAMA_DEFAULT_QUANTIZATION_LEVEL, - OLLAMA_SHOW_QUANTIZATION_LEVEL -} from '../utils.js'; - - - -/** - * Ollama转换器类 - * 实现Ollama协议到其他协议的转换 - */ -export class OllamaConverter extends BaseConverter { - constructor() { - super('ollama'); - } - - /** - * 转换请求 - Ollama -> 其他协议 - */ - convertRequest(data, targetProtocol) { - switch (targetProtocol) { - case MODEL_PROTOCOL_PREFIX.OPENAI: - case MODEL_PROTOCOL_PREFIX.CLAUDE: - case MODEL_PROTOCOL_PREFIX.GEMINI: - return this.toOpenAIRequest(data); - default: - throw new Error(`Unsupported target protocol: ${targetProtocol}`); - } - } - - /** - * 转换响应 - 其他协议 -> Ollama - */ - convertResponse(data, sourceProtocol, model) { - return this.toOllamaChatResponse(data, model); - } - - /** - * 转换流式响应块 - 其他协议 -> Ollama - */ - convertStreamChunk(chunk, sourceProtocol, model, isDone = false) { - return this.toOllamaStreamChunk(chunk, model, isDone); - } - - /** - * 转换模型列表 - 其他协议 -> Ollama - */ - convertModelList(data, sourceProtocol) { - return this.toOllamaTags(data, sourceProtocol); - } - - // ========================================================================= - // Ollama -> OpenAI 转换 - // ========================================================================= - - /** - * Ollama请求 -> OpenAI请求 - */ - toOpenAIRequest(ollamaRequest) { - const openaiRequest = { - model: ollamaRequest.model || 'default', - messages: [], - stream: ollamaRequest.stream !== undefined ? ollamaRequest.stream : false - }; - - // Map Ollama messages to OpenAI format - if (ollamaRequest.messages && Array.isArray(ollamaRequest.messages)) { - openaiRequest.messages = ollamaRequest.messages.map(msg => ({ - role: msg.role || 'user', - content: msg.content || '' - })); - } - - // Map Ollama options to OpenAI parameters - if (ollamaRequest.options) { - const opts = ollamaRequest.options; - if (opts.temperature !== undefined) openaiRequest.temperature = opts.temperature; - if (opts.top_p !== undefined) openaiRequest.top_p = opts.top_p; - if (opts.top_k !== undefined) openaiRequest.top_k = opts.top_k; - if (opts.num_predict !== undefined) openaiRequest.max_tokens = opts.num_predict; - if (opts.stop !== undefined) openaiRequest.stop = opts.stop; - } - - // Handle system prompt - if (ollamaRequest.system) { - openaiRequest.messages.unshift({ - role: 'system', - content: ollamaRequest.system - }); - } - - // Handle template/prompt for generate endpoint - if (ollamaRequest.prompt) { - openaiRequest.messages = [{ - role: 'user', - content: ollamaRequest.prompt - }]; - - // Add system prompt if provided - if (ollamaRequest.system) { - openaiRequest.messages.unshift({ - role: 'system', - content: ollamaRequest.system - }); - } - } - - return openaiRequest; - } - - // ========================================================================= - // OpenAI/Claude/Gemini -> Ollama 转换 - // ========================================================================= - - /** - * OpenAI/Claude/Gemini响应 -> Ollama chat响应 - */ - toOllamaChatResponse(response, model) { - const ollamaResponse = { - model: model || response.model || 'unknown', - created_at: new Date().toISOString(), - done: true - }; - - // Handle OpenAI format (choices array) - if (response.choices && response.choices.length > 0) { - const choice = response.choices[0]; - ollamaResponse.message = { - role: choice.message?.role || 'assistant', - content: choice.message?.content || '' - }; - - // Map finish reason - if (choice.finish_reason) { - ollamaResponse.done_reason = choice.finish_reason === 'stop' ? 'stop' : choice.finish_reason; - } - } - // Handle Claude format (content array) - else if (response.content && Array.isArray(response.content)) { - let textContent = ''; - response.content.forEach(block => { - if (block.type === 'text' && block.text) { - textContent += block.text; - } - }); - - ollamaResponse.message = { - role: response.role || 'assistant', - content: textContent - }; - - if (response.stop_reason) { - ollamaResponse.done_reason = response.stop_reason === 'end_turn' ? 'stop' : response.stop_reason; - } - } - // Handle Gemini format (candidates array) - else if (response.candidates && response.candidates.length > 0) { - const candidate = response.candidates[0]; - let textContent = ''; - if (candidate.content && candidate.content.parts) { - textContent = candidate.content.parts - .filter(part => part.text) - .map(part => part.text) - .join(''); - } - - ollamaResponse.message = { - role: candidate.content?.role || 'assistant', - content: textContent - }; - - if (candidate.finishReason) { - ollamaResponse.done_reason = candidate.finishReason.toLowerCase(); - } - } - - // Add usage statistics if available - const usage = response.usage || response.usageMetadata; - if (usage) { - ollamaResponse.prompt_eval_count = usage.prompt_tokens || usage.input_tokens || usage.promptTokenCount || 0; - ollamaResponse.eval_count = usage.completion_tokens || usage.output_tokens || usage.candidatesTokenCount || 0; - ollamaResponse.total_duration = 0; - ollamaResponse.load_duration = 0; - ollamaResponse.prompt_eval_duration = 0; - ollamaResponse.eval_duration = 0; - } - - return ollamaResponse; - } - - /** - * OpenAI/Claude/Gemini generate响应 -> Ollama generate响应 - */ - toOllamaGenerateResponse(response, model) { - const ollamaResponse = { - model: model || response.model || 'unknown', - created_at: new Date().toISOString(), - done: true - }; - - // Handle OpenAI format - if (response.choices && response.choices.length > 0) { - const choice = response.choices[0]; - ollamaResponse.response = choice.message?.content || choice.text || ''; - - if (choice.finish_reason) { - ollamaResponse.done_reason = choice.finish_reason === 'stop' ? 'stop' : choice.finish_reason; - } - } - // Handle Claude format - else if (response.content && Array.isArray(response.content)) { - let textContent = ''; - response.content.forEach(block => { - if (block.type === 'text' && block.text) { - textContent += block.text; - } - }); - ollamaResponse.response = textContent; - - if (response.stop_reason) { - ollamaResponse.done_reason = response.stop_reason === 'end_turn' ? 'stop' : response.stop_reason; - } - } - // Handle Gemini format - else if (response.candidates && response.candidates.length > 0) { - const candidate = response.candidates[0]; - let textContent = ''; - if (candidate.content && candidate.content.parts) { - textContent = candidate.content.parts - .filter(part => part.text) - .map(part => part.text) - .join(''); - } - ollamaResponse.response = textContent; - - if (candidate.finishReason) { - ollamaResponse.done_reason = candidate.finishReason.toLowerCase(); - } - } - - // Add usage statistics - const genUsage = response.usage || response.usageMetadata; - if (genUsage) { - ollamaResponse.prompt_eval_count = genUsage.prompt_tokens || genUsage.input_tokens || genUsage.promptTokenCount || 0; - ollamaResponse.eval_count = genUsage.completion_tokens || genUsage.output_tokens || genUsage.candidatesTokenCount || 0; - ollamaResponse.total_duration = 0; - ollamaResponse.load_duration = 0; - ollamaResponse.prompt_eval_duration = 0; - ollamaResponse.eval_duration = 0; - } - - return ollamaResponse; - } - - /** - * OpenAI/Claude/Gemini流式块 -> Ollama流式块 - */ - toOllamaStreamChunk(chunk, model, isDone = false) { - const ollamaChunk = { - model: model || 'unknown', - created_at: new Date().toISOString(), - done: isDone - }; - - // Handle Claude SSE format - if (chunk.type) { - if (chunk.type === 'content_block_delta' && chunk.delta) { - ollamaChunk.message = { - role: 'assistant', - content: chunk.delta.text || '' - }; - } else if (chunk.type === 'message_delta' && chunk.usage) { - ollamaChunk.message = { - role: 'assistant', - content: '' - }; - ollamaChunk.prompt_eval_count = 0; - ollamaChunk.eval_count = chunk.usage.output_tokens || 0; - } else { - ollamaChunk.message = { - role: 'assistant', - content: '' - }; - } - } - // Handle Gemini format - else if (!isDone && chunk.candidates && chunk.candidates.length > 0) { - const candidate = chunk.candidates[0]; - let content = ''; - if (candidate.content && candidate.content.parts) { - content = candidate.content.parts - .filter(part => part.text) - .map(part => part.text) - .join(''); - } - ollamaChunk.message = { - role: 'assistant', - content: content - }; - } - // Handle OpenAI format - else if (!isDone && chunk.choices && chunk.choices.length > 0) { - const delta = chunk.choices[0].delta; - ollamaChunk.message = { - role: delta.role || 'assistant', - content: delta.content || '' - }; - } - // Handle final chunk - else if (isDone) { - ollamaChunk.message = { - role: 'assistant', - content: '' - }; - ollamaChunk.done_reason = 'stop'; - } - - return ollamaChunk; - } - - /** - * OpenAI/Claude/Gemini流式块 -> Ollama generate流式块 - */ - toOllamaGenerateStreamChunk(chunk, model, isDone = false) { - const ollamaChunk = { - model: model || 'unknown', - created_at: new Date().toISOString(), - done: isDone - }; - - // Handle Claude SSE format - if (chunk.type) { - if (chunk.type === 'content_block_delta' && chunk.delta) { - ollamaChunk.response = chunk.delta.text || ''; - } else if (chunk.type === 'message_delta' && chunk.usage) { - ollamaChunk.response = ''; - ollamaChunk.prompt_eval_count = 0; - ollamaChunk.eval_count = chunk.usage.output_tokens || 0; - } else { - ollamaChunk.response = ''; - } - } - // Handle OpenAI format - else if (!isDone && chunk.choices && chunk.choices.length > 0) { - const delta = chunk.choices[0].delta; - ollamaChunk.response = delta.content || ''; - } - // Handle final chunk - else if (isDone) { - ollamaChunk.response = ''; - ollamaChunk.done_reason = 'stop'; - } - - return ollamaChunk; - } - - /** - * OpenAI/Claude/Gemini模型列表 -> Ollama tags - */ - toOllamaTags(modelList, sourceProtocol = null) { - const models = []; - - // Handle both OpenAI format (data array) and Gemini format (models array) - const sourceModels = modelList.data || modelList.models || []; - - if (Array.isArray(sourceModels)) { - sourceModels.forEach(model => { - // Get model name - let modelName = model.id || model.name || model.displayName || 'unknown'; - - // Remove "models/" prefix if present (for Gemini) - if (modelName.startsWith('models/')) { - modelName = modelName.substring(7); // Remove "models/" - } - - // Skip models with invalid names - if (modelName === 'unknown' || !modelName) { - return; - } - - // IMPORTANT: Copilot expects family: "Ollama" with capital O! - const modelOwner = 'Ollama'; - - models.push({ - name: modelName, - model: modelName, - modified_at: new Date().toISOString(), - size: 0, // As in the old patch - digest: '', // Empty string, as in the old patch - details: { - parent_model: '', - format: 'gguf', - family: modelOwner, // "Ollama" with capital O - families: [modelOwner], - parameter_size: '0B', // As in the old patch - quantization_level: OLLAMA_DEFAULT_QUANTIZATION_LEVEL - } - }); - }); - } - - return { models }; - } - - /** - * Generate Ollama show response - */ - toOllamaShowResponse(modelName) { - // Minimal implementation, as in the old patch - let contextLength = OLLAMA_DEFAULT_CONTEXT_LENGTH; - let maxOutputTokens = OLLAMA_DEFAULT_MAX_OUTPUT_TOKENS; - let family = 'Ollama'; // ВАЖНО: С большой буквы, как ожидает Copilot! - let architecture = 'transformer'; - - const lowerName = modelName.toLowerCase(); - - // Determine contextLength by model name - // Claude models - if (lowerName.includes('claude')) { - architecture = 'claude'; - contextLength = OLLAMA_CLAUDE_DEFAULT_CONTEXT_LENGTH; // Default 200K - - // Claude Sonnet 4.5 - if (lowerName.includes('sonnet-4-5') || lowerName.includes('sonnet-4.5')) { - contextLength = OLLAMA_CLAUDE_SONNET_45_CONTEXT_LENGTH; // 200K (1M beta available) - maxOutputTokens = OLLAMA_CLAUDE_SONNET_45_MAX_OUTPUT_TOKENS; // 64K output - } - // Claude Haiku 4.5 - else if (lowerName.includes('haiku-4-5') || lowerName.includes('haiku-4.5')) { - contextLength = OLLAMA_CLAUDE_HAIKU_45_CONTEXT_LENGTH; // 200K - maxOutputTokens = OLLAMA_CLAUDE_HAIKU_45_MAX_OUTPUT_TOKENS; // 64K output - } - // Claude Opus 4.1 - else if (lowerName.includes('opus-4-1') || lowerName.includes('opus-4.1')) { - contextLength = OLLAMA_CLAUDE_OPUS_41_CONTEXT_LENGTH; // 200K - maxOutputTokens = OLLAMA_CLAUDE_OPUS_41_MAX_OUTPUT_TOKENS; // 32K output - } - // Claude Sonnet 4.0 (legacy) - else if (lowerName.includes('sonnet-4-0') || lowerName.includes('sonnet-4.0') || lowerName.includes('sonnet-4-20')) { - contextLength = OLLAMA_CLAUDE_SONNET_40_CONTEXT_LENGTH; // 200K (1M beta available) - maxOutputTokens = OLLAMA_CLAUDE_SONNET_40_MAX_OUTPUT_TOKENS; // 64K output - } - // Claude Sonnet 3.7 (legacy) - else if (lowerName.includes('3-7') || lowerName.includes('3.7')) { - contextLength = OLLAMA_CLAUDE_SONNET_37_CONTEXT_LENGTH; // 200K - maxOutputTokens = OLLAMA_CLAUDE_SONNET_37_MAX_OUTPUT_TOKENS; // 64K output (128K beta available) - } - // Claude Opus 4.0 (legacy) - else if (lowerName.includes('opus-4-0') || lowerName.includes('opus-4.0') || lowerName.includes('opus-4-20')) { - contextLength = OLLAMA_CLAUDE_OPUS_40_CONTEXT_LENGTH; // 200K - maxOutputTokens = OLLAMA_CLAUDE_OPUS_40_MAX_OUTPUT_TOKENS; // 32K output - } - // Claude Haiku 3.5 (legacy) - else if (lowerName.includes('haiku-3-5') || lowerName.includes('haiku-3.5')) { - contextLength = OLLAMA_CLAUDE_HAIKU_35_CONTEXT_LENGTH; // 200K - maxOutputTokens = OLLAMA_CLAUDE_HAIKU_35_MAX_OUTPUT_TOKENS; // 8K output - } - // Claude Haiku 3.0 (legacy) - else if (lowerName.includes('haiku-3-0') || lowerName.includes('haiku-3.0') || lowerName.includes('haiku-20240307')) { - contextLength = OLLAMA_CLAUDE_HAIKU_30_CONTEXT_LENGTH; // 200K - maxOutputTokens = OLLAMA_CLAUDE_HAIKU_30_MAX_OUTPUT_TOKENS; // 4K output - } - // Claude Sonnet 3.5 (legacy) - else if (lowerName.includes('sonnet-3-5') || lowerName.includes('sonnet-3.5')) { - contextLength = OLLAMA_CLAUDE_SONNET_35_CONTEXT_LENGTH; // 200K - maxOutputTokens = OLLAMA_CLAUDE_SONNET_35_MAX_OUTPUT_TOKENS; // 8K output - } - // Claude Opus 3.0 (legacy) - else if (lowerName.includes('opus-3-0') || lowerName.includes('opus-3.0') || lowerName.includes('opus') && lowerName.includes('20240229')) { - contextLength = OLLAMA_CLAUDE_OPUS_30_CONTEXT_LENGTH; // 200K - maxOutputTokens = OLLAMA_CLAUDE_OPUS_30_MAX_OUTPUT_TOKENS; // 4K output - } - // Default for Claude - else { - contextLength = OLLAMA_CLAUDE_DEFAULT_CONTEXT_LENGTH; // 200K - maxOutputTokens = OLLAMA_CLAUDE_HAIKU_35_MAX_OUTPUT_TOKENS; // 8K output - } - } - // Gemini models - else if (lowerName.includes('gemini')) { - architecture = 'gemini'; - - // Gemini 2.5 Pro - if (lowerName.includes('2.5') && lowerName.includes('pro')) { - contextLength = OLLAMA_GEMINI_25_PRO_CONTEXT_LENGTH; // 1M input tokens - maxOutputTokens = OLLAMA_GEMINI_25_PRO_MAX_OUTPUT_TOKENS; // 65K output tokens - } - // Gemini 2.5 Flash / Flash-Lite - else if (lowerName.includes('2.5') && (lowerName.includes('flash') || lowerName.includes('lite'))) { - contextLength = OLLAMA_GEMINI_25_FLASH_CONTEXT_LENGTH; // 1M input tokens - maxOutputTokens = OLLAMA_GEMINI_25_FLASH_MAX_OUTPUT_TOKENS; // 65K output tokens - } - // Gemini 2.5 Flash Image - else if (lowerName.includes('2.5') && lowerName.includes('image')) { - contextLength = OLLAMA_GEMINI_25_IMAGE_CONTEXT_LENGTH; // 65K input tokens - maxOutputTokens = OLLAMA_GEMINI_25_IMAGE_MAX_OUTPUT_TOKENS; // 32K output tokens - } - // Gemini 2.5 Flash Live / Native Audio - else if (lowerName.includes('2.5') && (lowerName.includes('live') || lowerName.includes('native-audio'))) { - contextLength = OLLAMA_GEMINI_25_LIVE_CONTEXT_LENGTH; // 131K input tokens - maxOutputTokens = OLLAMA_GEMINI_25_LIVE_MAX_OUTPUT_TOKENS; // 8K output tokens - } - // Gemini 2.5 TTS - else if (lowerName.includes('2.5') && lowerName.includes('tts')) { - contextLength = OLLAMA_GEMINI_25_TTS_CONTEXT_LENGTH; // 8K input tokens - maxOutputTokens = OLLAMA_GEMINI_25_TTS_MAX_OUTPUT_TOKENS; // 16K output tokens - } - // Gemini 2.0 Flash - else if (lowerName.includes('2.0') && lowerName.includes('flash')) { - contextLength = OLLAMA_GEMINI_20_FLASH_CONTEXT_LENGTH; // 1M input tokens - maxOutputTokens = OLLAMA_GEMINI_20_FLASH_MAX_OUTPUT_TOKENS; // 8K output tokens - } - // Gemini 2.0 Flash Image - else if (lowerName.includes('2.0') && lowerName.includes('image')) { - contextLength = OLLAMA_GEMINI_20_IMAGE_CONTEXT_LENGTH; // 32K input tokens - maxOutputTokens = OLLAMA_GEMINI_20_IMAGE_MAX_OUTPUT_TOKENS; // 8K output tokens - } - // Gemini 1.5 Pro (legacy) - else if (lowerName.includes('1.5') && lowerName.includes('pro')) { - contextLength = OLLAMA_GEMINI_15_PRO_CONTEXT_LENGTH; // 2M tokens - maxOutputTokens = OLLAMA_GEMINI_15_PRO_MAX_OUTPUT_TOKENS; - } - // Gemini 1.5 Flash (legacy) - else if (lowerName.includes('1.5') && lowerName.includes('flash')) { - contextLength = OLLAMA_GEMINI_15_FLASH_CONTEXT_LENGTH; // 1M tokens - maxOutputTokens = OLLAMA_GEMINI_15_FLASH_MAX_OUTPUT_TOKENS; - } - // Default for Gemini - else { - contextLength = OLLAMA_GEMINI_DEFAULT_CONTEXT_LENGTH; // 1M tokens - maxOutputTokens = OLLAMA_GEMINI_DEFAULT_MAX_OUTPUT_TOKENS; - } - } - // GPT-4 models - else if (lowerName.includes('gpt-4')) { - architecture = 'gpt'; - - if (lowerName.includes('turbo') || lowerName.includes('preview')) { - contextLength = OLLAMA_GPT4_TURBO_CONTEXT_LENGTH; // GPT-4 Turbo - maxOutputTokens = OLLAMA_GPT4_TURBO_MAX_OUTPUT_TOKENS; - } else if (lowerName.includes('32k')) { - contextLength = OLLAMA_GPT4_32K_CONTEXT_LENGTH; - maxOutputTokens = OLLAMA_GPT4_32K_MAX_OUTPUT_TOKENS; - } else { - contextLength = OLLAMA_GPT4_BASE_CONTEXT_LENGTH; // GPT-4 base - maxOutputTokens = OLLAMA_GPT4_BASE_MAX_OUTPUT_TOKENS; - } - } - // GPT-3.5 models - else if (lowerName.includes('gpt-3.5')) { - architecture = 'gpt'; - - if (lowerName.includes('16k')) { - contextLength = OLLAMA_GPT35_16K_CONTEXT_LENGTH; - maxOutputTokens = OLLAMA_GPT35_16K_MAX_OUTPUT_TOKENS; - } else { - contextLength = OLLAMA_GPT35_BASE_CONTEXT_LENGTH; - maxOutputTokens = OLLAMA_GPT35_BASE_MAX_OUTPUT_TOKENS; - } - } - // Qwen models - else if (lowerName.includes('qwen')) { - architecture = 'qwen'; - - // Qwen3 Coder Plus (coder-model) - if (lowerName.includes('coder-plus') || lowerName.includes('coder_plus') || lowerName.includes('coder-model')) { - contextLength = OLLAMA_QWEN_CODER_PLUS_CONTEXT_LENGTH; // 128K tokens - maxOutputTokens = OLLAMA_QWEN_CODER_PLUS_MAX_OUTPUT_TOKENS; // 65K output - } - // Qwen3 VL Plus (vision-model) - else if (lowerName.includes('vl-plus') || lowerName.includes('vl_plus') || lowerName.includes('vision-model')) { - contextLength = OLLAMA_QWEN_VL_PLUS_CONTEXT_LENGTH; // 256K tokens - maxOutputTokens = OLLAMA_QWEN_VL_PLUS_MAX_OUTPUT_TOKENS; // 32K output - } - // Qwen3 Coder Flash - else if (lowerName.includes('coder-flash') || lowerName.includes('coder_flash')) { - contextLength = OLLAMA_QWEN_CODER_FLASH_CONTEXT_LENGTH; // 128K tokens - maxOutputTokens = OLLAMA_QWEN_CODER_FLASH_MAX_OUTPUT_TOKENS; // 65K output - } - // Default for Qwen - else { - contextLength = OLLAMA_QWEN_DEFAULT_CONTEXT_LENGTH; // 32K tokens - maxOutputTokens = OLLAMA_QWEN_DEFAULT_MAX_OUTPUT_TOKENS; - } - } - - // Minimal parameter_size, as in the old patch - let parameterSize = '0B'; - - return { - license: '', - modelfile: `# Modelfile for ${modelName}\nFROM ${modelName}`, - parameters: `num_ctx ${contextLength}\nnum_predict ${maxOutputTokens}\ntemperature ${OLLAMA_DEFAULT_TEMPERATURE}\ntop_p ${OLLAMA_DEFAULT_TOP_P}`, - template: '{{ if .System }}{{ .System }}\n{{ end }}{{ .Prompt }}', - details: { - parent_model: '', - format: 'gguf', - family: family, - families: [family], - parameter_size: parameterSize, - quantization_level: OLLAMA_SHOW_QUANTIZATION_LEVEL - }, - model_info: { - 'general.architecture': architecture, - 'general.file_type': OLLAMA_DEFAULT_FILE_TYPE, - 'general.parameter_count': 0, - 'general.quantization_version': OLLAMA_DEFAULT_QUANTIZATION_VERSION, - 'general.context_length': contextLength, - 'llama.context_length': contextLength, - 'llama.rope.freq_base': OLLAMA_DEFAULT_ROPE_FREQ_BASE - }, - capabilities: ['tools', 'vision', 'completion'] // Indicate that the model supports tool calling - }; - } -} diff --git a/src/converters/utils.js b/src/converters/utils.js index fe6bac0..04e0eba 100644 --- a/src/converters/utils.js +++ b/src/converters/utils.js @@ -49,87 +49,6 @@ export const OPENAI_RESPONSES_DEFAULT_TOP_P = 0.95; export const OPENAI_RESPONSES_DEFAULT_INPUT_TOKEN_LIMIT = 32768; export const OPENAI_RESPONSES_DEFAULT_OUTPUT_TOKEN_LIMIT = 128000; -// ============================================================================= -// Ollama 相关常量 -// ============================================================================= -export const OLLAMA_DEFAULT_CONTEXT_LENGTH = 65534; -export const OLLAMA_DEFAULT_MAX_OUTPUT_TOKENS = 8192; - -// Claude 模型上下文长度 -export const OLLAMA_CLAUDE_DEFAULT_CONTEXT_LENGTH = 200000; -export const OLLAMA_CLAUDE_SONNET_45_CONTEXT_LENGTH = 200000; -export const OLLAMA_CLAUDE_SONNET_45_MAX_OUTPUT_TOKENS = 200000; -export const OLLAMA_CLAUDE_HAIKU_45_CONTEXT_LENGTH = 200000; -export const OLLAMA_CLAUDE_HAIKU_45_MAX_OUTPUT_TOKENS = 200000; -export const OLLAMA_CLAUDE_OPUS_41_CONTEXT_LENGTH = 200000; -export const OLLAMA_CLAUDE_OPUS_41_MAX_OUTPUT_TOKENS = 32000; -export const OLLAMA_CLAUDE_SONNET_40_CONTEXT_LENGTH = 200000; -export const OLLAMA_CLAUDE_SONNET_40_MAX_OUTPUT_TOKENS = 200000; -export const OLLAMA_CLAUDE_SONNET_37_CONTEXT_LENGTH = 200000; -export const OLLAMA_CLAUDE_SONNET_37_MAX_OUTPUT_TOKENS = 200000; -export const OLLAMA_CLAUDE_OPUS_40_CONTEXT_LENGTH = 200000; -export const OLLAMA_CLAUDE_OPUS_40_MAX_OUTPUT_TOKENS = 32000; -export const OLLAMA_CLAUDE_HAIKU_35_CONTEXT_LENGTH = 200000; -export const OLLAMA_CLAUDE_HAIKU_35_MAX_OUTPUT_TOKENS = 200000; -export const OLLAMA_CLAUDE_HAIKU_30_CONTEXT_LENGTH = 200000; -export const OLLAMA_CLAUDE_HAIKU_30_MAX_OUTPUT_TOKENS = 8192; -export const OLLAMA_CLAUDE_SONNET_35_CONTEXT_LENGTH = 200000; -export const OLLAMA_CLAUDE_SONNET_35_MAX_OUTPUT_TOKENS = 200000; -export const OLLAMA_CLAUDE_OPUS_30_CONTEXT_LENGTH = 200000; -export const OLLAMA_CLAUDE_OPUS_30_MAX_OUTPUT_TOKENS = 8192; - -// Gemini 模型上下文长度 -export const OLLAMA_GEMINI_25_PRO_CONTEXT_LENGTH = 1048576; -export const OLLAMA_GEMINI_25_PRO_MAX_OUTPUT_TOKENS = 65534; -export const OLLAMA_GEMINI_25_FLASH_CONTEXT_LENGTH = 1048576; -export const OLLAMA_GEMINI_25_FLASH_MAX_OUTPUT_TOKENS = 65534; -export const OLLAMA_GEMINI_25_IMAGE_CONTEXT_LENGTH = 65534; -export const OLLAMA_GEMINI_25_IMAGE_MAX_OUTPUT_TOKENS = 32768; -export const OLLAMA_GEMINI_25_LIVE_CONTEXT_LENGTH = 131072; -export const OLLAMA_GEMINI_25_LIVE_MAX_OUTPUT_TOKENS = 65534; -export const OLLAMA_GEMINI_25_TTS_CONTEXT_LENGTH = 65534; -export const OLLAMA_GEMINI_25_TTS_MAX_OUTPUT_TOKENS = 16384; -export const OLLAMA_GEMINI_20_FLASH_CONTEXT_LENGTH = 1048576; -export const OLLAMA_GEMINI_20_FLASH_MAX_OUTPUT_TOKENS = 65534; -export const OLLAMA_GEMINI_20_IMAGE_CONTEXT_LENGTH = 32768; -export const OLLAMA_GEMINI_20_IMAGE_MAX_OUTPUT_TOKENS = 65534; -export const OLLAMA_GEMINI_15_PRO_CONTEXT_LENGTH = 2097152; -export const OLLAMA_GEMINI_15_PRO_MAX_OUTPUT_TOKENS = 65534; -export const OLLAMA_GEMINI_15_FLASH_CONTEXT_LENGTH = 1048576; -export const OLLAMA_GEMINI_15_FLASH_MAX_OUTPUT_TOKENS = 65534; -export const OLLAMA_GEMINI_DEFAULT_CONTEXT_LENGTH = 1048576; -export const OLLAMA_GEMINI_DEFAULT_MAX_OUTPUT_TOKENS = 65534; - -// GPT 模型上下文长度 -export const OLLAMA_GPT4_TURBO_CONTEXT_LENGTH = 128000; -export const OLLAMA_GPT4_TURBO_MAX_OUTPUT_TOKENS = 8192; -export const OLLAMA_GPT4_32K_CONTEXT_LENGTH = 32768; -export const OLLAMA_GPT4_32K_MAX_OUTPUT_TOKENS = 8192; -export const OLLAMA_GPT4_BASE_CONTEXT_LENGTH = 200000; -export const OLLAMA_GPT4_BASE_MAX_OUTPUT_TOKENS = 8192; -export const OLLAMA_GPT35_16K_CONTEXT_LENGTH = 16385; -export const OLLAMA_GPT35_16K_MAX_OUTPUT_TOKENS = 8192; -export const OLLAMA_GPT35_BASE_CONTEXT_LENGTH = 8192; -export const OLLAMA_GPT35_BASE_MAX_OUTPUT_TOKENS = 8192; - -// Qwen 模型上下文长度 -export const OLLAMA_QWEN_CODER_PLUS_CONTEXT_LENGTH = 128000; -export const OLLAMA_QWEN_CODER_PLUS_MAX_OUTPUT_TOKENS = 65534; -export const OLLAMA_QWEN_VL_PLUS_CONTEXT_LENGTH = 262144; -export const OLLAMA_QWEN_VL_PLUS_MAX_OUTPUT_TOKENS = 32768; -export const OLLAMA_QWEN_CODER_FLASH_CONTEXT_LENGTH = 128000; -export const OLLAMA_QWEN_CODER_FLASH_MAX_OUTPUT_TOKENS = 65534; -export const OLLAMA_QWEN_DEFAULT_CONTEXT_LENGTH = 32768; -export const OLLAMA_QWEN_DEFAULT_MAX_OUTPUT_TOKENS = 200000; - -export const OLLAMA_DEFAULT_FILE_TYPE = 2; -export const OLLAMA_DEFAULT_QUANTIZATION_VERSION = 2; -export const OLLAMA_DEFAULT_ROPE_FREQ_BASE = 10000.0; -export const OLLAMA_DEFAULT_TEMPERATURE = 0.7; -export const OLLAMA_DEFAULT_TOP_P = 0.9; -export const OLLAMA_DEFAULT_QUANTIZATION_LEVEL = 'Q4_0'; -export const OLLAMA_SHOW_QUANTIZATION_LEVEL = 'Q4_K_M'; - // ============================================================================= // 通用辅助函数 // ============================================================================= diff --git a/src/handlers/ollama-handler.js b/src/handlers/ollama-handler.js deleted file mode 100644 index 4744280..0000000 --- a/src/handlers/ollama-handler.js +++ /dev/null @@ -1,796 +0,0 @@ -/** - * Ollama API 处理器 - * 处理Ollama特定的端点并在后端协议之间进行转换 - */ - -import { getRequestBody, handleError, MODEL_PROTOCOL_PREFIX, MODEL_PROVIDER, getProtocolPrefix } from '../utils/common.js'; -import logger from '../utils/logger.js'; -import { convertData } from '../convert/convert.js'; -import { ConverterFactory } from '../converters/ConverterFactory.js'; -import { getProviderModels } from '../providers/provider-models.js'; -// Ollama版本号 -/** - * Model name prefix mapping for different providers - * These prefixes are added to model names in the list for user visibility - * but are removed before sending to actual providers - */ -export const MODEL_PREFIX_MAP = { - [MODEL_PROVIDER.KIRO_API]: '[Kiro]', - [MODEL_PROVIDER.CLAUDE_CUSTOM]: '[Claude]', - [MODEL_PROVIDER.GEMINI_CLI]: '[Gemini CLI]', - [MODEL_PROVIDER.OPENAI_CUSTOM]: '[OpenAI]', - [MODEL_PROVIDER.QWEN_API]: '[Qwen CLI]', - [MODEL_PROVIDER.OPENAI_CUSTOM_RESPONSES]: '[OpenAI Responses]', - [MODEL_PROVIDER.ANTIGRAVITY]: '[Antigravity]', - [MODEL_PROVIDER.IFLOW_API]: '[iFlow]', -} - -/** - * Adds provider prefix to model name for display purposes - * @param {string} modelName - Original model name - * @param {string} provider - Provider type - * @returns {string} Model name with prefix - */ -export function addModelPrefix(modelName, provider) { - if (!modelName) return modelName; - - // Don't add prefix if already exists - if (/^\[.*?\]\s+/.test(modelName)) { - return modelName; - } - - const prefix = MODEL_PREFIX_MAP[provider]; - if (!prefix) { - return modelName; - } - return `${prefix} ${modelName}`; -} - -/** - * Removes provider prefix from model name before sending to provider - * @param {string} modelName - Model name with possible prefix - * @returns {string} Clean model name without prefix - */ -export function removeModelPrefix(modelName) { - if (!modelName) { - return modelName; - } - - // Remove any prefix pattern like [Warp], [Kiro], etc. - const prefixPattern = /^\[.*?\]\s+/; - return modelName.replace(prefixPattern, ''); -} - -/** - * Extracts provider type from prefixed model name - * @param {string} modelName - Model name with possible prefix - * @returns {string|null} Provider type or null if no prefix found - */ -export function getProviderFromPrefix(modelName) { - if (!modelName) { - return null; - } - - const match = modelName.match(/^\[(.*?)\]/); - if (!match) { - return null; - } - - const prefixText = `[${match[1]}]`; - - // Find provider by prefix - for (const [provider, prefix] of Object.entries(MODEL_PREFIX_MAP)) { - if (prefix === prefixText) { - return provider; - } - } - - return null; -} - -/** - * Adds provider prefix to array of models (works with any format) - * @param {Array} models - Array of model objects - * @param {string} provider - Provider type - * @param {string} format - Format type ('openai', 'gemini', 'ollama') - * @returns {Array} Models with prefixed names - */ -export function addPrefixToModels(models, provider, format = 'openai') { - if (!Array.isArray(models)) return models; - - return models.map(model => { - if (format === 'openai') { - return { ...model, id: addModelPrefix(model.id, provider) }; - } else if (format === 'ollama') { - return { - ...model, - name: addModelPrefix(model.name, provider), - model: addModelPrefix(model.model || model.name, provider) - }; - } else { - // gemini/claude format - return { - ...model, - name: addModelPrefix(model.name, provider), - displayName: model.displayName ? addModelPrefix(model.displayName, provider) : undefined - }; - } - }); -} - -/** - * Determine which provider to use based on model name - * @param {string} modelName - Model name (may include prefix like "[Warp] gpt-5") - * @param {Object} providerPoolManager - Provider pool manager - * @param {string} defaultProvider - Default provider - * @returns {string} Provider type - */ -export function getProviderByModelName(modelName, providerPoolManager, defaultProvider) { - if (!modelName || !providerPoolManager || !providerPoolManager.providerPools) { - return defaultProvider; - } - - // First, check if model name has a prefix that directly indicates the provider - const providerFromPrefix = getProviderFromPrefix(modelName); - if (providerFromPrefix) { - logger.info(`[Provider Selection] Provider determined from prefix: ${providerFromPrefix}`); - return providerFromPrefix; - } - - // Remove prefix for further analysis - const cleanModelName = removeModelPrefix(modelName); - const lowerModelName = cleanModelName.toLowerCase(); - - // Check if it's a Claude model - if (lowerModelName.includes('claude') || lowerModelName.includes('sonnet') || lowerModelName.includes('opus') || lowerModelName.includes('haiku')) { - // Find available Claude provider - for (const [providerType, providers] of Object.entries(providerPoolManager.providerPools)) { - if (providerType.includes('claude') || providerType.includes('kiro')) { - const healthyProvider = providers.find(p => p.isHealthy); - if (healthyProvider) { - return providerType; - } - } - } - } - - // Check if it's a Gemini model - if (lowerModelName.includes('gemini')) { - // Find available Gemini provider - for (const [providerType, providers] of Object.entries(providerPoolManager.providerPools)) { - if (providerType.includes('gemini')) { - const healthyProvider = providers.find(p => p.isHealthy); - if (healthyProvider) { - return providerType; - } - } - } - } - - // Check if it's a Qwen model - if (lowerModelName.includes('qwen')) { - // Find available Qwen provider - for (const [providerType, providers] of Object.entries(providerPoolManager.providerPools)) { - if (providerType.includes('qwen')) { - const healthyProvider = providers.find(p => p.isHealthy); - if (healthyProvider) { - return providerType; - } - } - } - } - - // Check if it's a GPT model - if (lowerModelName.includes('gpt')) { - // Find available OpenAI provider - for (const [providerType, providers] of Object.entries(providerPoolManager.providerPools)) { - if (providerType.includes('openai')) { - const healthyProvider = providers.find(p => p.isHealthy); - if (healthyProvider) { - return providerType; - } - } - } - } - - return defaultProvider; -} - -const OLLAMA_VERSION = '0.12.10'; - -/** - * Model to Provider Mapper - * Maps model names to their corresponding providers - */ - -/** - * Get provider type for a given model name - * @param {string} modelName - The model name to look up (may include prefix like "[Warp] gpt-5") - * @param {string} defaultProvider - The default provider if no match is found - * @returns {string} The provider type - */ -export function getProviderForModel(modelName, defaultProvider) { - if (!modelName) { - return defaultProvider; - } - - // First, check if model name has a prefix that directly indicates the provider - // const providerFromPrefix = getProviderFromPrefix(modelName); - // if (providerFromPrefix) { - // return providerFromPrefix; - // } - - // Remove prefix for further analysis - const cleanModelName = removeModelPrefix(modelName); - logger.info(`[Provider Selection] Clean model name: ${cleanModelName}`); - - // Try to find the provider by checking if the model is in the provider's model list - // This handles cases where different providers have the same model name - const providerType = findProviderByModelName(cleanModelName); - logger.info(`[Provider Selection] Provider determined from model list: ${providerType}`); - if (providerType) { - return providerType; - } - - logger.info(`[Provider Selection] Model name not found in provider models. Using default provider: ${defaultProvider}`); - // Default to the provided default provider - return defaultProvider; -} - -/** - * Find provider type by checking if the model name is in the provider's model list - * @param {string} modelName - The model name to look up - * @returns {string|null} The provider type or null if not found - */ -function findProviderByModelName(modelName) { - // Map of provider types to check - const providerTypes = [ - MODEL_PROVIDER.GEMINI_CLI, - MODEL_PROVIDER.ANTIGRAVITY, - MODEL_PROVIDER.KIRO_API, - MODEL_PROVIDER.QWEN_API, - MODEL_PROVIDER.IFLOW_API - ]; - - // Check each provider's model list - for (const providerType of providerTypes) { - const models = getProviderModels(providerType); - if (models.includes(modelName)) { - return providerType; - } - } - - return null; -} - -/** - * 规范化 Ollama 路径并检查是否为 Ollama 端点 - * @param {string} path - 原始路径 - * @param {URL} requestUrl - 请求 URL 对象 - * @returns {Object} - { normalizedPath: string, isOllamaEndpoint: boolean } - */ -export function normalizeOllamaPath(path, requestUrl) { - let normalizedPath = path; - - // Normalize common Ollama path aliases (e.g., '/ollama/api/tags' -> '/api/tags') - if (normalizedPath.startsWith('/ollama/')) { - normalizedPath = normalizedPath.replace(/^\/ollama/, ''); - if (requestUrl) { - requestUrl.pathname = normalizedPath; - } - } - - // Map other common aliases - if (normalizedPath === '/v1/models') { - normalizedPath = '/api/tags'; - if (requestUrl) { - requestUrl.pathname = normalizedPath; - } - } - if (normalizedPath === '/api/tags/') { - normalizedPath = '/api/tags'; - if (requestUrl) { - requestUrl.pathname = normalizedPath; - } - } - - // Check if this is an Ollama endpoint - const isOllamaEndpoint = normalizedPath.startsWith('/api/'); - - return { normalizedPath, isOllamaEndpoint }; -} - -/** - * 处理所有 Ollama 相关的路径规范化和端点路由 - * @param {string} method - HTTP 方法 - * @param {string} path - 请求路径 - * @param {URL} requestUrl - 请求 URL 对象 - * @param {Object} req - 请求对象 - * @param {Object} res - 响应对象 - * @param {Object} apiService - API 服务实例 - * @param {Object} currentConfig - 当前配置 - * @param {Object} providerPoolManager - 提供商池管理器 - * @returns {Object} - { handled: boolean, normalizedPath: string } - */ -export async function handleOllamaRequest(method, path, requestUrl, req, res, apiService, currentConfig, providerPoolManager) { - // Normalize Ollama paths - const { normalizedPath } = normalizeOllamaPath(path, requestUrl); - - // Handle Ollama endpoints before auth check - const ollamaHandledBeforeAuth = await handleOllamaEndpointsBeforeAuth(method, normalizedPath, req, res); - if (ollamaHandledBeforeAuth) { - return { handled: true, normalizedPath }; - } - - // Handle Ollama endpoints after auth check - const ollamaHandledAfterAuth = await handleOllamaEndpointsAfterAuth(method, normalizedPath, req, res, apiService, currentConfig, providerPoolManager); - if (ollamaHandledAfterAuth) { - return { handled: true, normalizedPath }; - } - - return { handled: false, normalizedPath }; -} - -/** - * 处理 Ollama 端点路由(在认证检查之前) - * @param {string} method - HTTP 方法 - * @param {string} path - 请求路径 - * @param {Object} req - 请求对象 - * @param {Object} res - 响应对象 - * @returns {boolean} - 是否已处理请求 - */ -export async function handleOllamaEndpointsBeforeAuth(method, path, req, res) { - // Handle Ollama API endpoints BEFORE auth check (Ollama doesn't use authentication by default) - if (method === 'GET' && path === '/api/version') { - handleOllamaVersion(res); - return true; - } - - return false; -} - -/** - * 处理 Ollama 端点路由(在认证检查之后) - * @param {string} method - HTTP 方法 - * @param {string} path - 请求路径 - * @param {Object} req - 请求对象 - * @param {Object} res - 响应对象 - * @param {Object} apiService - API 服务实例 - * @param {Object} currentConfig - 当前配置 - * @param {Object} providerPoolManager - 提供商池管理器 - * @returns {boolean} - 是否已处理请求 - */ -export async function handleOllamaEndpointsAfterAuth(method, path, req, res, apiService, currentConfig, providerPoolManager) { - // Handle Ollama endpoints that need apiService (after auth check) - if (method === 'GET' && path === '/api/tags') { - await handleOllamaTags(req, res, apiService, currentConfig, providerPoolManager); - return true; - } - if (method === 'POST' && path === '/api/chat') { - await handleOllamaChat(req, res, apiService, currentConfig, providerPoolManager); - return true; - } - if (method === 'POST' && path === '/api/generate') { - await handleOllamaGenerate(req, res, apiService, currentConfig, providerPoolManager); - return true; - } - - return false; -} - -/** - * 处理 Ollama /api/tags 端点(列出模型) - * Note: apiService can be null when called before provider selection (e.g., from /ollama/api/tags) - * In this case, we fetch models from all healthy providers in the pool - */ -export async function handleOllamaTags(req, res, apiService, currentConfig, providerPoolManager) { - try { - logger.info('[Ollama] Handling /api/tags request'); - - const ollamaConverter = ConverterFactory.getConverter(MODEL_PROTOCOL_PREFIX.OLLAMA); - const { getServiceAdapter } = await import('../providers/adapter.js'); - - // Helper to fetch and convert models from a provider - const fetchProviderModels = async (providerType, service) => { - try { - const models = await service.listModels(); - const sourceProtocol = getProtocolPrefix(providerType); - const tags = ollamaConverter.convertModelList(models, sourceProtocol); - - if (tags.models && Array.isArray(tags.models)) { - return addPrefixToModels(tags.models, providerType, 'ollama'); - } - return []; - } catch (error) { - logger.error(`[Ollama] Error from ${providerType}:`, error.message); - return []; - } - }; - - // Collect fetch promises - const fetchPromises = []; - const processedProviderTypes = new Set(); - - // If apiService is provided, use it for the default provider - if (apiService) { - fetchPromises.push(fetchProviderModels(currentConfig.MODEL_PROVIDER, apiService)); - processedProviderTypes.add(currentConfig.MODEL_PROVIDER); - } - - // Add provider pool fetches (for all healthy providers) - if (providerPoolManager?.providerPools) { - for (const [providerType, providers] of Object.entries(providerPoolManager.providerPools)) { - // Skip if already processed - if (processedProviderTypes.has(providerType)) continue; - - const healthyProvider = providers.find(p => p.isHealthy && !p.isDisabled); - if (healthyProvider) { - const tempConfig = { ...currentConfig, ...healthyProvider, MODEL_PROVIDER: providerType }; - const service = getServiceAdapter(tempConfig); - fetchPromises.push(fetchProviderModels(providerType, service)); - processedProviderTypes.add(providerType); - } - } - } - - // If no providers available, return empty list - if (fetchPromises.length === 0) { - logger.warn('[Ollama] No healthy providers available to fetch models'); - const response = { models: [] }; - res.writeHead(200, { - 'Content-Type': 'application/json', - 'Access-Control-Allow-Origin': '*', - 'Server': `ollama/${OLLAMA_VERSION}` - }); - res.end(JSON.stringify(response)); - return; - } - - // Execute all fetches in parallel - const results = await Promise.all(fetchPromises); - const allModels = results.flat(); - - logger.info(`[Ollama] Fetched ${allModels.length} models from ${processedProviderTypes.size} provider(s)`); - - const response = { models: allModels }; - - res.writeHead(200, { - 'Content-Type': 'application/json', - 'Access-Control-Allow-Origin': '*', - 'Server': `ollama/${OLLAMA_VERSION}` - }); - res.end(JSON.stringify(response)); - } catch (error) { - logger.error('[Ollama Tags Error]', error); - handleError(res, error, MODEL_PROTOCOL_PREFIX.OLLAMA); - } -} - -/** - * 处理 Ollama /api/show 端点(显示模型信息) - */ -export async function handleOllamaShow(req, res) { - try { - // logger.info('[Ollama] Handling /api/show request'); - - const body = await getRequestBody(req); - const modelName = body.name || body.model || 'unknown'; - - const ollamaConverter = ConverterFactory.getConverter(MODEL_PROTOCOL_PREFIX.OLLAMA); - const showResponse = ollamaConverter.toOllamaShowResponse(modelName); - - res.writeHead(200, { - 'Content-Type': 'application/json', - 'Access-Control-Allow-Origin': '*', - 'Server': `ollama/${OLLAMA_VERSION}` - }); - res.end(JSON.stringify(showResponse)); - } catch (error) { - logger.error('[Ollama Show Error]', error); - handleError(res, error, MODEL_PROTOCOL_PREFIX.OLLAMA); - } -} - -/** - * 处理 Ollama /api/version 端点 - */ -export function handleOllamaVersion(res) { - try { - const response = { version: OLLAMA_VERSION }; - - res.writeHead(200, { - 'Content-Type': 'application/json', - 'Access-Control-Allow-Origin': '*', - 'Server': `ollama/${OLLAMA_VERSION}` - }); - res.end(JSON.stringify(response)); - } catch (error) { - logger.error('[Ollama Version Error]', error); - handleError(res, error, MODEL_PROTOCOL_PREFIX.OLLAMA); - } -} - -/** - * 处理 Ollama /api/chat 端点 - * Note: apiService can be null when called before provider selection - */ -export async function handleOllamaChat(req, res, apiService, currentConfig, providerPoolManager) { - try { - logger.info('[Ollama] Handling /api/chat request'); - - const ollamaRequest = await getRequestBody(req); - const { getServiceAdapter } = await import('../providers/adapter.js'); - - // Determine provider based on model name - const rawModelName = ollamaRequest.model; - const modelName = removeModelPrefix(rawModelName); - ollamaRequest.model = modelName; // Use clean model name - const detectedProvider = getProviderForModel(rawModelName, currentConfig.MODEL_PROVIDER); - - logger.info(`[Ollama] Model: ${modelName}, Detected provider: ${detectedProvider}`); - - // Get the appropriate service based on detected provider - let actualApiService = apiService; - let actualConfig = currentConfig; - - // If apiService is null or provider is different, get the appropriate service from pool - if (!apiService || detectedProvider !== currentConfig.MODEL_PROVIDER) { - if (providerPoolManager) { - // Select provider from pool (now async) - const providerConfig = await providerPoolManager.selectProvider(detectedProvider, modelName, { skipUsageCount: true }); - if (providerConfig) { - actualConfig = { - ...currentConfig, - ...providerConfig, - MODEL_PROVIDER: detectedProvider - }; - actualApiService = getServiceAdapter(actualConfig); - logger.info(`[Ollama] Using provider from pool: ${detectedProvider}`); - } else { - // No healthy provider in pool, try to create service directly - logger.warn(`[Ollama] No healthy provider found for ${detectedProvider} in pool`); - if (!apiService) { - throw new Error(`No healthy provider available for ${detectedProvider}`); - } - } - } else if (!apiService) { - // No pool manager and no apiService, try to create service directly - actualConfig = { ...currentConfig, MODEL_PROVIDER: detectedProvider }; - actualApiService = getServiceAdapter(actualConfig); - logger.info(`[Ollama] Created service adapter for: ${detectedProvider}`); - } - } - - // Convert Ollama request to OpenAI format - const ollamaConverter = ConverterFactory.getConverter(MODEL_PROTOCOL_PREFIX.OLLAMA); - const openaiRequest = ollamaConverter.convertRequest(ollamaRequest, MODEL_PROTOCOL_PREFIX.OPENAI); - - // Get the source protocol from the actual provider - const sourceProtocol = getProtocolPrefix(actualConfig.MODEL_PROVIDER); - - // Convert OpenAI format to backend provider format if needed - let backendRequest = openaiRequest; - if (sourceProtocol !== MODEL_PROTOCOL_PREFIX.OPENAI) { - backendRequest = convertData(openaiRequest, 'request', MODEL_PROTOCOL_PREFIX.OPENAI, sourceProtocol); - } - - // Handle streaming - if (ollamaRequest.stream) { - let clientDisconnected = false; - let listenersRegistered = false; - - // 监听客户端断开连接(只注册一次) - const onClientClose = () => { - clientDisconnected = true; - logger.info('[Ollama] Client disconnected during streaming'); - }; - const onClientError = (err) => { - clientDisconnected = true; - logger.error('[Ollama] Response stream error:', err.message); - }; - - if (!listenersRegistered) { - res.on('close', onClientClose); - res.on('error', onClientError); - listenersRegistered = true; - } - - try { - res.writeHead(200, { - 'Content-Type': 'application/json', - 'Transfer-Encoding': 'chunked', - 'Access-Control-Allow-Origin': '*', - 'Server': `ollama/${OLLAMA_VERSION}` - }); - - const stream = await actualApiService.generateContentStream(openaiRequest.model, backendRequest); - - for await (const chunk of stream) { - if (clientDisconnected) { - logger.info('[Ollama] Stopping stream due to client disconnect'); - break; - } - - try { - // Convert backend chunk to Ollama format - const ollamaChunk = ollamaConverter.convertStreamChunk(chunk, sourceProtocol, ollamaRequest.model, false); - if (!res.writableEnded) { - res.write(JSON.stringify(ollamaChunk) + '\n'); - } - } catch (chunkError) { - logger.error('[Ollama] Error processing chunk:', chunkError); - } - } - - // Send final chunk - if (!clientDisconnected && !res.writableEnded) { - const finalChunk = ollamaConverter.convertStreamChunk({}, sourceProtocol, ollamaRequest.model, true); - res.write(JSON.stringify(finalChunk) + '\n'); - res.end(); - } - } finally { - if (listenersRegistered) { - res.off('close', onClientClose); - res.off('error', onClientError); - } - } - } else { - // Non-streaming response - const backendResponse = await actualApiService.generateContent(openaiRequest.model, backendRequest); - const ollamaResponse = ollamaConverter.convertResponse(backendResponse, sourceProtocol, ollamaRequest.model); - - res.writeHead(200, { - 'Content-Type': 'application/json', - 'Access-Control-Allow-Origin': '*', - 'Server': `ollama/${OLLAMA_VERSION}` - }); - res.end(JSON.stringify(ollamaResponse)); - } - } catch (error) { - logger.error('[Ollama Chat Error]', error); - handleError(res, error, MODEL_PROTOCOL_PREFIX.OLLAMA); - } -} - -/** - * 处理 Ollama /api/generate 端点 - * Note: apiService can be null when called before provider selection - */ -export async function handleOllamaGenerate(req, res, apiService, currentConfig, providerPoolManager) { - try { - logger.info('[Ollama] Handling /api/generate request'); - - const ollamaRequest = await getRequestBody(req); - const { getServiceAdapter } = await import('../providers/adapter.js'); - - // Determine provider based on model name - const rawModelName = ollamaRequest.model; - const modelName = removeModelPrefix(rawModelName); - ollamaRequest.model = modelName; // Use clean model name - const detectedProvider = getProviderForModel(rawModelName, currentConfig.MODEL_PROVIDER); - - logger.info(`[Ollama] Model: ${modelName}, Detected provider: ${detectedProvider}`); - - // Get the appropriate service based on detected provider - let actualApiService = apiService; - let actualConfig = currentConfig; - - // If apiService is null or provider is different, get the appropriate service from pool - if (!apiService || detectedProvider !== currentConfig.MODEL_PROVIDER) { - if (providerPoolManager) { - // Select provider from pool (now async) - const providerConfig = await providerPoolManager.selectProvider(detectedProvider, modelName, { skipUsageCount: true }); - if (providerConfig) { - actualConfig = { - ...currentConfig, - ...providerConfig, - MODEL_PROVIDER: detectedProvider - }; - actualApiService = getServiceAdapter(actualConfig); - logger.info(`[Ollama] Using provider from pool: ${detectedProvider}`); - } else { - // No healthy provider in pool, try to create service directly - logger.warn(`[Ollama] No healthy provider found for ${detectedProvider} in pool`); - if (!apiService) { - throw new Error(`No healthy provider available for ${detectedProvider}`); - } - } - } else if (!apiService) { - // No pool manager and no apiService, try to create service directly - actualConfig = { ...currentConfig, MODEL_PROVIDER: detectedProvider }; - actualApiService = getServiceAdapter(actualConfig); - logger.info(`[Ollama] Created service adapter for: ${detectedProvider}`); - } - } - - // Convert Ollama request to OpenAI format - const ollamaConverter = ConverterFactory.getConverter(MODEL_PROTOCOL_PREFIX.OLLAMA); - const openaiRequest = ollamaConverter.convertRequest(ollamaRequest, MODEL_PROTOCOL_PREFIX.OPENAI); - - // Get the source protocol from the actual provider - const sourceProtocol = getProtocolPrefix(actualConfig.MODEL_PROVIDER); - - // Convert OpenAI format to backend provider format if needed - let backendRequest = openaiRequest; - if (sourceProtocol !== MODEL_PROTOCOL_PREFIX.OPENAI) { - backendRequest = convertData(openaiRequest, 'request', MODEL_PROTOCOL_PREFIX.OPENAI, sourceProtocol); - } - - // Handle streaming - if (ollamaRequest.stream) { - let clientDisconnected = false; - let listenersRegistered = false; - - // 监听客户端断开连接(只注册一次) - const onClientClose = () => { - clientDisconnected = true; - logger.info('[Ollama Generate] Client disconnected during streaming'); - }; - const onClientError = (err) => { - clientDisconnected = true; - logger.error('[Ollama Generate] Response stream error:', err.message); - }; - - if (!listenersRegistered) { - res.on('close', onClientClose); - res.on('error', onClientError); - listenersRegistered = true; - } - - try { - res.writeHead(200, { - 'Content-Type': 'application/json', - 'Transfer-Encoding': 'chunked', - 'Access-Control-Allow-Origin': '*', - 'Server': `ollama/${OLLAMA_VERSION}` - }); - - const stream = await actualApiService.generateContentStream(openaiRequest.model, backendRequest); - - for await (const chunk of stream) { - if (clientDisconnected) { - logger.info('[Ollama Generate] Stopping stream due to client disconnect'); - break; - } - - try { - // Convert backend chunk to Ollama generate format - const ollamaChunk = ollamaConverter.toOllamaGenerateStreamChunk(chunk, ollamaRequest.model, false); - if (!res.writableEnded) { - res.write(JSON.stringify(ollamaChunk) + '\n'); - } - } catch (chunkError) { - logger.error('[Ollama] Error processing chunk:', chunkError); - } - } - - // Send final chunk - if (!clientDisconnected && !res.writableEnded) { - const finalChunk = ollamaConverter.toOllamaGenerateStreamChunk({}, ollamaRequest.model, true); - res.write(JSON.stringify(finalChunk) + '\n'); - res.end(); - } - } finally { - if (listenersRegistered) { - res.off('close', onClientClose); - res.off('error', onClientError); - } - } - } else { - // Non-streaming response - const backendResponse = await actualApiService.generateContent(openaiRequest.model, backendRequest); - const ollamaResponse = ollamaConverter.toOllamaGenerateResponse(backendResponse, ollamaRequest.model); - - res.writeHead(200, { - 'Content-Type': 'application/json', - 'Access-Control-Allow-Origin': '*', - 'Server': `ollama/${OLLAMA_VERSION}` - }); - res.end(JSON.stringify(ollamaResponse)); - } - } catch (error) { - logger.error('[Ollama Generate Error]', error); - handleError(res, error, MODEL_PROTOCOL_PREFIX.OLLAMA); - } -} - diff --git a/src/handlers/request-handler.js b/src/handlers/request-handler.js index dd18b2b..c54fd1f 100644 --- a/src/handlers/request-handler.js +++ b/src/handlers/request-handler.js @@ -7,8 +7,8 @@ import { getApiService, getProviderStatus } from '../services/service-manager.js import { getProviderPoolManager } from '../services/service-manager.js'; import { MODEL_PROVIDER } from '../utils/common.js'; import { getRegisteredProviders } from '../providers/adapter.js'; +import { countTokensAnthropic } from '../utils/token-utils.js'; import { PROMPT_LOG_FILENAME } from '../core/config-manager.js'; -import { handleOllamaRequest, handleOllamaShow } from './ollama-handler.js'; import { getPluginManager } from '../core/plugin-manager.js'; import { randomUUID } from 'crypto'; import { handleGrokAssetsProxy } from '../utils/grok-assets-proxy.js'; @@ -93,12 +93,6 @@ export function createRequestHandler(config, providerPoolManager) { const uiHandled = await handleUIApiRequests(method, path, req, res, currentConfig, providerPoolManager); if (uiHandled) return; - // Ollama show endpoint with model name - if (method === 'POST' && path === '/ollama/api/show') { - await handleOllamaShow(req, res); - return true; - } - // logger.info(`\n${new Date().toLocaleString()}`); logger.info(`[Server] Received request: ${req.method} http://${req.headers.host}${req.url}`); @@ -170,15 +164,15 @@ export function createRequestHandler(config, providerPoolManager) { } // Check if the first path segment matches a MODEL_PROVIDER and switch if it does - // Note: 'ollama' is not a valid MODEL_PROVIDER, it's a protocol prefix for Ollama API compatibility const pathSegments = path.split('/').filter(segment => segment.length > 0); - const isOllamaPath = pathSegments[0] === 'ollama' || path.startsWith('/api/'); - if (pathSegments.length > 0 && !isOllamaPath) { + if (pathSegments.length > 0) { const firstSegment = pathSegments[0]; const registeredProviders = getRegisteredProviders(); const isValidProvider = registeredProviders.includes(firstSegment); - if (firstSegment && isValidProvider) { + const isAutoMode = firstSegment === MODEL_PROVIDER.AUTO; + + if (firstSegment && (isValidProvider || isAutoMode)) { currentConfig.MODEL_PROVIDER = firstSegment; logger.info(`[Config] MODEL_PROVIDER overridden by path segment to: ${currentConfig.MODEL_PROVIDER}`); pathSegments.shift(); @@ -215,52 +209,22 @@ export function createRequestHandler(config, providerPoolManager) { return; } - // Handle Ollama request BEFORE getting apiService (Ollama endpoints handle their own provider selection) - // This is important because Ollama /api/tags aggregates models from ALL providers, not just the default one - if (isOllamaPath) { - const { handled, normalizedPath } = await handleOllamaRequest(method, path, requestUrl, req, res, null, currentConfig, providerPoolManager); - if (handled) return; - // If not handled by Ollama handler, continue with normal flow - path = normalizedPath; - } - - // 获取或选择 API Service 实例 - let apiService; - try { - apiService = await getApiService(currentConfig); - } catch (error) { - handleError(res, { statusCode: 500, message: `Failed to get API service: ${error.message}` }, currentConfig.MODEL_PROVIDER); - const poolManager = getProviderPoolManager(); - if (poolManager) { - poolManager.markProviderUnhealthy(currentConfig.MODEL_PROVIDER, { - uuid: currentConfig.uuid - }); - } - return; - } - // Handle count_tokens requests (Anthropic API compatible) if (path.includes('/count_tokens') && method === 'POST') { try { const body = await parseRequestBody(req); logger.info(`[Server] Handling count_tokens request for model: ${body.model}`); - // Check if apiService has countTokens method - if (apiService && typeof apiService.countTokens === 'function') { - const result = apiService.countTokens(body); + // Use common utility method directly + try { + const result = countTokensAnthropic(body); res.writeHead(200, { 'Content-Type': 'application/json' }); res.end(JSON.stringify(result)); - } else { - // Fallback: use estimateInputTokens if available - if (apiService && typeof apiService.estimateInputTokens === 'function') { - const inputTokens = apiService.estimateInputTokens(body); - res.writeHead(200, { 'Content-Type': 'application/json' }); - res.end(JSON.stringify({ input_tokens: inputTokens })); - } else { - // Last resort: return 0 with a message - res.writeHead(200, { 'Content-Type': 'application/json' }); - res.end(JSON.stringify({ input_tokens: 0 })); - } + } catch (tokenError) { + logger.warn(`[Server] Common countTokens failed, falling back: ${tokenError.message}`); + // Last resort: return 0 + res.writeHead(200, { 'Content-Type': 'application/json' }); + res.end(JSON.stringify({ input_tokens: 0 })); } return true; } catch (error) { @@ -270,8 +234,23 @@ export function createRequestHandler(config, providerPoolManager) { } } + // 获取或选择 API Service 实例 + let apiService; + // try { + // apiService = await getApiService(currentConfig); + // } catch (error) { + // handleError(res, { statusCode: 500, message: `Failed to get API service: ${error.message}` }, currentConfig.MODEL_PROVIDER); + // const poolManager = getProviderPoolManager(); + // if (poolManager) { + // poolManager.markProviderUnhealthy(currentConfig.MODEL_PROVIDER, { + // uuid: currentConfig.uuid + // }); + // } + // return; + // } + try { - // Handle API requests (Ollama requests are already handled above before apiService is obtained) + // Handle API requests const apiHandled = await handleAPIRequests(method, path, req, res, currentConfig, apiService, providerPoolManager, PROMPT_LOG_FILENAME); if (apiHandled) return; diff --git a/src/providers/claude/claude-kiro.js b/src/providers/claude/claude-kiro.js index 24c5921..766336b 100644 --- a/src/providers/claude/claude-kiro.js +++ b/src/providers/claude/claude-kiro.js @@ -8,7 +8,13 @@ import * as crypto from 'crypto'; import * as http from 'http'; import * as https from 'https'; import { getProviderModels } from '../provider-models.js'; -import { countTokens } from '@anthropic-ai/tokenizer'; +import { + countTextTokens as countTextTokensUtil, + estimateInputTokens as estimateInputTokensUtil, + countTokensAnthropic as countTokensUtil, + processContent as processContentUtil, + getContentText as getContentTextUtil +} from '../../utils/token-utils.js'; import { configureAxiosProxy } from '../../utils/proxy-utils.js'; import { isRetryableNetworkError, MODEL_PROVIDER, formatExpiryLog } from '../../utils/common.js'; import { getProviderPoolManager } from '../../services/service-manager.js'; @@ -720,35 +726,35 @@ async saveCredentialsToFile(filePath, newData) { } + /** + * Count tokens for a given text using Claude's official tokenizer + * Static version for use without instance + */ + static countTextTokens(text) { + return countTextTokensUtil(text); + } + + /** + * Count tokens for a message request (compatible with Anthropic API) + * Static version for use without instance + */ + static countTokens(requestBody) { + return countTokensUtil(requestBody); + } + + /** + * Calculate input tokens from request body + * Static version for use without instance + */ + static estimateInputTokens(requestBody) { + return estimateInputTokensUtil(requestBody); + } + /** * Extract text content from OpenAI message format */ getContentText(message) { - if(message==null){ - return ""; - } - if (Array.isArray(message)) { - return message.map(part => { - if (typeof part === 'string') return part; - if (part && typeof part === 'object') { - if (part.type === 'text' && part.text) return part.text; - if (part.text) return part.text; - } - return ''; - }).join(''); - } else if (typeof message.content === 'string') { - return message.content; - } else if (Array.isArray(message.content)) { - return message.content.map(part => { - if (typeof part === 'string') return part; - if (part && typeof part === 'object') { - if (part.type === 'text' && part.text) return part.text; - if (part.text) return part.text; - } - return ''; - }).join(''); - } - return String(message.content || message); + return getContentTextUtil(message); } /** @@ -757,22 +763,7 @@ async saveCredentialsToFile(filePath, newData) { * @returns {string} 处理后的文本 */ processContent(content) { - if (!content) return ""; - if (typeof content === 'string') return content; - if (Array.isArray(content)) { - return content.map(part => { - if (typeof part === 'string') return part; - if (part && typeof part === 'object') { - if (part.type === 'text') return part.text || ""; - if (part.type === 'thinking') return part.thinking || part.text || ""; - if (part.type === 'tool_result') return this.processContent(part.content); - if (part.type === 'tool_use' && part.input) return JSON.stringify(part.input); - if (part.text) return part.text; - } - return ""; - }).join(""); - } - return this.getContentText(content); + return processContentUtil(content); } _normalizeThinkingBudgetTokens(budgetTokens) { @@ -2644,56 +2635,14 @@ async saveCredentialsToFile(filePath, newData) { * Count tokens for a given text using Claude's official tokenizer */ countTextTokens(text) { - if (!text) return 0; - try { - return countTokens(text); - } catch (error) { - // Fallback to estimation if tokenizer fails - logger.warn('[Kiro] Tokenizer error, falling back to estimation:', error.message); - return Math.ceil((text || '').length / 4); - } + return KiroApiService.countTextTokens(text); } /** * Calculate input tokens from request body using Claude's official tokenizer */ estimateInputTokens(requestBody) { - let allText = ""; - - // Count system prompt tokens - if (requestBody.system) { - allText += this.processContent(requestBody.system); - } - - // Count thinking prefix tokens if thinking is enabled - if (requestBody.thinking?.type && typeof requestBody.thinking.type === 'string') { - const t = requestBody.thinking.type.toLowerCase().trim(); - if (t === 'enabled') { - const budget = this._normalizeThinkingBudgetTokens(requestBody.thinking.budget_tokens); - allText += `enabled${budget}`; - } else if (t === 'adaptive') { - const effortRaw = typeof requestBody.thinking.effort === 'string' ? requestBody.thinking.effort : ''; - const effort = effortRaw.toLowerCase().trim(); - const normalizedEffort = (effort === 'low' || effort === 'medium' || effort === 'high') ? effort : 'high'; - allText += `adaptive${normalizedEffort}`; - } - } - - // Count all messages tokens - if (requestBody.messages && Array.isArray(requestBody.messages)) { - for (const message of requestBody.messages) { - if (message.content) { - allText += this.processContent(message.content); - } - } - } - - // Count tools definitions tokens if present - if (requestBody.tools && Array.isArray(requestBody.tools)) { - allText += JSON.stringify(requestBody.tools); - } - - return this.countTextTokens(allText); + return KiroApiService.estimateInputTokens(requestBody); } /** @@ -2957,48 +2906,7 @@ async saveCredentialsToFile(filePath, newData) { * @returns {Object} { input_tokens: number } */ countTokens(requestBody) { - let allText = ""; - let extraTokens = 0; - - // Count system prompt tokens - if (requestBody.system) { - allText += this.processContent(requestBody.system); - } - - // Count all messages tokens - if (requestBody.messages && Array.isArray(requestBody.messages)) { - for (const message of requestBody.messages) { - if (message.content) { - if (Array.isArray(message.content)) { - for (const block of message.content) { - if (block.type === 'image') { - // Images have a fixed token cost (approximately 1600 tokens for a typical image) - // This is an estimation as actual cost depends on image size - extraTokens += 1600; - } else if (block.type === 'document') { - // Documents - estimate based on content if available - if (block.source?.data) { - // For base64 encoded documents, estimate tokens - const estimatedChars = block.source.data.length * 0.75; // base64 to bytes ratio - extraTokens += Math.ceil(estimatedChars / 4); - } - } else { - allText += this.processContent([block]); - } - } - } else { - allText += this.processContent(message.content); - } - } - } - } - - // Count tools definitions tokens if present - if (requestBody.tools && Array.isArray(requestBody.tools)) { - allText += JSON.stringify(requestBody.tools); - } - - return { input_tokens: this.countTextTokens(allText) + extraTokens }; + return KiroApiService.countTokens(requestBody); } /** diff --git a/src/providers/provider-pool-manager.js b/src/providers/provider-pool-manager.js index fd6ad7c..261ac73 100644 --- a/src/providers/provider-pool-manager.js +++ b/src/providers/provider-pool-manager.js @@ -1,11 +1,11 @@ import * as fs from 'fs'; -import * as crypto from 'crypto'; -import { getServiceAdapter } from './adapter.js'; +import { getServiceAdapter, getRegisteredProviders } from './adapter.js'; import logger from '../utils/logger.js'; import { MODEL_PROVIDER, getProtocolPrefix } from '../utils/common.js'; import { getProviderModels } from './provider-models.js'; import { broadcastEvent } from '../ui-modules/event-broadcast.js'; -import axios from 'axios'; +import { convertData } from '../convert/convert.js'; +import { ENDPOINT_TYPE } from '../utils/common.js'; /** * Manages a pool of API service providers, handling their health and selection. @@ -1186,12 +1186,107 @@ export class ProviderPoolManager { return stats; } + /** + * Gets all available models across all provider pools, with optional format conversion. + * @param {string} [endpointType] - Optional endpoint type for format conversion (OPENAI_MODEL_LIST or GEMINI_MODEL_LIST). + * @returns {Promise} Formatted model list or raw array of model objects. + */ + async getAllAvailableModels(endpointType = null) { + const allModels = []; + + // 获取所有已注册的提供商和号池中的提供商 + const registeredProviders = getRegisteredProviders(); + const allProviderTypes = Array.from(new Set([...registeredProviders])); + + for (const providerType of allProviderTypes) { + if (this.providerStatus[providerType]) { + let models = getProviderModels(providerType); + + // 如果硬编码的模型列表为空,或者该类型的提供商在号池中没有配置节点,尝试从服务获取 + if (models.length === 0) { + try { + // 确定使用的配置:优先使用号池中第一个节点的配置,否则使用全局配置 + let targetConfig = this.globalConfig; + if (this.providerStatus[providerType] && this.providerStatus[providerType].length > 0) { + targetConfig = this.providerStatus[providerType][0].config; + } + + const tempConfig = { + ...this.globalConfig, + ...targetConfig, + MODEL_PROVIDER: providerType + }; + const serviceAdapter = getServiceAdapter(tempConfig); + + if (typeof serviceAdapter.listModels === 'function') { + const nativeModels = await serviceAdapter.listModels(); + // 统一转换为 OpenAI 格式以便提取 ID + const convertedData = convertData(nativeModels, 'modelList', providerType, MODEL_PROVIDER.OPENAI_CUSTOM); + if (convertedData && Array.isArray(convertedData.data)) { + const fetchedModels = convertedData.data.map(m => m.id); + if (fetchedModels.length > 0) { + models = fetchedModels; + } + } + } + } catch (err) { + this._log('debug', `Failed to fetch model list for ${providerType} from service: ${err.message}`); + // 保持原有的 models (可能是硬编码的空列表或 getProviderModels 返回的结果) + } + } + + for (const model of models) { + allModels.push({ + id: `${providerType}:${model}`, + provider: providerType, + model: model + }); + } + } + } + + // 如果没有指定 endpointType,返回原始数组 + if (!endpointType) { + return allModels; + } + + // 根据 endpointType 转换为对应格式 + if (endpointType === ENDPOINT_TYPE.OPENAI_MODEL_LIST) { + // OpenAI 格式聚合 + return { + object: "list", + data: allModels.map(m => ({ + id: m.id, + object: "model", + created: Math.floor(Date.now() / 1000), + owned_by: m.provider + })) + }; + } else if (endpointType === ENDPOINT_TYPE.GEMINI_MODEL_LIST) { + // Gemini 格式聚合 + return { + models: allModels.map(m => ({ + name: `models/${m.id}`, + baseModelId: m.model, + version: "v1", + displayName: `${m.model} (${m.provider})`, + description: `Model ${m.model} provided by ${m.provider}`, + supportedGenerationMethods: ["generateContent", "countTokens"] + })) + }; + } + + // 默认返回空列表 + return { data: [] }; + } + /** * 标记提供商需要刷新并推入刷新队列 * @param {string} providerType - 提供商类型 * @param {object} providerConfig - 提供商配置(包含 uuid) */ markProviderNeedRefresh(providerType, providerConfig) { + if (!providerConfig?.uuid) { this._log('error', 'Invalid providerConfig in markProviderNeedRefresh'); return; diff --git a/src/services/service-manager.js b/src/services/service-manager.js index c152e12..747518f 100644 --- a/src/services/service-manager.js +++ b/src/services/service-manager.js @@ -13,6 +13,7 @@ import { getFileName, formatSystemPath } from '../utils/provider-utils.js'; +import { MODEL_PROVIDER } from '../utils/common.js'; // 存储 ProviderPoolManager 实例 let providerPoolManager = null; @@ -351,6 +352,35 @@ export async function initApiService(config, isReady = false) { return serviceInstances; // Return the collection of initialized service instances } +/** + * [路由解析层] 负责前置处理前缀和 AUTO 模式转换 + * @private + * @returns {Promise} { effectiveProvider, actualModelName } + */ +async function _resolveEffectiveRouting(config, requestedModel) { + let effectiveProvider = config.MODEL_PROVIDER; + let actualModelName = requestedModel; + + // 1. 处理显式前缀 (无论是否是 AUTO 模式都支持) + if (requestedModel && requestedModel.includes(':')) { + const [prefix, ...modelParts] = requestedModel.split(':'); + const modelSuffix = modelParts.join(':'); + // 检查前缀是否是有效的提供商标识 + if (providerPoolManager && (providerPoolManager.providerStatus[prefix] || config.providerPools?.[prefix])) { + effectiveProvider = prefix; + actualModelName = modelSuffix; + logger.info(`[Routing] Prefix resolved: ${prefix}:${modelSuffix}`); + } + } + + // 2. 严格性检查:在 AUTO 模式下,如果到这里还没解析出具体提供商,则报错 (除非是列出模型场景) + if (effectiveProvider === MODEL_PROVIDER.AUTO && requestedModel) { + throw new Error(`[API Service] Auto-routing failed: Model name must include a provider prefix (e.g., 'provider:model'). Received: '${requestedModel}'`); + } + + return { effectiveProvider, actualModelName }; +} + /** * Get API service adapter, considering provider pools * @param {Object} config - The current request configuration @@ -360,11 +390,18 @@ export async function initApiService(config, isReady = false) { * @returns {Promise} The API service adapter */ export async function getApiService(config, requestedModel = null, options = {}) { + // 1. 前置路由解析 + const { effectiveProvider, actualModelName } = await _resolveEffectiveRouting(config, requestedModel); + config.MODEL_PROVIDER = effectiveProvider; + + // 模型列表特殊场景:AUTO 且无模型名 + if (effectiveProvider === MODEL_PROVIDER.AUTO && !actualModelName) return null; + let serviceConfig = config; if (providerPoolManager && config.providerPools && config.providerPools[config.MODEL_PROVIDER]) { // 如果有号池管理器,并且当前模型提供者类型有对应的号池,则从号池中选择一个提供者配置 // selectProvider 现在是异步的,使用链式锁确保并发安全 - const selectedProviderConfig = await providerPoolManager.selectProvider(config.MODEL_PROVIDER, requestedModel, { skipUsageCount: true }); + const selectedProviderConfig = await providerPoolManager.selectProvider(config.MODEL_PROVIDER, actualModelName, { ...options, skipUsageCount: true }); if (selectedProviderConfig) { // 合并选中的提供者配置到当前请求的 config 中 serviceConfig = deepmerge(config, selectedProviderConfig); @@ -372,12 +409,15 @@ export async function getApiService(config, requestedModel = null, options = {}) config.uuid = serviceConfig.uuid; config.customName = serviceConfig.customName; const customNameDisplay = serviceConfig.customName ? ` (${serviceConfig.customName})` : ''; - logger.info(`[API Service] Using pooled configuration for ${config.MODEL_PROVIDER}: ${serviceConfig.uuid}${customNameDisplay}${requestedModel ? ` (model: ${requestedModel})` : ''}`); + logger.info(`[API Service] Using pooled configuration for ${config.MODEL_PROVIDER}: ${serviceConfig.uuid}${customNameDisplay}${actualModelName ? ` (model: ${actualModelName})` : ''}`); } else { - const errorMsg = `[API Service] No healthy provider found in pool for ${config.MODEL_PROVIDER}${requestedModel ? ` supporting model: ${requestedModel}` : ''}`; + const errorMsg = `[API Service] No healthy provider found in pool for ${config.MODEL_PROVIDER}${actualModelName ? ` supporting model: ${actualModelName}` : ''}`; logger.error(errorMsg); throw new Error(errorMsg); } + } else if (effectiveProvider === MODEL_PROVIDER.AUTO && actualModelName) { + // 如果在 AUTO 模式下依然没能解析出具体提供商,则报错 + throw new Error(`[API Service] Auto-routing failed: Model name must include a provider prefix (e.g., 'provider:model'). Received: '${actualModelName}'`); } return getServiceAdapter(serviceConfig); } @@ -390,11 +430,20 @@ export async function getApiService(config, requestedModel = null, options = {}) * @returns {Promise} Object containing service adapter and metadata */ export async function getApiServiceWithFallback(config, requestedModel = null, options = {}) { + // 1. 前置路由解析 + const { effectiveProvider, actualModelName } = await _resolveEffectiveRouting(config, requestedModel); + config.MODEL_PROVIDER = effectiveProvider; + + // 模型列表特殊场景:AUTO 且无模型名 + if (effectiveProvider === MODEL_PROVIDER.AUTO && !actualModelName) { + return { service: null, serviceConfig: config, actualProviderType: effectiveProvider, isFallback: false, uuid: null, actualModel: null }; + } + let serviceConfig = config; let actualProviderType = config.MODEL_PROVIDER; let isFallback = false; let selectedUuid = null; - let actualModel = null; + let actualModel = actualModelName; if (providerPoolManager && config.providerPools && config.providerPools[config.MODEL_PROVIDER]) { // selectProviderWithFallback 现在是异步的,使用链式锁确保并发安全 @@ -406,13 +455,13 @@ export async function getApiServiceWithFallback(config, requestedModel = null, o // 我们需要一个支持 Fallback 的 acquireSlot selectedResult = await providerPoolManager.acquireSlotWithFallback( config.MODEL_PROVIDER, - requestedModel, + actualModelName, options ); } else { selectedResult = await providerPoolManager.selectProviderWithFallback( config.MODEL_PROVIDER, - requestedModel, + actualModelName, { ...options, skipUsageCount: true } ); } @@ -427,17 +476,20 @@ export async function getApiServiceWithFallback(config, requestedModel = null, o actualProviderType = selectedType; isFallback = fallbackUsed; selectedUuid = selectedProviderConfig.uuid; - actualModel = fallbackModel; + actualModel = fallbackModel || actualModelName; // 如果发生了 fallback,需要更新 MODEL_PROVIDER if (isFallback) { serviceConfig.MODEL_PROVIDER = actualProviderType; } } else { - const errorMsg = `[API Service] No healthy provider found in pool (including fallback) for ${config.MODEL_PROVIDER}${requestedModel ? ` supporting model: ${requestedModel}` : ''}`; + const errorMsg = `[API Service] No healthy provider found in pool (including fallback) for ${config.MODEL_PROVIDER}${actualModelName ? ` supporting model: ${actualModelName}` : ''}`; logger.error(errorMsg); throw new Error(errorMsg); } + } else if (effectiveProvider === MODEL_PROVIDER.AUTO && actualModelName) { + // 如果在 AUTO 模式下依然没能解析出具体提供商,则报错 + throw new Error(`[API Service] Auto-routing failed: Model name must include a provider prefix (e.g., 'provider:model'). Received: '${actualModelName}'`); } const service = getServiceAdapter(serviceConfig); diff --git a/src/utils/common.js b/src/utils/common.js index c9e3700..8ec8fed 100644 --- a/src/utils/common.js +++ b/src/utils/common.js @@ -55,7 +55,6 @@ export const MODEL_PROTOCOL_PREFIX = { OPENAI: 'openai', OPENAI_RESPONSES: 'openaiResponses', CLAUDE: 'claude', - OLLAMA: 'ollama', CODEX: 'codex', FORWARD: 'forward', GROK: 'grok', @@ -74,6 +73,7 @@ export const MODEL_PROVIDER = { CODEX_API: 'openai-codex-oauth', FORWARD_API: 'forward-api', GROK_CUSTOM: 'grok-custom', + AUTO: 'auto', } /** @@ -795,50 +795,64 @@ export async function handleUnaryRequest(res, service, model, requestBody, fromP * and sends the JSON response. * @param {http.IncomingMessage} req The HTTP request object. * @param {http.ServerResponse} res The HTTP response object. + * @param {Object} service - The API service instance. * @param {string} endpointType The type of endpoint being called (e.g., OPENAI_MODEL_LIST). * @param {Object} CONFIG - The server configuration object. + * @param {Object} providerPoolManager - The provider pool manager instance. + * @param {string} pooluuid - The selected provider UUID. */ export async function handleModelListRequest(req, res, service, endpointType, CONFIG, providerPoolManager, pooluuid) { - try{ + try { const clientProviderMap = { [ENDPOINT_TYPE.OPENAI_MODEL_LIST]: MODEL_PROTOCOL_PREFIX.OPENAI, [ENDPOINT_TYPE.GEMINI_MODEL_LIST]: MODEL_PROTOCOL_PREFIX.GEMINI, }; - const fromProvider = clientProviderMap[endpointType]; - const toProvider = CONFIG.MODEL_PROVIDER; - + if (!fromProvider) { throw new Error(`Unsupported endpoint type for model list: ${endpointType}`); } - // 1. Get the model list in the backend's native format. - const nativeModelList = await service.listModels(); - - // 2. Convert the model list to the client's expected format, if necessary. - let clientModelList = nativeModelList; - if (!getProtocolPrefix(toProvider).includes(getProtocolPrefix(fromProvider))) { - logger.info(`[ModelList Convert] Converting model list from ${toProvider} to ${fromProvider}`); - clientModelList = convertData(nativeModelList, 'modelList', toProvider, fromProvider); + let clientModelList; + + // --- 核心逻辑: auto 路由模式下的模型聚合 --- + if (CONFIG.MODEL_PROVIDER === MODEL_PROVIDER.AUTO && providerPoolManager) { + logger.info(`[ModelList] Aggregating models for 'auto' mode...`); + clientModelList = await providerPoolManager.getAllAvailableModels(endpointType); } else { - logger.info(`[ModelList Convert] Model list format matches. No conversion needed.`); + // --- 原有的单提供商逻辑 --- + const toProvider = CONFIG.MODEL_PROVIDER; + + // 1. Get the model list in the backend's native format. + const nativeModelList = await service.listModels(); + + // 2. Convert the model list to the client's expected format, if necessary. + clientModelList = nativeModelList; + if (!getProtocolPrefix(toProvider).includes(getProtocolPrefix(fromProvider))) { + logger.info(`[ModelList Convert] Converting model list from ${toProvider} to ${fromProvider}`); + clientModelList = convertData(nativeModelList, 'modelList', toProvider, fromProvider); + } else { + logger.info(`[ModelList Convert] Model list format matches. No conversion needed.`); + } } - logger.info(`[ModelList Response] Sending model list to client: ${JSON.stringify(clientModelList)}`); + // logger.info(`[ModelList Response] Sending model list to client: ${JSON.stringify(clientModelList)}`); res.writeHead(200, { 'Content-Type': 'application/json' }); res.end(JSON.stringify(clientModelList)); } catch (error) { logger.error('\n[Server] Error during model list processing:', error.stack); - if (providerPoolManager) { - // 如果是号池模式,并且请求处理失败,则标记当前使用的提供者为不健康 - providerPoolManager.markProviderUnhealthy(toProvider, { - uuid: pooluuid - }, error.message); - } + // if (providerPoolManager && pooluuid && CONFIG.MODEL_PROVIDER !== MODEL_PROVIDER.AUTO) { + // // 如果是号池模式(且非 auto 模式),并且请求处理失败,则标记当前使用的提供者为不健康 + // providerPoolManager.markProviderUnhealthy(CONFIG.MODEL_PROVIDER, { + // uuid: pooluuid + // }, error.message); + // } + handleError(res, error, CONFIG.MODEL_PROVIDER); } } + /** * Handles requests for content generation (both unary and streaming). This function * orchestrates request body parsing, conversion to the internal Gemini format, @@ -884,7 +898,7 @@ export async function handleContentGenerationRequest(req, res, service, endpoint // 2.5. 如果使用了提供商池,根据模型重新选择提供商(支持 Fallback) // 注意:这里开启 acquireSlot: true,会占用并发名额或进入队列 - if (providerPoolManager && CONFIG.providerPools && CONFIG.providerPools[CONFIG.MODEL_PROVIDER]) { + if (providerPoolManager && (CONFIG.MODEL_PROVIDER === MODEL_PROVIDER.AUTO || (CONFIG.providerPools && CONFIG.providerPools[CONFIG.MODEL_PROVIDER]))) { const { getApiServiceWithFallback } = await import('../services/service-manager.js'); const result = await getApiServiceWithFallback(CONFIG, model, { acquireSlot: true }); @@ -1212,35 +1226,6 @@ function _getProviderSpecificSuggestions(statusCode, provider) { ] }; - case MODEL_PROTOCOL_PREFIX.OLLAMA: - return { - auth: [ - 'Ollama typically does not require authentication', - 'If using a custom setup, verify your credentials', - 'Check if the Ollama server requires authentication' - ], - permission: [ - 'Verify the Ollama server is accessible', - 'Check if the requested model is available locally', - 'Ensure the Ollama server allows the requested operation' - ], - rateLimit: [ - 'The local Ollama server may be overloaded', - 'Try reducing concurrent requests', - 'Consider increasing server resources if running locally' - ], - serverError: [ - 'Check if the Ollama server is running', - 'Verify the server address and port are correct', - 'Check Ollama server logs for detailed error information' - ], - clientError: [ - 'Check your request format and parameters', - 'Verify the model name is available in your Ollama installation', - 'Try pulling the model first with: ollama pull ' - ] - }; - default: return defaultSuggestions; } diff --git a/src/utils/token-utils.js b/src/utils/token-utils.js new file mode 100644 index 0000000..59049fd --- /dev/null +++ b/src/utils/token-utils.js @@ -0,0 +1,170 @@ +import { countTokens } from '@anthropic-ai/tokenizer'; +import logger from './logger.js'; + +/** + * Extract text content from message format + */ +export function getContentText(message) { + if (message == null) { + return ""; + } + if (Array.isArray(message)) { + return message.map(part => { + if (typeof part === 'string') return part; + if (part && typeof part === 'object') { + if (part.type === 'text' && part.text) return part.text; + if (part.text) return part.text; + } + return ''; + }).join(''); + } else if (typeof message.content === 'string') { + return message.content; + } else if (Array.isArray(message.content)) { + return message.content.map(part => { + if (typeof part === 'string') return part; + if (part && typeof part === 'object') { + if (part.type === 'text' && part.text) return part.text; + if (part.text) return part.text; + } + return ''; + }).join(''); + } + return String(message.content || message); +} + +/** + * Process content blocks into text + * @param {any} content - content object or array + * @returns {string} processed text + */ +export function processContent(content) { + if (!content) return ""; + if (typeof content === 'string') return content; + if (Array.isArray(content)) { + return content.map(part => { + if (typeof part === 'string') return part; + if (part && typeof part === 'object') { + if (part.type === 'text') return part.text || ""; + if (part.type === 'thinking') return part.thinking || part.text || ""; + if (part.type === 'tool_result') return processContent(part.content); + if (part.type === 'tool_use' && part.input) return JSON.stringify(part.input); + if (part.text) return part.text; + } + return ""; + }).join(""); + } + return getContentText(content); +} + +/** + * Count tokens for a given text using Claude's official tokenizer + */ +export function countTextTokens(text) { + if (!text) return 0; + try { + return countTokens(text); + } catch (error) { + // Fallback to estimation if tokenizer fails + logger.warn('[TokenUtils] Tokenizer error, falling back to estimation:', error.message); + return Math.ceil((text || '').length / 4); + } +} + +/** + * Calculate input tokens from request body using Claude's official tokenizer + */ +export function estimateInputTokens(requestBody) { + let allText = ""; + + // Count system prompt tokens + if (requestBody.system) { + allText += processContent(requestBody.system); + } + + // Count thinking prefix tokens if thinking is enabled + if (requestBody.thinking?.type && typeof requestBody.thinking.type === 'string') { + const t = requestBody.thinking.type.toLowerCase().trim(); + if (t === 'enabled') { + const budgetTokens = requestBody.thinking.budget_tokens; + let budget = Number(budgetTokens); + if (!Number.isFinite(budget) || budget <= 0) { + budget = 20000; + } + budget = Math.floor(budget); + if (budget < 1024) budget = 1024; + budget = Math.min(budget, 24576); + allText += `enabled${budget}`; + } +else if (t === 'adaptive') { + const effortRaw = typeof requestBody.thinking.effort === 'string' ? requestBody.thinking.effort : ''; + const effort = effortRaw.toLowerCase().trim(); + const normalizedEffort = (effort === 'low' || effort === 'medium' || effort === 'high') ? effort : 'high'; + allText += `adaptive${normalizedEffort}`; + } + } + + // Count all messages tokens + if (requestBody.messages && Array.isArray(requestBody.messages)) { + for (const message of requestBody.messages) { + if (message.content) { + allText += processContent(message.content); + } + } + } + + // Count tools definitions tokens if present + if (requestBody.tools && Array.isArray(requestBody.tools)) { + allText += JSON.stringify(requestBody.tools); + } + + return countTextTokens(allText); +} + +/** + * Count tokens for a message request (compatible with Anthropic API) + * @param {Object} requestBody - The request body containing model, messages, system, tools, etc. + * @returns {Object} { input_tokens: number } + */ +export function countTokensAnthropic(requestBody) { + let allText = ""; + let extraTokens = 0; + + // Count system prompt tokens + if (requestBody.system) { + allText += processContent(requestBody.system); + } + + // Count all messages tokens + if (requestBody.messages && Array.isArray(requestBody.messages)) { + for (const message of requestBody.messages) { + if (message.content) { + if (Array.isArray(message.content)) { + for (const block of message.content) { + if (block.type === 'image') { + // Images have a fixed token cost (approximately 1600 tokens for a typical image) + extraTokens += 1600; + } else if (block.type === 'document') { + // Documents - estimate based on content if available + if (block.source?.data) { + // For base64 encoded documents, estimate tokens + const estimatedChars = block.source.data.length * 0.75; // base64 to bytes ratio + extraTokens += Math.ceil(estimatedChars / 4); + } + } else { + allText += processContent([block]); + } + } + } else { + allText += processContent(message.content); + } + } + } + } + + // Count tools definitions tokens if present + if (requestBody.tools && Array.isArray(requestBody.tools)) { + allText += JSON.stringify(requestBody.tools); + } + + return { input_tokens: countTextTokens(allText) + extraTokens }; +} diff --git a/static/app/i18n.js b/static/app/i18n.js index 3b5924c..c8817ac 100644 --- a/static/app/i18n.js +++ b/static/app/i18n.js @@ -500,7 +500,7 @@ const translations = { 'modal.provider.field.codexBaseUrl': 'Codex Base URL', 'modal.provider.field.apiKey': 'API 密钥', 'modal.provider.field.apiKey.placeholder': '请输入 API 密钥', - 'modal.provider.field.projectId.placeholder': 'Google Cloud 项目 ID', + 'modal.provider.field.projectId.placeholder': 'Google Cloud 项目 ID (留空自动发现)', 'modal.provider.field.projectId.optional.placeholder': 'Google Cloud 项目 ID (留空自动发现)', 'modal.provider.field.oauthPath.gemini.placeholder': '例如: ~/.gemini/oauth_creds.json', 'modal.provider.field.oauthPath.kiro.placeholder': '例如: ~/.aws/sso/cache/kiro-auth-token.json', @@ -659,10 +659,6 @@ const translations = { 'guide.client.cline.step3': '设置 API Base URL 为: http://localhost:3000/{provider}/v1', 'guide.client.cline.step4': '填入 API Key 和模型名称', 'guide.client.note': '提示:将 {provider} 替换为实际的提供商路径,如 gemini-cli-oauth、claude-kiro-oauth 等。可在仪表盘的路由示例中查看完整路径。', - 'guide.ollama.title': 'Ollama 协议使用', - 'guide.ollama.desc': '本项目支持 Ollama 协议,可以通过统一接口访问所有支持的模型。', - 'guide.ollama.listModels': '列出所有可用模型', - 'guide.ollama.chat': '聊天接口', 'guide.faq.title': '常见问题', 'guide.faq.q1': 'Q: 请求返回 404 错误怎么办?', 'guide.faq.a1': 'A: 检查接口路径是否正确。某些客户端会自动在 Base URL 后追加路径,导致路径重复。请查看控制台中的实际请求 URL,移除多余的路径部分。', @@ -1314,7 +1310,7 @@ const translations = { 'modal.provider.field.codexBaseUrl': 'Codex Base URL', 'modal.provider.field.apiKey': 'API Key', 'modal.provider.field.apiKey.placeholder': 'Please enter API Key', - 'modal.provider.field.projectId.placeholder': 'Google Cloud Project ID', + 'modal.provider.field.projectId.placeholder': 'Google Cloud Project ID (Leave blank for discovery)', 'modal.provider.field.projectId.optional.placeholder': 'Google Cloud Project ID (Leave blank for discovery)', 'modal.provider.field.oauthPath.gemini.placeholder': 'e.g.: ~/.gemini/oauth_creds.json', 'modal.provider.field.oauthPath.kiro.placeholder': 'e.g.: ~/.aws/sso/cache/kiro-auth-token.json', @@ -1473,10 +1469,6 @@ const translations = { 'guide.client.cline.step3': 'Set API Base URL to: http://localhost:3000/{provider}/v1', 'guide.client.cline.step4': 'Enter API Key and model name', 'guide.client.note': 'Tip: Replace {provider} with the actual provider path, such as gemini-cli-oauth, claude-kiro-oauth, etc. See the routing examples on the dashboard for full paths.', - 'guide.ollama.title': 'Ollama Protocol Usage', - 'guide.ollama.desc': 'This project supports Ollama protocol, allowing unified access to all supported models.', - 'guide.ollama.listModels': 'List all available models', - 'guide.ollama.chat': 'Chat interface', 'guide.faq.title': 'FAQ', 'guide.faq.q1': 'Q: What to do if request returns 404 error?', 'guide.faq.a1': 'A: Check if the API path is correct. Some clients automatically append paths to Base URL, causing duplication. Check the actual request URL in the console and remove redundant path parts.', diff --git a/static/app/utils.js b/static/app/utils.js index 7f1567a..f796c45 100644 --- a/static/app/utils.js +++ b/static/app/utils.js @@ -375,13 +375,13 @@ function getProviderTypeFields(providerType) { }, { id: 'GROK_CF_CLEARANCE', - label: t('modal.provider.field.cfClearance'), + label: `${t('modal.provider.field.cfClearance')} ${t('config.optional')}`, type: 'text', placeholder: 'cf_clearance cookie value' }, { id: 'GROK_USER_AGENT', - label: t('modal.provider.field.userAgent'), + label: `${t('modal.provider.field.userAgent')} ${t('config.optional')}`, type: 'text', placeholder: 'Mozilla/5.0 ...' }, diff --git a/static/components/section-guide.html b/static/components/section-guide.html index 0ebfdf7..5dfd24c 100644 --- a/static/components/section-guide.html +++ b/static/components/section-guide.html @@ -166,31 +166,6 @@ - -
-

Ollama 协议使用

-
-

本项目支持 Ollama 协议,可以通过统一接口访问所有支持的模型。

- -
-

列出所有可用模型

-
curl http://localhost:3000/ollama/api/tags \
-  -H "Authorization: Bearer YOUR_API_KEY"
-
- -
-

聊天接口

-
curl http://localhost:3000/ollama/api/chat \
-  -H "Content-Type: application/json" \
-  -H "Authorization: Bearer YOUR_API_KEY" \
-  -d '{
-    "model": "[Claude] claude-sonnet-4.5",
-    "messages": [{"role": "user", "content": "你好"}]
-  }'
-
-
-
-

常见问题