refactor: 移除 Ollama 协议支持并重构模型路由
- 删除 Ollama 协议相关代码,包括处理器、转换器、文档和常量 - 重构模型列表获取逻辑,支持 auto 模式下的多提供商聚合 - 新增 token 计算工具函数,统一各提供商 token 计数逻辑 - 改进模型前缀路由解析,增强 auto 模式的健壮性 - 更新多语言文档,移除 Ollama 相关内容
This commit is contained in:
parent
05df61df74
commit
8947f93471
17 changed files with 437 additions and 1970 deletions
43
README-JA.md
43
README-JA.md
|
|
@ -43,7 +43,6 @@
|
||||||
> - **2025.12.25** - 設定ファイル統一管理:すべての設定を `configs/` ディレクトリに集約。Dockerユーザーはマウントパスを `-v "ローカルパス:/app/configs"` に更新が必要
|
> - **2025.12.25** - 設定ファイル統一管理:すべての設定を `configs/` ディレクトリに集約。Dockerユーザーはマウントパスを `-v "ローカルパス:/app/configs"` に更新が必要
|
||||||
> - **2025.12.11** - Dockerイメージが自動的にビルドされ、Docker Hubで公開されました: [justlikemaki/aiclient-2-api](https://hub.docker.com/r/justlikemaki/aiclient-2-api)
|
> - **2025.12.11** - Dockerイメージが自動的にビルドされ、Docker Hubで公開されました: [justlikemaki/aiclient-2-api](https://hub.docker.com/r/justlikemaki/aiclient-2-api)
|
||||||
> - **2025.11.30** - Antigravityプロトコルサポートの追加、Google内部インターフェース経由でGemini 3 Pro、Claude Sonnet 4.5などのモデルへのアクセスをサポート
|
> - **2025.11.30** - Antigravityプロトコルサポートの追加、Google内部インターフェース経由でGemini 3 Pro、Claude Sonnet 4.5などのモデルへのアクセスをサポート
|
||||||
> - **2025.11.16** - Ollamaプロトコルサポートの追加、統一インターフェースでサポートされるすべてのモデルにアクセス
|
|
||||||
> - **2025.11.11** - Web UI管理コントロールコンソールの追加、リアルタイム設定管理と健康状態モニタリングをサポート
|
> - **2025.11.11** - Web UI管理コントロールコンソールの追加、リアルタイム設定管理と健康状態モニタリングをサポート
|
||||||
> - **2025.11.06** - Gemini 3 プレビュー版のサポートを追加、モデル互換性とパフォーマンス最適化を向上
|
> - **2025.11.06** - Gemini 3 プレビュー版のサポートを追加、モデル互換性とパフォーマンス最適化を向上
|
||||||
> - **2025.10.18** - Kiroオープン登録、新規アカウントに500クレジット付与、Claude Sonnet 4.5を完全サポート
|
> - **2025.10.18** - Kiroオープン登録、新規アカウントに500クレジット付与、Claude Sonnet 4.5を完全サポート
|
||||||
|
|
@ -93,7 +92,6 @@
|
||||||
- [📋 コア機能](#-コア機能)
|
- [📋 コア機能](#-コア機能)
|
||||||
- [🔐 認証設定ガイド](#-認証設定ガイド)
|
- [🔐 認証設定ガイド](#-認証設定ガイド)
|
||||||
- [📁 認証ファイル保存パス](#-認証ファイル保存パス)
|
- [📁 認証ファイル保存パス](#-認証ファイル保存パス)
|
||||||
- [🦙 Ollamaプロトコル使用例](#-ollamaプロトコル使用例)
|
|
||||||
- [⚙️ 高度な設定](#高度な設定)
|
- [⚙️ 高度な設定](#高度な設定)
|
||||||
- [❓ よくある質問](#-よくある質問)
|
- [❓ よくある質問](#-よくある質問)
|
||||||
- [📄 オープンソースライセンス](#-オープンソースライセンス)
|
- [📄 オープンソースライセンス](#-オープンソースライセンス)
|
||||||
|
|
@ -347,41 +345,6 @@ curl http://localhost:3000/claude-kiro-oauth/v1/chat/completions \
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### 🦙 Ollamaプロトコル使用例
|
|
||||||
|
|
||||||
本プロジェクトはOllamaプロトコルをサポートしており、統一インターフェースを通じてすべてのサポートモデルにアクセスできます。Ollamaエンドポイントは`/api/tags`、`/api/chat`、`/api/generate`などの標準インターフェースを提供します。
|
|
||||||
|
|
||||||
**Ollama API呼び出し例**:
|
|
||||||
|
|
||||||
1. **利用可能なすべてのモデルをリスト表示**:
|
|
||||||
```bash
|
|
||||||
curl http://localhost:3000/ollama/api/tags \
|
|
||||||
-H "Authorization: Bearer your-api-key"
|
|
||||||
```
|
|
||||||
|
|
||||||
2. **チャットインターフェース**:
|
|
||||||
```bash
|
|
||||||
curl http://localhost:3000/ollama/api/chat \
|
|
||||||
-H "Content-Type: application/json" \
|
|
||||||
-H "Authorization: Bearer your-api-key" \
|
|
||||||
-d '{
|
|
||||||
"model": "[Claude] claude-sonnet-4.5",
|
|
||||||
"messages": [
|
|
||||||
{"role": "user", "content": "こんにちは"}
|
|
||||||
]
|
|
||||||
}'
|
|
||||||
```
|
|
||||||
|
|
||||||
3. **モデルプレフィックスを使用してプロバイダーを指定**:
|
|
||||||
- `[Kiro]` - Kiro APIを使用してClaudeモデルにアクセス
|
|
||||||
- `[Claude]` - 公式Claude APIを使用
|
|
||||||
- `[Gemini CLI]` - Gemini CLI OAuth経由でアクセス
|
|
||||||
- `[OpenAI]` - 公式OpenAI APIを使用
|
|
||||||
- `[Grok]` - Grok Cookie/SSO経由でアクセス
|
|
||||||
- `[Qwen CLI]` - Qwen OAuth経由でアクセス
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### 高度な設定
|
### 高度な設定
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
@ -672,11 +635,9 @@ kill -9 <PID>
|
||||||
|
|
||||||
### 10. APIが404を返す
|
### 10. APIが404を返す
|
||||||
|
|
||||||
**問題の説明**:APIエンドポイントを呼び出すと404 Not Foundエラーが返されます。
|
|
||||||
|
|
||||||
**解決策**:
|
**解決策**:
|
||||||
- **エンドポイントパスを確認**:`/v1/chat/completions`、`/ollama/api/chat` などの正しいエンドポイントパスを使用していることを確認
|
- **エンドポイントパスを確認**:`/v1/chat/completions` などの正しいエンドポイントパスを使用していることを確認
|
||||||
- **クライアントの自動補完を確認**:一部のクライアント(Cherry-Studio、NextChatなど)はBase URLの後にパス(`/v1/chat/completions` など)を自動的に追加し、パスの重複を引き起こします。コンソールで実際のリクエストURLを確認し、冗長なパス部分を削除してください
|
- **クライアントの自動補完を確認**:一部のクライアント(Cherry-Studio、NextChatなど)はBase URLの後にパス(`/v1/chat/completions` など)を自動的に追加し、パスの重複を引き起こします。コンソールで実際のリクエストURLを確認し、冗长なパス部分を削除してください
|
||||||
- **サービス状態を確認**:サービスが正常に起動していることを確認、`http://localhost:3000` にアクセスしてWeb UIを確認
|
- **サービス状態を確認**:サービスが正常に起動していることを確認、`http://localhost:3000` にアクセスしてWeb UIを確認
|
||||||
- **ポート設定を確認**:リクエストが正しいポート(デフォルト3000)に送信されていることを確認
|
- **ポート設定を確認**:リクエストが正しいポート(デフォルト3000)に送信されていることを確認
|
||||||
- **利用可能なルートを確認**:Web UIダッシュボードページの「インタラクティブルーティング例」ですべての利用可能なエンドポイントを確認
|
- **利用可能なルートを確認**:Web UIダッシュボードページの「インタラクティブルーティング例」ですべての利用可能なエンドポイントを確認
|
||||||
|
|
|
||||||
39
README-ZH.md
39
README-ZH.md
|
|
@ -43,7 +43,6 @@
|
||||||
> - **2025.12.25** - 配置文件统一管理:所有配置集中到 `configs/` 目录,Docker 用户需更新挂载路径为 `-v "本地路径:/app/configs"`
|
> - **2025.12.25** - 配置文件统一管理:所有配置集中到 `configs/` 目录,Docker 用户需更新挂载路径为 `-v "本地路径:/app/configs"`
|
||||||
> - **2025.12.11** - Docker 镜像自动构建并发布到 Docker Hub: [justlikemaki/aiclient-2-api](https://hub.docker.com/r/justlikemaki/aiclient-2-api)
|
> - **2025.12.11** - Docker 镜像自动构建并发布到 Docker Hub: [justlikemaki/aiclient-2-api](https://hub.docker.com/r/justlikemaki/aiclient-2-api)
|
||||||
> - **2025.11.30** - 新增 Antigravity 协议支持,支持通过 Google 内部接口访问 Gemini 3 Pro、Claude Sonnet 4.5 等模型
|
> - **2025.11.30** - 新增 Antigravity 协议支持,支持通过 Google 内部接口访问 Gemini 3 Pro、Claude Sonnet 4.5 等模型
|
||||||
> - **2025.11.16** - 新增 Ollama 协议支持,统一接口访问所有支持的模型(Claude、Gemini、Qwen、OpenAI等)
|
|
||||||
> - **2025.11.11** - 新增 Web UI 管理控制台,支持实时配置管理和健康状态监控
|
> - **2025.11.11** - 新增 Web UI 管理控制台,支持实时配置管理和健康状态监控
|
||||||
> - **2025.11.06** - 新增对 Gemini 3 预览版的支持,增强模型兼容性和性能优化
|
> - **2025.11.06** - 新增对 Gemini 3 预览版的支持,增强模型兼容性和性能优化
|
||||||
> - **2025.10.18** - Kiro 开放注册,新用户赠送 500 额度,已完整支持 Claude Sonnet 4.5
|
> - **2025.10.18** - Kiro 开放注册,新用户赠送 500 额度,已完整支持 Claude Sonnet 4.5
|
||||||
|
|
@ -92,7 +91,6 @@
|
||||||
- [📋 核心功能](#-核心功能)
|
- [📋 核心功能](#-核心功能)
|
||||||
- [🔐 授权配置指南](#-授权配置指南)
|
- [🔐 授权配置指南](#-授权配置指南)
|
||||||
- [📁 授权文件存储路径](#-授权文件存储路径)
|
- [📁 授权文件存储路径](#-授权文件存储路径)
|
||||||
- [🦙 Ollama 协议使用示例](#-ollama-协议使用示例)
|
|
||||||
- [⚙️ 高级配置](#高级配置)
|
- [⚙️ 高级配置](#高级配置)
|
||||||
- [❓ 常见问题](#-常见问题)
|
- [❓ 常见问题](#-常见问题)
|
||||||
- [📄 开源许可](#-开源许可)
|
- [📄 开源许可](#-开源许可)
|
||||||
|
|
@ -346,41 +344,6 @@ curl http://localhost:3000/claude-kiro-oauth/v1/chat/completions \
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### 🦙 Ollama 协议使用示例
|
|
||||||
|
|
||||||
本项目支持 Ollama 协议,可以通过统一接口访问所有支持的模型。Ollama 端点提供 `/api/tags`、`/api/chat`、`/api/generate` 等标准接口。
|
|
||||||
|
|
||||||
**Ollama API 调用示例**:
|
|
||||||
|
|
||||||
1. **列出所有可用模型**:
|
|
||||||
```bash
|
|
||||||
curl http://localhost:3000/ollama/api/tags \
|
|
||||||
-H "Authorization: Bearer your-api-key"
|
|
||||||
```
|
|
||||||
|
|
||||||
2. **聊天接口**:
|
|
||||||
```bash
|
|
||||||
curl http://localhost:3000/ollama/api/chat \
|
|
||||||
-H "Content-Type: application/json" \
|
|
||||||
-H "Authorization: Bearer your-api-key" \
|
|
||||||
-d '{
|
|
||||||
"model": "[Claude] claude-sonnet-4.5",
|
|
||||||
"messages": [
|
|
||||||
{"role": "user", "content": "你好"}
|
|
||||||
]
|
|
||||||
}'
|
|
||||||
```
|
|
||||||
|
|
||||||
3. **使用模型前缀指定提供商**:
|
|
||||||
- `[Kiro]` - 使用 Kiro API 访问 Claude 模型
|
|
||||||
- `[Claude]` - 使用 Claude 官方 API
|
|
||||||
- `[Gemini CLI]` - 通过 Gemini CLI OAuth 访问
|
|
||||||
- `[OpenAI]` - 使用 OpenAI 官方 API
|
|
||||||
- `[Grok]` - 通过 Grok Cookie/SSO 访问
|
|
||||||
- `[Qwen CLI]` - 通过 Qwen OAuth 访问
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### 高级配置
|
### 高级配置
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
@ -674,7 +637,7 @@ kill -9 <PID>
|
||||||
**问题描述**:调用 API 接口时返回 404 Not Found 错误。
|
**问题描述**:调用 API 接口时返回 404 Not Found 错误。
|
||||||
|
|
||||||
**解决方案**:
|
**解决方案**:
|
||||||
- **检查接口路径**:确保使用正确的接口路径,如 `/v1/chat/completions`、`/ollama/api/chat` 等
|
- **检查接口路径**:确保使用正确的接口路径,如 `/v1/chat/completions` 等
|
||||||
- **检查客户端自动补全**:某些客户端(如 Cherry-Studio、NextChat)会自动在 Base URL 后追加路径(如 `/v1/chat/completions`),导致路径重复。请查看控制台中的实际请求 URL,移除多余的路径部分
|
- **检查客户端自动补全**:某些客户端(如 Cherry-Studio、NextChat)会自动在 Base URL 后追加路径(如 `/v1/chat/completions`),导致路径重复。请查看控制台中的实际请求 URL,移除多余的路径部分
|
||||||
- **检查服务状态**:确认服务已正常启动,访问 `http://localhost:3000` 查看 Web UI
|
- **检查服务状态**:确认服务已正常启动,访问 `http://localhost:3000` 查看 Web UI
|
||||||
- **检查端口配置**:确保请求发送到正确的端口(默认 3000)
|
- **检查端口配置**:确保请求发送到正确的端口(默认 3000)
|
||||||
|
|
|
||||||
41
README.md
41
README.md
|
|
@ -43,7 +43,6 @@
|
||||||
> - **2025.12.25** - Unified configuration management: All configs centralized to `configs/` directory. Docker users need to update mount path to `-v "local_path:/app/configs"`
|
> - **2025.12.25** - Unified configuration management: All configs centralized to `configs/` directory. Docker users need to update mount path to `-v "local_path:/app/configs"`
|
||||||
> - **2025.12.11** - Automatically built Docker images are now available on Docker Hub: [justlikemaki/aiclient-2-api](https://hub.docker.com/r/justlikemaki/aiclient-2-api)
|
> - **2025.12.11** - Automatically built Docker images are now available on Docker Hub: [justlikemaki/aiclient-2-api](https://hub.docker.com/r/justlikemaki/aiclient-2-api)
|
||||||
> - **2025.11.30** - Added Antigravity protocol support, enabling access to Gemini 3 Pro, Claude Sonnet 4.5, and other models via Google internal interfaces
|
> - **2025.11.30** - Added Antigravity protocol support, enabling access to Gemini 3 Pro, Claude Sonnet 4.5, and other models via Google internal interfaces
|
||||||
> - **2025.11.16** - Added Ollama protocol support, unified interface to access all supported models (Claude, Gemini, Qwen, OpenAI, etc.)
|
|
||||||
> - **2025.11.11** - Added Web UI management console, supporting real-time configuration management and health status monitoring
|
> - **2025.11.11** - Added Web UI management console, supporting real-time configuration management and health status monitoring
|
||||||
> - **2025.11.06** - Added support for Gemini 3 Preview, enhanced model compatibility and performance optimization
|
> - **2025.11.06** - Added support for Gemini 3 Preview, enhanced model compatibility and performance optimization
|
||||||
> - **2025.10.18** - Kiro open registration, new accounts get 500 credits, full support for Claude Sonnet 4.5
|
> - **2025.10.18** - Kiro open registration, new accounts get 500 credits, full support for Claude Sonnet 4.5
|
||||||
|
|
@ -93,7 +92,6 @@
|
||||||
- [📋 Core Features](#-core-features)
|
- [📋 Core Features](#-core-features)
|
||||||
- [🔐 Authorization Configuration Guide](#-authorization-configuration-guide)
|
- [🔐 Authorization Configuration Guide](#-authorization-configuration-guide)
|
||||||
- [📁 Authorization File Storage Paths](#-authorization-file-storage-paths)
|
- [📁 Authorization File Storage Paths](#-authorization-file-storage-paths)
|
||||||
- [🦙 Ollama Protocol Usage Examples](#-ollama-protocol-usage-examples)
|
|
||||||
- [⚙️ Advanced Configuration](#advanced-configuration)
|
- [⚙️ Advanced Configuration](#advanced-configuration)
|
||||||
- [❓ FAQ](#-faq)
|
- [❓ FAQ](#-faq)
|
||||||
- [📄 Open Source License](#-open-source-license)
|
- [📄 Open Source License](#-open-source-license)
|
||||||
|
|
@ -347,41 +345,6 @@ Default storage locations for authorization credential files of each service:
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### 🦙 Ollama Protocol Usage Examples
|
|
||||||
|
|
||||||
This project supports the Ollama protocol, allowing access to all supported models through a unified interface. The Ollama endpoint provides standard interfaces such as `/api/tags`, `/api/chat`, `/api/generate`, etc.
|
|
||||||
|
|
||||||
**Ollama API Call Examples**:
|
|
||||||
|
|
||||||
1. **List all available models**:
|
|
||||||
```bash
|
|
||||||
curl http://localhost:3000/ollama/api/tags \
|
|
||||||
-H "Authorization: Bearer your-api-key"
|
|
||||||
```
|
|
||||||
|
|
||||||
2. **Chat interface**:
|
|
||||||
```bash
|
|
||||||
curl http://localhost:3000/ollama/api/chat \
|
|
||||||
-H "Content-Type: application/json" \
|
|
||||||
-H "Authorization: Bearer your-api-key" \
|
|
||||||
-d '{
|
|
||||||
"model": "[Claude] claude-sonnet-4.5",
|
|
||||||
"messages": [
|
|
||||||
{"role": "user", "content": "Hello"}
|
|
||||||
]
|
|
||||||
}'
|
|
||||||
```
|
|
||||||
|
|
||||||
3. **Specify provider using model prefix**:
|
|
||||||
- `[Kiro]` - Access Claude models using Kiro API
|
|
||||||
- `[Claude]` - Use official Claude API
|
|
||||||
- `[Gemini CLI]` - Access via Gemini CLI OAuth
|
|
||||||
- `[OpenAI]` - Use official OpenAI API
|
|
||||||
- `[Grok]` - Access via Grok Cookie/SSO
|
|
||||||
- `[Qwen CLI]` - Access via Qwen OAuth
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### Advanced Configuration
|
### Advanced Configuration
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
@ -672,10 +635,8 @@ Or modify the port configuration in `configs/config.json` to use a different por
|
||||||
|
|
||||||
### 10. API Returns 404
|
### 10. API Returns 404
|
||||||
|
|
||||||
**Problem Description**: When calling API endpoints, it returns 404 Not Found error.
|
|
||||||
|
|
||||||
**Solutions**:
|
**Solutions**:
|
||||||
- **Check Endpoint Path**: Ensure you're using the correct endpoint path, such as `/v1/chat/completions`, `/ollama/api/chat`, etc.
|
- **Check Endpoint Path**: Ensure you're using the correct endpoint path, such as `/v1/chat/completions` etc.
|
||||||
- **Check Client Auto-completion**: Some clients (like Cherry-Studio, NextChat) automatically append paths (like `/v1/chat/completions`) after the Base URL, causing path duplication. Check the actual request URL in the console and remove redundant path parts
|
- **Check Client Auto-completion**: Some clients (like Cherry-Studio, NextChat) automatically append paths (like `/v1/chat/completions`) after the Base URL, causing path duplication. Check the actual request URL in the console and remove redundant path parts
|
||||||
- **Check Service Status**: Confirm the service has started normally, visit `http://localhost:3000` to view Web UI
|
- **Check Service Status**: Confirm the service has started normally, visit `http://localhost:3000` to view Web UI
|
||||||
- **Check Port Configuration**: Ensure requests are sent to the correct port (default 3000)
|
- **Check Port Configuration**: Ensure requests are sent to the correct port (default 3000)
|
||||||
|
|
|
||||||
|
|
@ -16,7 +16,6 @@
|
||||||
* `static/components/section-config.html`:配置按钮。
|
* `static/components/section-config.html`:配置按钮。
|
||||||
* `static/components/section-guide.html`:使用指南。
|
* `static/components/section-guide.html`:使用指南。
|
||||||
* `static/app/routing-examples.js`:路由调用示例。
|
* `static/app/routing-examples.js`:路由调用示例。
|
||||||
* `src/handlers/ollama-handler.js`:Ollama 协议前缀与支持映射。
|
|
||||||
6. **系统级映射(必做)**:在 OAuth 处理器、凭据关联工具、用量统计等模块中建立映射。
|
6. **系统级映射(必做)**:在 OAuth 处理器、凭据关联工具、用量统计等模块中建立映射。
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
@ -134,10 +133,6 @@
|
||||||
* **路由分发**:在 [`src/ui-modules/oauth-api.js`](src/ui-modules/oauth-api.js) 的 `handleGenerateAuthUrl` 中分发到相应的处理器。
|
* **路由分发**:在 [`src/ui-modules/oauth-api.js`](src/ui-modules/oauth-api.js) 的 `handleGenerateAuthUrl` 中分发到相应的处理器。
|
||||||
* **回调处理**:若涉及 HTTP 回调,需在 `src/auth/` 下实现回调服务器逻辑。
|
* **回调处理**:若涉及 HTTP 回调,需在 `src/auth/` 下实现回调服务器逻辑。
|
||||||
|
|
||||||
### 4.5 Ollama 协议映射 ([`src/handlers/ollama-handler.js`](src/handlers/ollama-handler.js))
|
|
||||||
* 在 `MODEL_PREFIX_MAP` 中添加该提供商对应的日志/显示前缀。
|
|
||||||
* 在 `supportedProviders` 数组中添加该提供商标识,以支持 Ollama 协议转换。
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 5. 注意事项
|
## 5. 注意事项
|
||||||
|
|
|
||||||
|
|
@ -9,7 +9,6 @@ import { OpenAIConverter } from './strategies/OpenAIConverter.js';
|
||||||
import { OpenAIResponsesConverter } from './strategies/OpenAIResponsesConverter.js';
|
import { OpenAIResponsesConverter } from './strategies/OpenAIResponsesConverter.js';
|
||||||
import { ClaudeConverter } from './strategies/ClaudeConverter.js';
|
import { ClaudeConverter } from './strategies/ClaudeConverter.js';
|
||||||
import { GeminiConverter } from './strategies/GeminiConverter.js';
|
import { GeminiConverter } from './strategies/GeminiConverter.js';
|
||||||
import { OllamaConverter } from './strategies/OllamaConverter.js';
|
|
||||||
import { CodexConverter } from './strategies/CodexConverter.js';
|
import { CodexConverter } from './strategies/CodexConverter.js';
|
||||||
import { GrokConverter } from './strategies/GrokConverter.js';
|
import { GrokConverter } from './strategies/GrokConverter.js';
|
||||||
|
|
||||||
|
|
@ -22,7 +21,6 @@ export function registerAllConverters() {
|
||||||
ConverterFactory.registerConverter(MODEL_PROTOCOL_PREFIX.OPENAI_RESPONSES, OpenAIResponsesConverter);
|
ConverterFactory.registerConverter(MODEL_PROTOCOL_PREFIX.OPENAI_RESPONSES, OpenAIResponsesConverter);
|
||||||
ConverterFactory.registerConverter(MODEL_PROTOCOL_PREFIX.CLAUDE, ClaudeConverter);
|
ConverterFactory.registerConverter(MODEL_PROTOCOL_PREFIX.CLAUDE, ClaudeConverter);
|
||||||
ConverterFactory.registerConverter(MODEL_PROTOCOL_PREFIX.GEMINI, GeminiConverter);
|
ConverterFactory.registerConverter(MODEL_PROTOCOL_PREFIX.GEMINI, GeminiConverter);
|
||||||
ConverterFactory.registerConverter(MODEL_PROTOCOL_PREFIX.OLLAMA, OllamaConverter);
|
|
||||||
ConverterFactory.registerConverter(MODEL_PROTOCOL_PREFIX.CODEX, CodexConverter);
|
ConverterFactory.registerConverter(MODEL_PROTOCOL_PREFIX.CODEX, CodexConverter);
|
||||||
ConverterFactory.registerConverter(MODEL_PROTOCOL_PREFIX.GROK, GrokConverter);
|
ConverterFactory.registerConverter(MODEL_PROTOCOL_PREFIX.GROK, GrokConverter);
|
||||||
}
|
}
|
||||||
|
|
|
||||||
|
|
@ -1,690 +0,0 @@
|
||||||
/**
|
|
||||||
* Ollama转换器
|
|
||||||
* 处理Ollama协议与其他协议之间的转换
|
|
||||||
*/
|
|
||||||
|
|
||||||
import { v4 as uuidv4 } from 'uuid';
|
|
||||||
import { createHash } from 'crypto';
|
|
||||||
import { BaseConverter } from '../BaseConverter.js';
|
|
||||||
import { MODEL_PROTOCOL_PREFIX } from '../../utils/common.js';
|
|
||||||
import {
|
|
||||||
OLLAMA_DEFAULT_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_DEFAULT_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_CLAUDE_DEFAULT_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_CLAUDE_SONNET_45_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_CLAUDE_SONNET_45_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_CLAUDE_HAIKU_45_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_CLAUDE_HAIKU_45_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_CLAUDE_OPUS_41_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_CLAUDE_OPUS_41_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_CLAUDE_SONNET_40_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_CLAUDE_SONNET_40_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_CLAUDE_SONNET_37_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_CLAUDE_SONNET_37_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_CLAUDE_OPUS_40_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_CLAUDE_OPUS_40_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_CLAUDE_HAIKU_35_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_CLAUDE_HAIKU_35_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_CLAUDE_HAIKU_30_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_CLAUDE_HAIKU_30_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_CLAUDE_SONNET_35_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_CLAUDE_SONNET_35_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_CLAUDE_OPUS_30_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_CLAUDE_OPUS_30_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_GEMINI_25_PRO_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_GEMINI_25_PRO_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_GEMINI_25_FLASH_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_GEMINI_25_FLASH_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_GEMINI_25_IMAGE_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_GEMINI_25_IMAGE_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_GEMINI_25_LIVE_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_GEMINI_25_LIVE_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_GEMINI_25_TTS_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_GEMINI_25_TTS_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_GEMINI_20_FLASH_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_GEMINI_20_FLASH_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_GEMINI_20_IMAGE_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_GEMINI_20_IMAGE_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_GEMINI_15_PRO_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_GEMINI_15_PRO_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_GEMINI_15_FLASH_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_GEMINI_15_FLASH_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_GEMINI_DEFAULT_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_GEMINI_DEFAULT_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_GPT4_TURBO_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_GPT4_TURBO_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_GPT4_32K_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_GPT4_32K_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_GPT4_BASE_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_GPT4_BASE_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_GPT35_16K_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_GPT35_16K_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_GPT35_BASE_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_GPT35_BASE_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_QWEN_CODER_PLUS_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_QWEN_CODER_PLUS_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_QWEN_VL_PLUS_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_QWEN_VL_PLUS_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_QWEN_CODER_FLASH_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_QWEN_CODER_FLASH_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_QWEN_DEFAULT_CONTEXT_LENGTH,
|
|
||||||
OLLAMA_QWEN_DEFAULT_MAX_OUTPUT_TOKENS,
|
|
||||||
OLLAMA_DEFAULT_FILE_TYPE,
|
|
||||||
OLLAMA_DEFAULT_QUANTIZATION_VERSION,
|
|
||||||
OLLAMA_DEFAULT_ROPE_FREQ_BASE,
|
|
||||||
OLLAMA_DEFAULT_TEMPERATURE,
|
|
||||||
OLLAMA_DEFAULT_TOP_P,
|
|
||||||
OLLAMA_DEFAULT_QUANTIZATION_LEVEL,
|
|
||||||
OLLAMA_SHOW_QUANTIZATION_LEVEL
|
|
||||||
} from '../utils.js';
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Ollama转换器类
|
|
||||||
* 实现Ollama协议到其他协议的转换
|
|
||||||
*/
|
|
||||||
export class OllamaConverter extends BaseConverter {
|
|
||||||
constructor() {
|
|
||||||
super('ollama');
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* 转换请求 - Ollama -> 其他协议
|
|
||||||
*/
|
|
||||||
convertRequest(data, targetProtocol) {
|
|
||||||
switch (targetProtocol) {
|
|
||||||
case MODEL_PROTOCOL_PREFIX.OPENAI:
|
|
||||||
case MODEL_PROTOCOL_PREFIX.CLAUDE:
|
|
||||||
case MODEL_PROTOCOL_PREFIX.GEMINI:
|
|
||||||
return this.toOpenAIRequest(data);
|
|
||||||
default:
|
|
||||||
throw new Error(`Unsupported target protocol: ${targetProtocol}`);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* 转换响应 - 其他协议 -> Ollama
|
|
||||||
*/
|
|
||||||
convertResponse(data, sourceProtocol, model) {
|
|
||||||
return this.toOllamaChatResponse(data, model);
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* 转换流式响应块 - 其他协议 -> Ollama
|
|
||||||
*/
|
|
||||||
convertStreamChunk(chunk, sourceProtocol, model, isDone = false) {
|
|
||||||
return this.toOllamaStreamChunk(chunk, model, isDone);
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* 转换模型列表 - 其他协议 -> Ollama
|
|
||||||
*/
|
|
||||||
convertModelList(data, sourceProtocol) {
|
|
||||||
return this.toOllamaTags(data, sourceProtocol);
|
|
||||||
}
|
|
||||||
|
|
||||||
// =========================================================================
|
|
||||||
// Ollama -> OpenAI 转换
|
|
||||||
// =========================================================================
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Ollama请求 -> OpenAI请求
|
|
||||||
*/
|
|
||||||
toOpenAIRequest(ollamaRequest) {
|
|
||||||
const openaiRequest = {
|
|
||||||
model: ollamaRequest.model || 'default',
|
|
||||||
messages: [],
|
|
||||||
stream: ollamaRequest.stream !== undefined ? ollamaRequest.stream : false
|
|
||||||
};
|
|
||||||
|
|
||||||
// Map Ollama messages to OpenAI format
|
|
||||||
if (ollamaRequest.messages && Array.isArray(ollamaRequest.messages)) {
|
|
||||||
openaiRequest.messages = ollamaRequest.messages.map(msg => ({
|
|
||||||
role: msg.role || 'user',
|
|
||||||
content: msg.content || ''
|
|
||||||
}));
|
|
||||||
}
|
|
||||||
|
|
||||||
// Map Ollama options to OpenAI parameters
|
|
||||||
if (ollamaRequest.options) {
|
|
||||||
const opts = ollamaRequest.options;
|
|
||||||
if (opts.temperature !== undefined) openaiRequest.temperature = opts.temperature;
|
|
||||||
if (opts.top_p !== undefined) openaiRequest.top_p = opts.top_p;
|
|
||||||
if (opts.top_k !== undefined) openaiRequest.top_k = opts.top_k;
|
|
||||||
if (opts.num_predict !== undefined) openaiRequest.max_tokens = opts.num_predict;
|
|
||||||
if (opts.stop !== undefined) openaiRequest.stop = opts.stop;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Handle system prompt
|
|
||||||
if (ollamaRequest.system) {
|
|
||||||
openaiRequest.messages.unshift({
|
|
||||||
role: 'system',
|
|
||||||
content: ollamaRequest.system
|
|
||||||
});
|
|
||||||
}
|
|
||||||
|
|
||||||
// Handle template/prompt for generate endpoint
|
|
||||||
if (ollamaRequest.prompt) {
|
|
||||||
openaiRequest.messages = [{
|
|
||||||
role: 'user',
|
|
||||||
content: ollamaRequest.prompt
|
|
||||||
}];
|
|
||||||
|
|
||||||
// Add system prompt if provided
|
|
||||||
if (ollamaRequest.system) {
|
|
||||||
openaiRequest.messages.unshift({
|
|
||||||
role: 'system',
|
|
||||||
content: ollamaRequest.system
|
|
||||||
});
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return openaiRequest;
|
|
||||||
}
|
|
||||||
|
|
||||||
// =========================================================================
|
|
||||||
// OpenAI/Claude/Gemini -> Ollama 转换
|
|
||||||
// =========================================================================
|
|
||||||
|
|
||||||
/**
|
|
||||||
* OpenAI/Claude/Gemini响应 -> Ollama chat响应
|
|
||||||
*/
|
|
||||||
toOllamaChatResponse(response, model) {
|
|
||||||
const ollamaResponse = {
|
|
||||||
model: model || response.model || 'unknown',
|
|
||||||
created_at: new Date().toISOString(),
|
|
||||||
done: true
|
|
||||||
};
|
|
||||||
|
|
||||||
// Handle OpenAI format (choices array)
|
|
||||||
if (response.choices && response.choices.length > 0) {
|
|
||||||
const choice = response.choices[0];
|
|
||||||
ollamaResponse.message = {
|
|
||||||
role: choice.message?.role || 'assistant',
|
|
||||||
content: choice.message?.content || ''
|
|
||||||
};
|
|
||||||
|
|
||||||
// Map finish reason
|
|
||||||
if (choice.finish_reason) {
|
|
||||||
ollamaResponse.done_reason = choice.finish_reason === 'stop' ? 'stop' : choice.finish_reason;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
// Handle Claude format (content array)
|
|
||||||
else if (response.content && Array.isArray(response.content)) {
|
|
||||||
let textContent = '';
|
|
||||||
response.content.forEach(block => {
|
|
||||||
if (block.type === 'text' && block.text) {
|
|
||||||
textContent += block.text;
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
ollamaResponse.message = {
|
|
||||||
role: response.role || 'assistant',
|
|
||||||
content: textContent
|
|
||||||
};
|
|
||||||
|
|
||||||
if (response.stop_reason) {
|
|
||||||
ollamaResponse.done_reason = response.stop_reason === 'end_turn' ? 'stop' : response.stop_reason;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
// Handle Gemini format (candidates array)
|
|
||||||
else if (response.candidates && response.candidates.length > 0) {
|
|
||||||
const candidate = response.candidates[0];
|
|
||||||
let textContent = '';
|
|
||||||
if (candidate.content && candidate.content.parts) {
|
|
||||||
textContent = candidate.content.parts
|
|
||||||
.filter(part => part.text)
|
|
||||||
.map(part => part.text)
|
|
||||||
.join('');
|
|
||||||
}
|
|
||||||
|
|
||||||
ollamaResponse.message = {
|
|
||||||
role: candidate.content?.role || 'assistant',
|
|
||||||
content: textContent
|
|
||||||
};
|
|
||||||
|
|
||||||
if (candidate.finishReason) {
|
|
||||||
ollamaResponse.done_reason = candidate.finishReason.toLowerCase();
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Add usage statistics if available
|
|
||||||
const usage = response.usage || response.usageMetadata;
|
|
||||||
if (usage) {
|
|
||||||
ollamaResponse.prompt_eval_count = usage.prompt_tokens || usage.input_tokens || usage.promptTokenCount || 0;
|
|
||||||
ollamaResponse.eval_count = usage.completion_tokens || usage.output_tokens || usage.candidatesTokenCount || 0;
|
|
||||||
ollamaResponse.total_duration = 0;
|
|
||||||
ollamaResponse.load_duration = 0;
|
|
||||||
ollamaResponse.prompt_eval_duration = 0;
|
|
||||||
ollamaResponse.eval_duration = 0;
|
|
||||||
}
|
|
||||||
|
|
||||||
return ollamaResponse;
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* OpenAI/Claude/Gemini generate响应 -> Ollama generate响应
|
|
||||||
*/
|
|
||||||
toOllamaGenerateResponse(response, model) {
|
|
||||||
const ollamaResponse = {
|
|
||||||
model: model || response.model || 'unknown',
|
|
||||||
created_at: new Date().toISOString(),
|
|
||||||
done: true
|
|
||||||
};
|
|
||||||
|
|
||||||
// Handle OpenAI format
|
|
||||||
if (response.choices && response.choices.length > 0) {
|
|
||||||
const choice = response.choices[0];
|
|
||||||
ollamaResponse.response = choice.message?.content || choice.text || '';
|
|
||||||
|
|
||||||
if (choice.finish_reason) {
|
|
||||||
ollamaResponse.done_reason = choice.finish_reason === 'stop' ? 'stop' : choice.finish_reason;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
// Handle Claude format
|
|
||||||
else if (response.content && Array.isArray(response.content)) {
|
|
||||||
let textContent = '';
|
|
||||||
response.content.forEach(block => {
|
|
||||||
if (block.type === 'text' && block.text) {
|
|
||||||
textContent += block.text;
|
|
||||||
}
|
|
||||||
});
|
|
||||||
ollamaResponse.response = textContent;
|
|
||||||
|
|
||||||
if (response.stop_reason) {
|
|
||||||
ollamaResponse.done_reason = response.stop_reason === 'end_turn' ? 'stop' : response.stop_reason;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
// Handle Gemini format
|
|
||||||
else if (response.candidates && response.candidates.length > 0) {
|
|
||||||
const candidate = response.candidates[0];
|
|
||||||
let textContent = '';
|
|
||||||
if (candidate.content && candidate.content.parts) {
|
|
||||||
textContent = candidate.content.parts
|
|
||||||
.filter(part => part.text)
|
|
||||||
.map(part => part.text)
|
|
||||||
.join('');
|
|
||||||
}
|
|
||||||
ollamaResponse.response = textContent;
|
|
||||||
|
|
||||||
if (candidate.finishReason) {
|
|
||||||
ollamaResponse.done_reason = candidate.finishReason.toLowerCase();
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Add usage statistics
|
|
||||||
const genUsage = response.usage || response.usageMetadata;
|
|
||||||
if (genUsage) {
|
|
||||||
ollamaResponse.prompt_eval_count = genUsage.prompt_tokens || genUsage.input_tokens || genUsage.promptTokenCount || 0;
|
|
||||||
ollamaResponse.eval_count = genUsage.completion_tokens || genUsage.output_tokens || genUsage.candidatesTokenCount || 0;
|
|
||||||
ollamaResponse.total_duration = 0;
|
|
||||||
ollamaResponse.load_duration = 0;
|
|
||||||
ollamaResponse.prompt_eval_duration = 0;
|
|
||||||
ollamaResponse.eval_duration = 0;
|
|
||||||
}
|
|
||||||
|
|
||||||
return ollamaResponse;
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* OpenAI/Claude/Gemini流式块 -> Ollama流式块
|
|
||||||
*/
|
|
||||||
toOllamaStreamChunk(chunk, model, isDone = false) {
|
|
||||||
const ollamaChunk = {
|
|
||||||
model: model || 'unknown',
|
|
||||||
created_at: new Date().toISOString(),
|
|
||||||
done: isDone
|
|
||||||
};
|
|
||||||
|
|
||||||
// Handle Claude SSE format
|
|
||||||
if (chunk.type) {
|
|
||||||
if (chunk.type === 'content_block_delta' && chunk.delta) {
|
|
||||||
ollamaChunk.message = {
|
|
||||||
role: 'assistant',
|
|
||||||
content: chunk.delta.text || ''
|
|
||||||
};
|
|
||||||
} else if (chunk.type === 'message_delta' && chunk.usage) {
|
|
||||||
ollamaChunk.message = {
|
|
||||||
role: 'assistant',
|
|
||||||
content: ''
|
|
||||||
};
|
|
||||||
ollamaChunk.prompt_eval_count = 0;
|
|
||||||
ollamaChunk.eval_count = chunk.usage.output_tokens || 0;
|
|
||||||
} else {
|
|
||||||
ollamaChunk.message = {
|
|
||||||
role: 'assistant',
|
|
||||||
content: ''
|
|
||||||
};
|
|
||||||
}
|
|
||||||
}
|
|
||||||
// Handle Gemini format
|
|
||||||
else if (!isDone && chunk.candidates && chunk.candidates.length > 0) {
|
|
||||||
const candidate = chunk.candidates[0];
|
|
||||||
let content = '';
|
|
||||||
if (candidate.content && candidate.content.parts) {
|
|
||||||
content = candidate.content.parts
|
|
||||||
.filter(part => part.text)
|
|
||||||
.map(part => part.text)
|
|
||||||
.join('');
|
|
||||||
}
|
|
||||||
ollamaChunk.message = {
|
|
||||||
role: 'assistant',
|
|
||||||
content: content
|
|
||||||
};
|
|
||||||
}
|
|
||||||
// Handle OpenAI format
|
|
||||||
else if (!isDone && chunk.choices && chunk.choices.length > 0) {
|
|
||||||
const delta = chunk.choices[0].delta;
|
|
||||||
ollamaChunk.message = {
|
|
||||||
role: delta.role || 'assistant',
|
|
||||||
content: delta.content || ''
|
|
||||||
};
|
|
||||||
}
|
|
||||||
// Handle final chunk
|
|
||||||
else if (isDone) {
|
|
||||||
ollamaChunk.message = {
|
|
||||||
role: 'assistant',
|
|
||||||
content: ''
|
|
||||||
};
|
|
||||||
ollamaChunk.done_reason = 'stop';
|
|
||||||
}
|
|
||||||
|
|
||||||
return ollamaChunk;
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* OpenAI/Claude/Gemini流式块 -> Ollama generate流式块
|
|
||||||
*/
|
|
||||||
toOllamaGenerateStreamChunk(chunk, model, isDone = false) {
|
|
||||||
const ollamaChunk = {
|
|
||||||
model: model || 'unknown',
|
|
||||||
created_at: new Date().toISOString(),
|
|
||||||
done: isDone
|
|
||||||
};
|
|
||||||
|
|
||||||
// Handle Claude SSE format
|
|
||||||
if (chunk.type) {
|
|
||||||
if (chunk.type === 'content_block_delta' && chunk.delta) {
|
|
||||||
ollamaChunk.response = chunk.delta.text || '';
|
|
||||||
} else if (chunk.type === 'message_delta' && chunk.usage) {
|
|
||||||
ollamaChunk.response = '';
|
|
||||||
ollamaChunk.prompt_eval_count = 0;
|
|
||||||
ollamaChunk.eval_count = chunk.usage.output_tokens || 0;
|
|
||||||
} else {
|
|
||||||
ollamaChunk.response = '';
|
|
||||||
}
|
|
||||||
}
|
|
||||||
// Handle OpenAI format
|
|
||||||
else if (!isDone && chunk.choices && chunk.choices.length > 0) {
|
|
||||||
const delta = chunk.choices[0].delta;
|
|
||||||
ollamaChunk.response = delta.content || '';
|
|
||||||
}
|
|
||||||
// Handle final chunk
|
|
||||||
else if (isDone) {
|
|
||||||
ollamaChunk.response = '';
|
|
||||||
ollamaChunk.done_reason = 'stop';
|
|
||||||
}
|
|
||||||
|
|
||||||
return ollamaChunk;
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* OpenAI/Claude/Gemini模型列表 -> Ollama tags
|
|
||||||
*/
|
|
||||||
toOllamaTags(modelList, sourceProtocol = null) {
|
|
||||||
const models = [];
|
|
||||||
|
|
||||||
// Handle both OpenAI format (data array) and Gemini format (models array)
|
|
||||||
const sourceModels = modelList.data || modelList.models || [];
|
|
||||||
|
|
||||||
if (Array.isArray(sourceModels)) {
|
|
||||||
sourceModels.forEach(model => {
|
|
||||||
// Get model name
|
|
||||||
let modelName = model.id || model.name || model.displayName || 'unknown';
|
|
||||||
|
|
||||||
// Remove "models/" prefix if present (for Gemini)
|
|
||||||
if (modelName.startsWith('models/')) {
|
|
||||||
modelName = modelName.substring(7); // Remove "models/"
|
|
||||||
}
|
|
||||||
|
|
||||||
// Skip models with invalid names
|
|
||||||
if (modelName === 'unknown' || !modelName) {
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
|
|
||||||
// IMPORTANT: Copilot expects family: "Ollama" with capital O!
|
|
||||||
const modelOwner = 'Ollama';
|
|
||||||
|
|
||||||
models.push({
|
|
||||||
name: modelName,
|
|
||||||
model: modelName,
|
|
||||||
modified_at: new Date().toISOString(),
|
|
||||||
size: 0, // As in the old patch
|
|
||||||
digest: '', // Empty string, as in the old patch
|
|
||||||
details: {
|
|
||||||
parent_model: '',
|
|
||||||
format: 'gguf',
|
|
||||||
family: modelOwner, // "Ollama" with capital O
|
|
||||||
families: [modelOwner],
|
|
||||||
parameter_size: '0B', // As in the old patch
|
|
||||||
quantization_level: OLLAMA_DEFAULT_QUANTIZATION_LEVEL
|
|
||||||
}
|
|
||||||
});
|
|
||||||
});
|
|
||||||
}
|
|
||||||
|
|
||||||
return { models };
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Generate Ollama show response
|
|
||||||
*/
|
|
||||||
toOllamaShowResponse(modelName) {
|
|
||||||
// Minimal implementation, as in the old patch
|
|
||||||
let contextLength = OLLAMA_DEFAULT_CONTEXT_LENGTH;
|
|
||||||
let maxOutputTokens = OLLAMA_DEFAULT_MAX_OUTPUT_TOKENS;
|
|
||||||
let family = 'Ollama'; // ВАЖНО: С большой буквы, как ожидает Copilot!
|
|
||||||
let architecture = 'transformer';
|
|
||||||
|
|
||||||
const lowerName = modelName.toLowerCase();
|
|
||||||
|
|
||||||
// Determine contextLength by model name
|
|
||||||
// Claude models
|
|
||||||
if (lowerName.includes('claude')) {
|
|
||||||
architecture = 'claude';
|
|
||||||
contextLength = OLLAMA_CLAUDE_DEFAULT_CONTEXT_LENGTH; // Default 200K
|
|
||||||
|
|
||||||
// Claude Sonnet 4.5
|
|
||||||
if (lowerName.includes('sonnet-4-5') || lowerName.includes('sonnet-4.5')) {
|
|
||||||
contextLength = OLLAMA_CLAUDE_SONNET_45_CONTEXT_LENGTH; // 200K (1M beta available)
|
|
||||||
maxOutputTokens = OLLAMA_CLAUDE_SONNET_45_MAX_OUTPUT_TOKENS; // 64K output
|
|
||||||
}
|
|
||||||
// Claude Haiku 4.5
|
|
||||||
else if (lowerName.includes('haiku-4-5') || lowerName.includes('haiku-4.5')) {
|
|
||||||
contextLength = OLLAMA_CLAUDE_HAIKU_45_CONTEXT_LENGTH; // 200K
|
|
||||||
maxOutputTokens = OLLAMA_CLAUDE_HAIKU_45_MAX_OUTPUT_TOKENS; // 64K output
|
|
||||||
}
|
|
||||||
// Claude Opus 4.1
|
|
||||||
else if (lowerName.includes('opus-4-1') || lowerName.includes('opus-4.1')) {
|
|
||||||
contextLength = OLLAMA_CLAUDE_OPUS_41_CONTEXT_LENGTH; // 200K
|
|
||||||
maxOutputTokens = OLLAMA_CLAUDE_OPUS_41_MAX_OUTPUT_TOKENS; // 32K output
|
|
||||||
}
|
|
||||||
// Claude Sonnet 4.0 (legacy)
|
|
||||||
else if (lowerName.includes('sonnet-4-0') || lowerName.includes('sonnet-4.0') || lowerName.includes('sonnet-4-20')) {
|
|
||||||
contextLength = OLLAMA_CLAUDE_SONNET_40_CONTEXT_LENGTH; // 200K (1M beta available)
|
|
||||||
maxOutputTokens = OLLAMA_CLAUDE_SONNET_40_MAX_OUTPUT_TOKENS; // 64K output
|
|
||||||
}
|
|
||||||
// Claude Sonnet 3.7 (legacy)
|
|
||||||
else if (lowerName.includes('3-7') || lowerName.includes('3.7')) {
|
|
||||||
contextLength = OLLAMA_CLAUDE_SONNET_37_CONTEXT_LENGTH; // 200K
|
|
||||||
maxOutputTokens = OLLAMA_CLAUDE_SONNET_37_MAX_OUTPUT_TOKENS; // 64K output (128K beta available)
|
|
||||||
}
|
|
||||||
// Claude Opus 4.0 (legacy)
|
|
||||||
else if (lowerName.includes('opus-4-0') || lowerName.includes('opus-4.0') || lowerName.includes('opus-4-20')) {
|
|
||||||
contextLength = OLLAMA_CLAUDE_OPUS_40_CONTEXT_LENGTH; // 200K
|
|
||||||
maxOutputTokens = OLLAMA_CLAUDE_OPUS_40_MAX_OUTPUT_TOKENS; // 32K output
|
|
||||||
}
|
|
||||||
// Claude Haiku 3.5 (legacy)
|
|
||||||
else if (lowerName.includes('haiku-3-5') || lowerName.includes('haiku-3.5')) {
|
|
||||||
contextLength = OLLAMA_CLAUDE_HAIKU_35_CONTEXT_LENGTH; // 200K
|
|
||||||
maxOutputTokens = OLLAMA_CLAUDE_HAIKU_35_MAX_OUTPUT_TOKENS; // 8K output
|
|
||||||
}
|
|
||||||
// Claude Haiku 3.0 (legacy)
|
|
||||||
else if (lowerName.includes('haiku-3-0') || lowerName.includes('haiku-3.0') || lowerName.includes('haiku-20240307')) {
|
|
||||||
contextLength = OLLAMA_CLAUDE_HAIKU_30_CONTEXT_LENGTH; // 200K
|
|
||||||
maxOutputTokens = OLLAMA_CLAUDE_HAIKU_30_MAX_OUTPUT_TOKENS; // 4K output
|
|
||||||
}
|
|
||||||
// Claude Sonnet 3.5 (legacy)
|
|
||||||
else if (lowerName.includes('sonnet-3-5') || lowerName.includes('sonnet-3.5')) {
|
|
||||||
contextLength = OLLAMA_CLAUDE_SONNET_35_CONTEXT_LENGTH; // 200K
|
|
||||||
maxOutputTokens = OLLAMA_CLAUDE_SONNET_35_MAX_OUTPUT_TOKENS; // 8K output
|
|
||||||
}
|
|
||||||
// Claude Opus 3.0 (legacy)
|
|
||||||
else if (lowerName.includes('opus-3-0') || lowerName.includes('opus-3.0') || lowerName.includes('opus') && lowerName.includes('20240229')) {
|
|
||||||
contextLength = OLLAMA_CLAUDE_OPUS_30_CONTEXT_LENGTH; // 200K
|
|
||||||
maxOutputTokens = OLLAMA_CLAUDE_OPUS_30_MAX_OUTPUT_TOKENS; // 4K output
|
|
||||||
}
|
|
||||||
// Default for Claude
|
|
||||||
else {
|
|
||||||
contextLength = OLLAMA_CLAUDE_DEFAULT_CONTEXT_LENGTH; // 200K
|
|
||||||
maxOutputTokens = OLLAMA_CLAUDE_HAIKU_35_MAX_OUTPUT_TOKENS; // 8K output
|
|
||||||
}
|
|
||||||
}
|
|
||||||
// Gemini models
|
|
||||||
else if (lowerName.includes('gemini')) {
|
|
||||||
architecture = 'gemini';
|
|
||||||
|
|
||||||
// Gemini 2.5 Pro
|
|
||||||
if (lowerName.includes('2.5') && lowerName.includes('pro')) {
|
|
||||||
contextLength = OLLAMA_GEMINI_25_PRO_CONTEXT_LENGTH; // 1M input tokens
|
|
||||||
maxOutputTokens = OLLAMA_GEMINI_25_PRO_MAX_OUTPUT_TOKENS; // 65K output tokens
|
|
||||||
}
|
|
||||||
// Gemini 2.5 Flash / Flash-Lite
|
|
||||||
else if (lowerName.includes('2.5') && (lowerName.includes('flash') || lowerName.includes('lite'))) {
|
|
||||||
contextLength = OLLAMA_GEMINI_25_FLASH_CONTEXT_LENGTH; // 1M input tokens
|
|
||||||
maxOutputTokens = OLLAMA_GEMINI_25_FLASH_MAX_OUTPUT_TOKENS; // 65K output tokens
|
|
||||||
}
|
|
||||||
// Gemini 2.5 Flash Image
|
|
||||||
else if (lowerName.includes('2.5') && lowerName.includes('image')) {
|
|
||||||
contextLength = OLLAMA_GEMINI_25_IMAGE_CONTEXT_LENGTH; // 65K input tokens
|
|
||||||
maxOutputTokens = OLLAMA_GEMINI_25_IMAGE_MAX_OUTPUT_TOKENS; // 32K output tokens
|
|
||||||
}
|
|
||||||
// Gemini 2.5 Flash Live / Native Audio
|
|
||||||
else if (lowerName.includes('2.5') && (lowerName.includes('live') || lowerName.includes('native-audio'))) {
|
|
||||||
contextLength = OLLAMA_GEMINI_25_LIVE_CONTEXT_LENGTH; // 131K input tokens
|
|
||||||
maxOutputTokens = OLLAMA_GEMINI_25_LIVE_MAX_OUTPUT_TOKENS; // 8K output tokens
|
|
||||||
}
|
|
||||||
// Gemini 2.5 TTS
|
|
||||||
else if (lowerName.includes('2.5') && lowerName.includes('tts')) {
|
|
||||||
contextLength = OLLAMA_GEMINI_25_TTS_CONTEXT_LENGTH; // 8K input tokens
|
|
||||||
maxOutputTokens = OLLAMA_GEMINI_25_TTS_MAX_OUTPUT_TOKENS; // 16K output tokens
|
|
||||||
}
|
|
||||||
// Gemini 2.0 Flash
|
|
||||||
else if (lowerName.includes('2.0') && lowerName.includes('flash')) {
|
|
||||||
contextLength = OLLAMA_GEMINI_20_FLASH_CONTEXT_LENGTH; // 1M input tokens
|
|
||||||
maxOutputTokens = OLLAMA_GEMINI_20_FLASH_MAX_OUTPUT_TOKENS; // 8K output tokens
|
|
||||||
}
|
|
||||||
// Gemini 2.0 Flash Image
|
|
||||||
else if (lowerName.includes('2.0') && lowerName.includes('image')) {
|
|
||||||
contextLength = OLLAMA_GEMINI_20_IMAGE_CONTEXT_LENGTH; // 32K input tokens
|
|
||||||
maxOutputTokens = OLLAMA_GEMINI_20_IMAGE_MAX_OUTPUT_TOKENS; // 8K output tokens
|
|
||||||
}
|
|
||||||
// Gemini 1.5 Pro (legacy)
|
|
||||||
else if (lowerName.includes('1.5') && lowerName.includes('pro')) {
|
|
||||||
contextLength = OLLAMA_GEMINI_15_PRO_CONTEXT_LENGTH; // 2M tokens
|
|
||||||
maxOutputTokens = OLLAMA_GEMINI_15_PRO_MAX_OUTPUT_TOKENS;
|
|
||||||
}
|
|
||||||
// Gemini 1.5 Flash (legacy)
|
|
||||||
else if (lowerName.includes('1.5') && lowerName.includes('flash')) {
|
|
||||||
contextLength = OLLAMA_GEMINI_15_FLASH_CONTEXT_LENGTH; // 1M tokens
|
|
||||||
maxOutputTokens = OLLAMA_GEMINI_15_FLASH_MAX_OUTPUT_TOKENS;
|
|
||||||
}
|
|
||||||
// Default for Gemini
|
|
||||||
else {
|
|
||||||
contextLength = OLLAMA_GEMINI_DEFAULT_CONTEXT_LENGTH; // 1M tokens
|
|
||||||
maxOutputTokens = OLLAMA_GEMINI_DEFAULT_MAX_OUTPUT_TOKENS;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
// GPT-4 models
|
|
||||||
else if (lowerName.includes('gpt-4')) {
|
|
||||||
architecture = 'gpt';
|
|
||||||
|
|
||||||
if (lowerName.includes('turbo') || lowerName.includes('preview')) {
|
|
||||||
contextLength = OLLAMA_GPT4_TURBO_CONTEXT_LENGTH; // GPT-4 Turbo
|
|
||||||
maxOutputTokens = OLLAMA_GPT4_TURBO_MAX_OUTPUT_TOKENS;
|
|
||||||
} else if (lowerName.includes('32k')) {
|
|
||||||
contextLength = OLLAMA_GPT4_32K_CONTEXT_LENGTH;
|
|
||||||
maxOutputTokens = OLLAMA_GPT4_32K_MAX_OUTPUT_TOKENS;
|
|
||||||
} else {
|
|
||||||
contextLength = OLLAMA_GPT4_BASE_CONTEXT_LENGTH; // GPT-4 base
|
|
||||||
maxOutputTokens = OLLAMA_GPT4_BASE_MAX_OUTPUT_TOKENS;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
// GPT-3.5 models
|
|
||||||
else if (lowerName.includes('gpt-3.5')) {
|
|
||||||
architecture = 'gpt';
|
|
||||||
|
|
||||||
if (lowerName.includes('16k')) {
|
|
||||||
contextLength = OLLAMA_GPT35_16K_CONTEXT_LENGTH;
|
|
||||||
maxOutputTokens = OLLAMA_GPT35_16K_MAX_OUTPUT_TOKENS;
|
|
||||||
} else {
|
|
||||||
contextLength = OLLAMA_GPT35_BASE_CONTEXT_LENGTH;
|
|
||||||
maxOutputTokens = OLLAMA_GPT35_BASE_MAX_OUTPUT_TOKENS;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
// Qwen models
|
|
||||||
else if (lowerName.includes('qwen')) {
|
|
||||||
architecture = 'qwen';
|
|
||||||
|
|
||||||
// Qwen3 Coder Plus (coder-model)
|
|
||||||
if (lowerName.includes('coder-plus') || lowerName.includes('coder_plus') || lowerName.includes('coder-model')) {
|
|
||||||
contextLength = OLLAMA_QWEN_CODER_PLUS_CONTEXT_LENGTH; // 128K tokens
|
|
||||||
maxOutputTokens = OLLAMA_QWEN_CODER_PLUS_MAX_OUTPUT_TOKENS; // 65K output
|
|
||||||
}
|
|
||||||
// Qwen3 VL Plus (vision-model)
|
|
||||||
else if (lowerName.includes('vl-plus') || lowerName.includes('vl_plus') || lowerName.includes('vision-model')) {
|
|
||||||
contextLength = OLLAMA_QWEN_VL_PLUS_CONTEXT_LENGTH; // 256K tokens
|
|
||||||
maxOutputTokens = OLLAMA_QWEN_VL_PLUS_MAX_OUTPUT_TOKENS; // 32K output
|
|
||||||
}
|
|
||||||
// Qwen3 Coder Flash
|
|
||||||
else if (lowerName.includes('coder-flash') || lowerName.includes('coder_flash')) {
|
|
||||||
contextLength = OLLAMA_QWEN_CODER_FLASH_CONTEXT_LENGTH; // 128K tokens
|
|
||||||
maxOutputTokens = OLLAMA_QWEN_CODER_FLASH_MAX_OUTPUT_TOKENS; // 65K output
|
|
||||||
}
|
|
||||||
// Default for Qwen
|
|
||||||
else {
|
|
||||||
contextLength = OLLAMA_QWEN_DEFAULT_CONTEXT_LENGTH; // 32K tokens
|
|
||||||
maxOutputTokens = OLLAMA_QWEN_DEFAULT_MAX_OUTPUT_TOKENS;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Minimal parameter_size, as in the old patch
|
|
||||||
let parameterSize = '0B';
|
|
||||||
|
|
||||||
return {
|
|
||||||
license: '',
|
|
||||||
modelfile: `# Modelfile for ${modelName}\nFROM ${modelName}`,
|
|
||||||
parameters: `num_ctx ${contextLength}\nnum_predict ${maxOutputTokens}\ntemperature ${OLLAMA_DEFAULT_TEMPERATURE}\ntop_p ${OLLAMA_DEFAULT_TOP_P}`,
|
|
||||||
template: '{{ if .System }}{{ .System }}\n{{ end }}{{ .Prompt }}',
|
|
||||||
details: {
|
|
||||||
parent_model: '',
|
|
||||||
format: 'gguf',
|
|
||||||
family: family,
|
|
||||||
families: [family],
|
|
||||||
parameter_size: parameterSize,
|
|
||||||
quantization_level: OLLAMA_SHOW_QUANTIZATION_LEVEL
|
|
||||||
},
|
|
||||||
model_info: {
|
|
||||||
'general.architecture': architecture,
|
|
||||||
'general.file_type': OLLAMA_DEFAULT_FILE_TYPE,
|
|
||||||
'general.parameter_count': 0,
|
|
||||||
'general.quantization_version': OLLAMA_DEFAULT_QUANTIZATION_VERSION,
|
|
||||||
'general.context_length': contextLength,
|
|
||||||
'llama.context_length': contextLength,
|
|
||||||
'llama.rope.freq_base': OLLAMA_DEFAULT_ROPE_FREQ_BASE
|
|
||||||
},
|
|
||||||
capabilities: ['tools', 'vision', 'completion'] // Indicate that the model supports tool calling
|
|
||||||
};
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
@ -49,87 +49,6 @@ export const OPENAI_RESPONSES_DEFAULT_TOP_P = 0.95;
|
||||||
export const OPENAI_RESPONSES_DEFAULT_INPUT_TOKEN_LIMIT = 32768;
|
export const OPENAI_RESPONSES_DEFAULT_INPUT_TOKEN_LIMIT = 32768;
|
||||||
export const OPENAI_RESPONSES_DEFAULT_OUTPUT_TOKEN_LIMIT = 128000;
|
export const OPENAI_RESPONSES_DEFAULT_OUTPUT_TOKEN_LIMIT = 128000;
|
||||||
|
|
||||||
// =============================================================================
|
|
||||||
// Ollama 相关常量
|
|
||||||
// =============================================================================
|
|
||||||
export const OLLAMA_DEFAULT_CONTEXT_LENGTH = 65534;
|
|
||||||
export const OLLAMA_DEFAULT_MAX_OUTPUT_TOKENS = 8192;
|
|
||||||
|
|
||||||
// Claude 模型上下文长度
|
|
||||||
export const OLLAMA_CLAUDE_DEFAULT_CONTEXT_LENGTH = 200000;
|
|
||||||
export const OLLAMA_CLAUDE_SONNET_45_CONTEXT_LENGTH = 200000;
|
|
||||||
export const OLLAMA_CLAUDE_SONNET_45_MAX_OUTPUT_TOKENS = 200000;
|
|
||||||
export const OLLAMA_CLAUDE_HAIKU_45_CONTEXT_LENGTH = 200000;
|
|
||||||
export const OLLAMA_CLAUDE_HAIKU_45_MAX_OUTPUT_TOKENS = 200000;
|
|
||||||
export const OLLAMA_CLAUDE_OPUS_41_CONTEXT_LENGTH = 200000;
|
|
||||||
export const OLLAMA_CLAUDE_OPUS_41_MAX_OUTPUT_TOKENS = 32000;
|
|
||||||
export const OLLAMA_CLAUDE_SONNET_40_CONTEXT_LENGTH = 200000;
|
|
||||||
export const OLLAMA_CLAUDE_SONNET_40_MAX_OUTPUT_TOKENS = 200000;
|
|
||||||
export const OLLAMA_CLAUDE_SONNET_37_CONTEXT_LENGTH = 200000;
|
|
||||||
export const OLLAMA_CLAUDE_SONNET_37_MAX_OUTPUT_TOKENS = 200000;
|
|
||||||
export const OLLAMA_CLAUDE_OPUS_40_CONTEXT_LENGTH = 200000;
|
|
||||||
export const OLLAMA_CLAUDE_OPUS_40_MAX_OUTPUT_TOKENS = 32000;
|
|
||||||
export const OLLAMA_CLAUDE_HAIKU_35_CONTEXT_LENGTH = 200000;
|
|
||||||
export const OLLAMA_CLAUDE_HAIKU_35_MAX_OUTPUT_TOKENS = 200000;
|
|
||||||
export const OLLAMA_CLAUDE_HAIKU_30_CONTEXT_LENGTH = 200000;
|
|
||||||
export const OLLAMA_CLAUDE_HAIKU_30_MAX_OUTPUT_TOKENS = 8192;
|
|
||||||
export const OLLAMA_CLAUDE_SONNET_35_CONTEXT_LENGTH = 200000;
|
|
||||||
export const OLLAMA_CLAUDE_SONNET_35_MAX_OUTPUT_TOKENS = 200000;
|
|
||||||
export const OLLAMA_CLAUDE_OPUS_30_CONTEXT_LENGTH = 200000;
|
|
||||||
export const OLLAMA_CLAUDE_OPUS_30_MAX_OUTPUT_TOKENS = 8192;
|
|
||||||
|
|
||||||
// Gemini 模型上下文长度
|
|
||||||
export const OLLAMA_GEMINI_25_PRO_CONTEXT_LENGTH = 1048576;
|
|
||||||
export const OLLAMA_GEMINI_25_PRO_MAX_OUTPUT_TOKENS = 65534;
|
|
||||||
export const OLLAMA_GEMINI_25_FLASH_CONTEXT_LENGTH = 1048576;
|
|
||||||
export const OLLAMA_GEMINI_25_FLASH_MAX_OUTPUT_TOKENS = 65534;
|
|
||||||
export const OLLAMA_GEMINI_25_IMAGE_CONTEXT_LENGTH = 65534;
|
|
||||||
export const OLLAMA_GEMINI_25_IMAGE_MAX_OUTPUT_TOKENS = 32768;
|
|
||||||
export const OLLAMA_GEMINI_25_LIVE_CONTEXT_LENGTH = 131072;
|
|
||||||
export const OLLAMA_GEMINI_25_LIVE_MAX_OUTPUT_TOKENS = 65534;
|
|
||||||
export const OLLAMA_GEMINI_25_TTS_CONTEXT_LENGTH = 65534;
|
|
||||||
export const OLLAMA_GEMINI_25_TTS_MAX_OUTPUT_TOKENS = 16384;
|
|
||||||
export const OLLAMA_GEMINI_20_FLASH_CONTEXT_LENGTH = 1048576;
|
|
||||||
export const OLLAMA_GEMINI_20_FLASH_MAX_OUTPUT_TOKENS = 65534;
|
|
||||||
export const OLLAMA_GEMINI_20_IMAGE_CONTEXT_LENGTH = 32768;
|
|
||||||
export const OLLAMA_GEMINI_20_IMAGE_MAX_OUTPUT_TOKENS = 65534;
|
|
||||||
export const OLLAMA_GEMINI_15_PRO_CONTEXT_LENGTH = 2097152;
|
|
||||||
export const OLLAMA_GEMINI_15_PRO_MAX_OUTPUT_TOKENS = 65534;
|
|
||||||
export const OLLAMA_GEMINI_15_FLASH_CONTEXT_LENGTH = 1048576;
|
|
||||||
export const OLLAMA_GEMINI_15_FLASH_MAX_OUTPUT_TOKENS = 65534;
|
|
||||||
export const OLLAMA_GEMINI_DEFAULT_CONTEXT_LENGTH = 1048576;
|
|
||||||
export const OLLAMA_GEMINI_DEFAULT_MAX_OUTPUT_TOKENS = 65534;
|
|
||||||
|
|
||||||
// GPT 模型上下文长度
|
|
||||||
export const OLLAMA_GPT4_TURBO_CONTEXT_LENGTH = 128000;
|
|
||||||
export const OLLAMA_GPT4_TURBO_MAX_OUTPUT_TOKENS = 8192;
|
|
||||||
export const OLLAMA_GPT4_32K_CONTEXT_LENGTH = 32768;
|
|
||||||
export const OLLAMA_GPT4_32K_MAX_OUTPUT_TOKENS = 8192;
|
|
||||||
export const OLLAMA_GPT4_BASE_CONTEXT_LENGTH = 200000;
|
|
||||||
export const OLLAMA_GPT4_BASE_MAX_OUTPUT_TOKENS = 8192;
|
|
||||||
export const OLLAMA_GPT35_16K_CONTEXT_LENGTH = 16385;
|
|
||||||
export const OLLAMA_GPT35_16K_MAX_OUTPUT_TOKENS = 8192;
|
|
||||||
export const OLLAMA_GPT35_BASE_CONTEXT_LENGTH = 8192;
|
|
||||||
export const OLLAMA_GPT35_BASE_MAX_OUTPUT_TOKENS = 8192;
|
|
||||||
|
|
||||||
// Qwen 模型上下文长度
|
|
||||||
export const OLLAMA_QWEN_CODER_PLUS_CONTEXT_LENGTH = 128000;
|
|
||||||
export const OLLAMA_QWEN_CODER_PLUS_MAX_OUTPUT_TOKENS = 65534;
|
|
||||||
export const OLLAMA_QWEN_VL_PLUS_CONTEXT_LENGTH = 262144;
|
|
||||||
export const OLLAMA_QWEN_VL_PLUS_MAX_OUTPUT_TOKENS = 32768;
|
|
||||||
export const OLLAMA_QWEN_CODER_FLASH_CONTEXT_LENGTH = 128000;
|
|
||||||
export const OLLAMA_QWEN_CODER_FLASH_MAX_OUTPUT_TOKENS = 65534;
|
|
||||||
export const OLLAMA_QWEN_DEFAULT_CONTEXT_LENGTH = 32768;
|
|
||||||
export const OLLAMA_QWEN_DEFAULT_MAX_OUTPUT_TOKENS = 200000;
|
|
||||||
|
|
||||||
export const OLLAMA_DEFAULT_FILE_TYPE = 2;
|
|
||||||
export const OLLAMA_DEFAULT_QUANTIZATION_VERSION = 2;
|
|
||||||
export const OLLAMA_DEFAULT_ROPE_FREQ_BASE = 10000.0;
|
|
||||||
export const OLLAMA_DEFAULT_TEMPERATURE = 0.7;
|
|
||||||
export const OLLAMA_DEFAULT_TOP_P = 0.9;
|
|
||||||
export const OLLAMA_DEFAULT_QUANTIZATION_LEVEL = 'Q4_0';
|
|
||||||
export const OLLAMA_SHOW_QUANTIZATION_LEVEL = 'Q4_K_M';
|
|
||||||
|
|
||||||
// =============================================================================
|
// =============================================================================
|
||||||
// 通用辅助函数
|
// 通用辅助函数
|
||||||
// =============================================================================
|
// =============================================================================
|
||||||
|
|
|
||||||
|
|
@ -1,796 +0,0 @@
|
||||||
/**
|
|
||||||
* Ollama API 处理器
|
|
||||||
* 处理Ollama特定的端点并在后端协议之间进行转换
|
|
||||||
*/
|
|
||||||
|
|
||||||
import { getRequestBody, handleError, MODEL_PROTOCOL_PREFIX, MODEL_PROVIDER, getProtocolPrefix } from '../utils/common.js';
|
|
||||||
import logger from '../utils/logger.js';
|
|
||||||
import { convertData } from '../convert/convert.js';
|
|
||||||
import { ConverterFactory } from '../converters/ConverterFactory.js';
|
|
||||||
import { getProviderModels } from '../providers/provider-models.js';
|
|
||||||
// Ollama版本号
|
|
||||||
/**
|
|
||||||
* Model name prefix mapping for different providers
|
|
||||||
* These prefixes are added to model names in the list for user visibility
|
|
||||||
* but are removed before sending to actual providers
|
|
||||||
*/
|
|
||||||
export const MODEL_PREFIX_MAP = {
|
|
||||||
[MODEL_PROVIDER.KIRO_API]: '[Kiro]',
|
|
||||||
[MODEL_PROVIDER.CLAUDE_CUSTOM]: '[Claude]',
|
|
||||||
[MODEL_PROVIDER.GEMINI_CLI]: '[Gemini CLI]',
|
|
||||||
[MODEL_PROVIDER.OPENAI_CUSTOM]: '[OpenAI]',
|
|
||||||
[MODEL_PROVIDER.QWEN_API]: '[Qwen CLI]',
|
|
||||||
[MODEL_PROVIDER.OPENAI_CUSTOM_RESPONSES]: '[OpenAI Responses]',
|
|
||||||
[MODEL_PROVIDER.ANTIGRAVITY]: '[Antigravity]',
|
|
||||||
[MODEL_PROVIDER.IFLOW_API]: '[iFlow]',
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Adds provider prefix to model name for display purposes
|
|
||||||
* @param {string} modelName - Original model name
|
|
||||||
* @param {string} provider - Provider type
|
|
||||||
* @returns {string} Model name with prefix
|
|
||||||
*/
|
|
||||||
export function addModelPrefix(modelName, provider) {
|
|
||||||
if (!modelName) return modelName;
|
|
||||||
|
|
||||||
// Don't add prefix if already exists
|
|
||||||
if (/^\[.*?\]\s+/.test(modelName)) {
|
|
||||||
return modelName;
|
|
||||||
}
|
|
||||||
|
|
||||||
const prefix = MODEL_PREFIX_MAP[provider];
|
|
||||||
if (!prefix) {
|
|
||||||
return modelName;
|
|
||||||
}
|
|
||||||
return `${prefix} ${modelName}`;
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Removes provider prefix from model name before sending to provider
|
|
||||||
* @param {string} modelName - Model name with possible prefix
|
|
||||||
* @returns {string} Clean model name without prefix
|
|
||||||
*/
|
|
||||||
export function removeModelPrefix(modelName) {
|
|
||||||
if (!modelName) {
|
|
||||||
return modelName;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Remove any prefix pattern like [Warp], [Kiro], etc.
|
|
||||||
const prefixPattern = /^\[.*?\]\s+/;
|
|
||||||
return modelName.replace(prefixPattern, '');
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Extracts provider type from prefixed model name
|
|
||||||
* @param {string} modelName - Model name with possible prefix
|
|
||||||
* @returns {string|null} Provider type or null if no prefix found
|
|
||||||
*/
|
|
||||||
export function getProviderFromPrefix(modelName) {
|
|
||||||
if (!modelName) {
|
|
||||||
return null;
|
|
||||||
}
|
|
||||||
|
|
||||||
const match = modelName.match(/^\[(.*?)\]/);
|
|
||||||
if (!match) {
|
|
||||||
return null;
|
|
||||||
}
|
|
||||||
|
|
||||||
const prefixText = `[${match[1]}]`;
|
|
||||||
|
|
||||||
// Find provider by prefix
|
|
||||||
for (const [provider, prefix] of Object.entries(MODEL_PREFIX_MAP)) {
|
|
||||||
if (prefix === prefixText) {
|
|
||||||
return provider;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return null;
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Adds provider prefix to array of models (works with any format)
|
|
||||||
* @param {Array} models - Array of model objects
|
|
||||||
* @param {string} provider - Provider type
|
|
||||||
* @param {string} format - Format type ('openai', 'gemini', 'ollama')
|
|
||||||
* @returns {Array} Models with prefixed names
|
|
||||||
*/
|
|
||||||
export function addPrefixToModels(models, provider, format = 'openai') {
|
|
||||||
if (!Array.isArray(models)) return models;
|
|
||||||
|
|
||||||
return models.map(model => {
|
|
||||||
if (format === 'openai') {
|
|
||||||
return { ...model, id: addModelPrefix(model.id, provider) };
|
|
||||||
} else if (format === 'ollama') {
|
|
||||||
return {
|
|
||||||
...model,
|
|
||||||
name: addModelPrefix(model.name, provider),
|
|
||||||
model: addModelPrefix(model.model || model.name, provider)
|
|
||||||
};
|
|
||||||
} else {
|
|
||||||
// gemini/claude format
|
|
||||||
return {
|
|
||||||
...model,
|
|
||||||
name: addModelPrefix(model.name, provider),
|
|
||||||
displayName: model.displayName ? addModelPrefix(model.displayName, provider) : undefined
|
|
||||||
};
|
|
||||||
}
|
|
||||||
});
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Determine which provider to use based on model name
|
|
||||||
* @param {string} modelName - Model name (may include prefix like "[Warp] gpt-5")
|
|
||||||
* @param {Object} providerPoolManager - Provider pool manager
|
|
||||||
* @param {string} defaultProvider - Default provider
|
|
||||||
* @returns {string} Provider type
|
|
||||||
*/
|
|
||||||
export function getProviderByModelName(modelName, providerPoolManager, defaultProvider) {
|
|
||||||
if (!modelName || !providerPoolManager || !providerPoolManager.providerPools) {
|
|
||||||
return defaultProvider;
|
|
||||||
}
|
|
||||||
|
|
||||||
// First, check if model name has a prefix that directly indicates the provider
|
|
||||||
const providerFromPrefix = getProviderFromPrefix(modelName);
|
|
||||||
if (providerFromPrefix) {
|
|
||||||
logger.info(`[Provider Selection] Provider determined from prefix: ${providerFromPrefix}`);
|
|
||||||
return providerFromPrefix;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Remove prefix for further analysis
|
|
||||||
const cleanModelName = removeModelPrefix(modelName);
|
|
||||||
const lowerModelName = cleanModelName.toLowerCase();
|
|
||||||
|
|
||||||
// Check if it's a Claude model
|
|
||||||
if (lowerModelName.includes('claude') || lowerModelName.includes('sonnet') || lowerModelName.includes('opus') || lowerModelName.includes('haiku')) {
|
|
||||||
// Find available Claude provider
|
|
||||||
for (const [providerType, providers] of Object.entries(providerPoolManager.providerPools)) {
|
|
||||||
if (providerType.includes('claude') || providerType.includes('kiro')) {
|
|
||||||
const healthyProvider = providers.find(p => p.isHealthy);
|
|
||||||
if (healthyProvider) {
|
|
||||||
return providerType;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Check if it's a Gemini model
|
|
||||||
if (lowerModelName.includes('gemini')) {
|
|
||||||
// Find available Gemini provider
|
|
||||||
for (const [providerType, providers] of Object.entries(providerPoolManager.providerPools)) {
|
|
||||||
if (providerType.includes('gemini')) {
|
|
||||||
const healthyProvider = providers.find(p => p.isHealthy);
|
|
||||||
if (healthyProvider) {
|
|
||||||
return providerType;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Check if it's a Qwen model
|
|
||||||
if (lowerModelName.includes('qwen')) {
|
|
||||||
// Find available Qwen provider
|
|
||||||
for (const [providerType, providers] of Object.entries(providerPoolManager.providerPools)) {
|
|
||||||
if (providerType.includes('qwen')) {
|
|
||||||
const healthyProvider = providers.find(p => p.isHealthy);
|
|
||||||
if (healthyProvider) {
|
|
||||||
return providerType;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Check if it's a GPT model
|
|
||||||
if (lowerModelName.includes('gpt')) {
|
|
||||||
// Find available OpenAI provider
|
|
||||||
for (const [providerType, providers] of Object.entries(providerPoolManager.providerPools)) {
|
|
||||||
if (providerType.includes('openai')) {
|
|
||||||
const healthyProvider = providers.find(p => p.isHealthy);
|
|
||||||
if (healthyProvider) {
|
|
||||||
return providerType;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return defaultProvider;
|
|
||||||
}
|
|
||||||
|
|
||||||
const OLLAMA_VERSION = '0.12.10';
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Model to Provider Mapper
|
|
||||||
* Maps model names to their corresponding providers
|
|
||||||
*/
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Get provider type for a given model name
|
|
||||||
* @param {string} modelName - The model name to look up (may include prefix like "[Warp] gpt-5")
|
|
||||||
* @param {string} defaultProvider - The default provider if no match is found
|
|
||||||
* @returns {string} The provider type
|
|
||||||
*/
|
|
||||||
export function getProviderForModel(modelName, defaultProvider) {
|
|
||||||
if (!modelName) {
|
|
||||||
return defaultProvider;
|
|
||||||
}
|
|
||||||
|
|
||||||
// First, check if model name has a prefix that directly indicates the provider
|
|
||||||
// const providerFromPrefix = getProviderFromPrefix(modelName);
|
|
||||||
// if (providerFromPrefix) {
|
|
||||||
// return providerFromPrefix;
|
|
||||||
// }
|
|
||||||
|
|
||||||
// Remove prefix for further analysis
|
|
||||||
const cleanModelName = removeModelPrefix(modelName);
|
|
||||||
logger.info(`[Provider Selection] Clean model name: ${cleanModelName}`);
|
|
||||||
|
|
||||||
// Try to find the provider by checking if the model is in the provider's model list
|
|
||||||
// This handles cases where different providers have the same model name
|
|
||||||
const providerType = findProviderByModelName(cleanModelName);
|
|
||||||
logger.info(`[Provider Selection] Provider determined from model list: ${providerType}`);
|
|
||||||
if (providerType) {
|
|
||||||
return providerType;
|
|
||||||
}
|
|
||||||
|
|
||||||
logger.info(`[Provider Selection] Model name not found in provider models. Using default provider: ${defaultProvider}`);
|
|
||||||
// Default to the provided default provider
|
|
||||||
return defaultProvider;
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Find provider type by checking if the model name is in the provider's model list
|
|
||||||
* @param {string} modelName - The model name to look up
|
|
||||||
* @returns {string|null} The provider type or null if not found
|
|
||||||
*/
|
|
||||||
function findProviderByModelName(modelName) {
|
|
||||||
// Map of provider types to check
|
|
||||||
const providerTypes = [
|
|
||||||
MODEL_PROVIDER.GEMINI_CLI,
|
|
||||||
MODEL_PROVIDER.ANTIGRAVITY,
|
|
||||||
MODEL_PROVIDER.KIRO_API,
|
|
||||||
MODEL_PROVIDER.QWEN_API,
|
|
||||||
MODEL_PROVIDER.IFLOW_API
|
|
||||||
];
|
|
||||||
|
|
||||||
// Check each provider's model list
|
|
||||||
for (const providerType of providerTypes) {
|
|
||||||
const models = getProviderModels(providerType);
|
|
||||||
if (models.includes(modelName)) {
|
|
||||||
return providerType;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return null;
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* 规范化 Ollama 路径并检查是否为 Ollama 端点
|
|
||||||
* @param {string} path - 原始路径
|
|
||||||
* @param {URL} requestUrl - 请求 URL 对象
|
|
||||||
* @returns {Object} - { normalizedPath: string, isOllamaEndpoint: boolean }
|
|
||||||
*/
|
|
||||||
export function normalizeOllamaPath(path, requestUrl) {
|
|
||||||
let normalizedPath = path;
|
|
||||||
|
|
||||||
// Normalize common Ollama path aliases (e.g., '/ollama/api/tags' -> '/api/tags')
|
|
||||||
if (normalizedPath.startsWith('/ollama/')) {
|
|
||||||
normalizedPath = normalizedPath.replace(/^\/ollama/, '');
|
|
||||||
if (requestUrl) {
|
|
||||||
requestUrl.pathname = normalizedPath;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Map other common aliases
|
|
||||||
if (normalizedPath === '/v1/models') {
|
|
||||||
normalizedPath = '/api/tags';
|
|
||||||
if (requestUrl) {
|
|
||||||
requestUrl.pathname = normalizedPath;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
if (normalizedPath === '/api/tags/') {
|
|
||||||
normalizedPath = '/api/tags';
|
|
||||||
if (requestUrl) {
|
|
||||||
requestUrl.pathname = normalizedPath;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Check if this is an Ollama endpoint
|
|
||||||
const isOllamaEndpoint = normalizedPath.startsWith('/api/');
|
|
||||||
|
|
||||||
return { normalizedPath, isOllamaEndpoint };
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* 处理所有 Ollama 相关的路径规范化和端点路由
|
|
||||||
* @param {string} method - HTTP 方法
|
|
||||||
* @param {string} path - 请求路径
|
|
||||||
* @param {URL} requestUrl - 请求 URL 对象
|
|
||||||
* @param {Object} req - 请求对象
|
|
||||||
* @param {Object} res - 响应对象
|
|
||||||
* @param {Object} apiService - API 服务实例
|
|
||||||
* @param {Object} currentConfig - 当前配置
|
|
||||||
* @param {Object} providerPoolManager - 提供商池管理器
|
|
||||||
* @returns {Object} - { handled: boolean, normalizedPath: string }
|
|
||||||
*/
|
|
||||||
export async function handleOllamaRequest(method, path, requestUrl, req, res, apiService, currentConfig, providerPoolManager) {
|
|
||||||
// Normalize Ollama paths
|
|
||||||
const { normalizedPath } = normalizeOllamaPath(path, requestUrl);
|
|
||||||
|
|
||||||
// Handle Ollama endpoints before auth check
|
|
||||||
const ollamaHandledBeforeAuth = await handleOllamaEndpointsBeforeAuth(method, normalizedPath, req, res);
|
|
||||||
if (ollamaHandledBeforeAuth) {
|
|
||||||
return { handled: true, normalizedPath };
|
|
||||||
}
|
|
||||||
|
|
||||||
// Handle Ollama endpoints after auth check
|
|
||||||
const ollamaHandledAfterAuth = await handleOllamaEndpointsAfterAuth(method, normalizedPath, req, res, apiService, currentConfig, providerPoolManager);
|
|
||||||
if (ollamaHandledAfterAuth) {
|
|
||||||
return { handled: true, normalizedPath };
|
|
||||||
}
|
|
||||||
|
|
||||||
return { handled: false, normalizedPath };
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* 处理 Ollama 端点路由(在认证检查之前)
|
|
||||||
* @param {string} method - HTTP 方法
|
|
||||||
* @param {string} path - 请求路径
|
|
||||||
* @param {Object} req - 请求对象
|
|
||||||
* @param {Object} res - 响应对象
|
|
||||||
* @returns {boolean} - 是否已处理请求
|
|
||||||
*/
|
|
||||||
export async function handleOllamaEndpointsBeforeAuth(method, path, req, res) {
|
|
||||||
// Handle Ollama API endpoints BEFORE auth check (Ollama doesn't use authentication by default)
|
|
||||||
if (method === 'GET' && path === '/api/version') {
|
|
||||||
handleOllamaVersion(res);
|
|
||||||
return true;
|
|
||||||
}
|
|
||||||
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* 处理 Ollama 端点路由(在认证检查之后)
|
|
||||||
* @param {string} method - HTTP 方法
|
|
||||||
* @param {string} path - 请求路径
|
|
||||||
* @param {Object} req - 请求对象
|
|
||||||
* @param {Object} res - 响应对象
|
|
||||||
* @param {Object} apiService - API 服务实例
|
|
||||||
* @param {Object} currentConfig - 当前配置
|
|
||||||
* @param {Object} providerPoolManager - 提供商池管理器
|
|
||||||
* @returns {boolean} - 是否已处理请求
|
|
||||||
*/
|
|
||||||
export async function handleOllamaEndpointsAfterAuth(method, path, req, res, apiService, currentConfig, providerPoolManager) {
|
|
||||||
// Handle Ollama endpoints that need apiService (after auth check)
|
|
||||||
if (method === 'GET' && path === '/api/tags') {
|
|
||||||
await handleOllamaTags(req, res, apiService, currentConfig, providerPoolManager);
|
|
||||||
return true;
|
|
||||||
}
|
|
||||||
if (method === 'POST' && path === '/api/chat') {
|
|
||||||
await handleOllamaChat(req, res, apiService, currentConfig, providerPoolManager);
|
|
||||||
return true;
|
|
||||||
}
|
|
||||||
if (method === 'POST' && path === '/api/generate') {
|
|
||||||
await handleOllamaGenerate(req, res, apiService, currentConfig, providerPoolManager);
|
|
||||||
return true;
|
|
||||||
}
|
|
||||||
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* 处理 Ollama /api/tags 端点(列出模型)
|
|
||||||
* Note: apiService can be null when called before provider selection (e.g., from /ollama/api/tags)
|
|
||||||
* In this case, we fetch models from all healthy providers in the pool
|
|
||||||
*/
|
|
||||||
export async function handleOllamaTags(req, res, apiService, currentConfig, providerPoolManager) {
|
|
||||||
try {
|
|
||||||
logger.info('[Ollama] Handling /api/tags request');
|
|
||||||
|
|
||||||
const ollamaConverter = ConverterFactory.getConverter(MODEL_PROTOCOL_PREFIX.OLLAMA);
|
|
||||||
const { getServiceAdapter } = await import('../providers/adapter.js');
|
|
||||||
|
|
||||||
// Helper to fetch and convert models from a provider
|
|
||||||
const fetchProviderModels = async (providerType, service) => {
|
|
||||||
try {
|
|
||||||
const models = await service.listModels();
|
|
||||||
const sourceProtocol = getProtocolPrefix(providerType);
|
|
||||||
const tags = ollamaConverter.convertModelList(models, sourceProtocol);
|
|
||||||
|
|
||||||
if (tags.models && Array.isArray(tags.models)) {
|
|
||||||
return addPrefixToModels(tags.models, providerType, 'ollama');
|
|
||||||
}
|
|
||||||
return [];
|
|
||||||
} catch (error) {
|
|
||||||
logger.error(`[Ollama] Error from ${providerType}:`, error.message);
|
|
||||||
return [];
|
|
||||||
}
|
|
||||||
};
|
|
||||||
|
|
||||||
// Collect fetch promises
|
|
||||||
const fetchPromises = [];
|
|
||||||
const processedProviderTypes = new Set();
|
|
||||||
|
|
||||||
// If apiService is provided, use it for the default provider
|
|
||||||
if (apiService) {
|
|
||||||
fetchPromises.push(fetchProviderModels(currentConfig.MODEL_PROVIDER, apiService));
|
|
||||||
processedProviderTypes.add(currentConfig.MODEL_PROVIDER);
|
|
||||||
}
|
|
||||||
|
|
||||||
// Add provider pool fetches (for all healthy providers)
|
|
||||||
if (providerPoolManager?.providerPools) {
|
|
||||||
for (const [providerType, providers] of Object.entries(providerPoolManager.providerPools)) {
|
|
||||||
// Skip if already processed
|
|
||||||
if (processedProviderTypes.has(providerType)) continue;
|
|
||||||
|
|
||||||
const healthyProvider = providers.find(p => p.isHealthy && !p.isDisabled);
|
|
||||||
if (healthyProvider) {
|
|
||||||
const tempConfig = { ...currentConfig, ...healthyProvider, MODEL_PROVIDER: providerType };
|
|
||||||
const service = getServiceAdapter(tempConfig);
|
|
||||||
fetchPromises.push(fetchProviderModels(providerType, service));
|
|
||||||
processedProviderTypes.add(providerType);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// If no providers available, return empty list
|
|
||||||
if (fetchPromises.length === 0) {
|
|
||||||
logger.warn('[Ollama] No healthy providers available to fetch models');
|
|
||||||
const response = { models: [] };
|
|
||||||
res.writeHead(200, {
|
|
||||||
'Content-Type': 'application/json',
|
|
||||||
'Access-Control-Allow-Origin': '*',
|
|
||||||
'Server': `ollama/${OLLAMA_VERSION}`
|
|
||||||
});
|
|
||||||
res.end(JSON.stringify(response));
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Execute all fetches in parallel
|
|
||||||
const results = await Promise.all(fetchPromises);
|
|
||||||
const allModels = results.flat();
|
|
||||||
|
|
||||||
logger.info(`[Ollama] Fetched ${allModels.length} models from ${processedProviderTypes.size} provider(s)`);
|
|
||||||
|
|
||||||
const response = { models: allModels };
|
|
||||||
|
|
||||||
res.writeHead(200, {
|
|
||||||
'Content-Type': 'application/json',
|
|
||||||
'Access-Control-Allow-Origin': '*',
|
|
||||||
'Server': `ollama/${OLLAMA_VERSION}`
|
|
||||||
});
|
|
||||||
res.end(JSON.stringify(response));
|
|
||||||
} catch (error) {
|
|
||||||
logger.error('[Ollama Tags Error]', error);
|
|
||||||
handleError(res, error, MODEL_PROTOCOL_PREFIX.OLLAMA);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* 处理 Ollama /api/show 端点(显示模型信息)
|
|
||||||
*/
|
|
||||||
export async function handleOllamaShow(req, res) {
|
|
||||||
try {
|
|
||||||
// logger.info('[Ollama] Handling /api/show request');
|
|
||||||
|
|
||||||
const body = await getRequestBody(req);
|
|
||||||
const modelName = body.name || body.model || 'unknown';
|
|
||||||
|
|
||||||
const ollamaConverter = ConverterFactory.getConverter(MODEL_PROTOCOL_PREFIX.OLLAMA);
|
|
||||||
const showResponse = ollamaConverter.toOllamaShowResponse(modelName);
|
|
||||||
|
|
||||||
res.writeHead(200, {
|
|
||||||
'Content-Type': 'application/json',
|
|
||||||
'Access-Control-Allow-Origin': '*',
|
|
||||||
'Server': `ollama/${OLLAMA_VERSION}`
|
|
||||||
});
|
|
||||||
res.end(JSON.stringify(showResponse));
|
|
||||||
} catch (error) {
|
|
||||||
logger.error('[Ollama Show Error]', error);
|
|
||||||
handleError(res, error, MODEL_PROTOCOL_PREFIX.OLLAMA);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* 处理 Ollama /api/version 端点
|
|
||||||
*/
|
|
||||||
export function handleOllamaVersion(res) {
|
|
||||||
try {
|
|
||||||
const response = { version: OLLAMA_VERSION };
|
|
||||||
|
|
||||||
res.writeHead(200, {
|
|
||||||
'Content-Type': 'application/json',
|
|
||||||
'Access-Control-Allow-Origin': '*',
|
|
||||||
'Server': `ollama/${OLLAMA_VERSION}`
|
|
||||||
});
|
|
||||||
res.end(JSON.stringify(response));
|
|
||||||
} catch (error) {
|
|
||||||
logger.error('[Ollama Version Error]', error);
|
|
||||||
handleError(res, error, MODEL_PROTOCOL_PREFIX.OLLAMA);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* 处理 Ollama /api/chat 端点
|
|
||||||
* Note: apiService can be null when called before provider selection
|
|
||||||
*/
|
|
||||||
export async function handleOllamaChat(req, res, apiService, currentConfig, providerPoolManager) {
|
|
||||||
try {
|
|
||||||
logger.info('[Ollama] Handling /api/chat request');
|
|
||||||
|
|
||||||
const ollamaRequest = await getRequestBody(req);
|
|
||||||
const { getServiceAdapter } = await import('../providers/adapter.js');
|
|
||||||
|
|
||||||
// Determine provider based on model name
|
|
||||||
const rawModelName = ollamaRequest.model;
|
|
||||||
const modelName = removeModelPrefix(rawModelName);
|
|
||||||
ollamaRequest.model = modelName; // Use clean model name
|
|
||||||
const detectedProvider = getProviderForModel(rawModelName, currentConfig.MODEL_PROVIDER);
|
|
||||||
|
|
||||||
logger.info(`[Ollama] Model: ${modelName}, Detected provider: ${detectedProvider}`);
|
|
||||||
|
|
||||||
// Get the appropriate service based on detected provider
|
|
||||||
let actualApiService = apiService;
|
|
||||||
let actualConfig = currentConfig;
|
|
||||||
|
|
||||||
// If apiService is null or provider is different, get the appropriate service from pool
|
|
||||||
if (!apiService || detectedProvider !== currentConfig.MODEL_PROVIDER) {
|
|
||||||
if (providerPoolManager) {
|
|
||||||
// Select provider from pool (now async)
|
|
||||||
const providerConfig = await providerPoolManager.selectProvider(detectedProvider, modelName, { skipUsageCount: true });
|
|
||||||
if (providerConfig) {
|
|
||||||
actualConfig = {
|
|
||||||
...currentConfig,
|
|
||||||
...providerConfig,
|
|
||||||
MODEL_PROVIDER: detectedProvider
|
|
||||||
};
|
|
||||||
actualApiService = getServiceAdapter(actualConfig);
|
|
||||||
logger.info(`[Ollama] Using provider from pool: ${detectedProvider}`);
|
|
||||||
} else {
|
|
||||||
// No healthy provider in pool, try to create service directly
|
|
||||||
logger.warn(`[Ollama] No healthy provider found for ${detectedProvider} in pool`);
|
|
||||||
if (!apiService) {
|
|
||||||
throw new Error(`No healthy provider available for ${detectedProvider}`);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
} else if (!apiService) {
|
|
||||||
// No pool manager and no apiService, try to create service directly
|
|
||||||
actualConfig = { ...currentConfig, MODEL_PROVIDER: detectedProvider };
|
|
||||||
actualApiService = getServiceAdapter(actualConfig);
|
|
||||||
logger.info(`[Ollama] Created service adapter for: ${detectedProvider}`);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Convert Ollama request to OpenAI format
|
|
||||||
const ollamaConverter = ConverterFactory.getConverter(MODEL_PROTOCOL_PREFIX.OLLAMA);
|
|
||||||
const openaiRequest = ollamaConverter.convertRequest(ollamaRequest, MODEL_PROTOCOL_PREFIX.OPENAI);
|
|
||||||
|
|
||||||
// Get the source protocol from the actual provider
|
|
||||||
const sourceProtocol = getProtocolPrefix(actualConfig.MODEL_PROVIDER);
|
|
||||||
|
|
||||||
// Convert OpenAI format to backend provider format if needed
|
|
||||||
let backendRequest = openaiRequest;
|
|
||||||
if (sourceProtocol !== MODEL_PROTOCOL_PREFIX.OPENAI) {
|
|
||||||
backendRequest = convertData(openaiRequest, 'request', MODEL_PROTOCOL_PREFIX.OPENAI, sourceProtocol);
|
|
||||||
}
|
|
||||||
|
|
||||||
// Handle streaming
|
|
||||||
if (ollamaRequest.stream) {
|
|
||||||
let clientDisconnected = false;
|
|
||||||
let listenersRegistered = false;
|
|
||||||
|
|
||||||
// 监听客户端断开连接(只注册一次)
|
|
||||||
const onClientClose = () => {
|
|
||||||
clientDisconnected = true;
|
|
||||||
logger.info('[Ollama] Client disconnected during streaming');
|
|
||||||
};
|
|
||||||
const onClientError = (err) => {
|
|
||||||
clientDisconnected = true;
|
|
||||||
logger.error('[Ollama] Response stream error:', err.message);
|
|
||||||
};
|
|
||||||
|
|
||||||
if (!listenersRegistered) {
|
|
||||||
res.on('close', onClientClose);
|
|
||||||
res.on('error', onClientError);
|
|
||||||
listenersRegistered = true;
|
|
||||||
}
|
|
||||||
|
|
||||||
try {
|
|
||||||
res.writeHead(200, {
|
|
||||||
'Content-Type': 'application/json',
|
|
||||||
'Transfer-Encoding': 'chunked',
|
|
||||||
'Access-Control-Allow-Origin': '*',
|
|
||||||
'Server': `ollama/${OLLAMA_VERSION}`
|
|
||||||
});
|
|
||||||
|
|
||||||
const stream = await actualApiService.generateContentStream(openaiRequest.model, backendRequest);
|
|
||||||
|
|
||||||
for await (const chunk of stream) {
|
|
||||||
if (clientDisconnected) {
|
|
||||||
logger.info('[Ollama] Stopping stream due to client disconnect');
|
|
||||||
break;
|
|
||||||
}
|
|
||||||
|
|
||||||
try {
|
|
||||||
// Convert backend chunk to Ollama format
|
|
||||||
const ollamaChunk = ollamaConverter.convertStreamChunk(chunk, sourceProtocol, ollamaRequest.model, false);
|
|
||||||
if (!res.writableEnded) {
|
|
||||||
res.write(JSON.stringify(ollamaChunk) + '\n');
|
|
||||||
}
|
|
||||||
} catch (chunkError) {
|
|
||||||
logger.error('[Ollama] Error processing chunk:', chunkError);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Send final chunk
|
|
||||||
if (!clientDisconnected && !res.writableEnded) {
|
|
||||||
const finalChunk = ollamaConverter.convertStreamChunk({}, sourceProtocol, ollamaRequest.model, true);
|
|
||||||
res.write(JSON.stringify(finalChunk) + '\n');
|
|
||||||
res.end();
|
|
||||||
}
|
|
||||||
} finally {
|
|
||||||
if (listenersRegistered) {
|
|
||||||
res.off('close', onClientClose);
|
|
||||||
res.off('error', onClientError);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
// Non-streaming response
|
|
||||||
const backendResponse = await actualApiService.generateContent(openaiRequest.model, backendRequest);
|
|
||||||
const ollamaResponse = ollamaConverter.convertResponse(backendResponse, sourceProtocol, ollamaRequest.model);
|
|
||||||
|
|
||||||
res.writeHead(200, {
|
|
||||||
'Content-Type': 'application/json',
|
|
||||||
'Access-Control-Allow-Origin': '*',
|
|
||||||
'Server': `ollama/${OLLAMA_VERSION}`
|
|
||||||
});
|
|
||||||
res.end(JSON.stringify(ollamaResponse));
|
|
||||||
}
|
|
||||||
} catch (error) {
|
|
||||||
logger.error('[Ollama Chat Error]', error);
|
|
||||||
handleError(res, error, MODEL_PROTOCOL_PREFIX.OLLAMA);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* 处理 Ollama /api/generate 端点
|
|
||||||
* Note: apiService can be null when called before provider selection
|
|
||||||
*/
|
|
||||||
export async function handleOllamaGenerate(req, res, apiService, currentConfig, providerPoolManager) {
|
|
||||||
try {
|
|
||||||
logger.info('[Ollama] Handling /api/generate request');
|
|
||||||
|
|
||||||
const ollamaRequest = await getRequestBody(req);
|
|
||||||
const { getServiceAdapter } = await import('../providers/adapter.js');
|
|
||||||
|
|
||||||
// Determine provider based on model name
|
|
||||||
const rawModelName = ollamaRequest.model;
|
|
||||||
const modelName = removeModelPrefix(rawModelName);
|
|
||||||
ollamaRequest.model = modelName; // Use clean model name
|
|
||||||
const detectedProvider = getProviderForModel(rawModelName, currentConfig.MODEL_PROVIDER);
|
|
||||||
|
|
||||||
logger.info(`[Ollama] Model: ${modelName}, Detected provider: ${detectedProvider}`);
|
|
||||||
|
|
||||||
// Get the appropriate service based on detected provider
|
|
||||||
let actualApiService = apiService;
|
|
||||||
let actualConfig = currentConfig;
|
|
||||||
|
|
||||||
// If apiService is null or provider is different, get the appropriate service from pool
|
|
||||||
if (!apiService || detectedProvider !== currentConfig.MODEL_PROVIDER) {
|
|
||||||
if (providerPoolManager) {
|
|
||||||
// Select provider from pool (now async)
|
|
||||||
const providerConfig = await providerPoolManager.selectProvider(detectedProvider, modelName, { skipUsageCount: true });
|
|
||||||
if (providerConfig) {
|
|
||||||
actualConfig = {
|
|
||||||
...currentConfig,
|
|
||||||
...providerConfig,
|
|
||||||
MODEL_PROVIDER: detectedProvider
|
|
||||||
};
|
|
||||||
actualApiService = getServiceAdapter(actualConfig);
|
|
||||||
logger.info(`[Ollama] Using provider from pool: ${detectedProvider}`);
|
|
||||||
} else {
|
|
||||||
// No healthy provider in pool, try to create service directly
|
|
||||||
logger.warn(`[Ollama] No healthy provider found for ${detectedProvider} in pool`);
|
|
||||||
if (!apiService) {
|
|
||||||
throw new Error(`No healthy provider available for ${detectedProvider}`);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
} else if (!apiService) {
|
|
||||||
// No pool manager and no apiService, try to create service directly
|
|
||||||
actualConfig = { ...currentConfig, MODEL_PROVIDER: detectedProvider };
|
|
||||||
actualApiService = getServiceAdapter(actualConfig);
|
|
||||||
logger.info(`[Ollama] Created service adapter for: ${detectedProvider}`);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Convert Ollama request to OpenAI format
|
|
||||||
const ollamaConverter = ConverterFactory.getConverter(MODEL_PROTOCOL_PREFIX.OLLAMA);
|
|
||||||
const openaiRequest = ollamaConverter.convertRequest(ollamaRequest, MODEL_PROTOCOL_PREFIX.OPENAI);
|
|
||||||
|
|
||||||
// Get the source protocol from the actual provider
|
|
||||||
const sourceProtocol = getProtocolPrefix(actualConfig.MODEL_PROVIDER);
|
|
||||||
|
|
||||||
// Convert OpenAI format to backend provider format if needed
|
|
||||||
let backendRequest = openaiRequest;
|
|
||||||
if (sourceProtocol !== MODEL_PROTOCOL_PREFIX.OPENAI) {
|
|
||||||
backendRequest = convertData(openaiRequest, 'request', MODEL_PROTOCOL_PREFIX.OPENAI, sourceProtocol);
|
|
||||||
}
|
|
||||||
|
|
||||||
// Handle streaming
|
|
||||||
if (ollamaRequest.stream) {
|
|
||||||
let clientDisconnected = false;
|
|
||||||
let listenersRegistered = false;
|
|
||||||
|
|
||||||
// 监听客户端断开连接(只注册一次)
|
|
||||||
const onClientClose = () => {
|
|
||||||
clientDisconnected = true;
|
|
||||||
logger.info('[Ollama Generate] Client disconnected during streaming');
|
|
||||||
};
|
|
||||||
const onClientError = (err) => {
|
|
||||||
clientDisconnected = true;
|
|
||||||
logger.error('[Ollama Generate] Response stream error:', err.message);
|
|
||||||
};
|
|
||||||
|
|
||||||
if (!listenersRegistered) {
|
|
||||||
res.on('close', onClientClose);
|
|
||||||
res.on('error', onClientError);
|
|
||||||
listenersRegistered = true;
|
|
||||||
}
|
|
||||||
|
|
||||||
try {
|
|
||||||
res.writeHead(200, {
|
|
||||||
'Content-Type': 'application/json',
|
|
||||||
'Transfer-Encoding': 'chunked',
|
|
||||||
'Access-Control-Allow-Origin': '*',
|
|
||||||
'Server': `ollama/${OLLAMA_VERSION}`
|
|
||||||
});
|
|
||||||
|
|
||||||
const stream = await actualApiService.generateContentStream(openaiRequest.model, backendRequest);
|
|
||||||
|
|
||||||
for await (const chunk of stream) {
|
|
||||||
if (clientDisconnected) {
|
|
||||||
logger.info('[Ollama Generate] Stopping stream due to client disconnect');
|
|
||||||
break;
|
|
||||||
}
|
|
||||||
|
|
||||||
try {
|
|
||||||
// Convert backend chunk to Ollama generate format
|
|
||||||
const ollamaChunk = ollamaConverter.toOllamaGenerateStreamChunk(chunk, ollamaRequest.model, false);
|
|
||||||
if (!res.writableEnded) {
|
|
||||||
res.write(JSON.stringify(ollamaChunk) + '\n');
|
|
||||||
}
|
|
||||||
} catch (chunkError) {
|
|
||||||
logger.error('[Ollama] Error processing chunk:', chunkError);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Send final chunk
|
|
||||||
if (!clientDisconnected && !res.writableEnded) {
|
|
||||||
const finalChunk = ollamaConverter.toOllamaGenerateStreamChunk({}, ollamaRequest.model, true);
|
|
||||||
res.write(JSON.stringify(finalChunk) + '\n');
|
|
||||||
res.end();
|
|
||||||
}
|
|
||||||
} finally {
|
|
||||||
if (listenersRegistered) {
|
|
||||||
res.off('close', onClientClose);
|
|
||||||
res.off('error', onClientError);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
// Non-streaming response
|
|
||||||
const backendResponse = await actualApiService.generateContent(openaiRequest.model, backendRequest);
|
|
||||||
const ollamaResponse = ollamaConverter.toOllamaGenerateResponse(backendResponse, ollamaRequest.model);
|
|
||||||
|
|
||||||
res.writeHead(200, {
|
|
||||||
'Content-Type': 'application/json',
|
|
||||||
'Access-Control-Allow-Origin': '*',
|
|
||||||
'Server': `ollama/${OLLAMA_VERSION}`
|
|
||||||
});
|
|
||||||
res.end(JSON.stringify(ollamaResponse));
|
|
||||||
}
|
|
||||||
} catch (error) {
|
|
||||||
logger.error('[Ollama Generate Error]', error);
|
|
||||||
handleError(res, error, MODEL_PROTOCOL_PREFIX.OLLAMA);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
@ -7,8 +7,8 @@ import { getApiService, getProviderStatus } from '../services/service-manager.js
|
||||||
import { getProviderPoolManager } from '../services/service-manager.js';
|
import { getProviderPoolManager } from '../services/service-manager.js';
|
||||||
import { MODEL_PROVIDER } from '../utils/common.js';
|
import { MODEL_PROVIDER } from '../utils/common.js';
|
||||||
import { getRegisteredProviders } from '../providers/adapter.js';
|
import { getRegisteredProviders } from '../providers/adapter.js';
|
||||||
|
import { countTokensAnthropic } from '../utils/token-utils.js';
|
||||||
import { PROMPT_LOG_FILENAME } from '../core/config-manager.js';
|
import { PROMPT_LOG_FILENAME } from '../core/config-manager.js';
|
||||||
import { handleOllamaRequest, handleOllamaShow } from './ollama-handler.js';
|
|
||||||
import { getPluginManager } from '../core/plugin-manager.js';
|
import { getPluginManager } from '../core/plugin-manager.js';
|
||||||
import { randomUUID } from 'crypto';
|
import { randomUUID } from 'crypto';
|
||||||
import { handleGrokAssetsProxy } from '../utils/grok-assets-proxy.js';
|
import { handleGrokAssetsProxy } from '../utils/grok-assets-proxy.js';
|
||||||
|
|
@ -93,12 +93,6 @@ export function createRequestHandler(config, providerPoolManager) {
|
||||||
const uiHandled = await handleUIApiRequests(method, path, req, res, currentConfig, providerPoolManager);
|
const uiHandled = await handleUIApiRequests(method, path, req, res, currentConfig, providerPoolManager);
|
||||||
if (uiHandled) return;
|
if (uiHandled) return;
|
||||||
|
|
||||||
// Ollama show endpoint with model name
|
|
||||||
if (method === 'POST' && path === '/ollama/api/show') {
|
|
||||||
await handleOllamaShow(req, res);
|
|
||||||
return true;
|
|
||||||
}
|
|
||||||
|
|
||||||
// logger.info(`\n${new Date().toLocaleString()}`);
|
// logger.info(`\n${new Date().toLocaleString()}`);
|
||||||
logger.info(`[Server] Received request: ${req.method} http://${req.headers.host}${req.url}`);
|
logger.info(`[Server] Received request: ${req.method} http://${req.headers.host}${req.url}`);
|
||||||
|
|
||||||
|
|
@ -170,15 +164,15 @@ export function createRequestHandler(config, providerPoolManager) {
|
||||||
}
|
}
|
||||||
|
|
||||||
// Check if the first path segment matches a MODEL_PROVIDER and switch if it does
|
// Check if the first path segment matches a MODEL_PROVIDER and switch if it does
|
||||||
// Note: 'ollama' is not a valid MODEL_PROVIDER, it's a protocol prefix for Ollama API compatibility
|
|
||||||
const pathSegments = path.split('/').filter(segment => segment.length > 0);
|
const pathSegments = path.split('/').filter(segment => segment.length > 0);
|
||||||
const isOllamaPath = pathSegments[0] === 'ollama' || path.startsWith('/api/');
|
|
||||||
|
|
||||||
if (pathSegments.length > 0 && !isOllamaPath) {
|
if (pathSegments.length > 0) {
|
||||||
const firstSegment = pathSegments[0];
|
const firstSegment = pathSegments[0];
|
||||||
const registeredProviders = getRegisteredProviders();
|
const registeredProviders = getRegisteredProviders();
|
||||||
const isValidProvider = registeredProviders.includes(firstSegment);
|
const isValidProvider = registeredProviders.includes(firstSegment);
|
||||||
if (firstSegment && isValidProvider) {
|
const isAutoMode = firstSegment === MODEL_PROVIDER.AUTO;
|
||||||
|
|
||||||
|
if (firstSegment && (isValidProvider || isAutoMode)) {
|
||||||
currentConfig.MODEL_PROVIDER = firstSegment;
|
currentConfig.MODEL_PROVIDER = firstSegment;
|
||||||
logger.info(`[Config] MODEL_PROVIDER overridden by path segment to: ${currentConfig.MODEL_PROVIDER}`);
|
logger.info(`[Config] MODEL_PROVIDER overridden by path segment to: ${currentConfig.MODEL_PROVIDER}`);
|
||||||
pathSegments.shift();
|
pathSegments.shift();
|
||||||
|
|
@ -215,52 +209,22 @@ export function createRequestHandler(config, providerPoolManager) {
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
// Handle Ollama request BEFORE getting apiService (Ollama endpoints handle their own provider selection)
|
|
||||||
// This is important because Ollama /api/tags aggregates models from ALL providers, not just the default one
|
|
||||||
if (isOllamaPath) {
|
|
||||||
const { handled, normalizedPath } = await handleOllamaRequest(method, path, requestUrl, req, res, null, currentConfig, providerPoolManager);
|
|
||||||
if (handled) return;
|
|
||||||
// If not handled by Ollama handler, continue with normal flow
|
|
||||||
path = normalizedPath;
|
|
||||||
}
|
|
||||||
|
|
||||||
// 获取或选择 API Service 实例
|
|
||||||
let apiService;
|
|
||||||
try {
|
|
||||||
apiService = await getApiService(currentConfig);
|
|
||||||
} catch (error) {
|
|
||||||
handleError(res, { statusCode: 500, message: `Failed to get API service: ${error.message}` }, currentConfig.MODEL_PROVIDER);
|
|
||||||
const poolManager = getProviderPoolManager();
|
|
||||||
if (poolManager) {
|
|
||||||
poolManager.markProviderUnhealthy(currentConfig.MODEL_PROVIDER, {
|
|
||||||
uuid: currentConfig.uuid
|
|
||||||
});
|
|
||||||
}
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Handle count_tokens requests (Anthropic API compatible)
|
// Handle count_tokens requests (Anthropic API compatible)
|
||||||
if (path.includes('/count_tokens') && method === 'POST') {
|
if (path.includes('/count_tokens') && method === 'POST') {
|
||||||
try {
|
try {
|
||||||
const body = await parseRequestBody(req);
|
const body = await parseRequestBody(req);
|
||||||
logger.info(`[Server] Handling count_tokens request for model: ${body.model}`);
|
logger.info(`[Server] Handling count_tokens request for model: ${body.model}`);
|
||||||
|
|
||||||
// Check if apiService has countTokens method
|
// Use common utility method directly
|
||||||
if (apiService && typeof apiService.countTokens === 'function') {
|
try {
|
||||||
const result = apiService.countTokens(body);
|
const result = countTokensAnthropic(body);
|
||||||
res.writeHead(200, { 'Content-Type': 'application/json' });
|
res.writeHead(200, { 'Content-Type': 'application/json' });
|
||||||
res.end(JSON.stringify(result));
|
res.end(JSON.stringify(result));
|
||||||
} else {
|
} catch (tokenError) {
|
||||||
// Fallback: use estimateInputTokens if available
|
logger.warn(`[Server] Common countTokens failed, falling back: ${tokenError.message}`);
|
||||||
if (apiService && typeof apiService.estimateInputTokens === 'function') {
|
// Last resort: return 0
|
||||||
const inputTokens = apiService.estimateInputTokens(body);
|
res.writeHead(200, { 'Content-Type': 'application/json' });
|
||||||
res.writeHead(200, { 'Content-Type': 'application/json' });
|
res.end(JSON.stringify({ input_tokens: 0 }));
|
||||||
res.end(JSON.stringify({ input_tokens: inputTokens }));
|
|
||||||
} else {
|
|
||||||
// Last resort: return 0 with a message
|
|
||||||
res.writeHead(200, { 'Content-Type': 'application/json' });
|
|
||||||
res.end(JSON.stringify({ input_tokens: 0 }));
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
return true;
|
return true;
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
|
|
@ -270,8 +234,23 @@ export function createRequestHandler(config, providerPoolManager) {
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// 获取或选择 API Service 实例
|
||||||
|
let apiService;
|
||||||
|
// try {
|
||||||
|
// apiService = await getApiService(currentConfig);
|
||||||
|
// } catch (error) {
|
||||||
|
// handleError(res, { statusCode: 500, message: `Failed to get API service: ${error.message}` }, currentConfig.MODEL_PROVIDER);
|
||||||
|
// const poolManager = getProviderPoolManager();
|
||||||
|
// if (poolManager) {
|
||||||
|
// poolManager.markProviderUnhealthy(currentConfig.MODEL_PROVIDER, {
|
||||||
|
// uuid: currentConfig.uuid
|
||||||
|
// });
|
||||||
|
// }
|
||||||
|
// return;
|
||||||
|
// }
|
||||||
|
|
||||||
try {
|
try {
|
||||||
// Handle API requests (Ollama requests are already handled above before apiService is obtained)
|
// Handle API requests
|
||||||
const apiHandled = await handleAPIRequests(method, path, req, res, currentConfig, apiService, providerPoolManager, PROMPT_LOG_FILENAME);
|
const apiHandled = await handleAPIRequests(method, path, req, res, currentConfig, apiService, providerPoolManager, PROMPT_LOG_FILENAME);
|
||||||
if (apiHandled) return;
|
if (apiHandled) return;
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -8,7 +8,13 @@ import * as crypto from 'crypto';
|
||||||
import * as http from 'http';
|
import * as http from 'http';
|
||||||
import * as https from 'https';
|
import * as https from 'https';
|
||||||
import { getProviderModels } from '../provider-models.js';
|
import { getProviderModels } from '../provider-models.js';
|
||||||
import { countTokens } from '@anthropic-ai/tokenizer';
|
import {
|
||||||
|
countTextTokens as countTextTokensUtil,
|
||||||
|
estimateInputTokens as estimateInputTokensUtil,
|
||||||
|
countTokensAnthropic as countTokensUtil,
|
||||||
|
processContent as processContentUtil,
|
||||||
|
getContentText as getContentTextUtil
|
||||||
|
} from '../../utils/token-utils.js';
|
||||||
import { configureAxiosProxy } from '../../utils/proxy-utils.js';
|
import { configureAxiosProxy } from '../../utils/proxy-utils.js';
|
||||||
import { isRetryableNetworkError, MODEL_PROVIDER, formatExpiryLog } from '../../utils/common.js';
|
import { isRetryableNetworkError, MODEL_PROVIDER, formatExpiryLog } from '../../utils/common.js';
|
||||||
import { getProviderPoolManager } from '../../services/service-manager.js';
|
import { getProviderPoolManager } from '../../services/service-manager.js';
|
||||||
|
|
@ -720,35 +726,35 @@ async saveCredentialsToFile(filePath, newData) {
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Count tokens for a given text using Claude's official tokenizer
|
||||||
|
* Static version for use without instance
|
||||||
|
*/
|
||||||
|
static countTextTokens(text) {
|
||||||
|
return countTextTokensUtil(text);
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Count tokens for a message request (compatible with Anthropic API)
|
||||||
|
* Static version for use without instance
|
||||||
|
*/
|
||||||
|
static countTokens(requestBody) {
|
||||||
|
return countTokensUtil(requestBody);
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Calculate input tokens from request body
|
||||||
|
* Static version for use without instance
|
||||||
|
*/
|
||||||
|
static estimateInputTokens(requestBody) {
|
||||||
|
return estimateInputTokensUtil(requestBody);
|
||||||
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Extract text content from OpenAI message format
|
* Extract text content from OpenAI message format
|
||||||
*/
|
*/
|
||||||
getContentText(message) {
|
getContentText(message) {
|
||||||
if(message==null){
|
return getContentTextUtil(message);
|
||||||
return "";
|
|
||||||
}
|
|
||||||
if (Array.isArray(message)) {
|
|
||||||
return message.map(part => {
|
|
||||||
if (typeof part === 'string') return part;
|
|
||||||
if (part && typeof part === 'object') {
|
|
||||||
if (part.type === 'text' && part.text) return part.text;
|
|
||||||
if (part.text) return part.text;
|
|
||||||
}
|
|
||||||
return '';
|
|
||||||
}).join('');
|
|
||||||
} else if (typeof message.content === 'string') {
|
|
||||||
return message.content;
|
|
||||||
} else if (Array.isArray(message.content)) {
|
|
||||||
return message.content.map(part => {
|
|
||||||
if (typeof part === 'string') return part;
|
|
||||||
if (part && typeof part === 'object') {
|
|
||||||
if (part.type === 'text' && part.text) return part.text;
|
|
||||||
if (part.text) return part.text;
|
|
||||||
}
|
|
||||||
return '';
|
|
||||||
}).join('');
|
|
||||||
}
|
|
||||||
return String(message.content || message);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
|
@ -757,22 +763,7 @@ async saveCredentialsToFile(filePath, newData) {
|
||||||
* @returns {string} 处理后的文本
|
* @returns {string} 处理后的文本
|
||||||
*/
|
*/
|
||||||
processContent(content) {
|
processContent(content) {
|
||||||
if (!content) return "";
|
return processContentUtil(content);
|
||||||
if (typeof content === 'string') return content;
|
|
||||||
if (Array.isArray(content)) {
|
|
||||||
return content.map(part => {
|
|
||||||
if (typeof part === 'string') return part;
|
|
||||||
if (part && typeof part === 'object') {
|
|
||||||
if (part.type === 'text') return part.text || "";
|
|
||||||
if (part.type === 'thinking') return part.thinking || part.text || "";
|
|
||||||
if (part.type === 'tool_result') return this.processContent(part.content);
|
|
||||||
if (part.type === 'tool_use' && part.input) return JSON.stringify(part.input);
|
|
||||||
if (part.text) return part.text;
|
|
||||||
}
|
|
||||||
return "";
|
|
||||||
}).join("");
|
|
||||||
}
|
|
||||||
return this.getContentText(content);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
_normalizeThinkingBudgetTokens(budgetTokens) {
|
_normalizeThinkingBudgetTokens(budgetTokens) {
|
||||||
|
|
@ -2644,56 +2635,14 @@ async saveCredentialsToFile(filePath, newData) {
|
||||||
* Count tokens for a given text using Claude's official tokenizer
|
* Count tokens for a given text using Claude's official tokenizer
|
||||||
*/
|
*/
|
||||||
countTextTokens(text) {
|
countTextTokens(text) {
|
||||||
if (!text) return 0;
|
return KiroApiService.countTextTokens(text);
|
||||||
try {
|
|
||||||
return countTokens(text);
|
|
||||||
} catch (error) {
|
|
||||||
// Fallback to estimation if tokenizer fails
|
|
||||||
logger.warn('[Kiro] Tokenizer error, falling back to estimation:', error.message);
|
|
||||||
return Math.ceil((text || '').length / 4);
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Calculate input tokens from request body using Claude's official tokenizer
|
* Calculate input tokens from request body using Claude's official tokenizer
|
||||||
*/
|
*/
|
||||||
estimateInputTokens(requestBody) {
|
estimateInputTokens(requestBody) {
|
||||||
let allText = "";
|
return KiroApiService.estimateInputTokens(requestBody);
|
||||||
|
|
||||||
// Count system prompt tokens
|
|
||||||
if (requestBody.system) {
|
|
||||||
allText += this.processContent(requestBody.system);
|
|
||||||
}
|
|
||||||
|
|
||||||
// Count thinking prefix tokens if thinking is enabled
|
|
||||||
if (requestBody.thinking?.type && typeof requestBody.thinking.type === 'string') {
|
|
||||||
const t = requestBody.thinking.type.toLowerCase().trim();
|
|
||||||
if (t === 'enabled') {
|
|
||||||
const budget = this._normalizeThinkingBudgetTokens(requestBody.thinking.budget_tokens);
|
|
||||||
allText += `<thinking_mode>enabled</thinking_mode><max_thinking_length>${budget}</max_thinking_length>`;
|
|
||||||
} else if (t === 'adaptive') {
|
|
||||||
const effortRaw = typeof requestBody.thinking.effort === 'string' ? requestBody.thinking.effort : '';
|
|
||||||
const effort = effortRaw.toLowerCase().trim();
|
|
||||||
const normalizedEffort = (effort === 'low' || effort === 'medium' || effort === 'high') ? effort : 'high';
|
|
||||||
allText += `<thinking_mode>adaptive</thinking_mode><thinking_effort>${normalizedEffort}</thinking_effort>`;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Count all messages tokens
|
|
||||||
if (requestBody.messages && Array.isArray(requestBody.messages)) {
|
|
||||||
for (const message of requestBody.messages) {
|
|
||||||
if (message.content) {
|
|
||||||
allText += this.processContent(message.content);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Count tools definitions tokens if present
|
|
||||||
if (requestBody.tools && Array.isArray(requestBody.tools)) {
|
|
||||||
allText += JSON.stringify(requestBody.tools);
|
|
||||||
}
|
|
||||||
|
|
||||||
return this.countTextTokens(allText);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
|
@ -2957,48 +2906,7 @@ async saveCredentialsToFile(filePath, newData) {
|
||||||
* @returns {Object} { input_tokens: number }
|
* @returns {Object} { input_tokens: number }
|
||||||
*/
|
*/
|
||||||
countTokens(requestBody) {
|
countTokens(requestBody) {
|
||||||
let allText = "";
|
return KiroApiService.countTokens(requestBody);
|
||||||
let extraTokens = 0;
|
|
||||||
|
|
||||||
// Count system prompt tokens
|
|
||||||
if (requestBody.system) {
|
|
||||||
allText += this.processContent(requestBody.system);
|
|
||||||
}
|
|
||||||
|
|
||||||
// Count all messages tokens
|
|
||||||
if (requestBody.messages && Array.isArray(requestBody.messages)) {
|
|
||||||
for (const message of requestBody.messages) {
|
|
||||||
if (message.content) {
|
|
||||||
if (Array.isArray(message.content)) {
|
|
||||||
for (const block of message.content) {
|
|
||||||
if (block.type === 'image') {
|
|
||||||
// Images have a fixed token cost (approximately 1600 tokens for a typical image)
|
|
||||||
// This is an estimation as actual cost depends on image size
|
|
||||||
extraTokens += 1600;
|
|
||||||
} else if (block.type === 'document') {
|
|
||||||
// Documents - estimate based on content if available
|
|
||||||
if (block.source?.data) {
|
|
||||||
// For base64 encoded documents, estimate tokens
|
|
||||||
const estimatedChars = block.source.data.length * 0.75; // base64 to bytes ratio
|
|
||||||
extraTokens += Math.ceil(estimatedChars / 4);
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
allText += this.processContent([block]);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
allText += this.processContent(message.content);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Count tools definitions tokens if present
|
|
||||||
if (requestBody.tools && Array.isArray(requestBody.tools)) {
|
|
||||||
allText += JSON.stringify(requestBody.tools);
|
|
||||||
}
|
|
||||||
|
|
||||||
return { input_tokens: this.countTextTokens(allText) + extraTokens };
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
|
|
||||||
|
|
@ -1,11 +1,11 @@
|
||||||
import * as fs from 'fs';
|
import * as fs from 'fs';
|
||||||
import * as crypto from 'crypto';
|
import { getServiceAdapter, getRegisteredProviders } from './adapter.js';
|
||||||
import { getServiceAdapter } from './adapter.js';
|
|
||||||
import logger from '../utils/logger.js';
|
import logger from '../utils/logger.js';
|
||||||
import { MODEL_PROVIDER, getProtocolPrefix } from '../utils/common.js';
|
import { MODEL_PROVIDER, getProtocolPrefix } from '../utils/common.js';
|
||||||
import { getProviderModels } from './provider-models.js';
|
import { getProviderModels } from './provider-models.js';
|
||||||
import { broadcastEvent } from '../ui-modules/event-broadcast.js';
|
import { broadcastEvent } from '../ui-modules/event-broadcast.js';
|
||||||
import axios from 'axios';
|
import { convertData } from '../convert/convert.js';
|
||||||
|
import { ENDPOINT_TYPE } from '../utils/common.js';
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Manages a pool of API service providers, handling their health and selection.
|
* Manages a pool of API service providers, handling their health and selection.
|
||||||
|
|
@ -1186,12 +1186,107 @@ export class ProviderPoolManager {
|
||||||
return stats;
|
return stats;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Gets all available models across all provider pools, with optional format conversion.
|
||||||
|
* @param {string} [endpointType] - Optional endpoint type for format conversion (OPENAI_MODEL_LIST or GEMINI_MODEL_LIST).
|
||||||
|
* @returns {Promise<Object|Array>} Formatted model list or raw array of model objects.
|
||||||
|
*/
|
||||||
|
async getAllAvailableModels(endpointType = null) {
|
||||||
|
const allModels = [];
|
||||||
|
|
||||||
|
// 获取所有已注册的提供商和号池中的提供商
|
||||||
|
const registeredProviders = getRegisteredProviders();
|
||||||
|
const allProviderTypes = Array.from(new Set([...registeredProviders]));
|
||||||
|
|
||||||
|
for (const providerType of allProviderTypes) {
|
||||||
|
if (this.providerStatus[providerType]) {
|
||||||
|
let models = getProviderModels(providerType);
|
||||||
|
|
||||||
|
// 如果硬编码的模型列表为空,或者该类型的提供商在号池中没有配置节点,尝试从服务获取
|
||||||
|
if (models.length === 0) {
|
||||||
|
try {
|
||||||
|
// 确定使用的配置:优先使用号池中第一个节点的配置,否则使用全局配置
|
||||||
|
let targetConfig = this.globalConfig;
|
||||||
|
if (this.providerStatus[providerType] && this.providerStatus[providerType].length > 0) {
|
||||||
|
targetConfig = this.providerStatus[providerType][0].config;
|
||||||
|
}
|
||||||
|
|
||||||
|
const tempConfig = {
|
||||||
|
...this.globalConfig,
|
||||||
|
...targetConfig,
|
||||||
|
MODEL_PROVIDER: providerType
|
||||||
|
};
|
||||||
|
const serviceAdapter = getServiceAdapter(tempConfig);
|
||||||
|
|
||||||
|
if (typeof serviceAdapter.listModels === 'function') {
|
||||||
|
const nativeModels = await serviceAdapter.listModels();
|
||||||
|
// 统一转换为 OpenAI 格式以便提取 ID
|
||||||
|
const convertedData = convertData(nativeModels, 'modelList', providerType, MODEL_PROVIDER.OPENAI_CUSTOM);
|
||||||
|
if (convertedData && Array.isArray(convertedData.data)) {
|
||||||
|
const fetchedModels = convertedData.data.map(m => m.id);
|
||||||
|
if (fetchedModels.length > 0) {
|
||||||
|
models = fetchedModels;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
} catch (err) {
|
||||||
|
this._log('debug', `Failed to fetch model list for ${providerType} from service: ${err.message}`);
|
||||||
|
// 保持原有的 models (可能是硬编码的空列表或 getProviderModels 返回的结果)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
for (const model of models) {
|
||||||
|
allModels.push({
|
||||||
|
id: `${providerType}:${model}`,
|
||||||
|
provider: providerType,
|
||||||
|
model: model
|
||||||
|
});
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// 如果没有指定 endpointType,返回原始数组
|
||||||
|
if (!endpointType) {
|
||||||
|
return allModels;
|
||||||
|
}
|
||||||
|
|
||||||
|
// 根据 endpointType 转换为对应格式
|
||||||
|
if (endpointType === ENDPOINT_TYPE.OPENAI_MODEL_LIST) {
|
||||||
|
// OpenAI 格式聚合
|
||||||
|
return {
|
||||||
|
object: "list",
|
||||||
|
data: allModels.map(m => ({
|
||||||
|
id: m.id,
|
||||||
|
object: "model",
|
||||||
|
created: Math.floor(Date.now() / 1000),
|
||||||
|
owned_by: m.provider
|
||||||
|
}))
|
||||||
|
};
|
||||||
|
} else if (endpointType === ENDPOINT_TYPE.GEMINI_MODEL_LIST) {
|
||||||
|
// Gemini 格式聚合
|
||||||
|
return {
|
||||||
|
models: allModels.map(m => ({
|
||||||
|
name: `models/${m.id}`,
|
||||||
|
baseModelId: m.model,
|
||||||
|
version: "v1",
|
||||||
|
displayName: `${m.model} (${m.provider})`,
|
||||||
|
description: `Model ${m.model} provided by ${m.provider}`,
|
||||||
|
supportedGenerationMethods: ["generateContent", "countTokens"]
|
||||||
|
}))
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
// 默认返回空列表
|
||||||
|
return { data: [] };
|
||||||
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* 标记提供商需要刷新并推入刷新队列
|
* 标记提供商需要刷新并推入刷新队列
|
||||||
* @param {string} providerType - 提供商类型
|
* @param {string} providerType - 提供商类型
|
||||||
* @param {object} providerConfig - 提供商配置(包含 uuid)
|
* @param {object} providerConfig - 提供商配置(包含 uuid)
|
||||||
*/
|
*/
|
||||||
markProviderNeedRefresh(providerType, providerConfig) {
|
markProviderNeedRefresh(providerType, providerConfig) {
|
||||||
|
|
||||||
if (!providerConfig?.uuid) {
|
if (!providerConfig?.uuid) {
|
||||||
this._log('error', 'Invalid providerConfig in markProviderNeedRefresh');
|
this._log('error', 'Invalid providerConfig in markProviderNeedRefresh');
|
||||||
return;
|
return;
|
||||||
|
|
|
||||||
|
|
@ -13,6 +13,7 @@ import {
|
||||||
getFileName,
|
getFileName,
|
||||||
formatSystemPath
|
formatSystemPath
|
||||||
} from '../utils/provider-utils.js';
|
} from '../utils/provider-utils.js';
|
||||||
|
import { MODEL_PROVIDER } from '../utils/common.js';
|
||||||
|
|
||||||
// 存储 ProviderPoolManager 实例
|
// 存储 ProviderPoolManager 实例
|
||||||
let providerPoolManager = null;
|
let providerPoolManager = null;
|
||||||
|
|
@ -351,6 +352,35 @@ export async function initApiService(config, isReady = false) {
|
||||||
return serviceInstances; // Return the collection of initialized service instances
|
return serviceInstances; // Return the collection of initialized service instances
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* [路由解析层] 负责前置处理前缀和 AUTO 模式转换
|
||||||
|
* @private
|
||||||
|
* @returns {Promise<Object>} { effectiveProvider, actualModelName }
|
||||||
|
*/
|
||||||
|
async function _resolveEffectiveRouting(config, requestedModel) {
|
||||||
|
let effectiveProvider = config.MODEL_PROVIDER;
|
||||||
|
let actualModelName = requestedModel;
|
||||||
|
|
||||||
|
// 1. 处理显式前缀 (无论是否是 AUTO 模式都支持)
|
||||||
|
if (requestedModel && requestedModel.includes(':')) {
|
||||||
|
const [prefix, ...modelParts] = requestedModel.split(':');
|
||||||
|
const modelSuffix = modelParts.join(':');
|
||||||
|
// 检查前缀是否是有效的提供商标识
|
||||||
|
if (providerPoolManager && (providerPoolManager.providerStatus[prefix] || config.providerPools?.[prefix])) {
|
||||||
|
effectiveProvider = prefix;
|
||||||
|
actualModelName = modelSuffix;
|
||||||
|
logger.info(`[Routing] Prefix resolved: ${prefix}:${modelSuffix}`);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// 2. 严格性检查:在 AUTO 模式下,如果到这里还没解析出具体提供商,则报错 (除非是列出模型场景)
|
||||||
|
if (effectiveProvider === MODEL_PROVIDER.AUTO && requestedModel) {
|
||||||
|
throw new Error(`[API Service] Auto-routing failed: Model name must include a provider prefix (e.g., 'provider:model'). Received: '${requestedModel}'`);
|
||||||
|
}
|
||||||
|
|
||||||
|
return { effectiveProvider, actualModelName };
|
||||||
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Get API service adapter, considering provider pools
|
* Get API service adapter, considering provider pools
|
||||||
* @param {Object} config - The current request configuration
|
* @param {Object} config - The current request configuration
|
||||||
|
|
@ -360,11 +390,18 @@ export async function initApiService(config, isReady = false) {
|
||||||
* @returns {Promise<Object>} The API service adapter
|
* @returns {Promise<Object>} The API service adapter
|
||||||
*/
|
*/
|
||||||
export async function getApiService(config, requestedModel = null, options = {}) {
|
export async function getApiService(config, requestedModel = null, options = {}) {
|
||||||
|
// 1. 前置路由解析
|
||||||
|
const { effectiveProvider, actualModelName } = await _resolveEffectiveRouting(config, requestedModel);
|
||||||
|
config.MODEL_PROVIDER = effectiveProvider;
|
||||||
|
|
||||||
|
// 模型列表特殊场景:AUTO 且无模型名
|
||||||
|
if (effectiveProvider === MODEL_PROVIDER.AUTO && !actualModelName) return null;
|
||||||
|
|
||||||
let serviceConfig = config;
|
let serviceConfig = config;
|
||||||
if (providerPoolManager && config.providerPools && config.providerPools[config.MODEL_PROVIDER]) {
|
if (providerPoolManager && config.providerPools && config.providerPools[config.MODEL_PROVIDER]) {
|
||||||
// 如果有号池管理器,并且当前模型提供者类型有对应的号池,则从号池中选择一个提供者配置
|
// 如果有号池管理器,并且当前模型提供者类型有对应的号池,则从号池中选择一个提供者配置
|
||||||
// selectProvider 现在是异步的,使用链式锁确保并发安全
|
// selectProvider 现在是异步的,使用链式锁确保并发安全
|
||||||
const selectedProviderConfig = await providerPoolManager.selectProvider(config.MODEL_PROVIDER, requestedModel, { skipUsageCount: true });
|
const selectedProviderConfig = await providerPoolManager.selectProvider(config.MODEL_PROVIDER, actualModelName, { ...options, skipUsageCount: true });
|
||||||
if (selectedProviderConfig) {
|
if (selectedProviderConfig) {
|
||||||
// 合并选中的提供者配置到当前请求的 config 中
|
// 合并选中的提供者配置到当前请求的 config 中
|
||||||
serviceConfig = deepmerge(config, selectedProviderConfig);
|
serviceConfig = deepmerge(config, selectedProviderConfig);
|
||||||
|
|
@ -372,12 +409,15 @@ export async function getApiService(config, requestedModel = null, options = {})
|
||||||
config.uuid = serviceConfig.uuid;
|
config.uuid = serviceConfig.uuid;
|
||||||
config.customName = serviceConfig.customName;
|
config.customName = serviceConfig.customName;
|
||||||
const customNameDisplay = serviceConfig.customName ? ` (${serviceConfig.customName})` : '';
|
const customNameDisplay = serviceConfig.customName ? ` (${serviceConfig.customName})` : '';
|
||||||
logger.info(`[API Service] Using pooled configuration for ${config.MODEL_PROVIDER}: ${serviceConfig.uuid}${customNameDisplay}${requestedModel ? ` (model: ${requestedModel})` : ''}`);
|
logger.info(`[API Service] Using pooled configuration for ${config.MODEL_PROVIDER}: ${serviceConfig.uuid}${customNameDisplay}${actualModelName ? ` (model: ${actualModelName})` : ''}`);
|
||||||
} else {
|
} else {
|
||||||
const errorMsg = `[API Service] No healthy provider found in pool for ${config.MODEL_PROVIDER}${requestedModel ? ` supporting model: ${requestedModel}` : ''}`;
|
const errorMsg = `[API Service] No healthy provider found in pool for ${config.MODEL_PROVIDER}${actualModelName ? ` supporting model: ${actualModelName}` : ''}`;
|
||||||
logger.error(errorMsg);
|
logger.error(errorMsg);
|
||||||
throw new Error(errorMsg);
|
throw new Error(errorMsg);
|
||||||
}
|
}
|
||||||
|
} else if (effectiveProvider === MODEL_PROVIDER.AUTO && actualModelName) {
|
||||||
|
// 如果在 AUTO 模式下依然没能解析出具体提供商,则报错
|
||||||
|
throw new Error(`[API Service] Auto-routing failed: Model name must include a provider prefix (e.g., 'provider:model'). Received: '${actualModelName}'`);
|
||||||
}
|
}
|
||||||
return getServiceAdapter(serviceConfig);
|
return getServiceAdapter(serviceConfig);
|
||||||
}
|
}
|
||||||
|
|
@ -390,11 +430,20 @@ export async function getApiService(config, requestedModel = null, options = {})
|
||||||
* @returns {Promise<Object>} Object containing service adapter and metadata
|
* @returns {Promise<Object>} Object containing service adapter and metadata
|
||||||
*/
|
*/
|
||||||
export async function getApiServiceWithFallback(config, requestedModel = null, options = {}) {
|
export async function getApiServiceWithFallback(config, requestedModel = null, options = {}) {
|
||||||
|
// 1. 前置路由解析
|
||||||
|
const { effectiveProvider, actualModelName } = await _resolveEffectiveRouting(config, requestedModel);
|
||||||
|
config.MODEL_PROVIDER = effectiveProvider;
|
||||||
|
|
||||||
|
// 模型列表特殊场景:AUTO 且无模型名
|
||||||
|
if (effectiveProvider === MODEL_PROVIDER.AUTO && !actualModelName) {
|
||||||
|
return { service: null, serviceConfig: config, actualProviderType: effectiveProvider, isFallback: false, uuid: null, actualModel: null };
|
||||||
|
}
|
||||||
|
|
||||||
let serviceConfig = config;
|
let serviceConfig = config;
|
||||||
let actualProviderType = config.MODEL_PROVIDER;
|
let actualProviderType = config.MODEL_PROVIDER;
|
||||||
let isFallback = false;
|
let isFallback = false;
|
||||||
let selectedUuid = null;
|
let selectedUuid = null;
|
||||||
let actualModel = null;
|
let actualModel = actualModelName;
|
||||||
|
|
||||||
if (providerPoolManager && config.providerPools && config.providerPools[config.MODEL_PROVIDER]) {
|
if (providerPoolManager && config.providerPools && config.providerPools[config.MODEL_PROVIDER]) {
|
||||||
// selectProviderWithFallback 现在是异步的,使用链式锁确保并发安全
|
// selectProviderWithFallback 现在是异步的,使用链式锁确保并发安全
|
||||||
|
|
@ -406,13 +455,13 @@ export async function getApiServiceWithFallback(config, requestedModel = null, o
|
||||||
// 我们需要一个支持 Fallback 的 acquireSlot
|
// 我们需要一个支持 Fallback 的 acquireSlot
|
||||||
selectedResult = await providerPoolManager.acquireSlotWithFallback(
|
selectedResult = await providerPoolManager.acquireSlotWithFallback(
|
||||||
config.MODEL_PROVIDER,
|
config.MODEL_PROVIDER,
|
||||||
requestedModel,
|
actualModelName,
|
||||||
options
|
options
|
||||||
);
|
);
|
||||||
} else {
|
} else {
|
||||||
selectedResult = await providerPoolManager.selectProviderWithFallback(
|
selectedResult = await providerPoolManager.selectProviderWithFallback(
|
||||||
config.MODEL_PROVIDER,
|
config.MODEL_PROVIDER,
|
||||||
requestedModel,
|
actualModelName,
|
||||||
{ ...options, skipUsageCount: true }
|
{ ...options, skipUsageCount: true }
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
|
@ -427,17 +476,20 @@ export async function getApiServiceWithFallback(config, requestedModel = null, o
|
||||||
actualProviderType = selectedType;
|
actualProviderType = selectedType;
|
||||||
isFallback = fallbackUsed;
|
isFallback = fallbackUsed;
|
||||||
selectedUuid = selectedProviderConfig.uuid;
|
selectedUuid = selectedProviderConfig.uuid;
|
||||||
actualModel = fallbackModel;
|
actualModel = fallbackModel || actualModelName;
|
||||||
|
|
||||||
// 如果发生了 fallback,需要更新 MODEL_PROVIDER
|
// 如果发生了 fallback,需要更新 MODEL_PROVIDER
|
||||||
if (isFallback) {
|
if (isFallback) {
|
||||||
serviceConfig.MODEL_PROVIDER = actualProviderType;
|
serviceConfig.MODEL_PROVIDER = actualProviderType;
|
||||||
}
|
}
|
||||||
} else {
|
} else {
|
||||||
const errorMsg = `[API Service] No healthy provider found in pool (including fallback) for ${config.MODEL_PROVIDER}${requestedModel ? ` supporting model: ${requestedModel}` : ''}`;
|
const errorMsg = `[API Service] No healthy provider found in pool (including fallback) for ${config.MODEL_PROVIDER}${actualModelName ? ` supporting model: ${actualModelName}` : ''}`;
|
||||||
logger.error(errorMsg);
|
logger.error(errorMsg);
|
||||||
throw new Error(errorMsg);
|
throw new Error(errorMsg);
|
||||||
}
|
}
|
||||||
|
} else if (effectiveProvider === MODEL_PROVIDER.AUTO && actualModelName) {
|
||||||
|
// 如果在 AUTO 模式下依然没能解析出具体提供商,则报错
|
||||||
|
throw new Error(`[API Service] Auto-routing failed: Model name must include a provider prefix (e.g., 'provider:model'). Received: '${actualModelName}'`);
|
||||||
}
|
}
|
||||||
|
|
||||||
const service = getServiceAdapter(serviceConfig);
|
const service = getServiceAdapter(serviceConfig);
|
||||||
|
|
|
||||||
|
|
@ -55,7 +55,6 @@ export const MODEL_PROTOCOL_PREFIX = {
|
||||||
OPENAI: 'openai',
|
OPENAI: 'openai',
|
||||||
OPENAI_RESPONSES: 'openaiResponses',
|
OPENAI_RESPONSES: 'openaiResponses',
|
||||||
CLAUDE: 'claude',
|
CLAUDE: 'claude',
|
||||||
OLLAMA: 'ollama',
|
|
||||||
CODEX: 'codex',
|
CODEX: 'codex',
|
||||||
FORWARD: 'forward',
|
FORWARD: 'forward',
|
||||||
GROK: 'grok',
|
GROK: 'grok',
|
||||||
|
|
@ -74,6 +73,7 @@ export const MODEL_PROVIDER = {
|
||||||
CODEX_API: 'openai-codex-oauth',
|
CODEX_API: 'openai-codex-oauth',
|
||||||
FORWARD_API: 'forward-api',
|
FORWARD_API: 'forward-api',
|
||||||
GROK_CUSTOM: 'grok-custom',
|
GROK_CUSTOM: 'grok-custom',
|
||||||
|
AUTO: 'auto',
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
|
@ -795,50 +795,64 @@ export async function handleUnaryRequest(res, service, model, requestBody, fromP
|
||||||
* and sends the JSON response.
|
* and sends the JSON response.
|
||||||
* @param {http.IncomingMessage} req The HTTP request object.
|
* @param {http.IncomingMessage} req The HTTP request object.
|
||||||
* @param {http.ServerResponse} res The HTTP response object.
|
* @param {http.ServerResponse} res The HTTP response object.
|
||||||
|
* @param {Object} service - The API service instance.
|
||||||
* @param {string} endpointType The type of endpoint being called (e.g., OPENAI_MODEL_LIST).
|
* @param {string} endpointType The type of endpoint being called (e.g., OPENAI_MODEL_LIST).
|
||||||
* @param {Object} CONFIG - The server configuration object.
|
* @param {Object} CONFIG - The server configuration object.
|
||||||
|
* @param {Object} providerPoolManager - The provider pool manager instance.
|
||||||
|
* @param {string} pooluuid - The selected provider UUID.
|
||||||
*/
|
*/
|
||||||
export async function handleModelListRequest(req, res, service, endpointType, CONFIG, providerPoolManager, pooluuid) {
|
export async function handleModelListRequest(req, res, service, endpointType, CONFIG, providerPoolManager, pooluuid) {
|
||||||
try{
|
try {
|
||||||
const clientProviderMap = {
|
const clientProviderMap = {
|
||||||
[ENDPOINT_TYPE.OPENAI_MODEL_LIST]: MODEL_PROTOCOL_PREFIX.OPENAI,
|
[ENDPOINT_TYPE.OPENAI_MODEL_LIST]: MODEL_PROTOCOL_PREFIX.OPENAI,
|
||||||
[ENDPOINT_TYPE.GEMINI_MODEL_LIST]: MODEL_PROTOCOL_PREFIX.GEMINI,
|
[ENDPOINT_TYPE.GEMINI_MODEL_LIST]: MODEL_PROTOCOL_PREFIX.GEMINI,
|
||||||
};
|
};
|
||||||
|
|
||||||
|
|
||||||
const fromProvider = clientProviderMap[endpointType];
|
const fromProvider = clientProviderMap[endpointType];
|
||||||
const toProvider = CONFIG.MODEL_PROVIDER;
|
|
||||||
|
|
||||||
if (!fromProvider) {
|
if (!fromProvider) {
|
||||||
throw new Error(`Unsupported endpoint type for model list: ${endpointType}`);
|
throw new Error(`Unsupported endpoint type for model list: ${endpointType}`);
|
||||||
}
|
}
|
||||||
|
|
||||||
// 1. Get the model list in the backend's native format.
|
let clientModelList;
|
||||||
const nativeModelList = await service.listModels();
|
|
||||||
|
// --- 核心逻辑: auto 路由模式下的模型聚合 ---
|
||||||
// 2. Convert the model list to the client's expected format, if necessary.
|
if (CONFIG.MODEL_PROVIDER === MODEL_PROVIDER.AUTO && providerPoolManager) {
|
||||||
let clientModelList = nativeModelList;
|
logger.info(`[ModelList] Aggregating models for 'auto' mode...`);
|
||||||
if (!getProtocolPrefix(toProvider).includes(getProtocolPrefix(fromProvider))) {
|
clientModelList = await providerPoolManager.getAllAvailableModels(endpointType);
|
||||||
logger.info(`[ModelList Convert] Converting model list from ${toProvider} to ${fromProvider}`);
|
|
||||||
clientModelList = convertData(nativeModelList, 'modelList', toProvider, fromProvider);
|
|
||||||
} else {
|
} else {
|
||||||
logger.info(`[ModelList Convert] Model list format matches. No conversion needed.`);
|
// --- 原有的单提供商逻辑 ---
|
||||||
|
const toProvider = CONFIG.MODEL_PROVIDER;
|
||||||
|
|
||||||
|
// 1. Get the model list in the backend's native format.
|
||||||
|
const nativeModelList = await service.listModels();
|
||||||
|
|
||||||
|
// 2. Convert the model list to the client's expected format, if necessary.
|
||||||
|
clientModelList = nativeModelList;
|
||||||
|
if (!getProtocolPrefix(toProvider).includes(getProtocolPrefix(fromProvider))) {
|
||||||
|
logger.info(`[ModelList Convert] Converting model list from ${toProvider} to ${fromProvider}`);
|
||||||
|
clientModelList = convertData(nativeModelList, 'modelList', toProvider, fromProvider);
|
||||||
|
} else {
|
||||||
|
logger.info(`[ModelList Convert] Model list format matches. No conversion needed.`);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
logger.info(`[ModelList Response] Sending model list to client: ${JSON.stringify(clientModelList)}`);
|
// logger.info(`[ModelList Response] Sending model list to client: ${JSON.stringify(clientModelList)}`);
|
||||||
res.writeHead(200, { 'Content-Type': 'application/json' });
|
res.writeHead(200, { 'Content-Type': 'application/json' });
|
||||||
res.end(JSON.stringify(clientModelList));
|
res.end(JSON.stringify(clientModelList));
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
logger.error('\n[Server] Error during model list processing:', error.stack);
|
logger.error('\n[Server] Error during model list processing:', error.stack);
|
||||||
if (providerPoolManager) {
|
// if (providerPoolManager && pooluuid && CONFIG.MODEL_PROVIDER !== MODEL_PROVIDER.AUTO) {
|
||||||
// 如果是号池模式,并且请求处理失败,则标记当前使用的提供者为不健康
|
// // 如果是号池模式(且非 auto 模式),并且请求处理失败,则标记当前使用的提供者为不健康
|
||||||
providerPoolManager.markProviderUnhealthy(toProvider, {
|
// providerPoolManager.markProviderUnhealthy(CONFIG.MODEL_PROVIDER, {
|
||||||
uuid: pooluuid
|
// uuid: pooluuid
|
||||||
}, error.message);
|
// }, error.message);
|
||||||
}
|
// }
|
||||||
|
handleError(res, error, CONFIG.MODEL_PROVIDER);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Handles requests for content generation (both unary and streaming). This function
|
* Handles requests for content generation (both unary and streaming). This function
|
||||||
* orchestrates request body parsing, conversion to the internal Gemini format,
|
* orchestrates request body parsing, conversion to the internal Gemini format,
|
||||||
|
|
@ -884,7 +898,7 @@ export async function handleContentGenerationRequest(req, res, service, endpoint
|
||||||
|
|
||||||
// 2.5. 如果使用了提供商池,根据模型重新选择提供商(支持 Fallback)
|
// 2.5. 如果使用了提供商池,根据模型重新选择提供商(支持 Fallback)
|
||||||
// 注意:这里开启 acquireSlot: true,会占用并发名额或进入队列
|
// 注意:这里开启 acquireSlot: true,会占用并发名额或进入队列
|
||||||
if (providerPoolManager && CONFIG.providerPools && CONFIG.providerPools[CONFIG.MODEL_PROVIDER]) {
|
if (providerPoolManager && (CONFIG.MODEL_PROVIDER === MODEL_PROVIDER.AUTO || (CONFIG.providerPools && CONFIG.providerPools[CONFIG.MODEL_PROVIDER]))) {
|
||||||
const { getApiServiceWithFallback } = await import('../services/service-manager.js');
|
const { getApiServiceWithFallback } = await import('../services/service-manager.js');
|
||||||
const result = await getApiServiceWithFallback(CONFIG, model, { acquireSlot: true });
|
const result = await getApiServiceWithFallback(CONFIG, model, { acquireSlot: true });
|
||||||
|
|
||||||
|
|
@ -1212,35 +1226,6 @@ function _getProviderSpecificSuggestions(statusCode, provider) {
|
||||||
]
|
]
|
||||||
};
|
};
|
||||||
|
|
||||||
case MODEL_PROTOCOL_PREFIX.OLLAMA:
|
|
||||||
return {
|
|
||||||
auth: [
|
|
||||||
'Ollama typically does not require authentication',
|
|
||||||
'If using a custom setup, verify your credentials',
|
|
||||||
'Check if the Ollama server requires authentication'
|
|
||||||
],
|
|
||||||
permission: [
|
|
||||||
'Verify the Ollama server is accessible',
|
|
||||||
'Check if the requested model is available locally',
|
|
||||||
'Ensure the Ollama server allows the requested operation'
|
|
||||||
],
|
|
||||||
rateLimit: [
|
|
||||||
'The local Ollama server may be overloaded',
|
|
||||||
'Try reducing concurrent requests',
|
|
||||||
'Consider increasing server resources if running locally'
|
|
||||||
],
|
|
||||||
serverError: [
|
|
||||||
'Check if the Ollama server is running',
|
|
||||||
'Verify the server address and port are correct',
|
|
||||||
'Check Ollama server logs for detailed error information'
|
|
||||||
],
|
|
||||||
clientError: [
|
|
||||||
'Check your request format and parameters',
|
|
||||||
'Verify the model name is available in your Ollama installation',
|
|
||||||
'Try pulling the model first with: ollama pull <model-name>'
|
|
||||||
]
|
|
||||||
};
|
|
||||||
|
|
||||||
default:
|
default:
|
||||||
return defaultSuggestions;
|
return defaultSuggestions;
|
||||||
}
|
}
|
||||||
|
|
|
||||||
170
src/utils/token-utils.js
Normal file
170
src/utils/token-utils.js
Normal file
|
|
@ -0,0 +1,170 @@
|
||||||
|
import { countTokens } from '@anthropic-ai/tokenizer';
|
||||||
|
import logger from './logger.js';
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Extract text content from message format
|
||||||
|
*/
|
||||||
|
export function getContentText(message) {
|
||||||
|
if (message == null) {
|
||||||
|
return "";
|
||||||
|
}
|
||||||
|
if (Array.isArray(message)) {
|
||||||
|
return message.map(part => {
|
||||||
|
if (typeof part === 'string') return part;
|
||||||
|
if (part && typeof part === 'object') {
|
||||||
|
if (part.type === 'text' && part.text) return part.text;
|
||||||
|
if (part.text) return part.text;
|
||||||
|
}
|
||||||
|
return '';
|
||||||
|
}).join('');
|
||||||
|
} else if (typeof message.content === 'string') {
|
||||||
|
return message.content;
|
||||||
|
} else if (Array.isArray(message.content)) {
|
||||||
|
return message.content.map(part => {
|
||||||
|
if (typeof part === 'string') return part;
|
||||||
|
if (part && typeof part === 'object') {
|
||||||
|
if (part.type === 'text' && part.text) return part.text;
|
||||||
|
if (part.text) return part.text;
|
||||||
|
}
|
||||||
|
return '';
|
||||||
|
}).join('');
|
||||||
|
}
|
||||||
|
return String(message.content || message);
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Process content blocks into text
|
||||||
|
* @param {any} content - content object or array
|
||||||
|
* @returns {string} processed text
|
||||||
|
*/
|
||||||
|
export function processContent(content) {
|
||||||
|
if (!content) return "";
|
||||||
|
if (typeof content === 'string') return content;
|
||||||
|
if (Array.isArray(content)) {
|
||||||
|
return content.map(part => {
|
||||||
|
if (typeof part === 'string') return part;
|
||||||
|
if (part && typeof part === 'object') {
|
||||||
|
if (part.type === 'text') return part.text || "";
|
||||||
|
if (part.type === 'thinking') return part.thinking || part.text || "";
|
||||||
|
if (part.type === 'tool_result') return processContent(part.content);
|
||||||
|
if (part.type === 'tool_use' && part.input) return JSON.stringify(part.input);
|
||||||
|
if (part.text) return part.text;
|
||||||
|
}
|
||||||
|
return "";
|
||||||
|
}).join("");
|
||||||
|
}
|
||||||
|
return getContentText(content);
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Count tokens for a given text using Claude's official tokenizer
|
||||||
|
*/
|
||||||
|
export function countTextTokens(text) {
|
||||||
|
if (!text) return 0;
|
||||||
|
try {
|
||||||
|
return countTokens(text);
|
||||||
|
} catch (error) {
|
||||||
|
// Fallback to estimation if tokenizer fails
|
||||||
|
logger.warn('[TokenUtils] Tokenizer error, falling back to estimation:', error.message);
|
||||||
|
return Math.ceil((text || '').length / 4);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Calculate input tokens from request body using Claude's official tokenizer
|
||||||
|
*/
|
||||||
|
export function estimateInputTokens(requestBody) {
|
||||||
|
let allText = "";
|
||||||
|
|
||||||
|
// Count system prompt tokens
|
||||||
|
if (requestBody.system) {
|
||||||
|
allText += processContent(requestBody.system);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Count thinking prefix tokens if thinking is enabled
|
||||||
|
if (requestBody.thinking?.type && typeof requestBody.thinking.type === 'string') {
|
||||||
|
const t = requestBody.thinking.type.toLowerCase().trim();
|
||||||
|
if (t === 'enabled') {
|
||||||
|
const budgetTokens = requestBody.thinking.budget_tokens;
|
||||||
|
let budget = Number(budgetTokens);
|
||||||
|
if (!Number.isFinite(budget) || budget <= 0) {
|
||||||
|
budget = 20000;
|
||||||
|
}
|
||||||
|
budget = Math.floor(budget);
|
||||||
|
if (budget < 1024) budget = 1024;
|
||||||
|
budget = Math.min(budget, 24576);
|
||||||
|
allText += `<thinking_mode>enabled</thinking_mode><max_thinking_length>${budget}</max_thinking_length>`;
|
||||||
|
}
|
||||||
|
else if (t === 'adaptive') {
|
||||||
|
const effortRaw = typeof requestBody.thinking.effort === 'string' ? requestBody.thinking.effort : '';
|
||||||
|
const effort = effortRaw.toLowerCase().trim();
|
||||||
|
const normalizedEffort = (effort === 'low' || effort === 'medium' || effort === 'high') ? effort : 'high';
|
||||||
|
allText += `<thinking_mode>adaptive</thinking_mode><thinking_effort>${normalizedEffort}</thinking_effort>`;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Count all messages tokens
|
||||||
|
if (requestBody.messages && Array.isArray(requestBody.messages)) {
|
||||||
|
for (const message of requestBody.messages) {
|
||||||
|
if (message.content) {
|
||||||
|
allText += processContent(message.content);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Count tools definitions tokens if present
|
||||||
|
if (requestBody.tools && Array.isArray(requestBody.tools)) {
|
||||||
|
allText += JSON.stringify(requestBody.tools);
|
||||||
|
}
|
||||||
|
|
||||||
|
return countTextTokens(allText);
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Count tokens for a message request (compatible with Anthropic API)
|
||||||
|
* @param {Object} requestBody - The request body containing model, messages, system, tools, etc.
|
||||||
|
* @returns {Object} { input_tokens: number }
|
||||||
|
*/
|
||||||
|
export function countTokensAnthropic(requestBody) {
|
||||||
|
let allText = "";
|
||||||
|
let extraTokens = 0;
|
||||||
|
|
||||||
|
// Count system prompt tokens
|
||||||
|
if (requestBody.system) {
|
||||||
|
allText += processContent(requestBody.system);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Count all messages tokens
|
||||||
|
if (requestBody.messages && Array.isArray(requestBody.messages)) {
|
||||||
|
for (const message of requestBody.messages) {
|
||||||
|
if (message.content) {
|
||||||
|
if (Array.isArray(message.content)) {
|
||||||
|
for (const block of message.content) {
|
||||||
|
if (block.type === 'image') {
|
||||||
|
// Images have a fixed token cost (approximately 1600 tokens for a typical image)
|
||||||
|
extraTokens += 1600;
|
||||||
|
} else if (block.type === 'document') {
|
||||||
|
// Documents - estimate based on content if available
|
||||||
|
if (block.source?.data) {
|
||||||
|
// For base64 encoded documents, estimate tokens
|
||||||
|
const estimatedChars = block.source.data.length * 0.75; // base64 to bytes ratio
|
||||||
|
extraTokens += Math.ceil(estimatedChars / 4);
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
allText += processContent([block]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
allText += processContent(message.content);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Count tools definitions tokens if present
|
||||||
|
if (requestBody.tools && Array.isArray(requestBody.tools)) {
|
||||||
|
allText += JSON.stringify(requestBody.tools);
|
||||||
|
}
|
||||||
|
|
||||||
|
return { input_tokens: countTextTokens(allText) + extraTokens };
|
||||||
|
}
|
||||||
|
|
@ -500,7 +500,7 @@ const translations = {
|
||||||
'modal.provider.field.codexBaseUrl': 'Codex Base URL',
|
'modal.provider.field.codexBaseUrl': 'Codex Base URL',
|
||||||
'modal.provider.field.apiKey': 'API 密钥',
|
'modal.provider.field.apiKey': 'API 密钥',
|
||||||
'modal.provider.field.apiKey.placeholder': '请输入 API 密钥',
|
'modal.provider.field.apiKey.placeholder': '请输入 API 密钥',
|
||||||
'modal.provider.field.projectId.placeholder': 'Google Cloud 项目 ID',
|
'modal.provider.field.projectId.placeholder': 'Google Cloud 项目 ID (留空自动发现)',
|
||||||
'modal.provider.field.projectId.optional.placeholder': 'Google Cloud 项目 ID (留空自动发现)',
|
'modal.provider.field.projectId.optional.placeholder': 'Google Cloud 项目 ID (留空自动发现)',
|
||||||
'modal.provider.field.oauthPath.gemini.placeholder': '例如: ~/.gemini/oauth_creds.json',
|
'modal.provider.field.oauthPath.gemini.placeholder': '例如: ~/.gemini/oauth_creds.json',
|
||||||
'modal.provider.field.oauthPath.kiro.placeholder': '例如: ~/.aws/sso/cache/kiro-auth-token.json',
|
'modal.provider.field.oauthPath.kiro.placeholder': '例如: ~/.aws/sso/cache/kiro-auth-token.json',
|
||||||
|
|
@ -659,10 +659,6 @@ const translations = {
|
||||||
'guide.client.cline.step3': '设置 API Base URL 为: http://localhost:3000/{provider}/v1',
|
'guide.client.cline.step3': '设置 API Base URL 为: http://localhost:3000/{provider}/v1',
|
||||||
'guide.client.cline.step4': '填入 API Key 和模型名称',
|
'guide.client.cline.step4': '填入 API Key 和模型名称',
|
||||||
'guide.client.note': '提示:将 {provider} 替换为实际的提供商路径,如 gemini-cli-oauth、claude-kiro-oauth 等。可在仪表盘的路由示例中查看完整路径。',
|
'guide.client.note': '提示:将 {provider} 替换为实际的提供商路径,如 gemini-cli-oauth、claude-kiro-oauth 等。可在仪表盘的路由示例中查看完整路径。',
|
||||||
'guide.ollama.title': 'Ollama 协议使用',
|
|
||||||
'guide.ollama.desc': '本项目支持 Ollama 协议,可以通过统一接口访问所有支持的模型。',
|
|
||||||
'guide.ollama.listModels': '列出所有可用模型',
|
|
||||||
'guide.ollama.chat': '聊天接口',
|
|
||||||
'guide.faq.title': '常见问题',
|
'guide.faq.title': '常见问题',
|
||||||
'guide.faq.q1': 'Q: 请求返回 404 错误怎么办?',
|
'guide.faq.q1': 'Q: 请求返回 404 错误怎么办?',
|
||||||
'guide.faq.a1': 'A: 检查接口路径是否正确。某些客户端会自动在 Base URL 后追加路径,导致路径重复。请查看控制台中的实际请求 URL,移除多余的路径部分。',
|
'guide.faq.a1': 'A: 检查接口路径是否正确。某些客户端会自动在 Base URL 后追加路径,导致路径重复。请查看控制台中的实际请求 URL,移除多余的路径部分。',
|
||||||
|
|
@ -1314,7 +1310,7 @@ const translations = {
|
||||||
'modal.provider.field.codexBaseUrl': 'Codex Base URL',
|
'modal.provider.field.codexBaseUrl': 'Codex Base URL',
|
||||||
'modal.provider.field.apiKey': 'API Key',
|
'modal.provider.field.apiKey': 'API Key',
|
||||||
'modal.provider.field.apiKey.placeholder': 'Please enter API Key',
|
'modal.provider.field.apiKey.placeholder': 'Please enter API Key',
|
||||||
'modal.provider.field.projectId.placeholder': 'Google Cloud Project ID',
|
'modal.provider.field.projectId.placeholder': 'Google Cloud Project ID (Leave blank for discovery)',
|
||||||
'modal.provider.field.projectId.optional.placeholder': 'Google Cloud Project ID (Leave blank for discovery)',
|
'modal.provider.field.projectId.optional.placeholder': 'Google Cloud Project ID (Leave blank for discovery)',
|
||||||
'modal.provider.field.oauthPath.gemini.placeholder': 'e.g.: ~/.gemini/oauth_creds.json',
|
'modal.provider.field.oauthPath.gemini.placeholder': 'e.g.: ~/.gemini/oauth_creds.json',
|
||||||
'modal.provider.field.oauthPath.kiro.placeholder': 'e.g.: ~/.aws/sso/cache/kiro-auth-token.json',
|
'modal.provider.field.oauthPath.kiro.placeholder': 'e.g.: ~/.aws/sso/cache/kiro-auth-token.json',
|
||||||
|
|
@ -1473,10 +1469,6 @@ const translations = {
|
||||||
'guide.client.cline.step3': 'Set API Base URL to: http://localhost:3000/{provider}/v1',
|
'guide.client.cline.step3': 'Set API Base URL to: http://localhost:3000/{provider}/v1',
|
||||||
'guide.client.cline.step4': 'Enter API Key and model name',
|
'guide.client.cline.step4': 'Enter API Key and model name',
|
||||||
'guide.client.note': 'Tip: Replace {provider} with the actual provider path, such as gemini-cli-oauth, claude-kiro-oauth, etc. See the routing examples on the dashboard for full paths.',
|
'guide.client.note': 'Tip: Replace {provider} with the actual provider path, such as gemini-cli-oauth, claude-kiro-oauth, etc. See the routing examples on the dashboard for full paths.',
|
||||||
'guide.ollama.title': 'Ollama Protocol Usage',
|
|
||||||
'guide.ollama.desc': 'This project supports Ollama protocol, allowing unified access to all supported models.',
|
|
||||||
'guide.ollama.listModels': 'List all available models',
|
|
||||||
'guide.ollama.chat': 'Chat interface',
|
|
||||||
'guide.faq.title': 'FAQ',
|
'guide.faq.title': 'FAQ',
|
||||||
'guide.faq.q1': 'Q: What to do if request returns 404 error?',
|
'guide.faq.q1': 'Q: What to do if request returns 404 error?',
|
||||||
'guide.faq.a1': 'A: Check if the API path is correct. Some clients automatically append paths to Base URL, causing duplication. Check the actual request URL in the console and remove redundant path parts.',
|
'guide.faq.a1': 'A: Check if the API path is correct. Some clients automatically append paths to Base URL, causing duplication. Check the actual request URL in the console and remove redundant path parts.',
|
||||||
|
|
|
||||||
|
|
@ -375,13 +375,13 @@ function getProviderTypeFields(providerType) {
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
id: 'GROK_CF_CLEARANCE',
|
id: 'GROK_CF_CLEARANCE',
|
||||||
label: t('modal.provider.field.cfClearance'),
|
label: `${t('modal.provider.field.cfClearance')} <span class="optional-tag">${t('config.optional')}</span>`,
|
||||||
type: 'text',
|
type: 'text',
|
||||||
placeholder: 'cf_clearance cookie value'
|
placeholder: 'cf_clearance cookie value'
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
id: 'GROK_USER_AGENT',
|
id: 'GROK_USER_AGENT',
|
||||||
label: t('modal.provider.field.userAgent'),
|
label: `${t('modal.provider.field.userAgent')} <span class="optional-tag">${t('config.optional')}</span>`,
|
||||||
type: 'text',
|
type: 'text',
|
||||||
placeholder: 'Mozilla/5.0 ...'
|
placeholder: 'Mozilla/5.0 ...'
|
||||||
},
|
},
|
||||||
|
|
|
||||||
|
|
@ -166,31 +166,6 @@
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<!-- Ollama 协议使用 -->
|
|
||||||
<div class="guide-panel">
|
|
||||||
<h3><i class="fas fa-llama"></i> <span data-i18n="guide.ollama.title">Ollama 协议使用</span></h3>
|
|
||||||
<div class="guide-content">
|
|
||||||
<p data-i18n="guide.ollama.desc">本项目支持 Ollama 协议,可以通过统一接口访问所有支持的模型。</p>
|
|
||||||
|
|
||||||
<div class="api-example">
|
|
||||||
<h4 data-i18n="guide.ollama.listModels">列出所有可用模型</h4>
|
|
||||||
<pre><code>curl http://localhost:3000/ollama/api/tags \
|
|
||||||
-H "Authorization: Bearer YOUR_API_KEY"</code></pre>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<div class="api-example">
|
|
||||||
<h4 data-i18n="guide.ollama.chat">聊天接口</h4>
|
|
||||||
<pre><code>curl http://localhost:3000/ollama/api/chat \
|
|
||||||
-H "Content-Type: application/json" \
|
|
||||||
-H "Authorization: Bearer YOUR_API_KEY" \
|
|
||||||
-d '{
|
|
||||||
"model": "[Claude] claude-sonnet-4.5",
|
|
||||||
"messages": [{"role": "user", "content": "你好"}]
|
|
||||||
}'</code></pre>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<!-- 常见问题 -->
|
<!-- 常见问题 -->
|
||||||
<div class="guide-panel">
|
<div class="guide-panel">
|
||||||
<h3><i class="fas fa-question-circle"></i> <span data-i18n="guide.faq.title">常见问题</span></h3>
|
<h3><i class="fas fa-question-circle"></i> <span data-i18n="guide.faq.title">常见问题</span></h3>
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue