diff --git a/docs/research/codex-app-server-model-catalog-plan.md b/docs/research/codex-app-server-model-catalog-plan.md new file mode 100644 index 00000000..84ad8210 --- /dev/null +++ b/docs/research/codex-app-server-model-catalog-plan.md @@ -0,0 +1,2372 @@ +# Codex App-Server Model Catalog Plan + +**Date**: 2026-04-21 +**Status**: implementation complete in feature worktrees, pending final review/commit +**Worktree**: `/Users/belief/dev/projects/claude/claude_team_codex_model_catalog_plan` +**Branch**: `spike/codex-model-catalog-plan` +**Primary repo**: `claude_team` +**Secondary repo worktree**: `/Users/belief/dev/projects/claude/agent_teams_orchestrator_codex_native_spike` +**Architecture reference**: [FEATURE_ARCHITECTURE_STANDARD.md](../FEATURE_ARCHITECTURE_STANDARD.md) + +## Executive Summary + +Codex model selection should move from hardcoded local lists to the official Codex app-server `model/list` catalog. + +Chosen implementation: + +- Add a dedicated `src/features/codex-model-catalog` feature in `claude_team`. +- Use `codex app-server` JSON-RPC `model/list` as the primary source for Codex models. +- Keep the existing static Codex catalog only as a bounded fallback when app-server is unavailable. +- Add rich, additive model metadata to `CliProviderStatus` while keeping `models: string[]` for backwards compatibility. +- Use per-model `supportedReasoningEfforts` and `defaultReasoningEffort` for the Codex model picker and launch validation. +- Keep Anthropic and Gemini behavior unchanged by default. +- Update `agent_teams_orchestrator` so Codex launches pass reasoning effort through Codex config key `model_reasoning_effort`, not through an invented `--effort` flag. + +Decision score: + +- `🎯 9 🛡️ 9 🧠 6` +- estimated implementation size: `1200-2400` lines across `claude_team` and `agent_teams_orchestrator` + +Why this is the safest path: + +- It follows the real Codex client contract instead of chasing static releases. +- It solves future model releases like `gpt-5.5` without an app release, as long as Codex app-server already exposes the model. +- It avoids breaking Anthropic by making the new catalog contract additive and provider-scoped. +- It handles `xhigh` correctly as Codex-specific reasoning effort, not as Anthropic `max`. + +Current implementation state: + +- `claude_team` has the dedicated Codex model catalog feature, app-server JSON-RPC client, static fallback, provider status integration, Codex model picker integration, provider-aware effort UI, launch validation, launch identity persistence, and targeted tests. +- `agent_teams_orchestrator_codex_native_spike` exposes runtime capabilities for dynamic Codex models and Codex reasoning config pass-through, and its Codex native exec runner passes effort through `-c model_reasoning_effort="value"`. +- Anthropic remains isolated from Codex-only effort values. Anthropic launch UI still uses `low | medium | high`; Codex can use per-model `minimal | low | medium | high | xhigh` only where catalog/runtime policy allows it. +- Future Codex app-server models can appear immediately in UI. Launch is allowed only when the local runtime declares dynamic Codex model support; otherwise they remain visible with upgrade/policy copy instead of failing late during spawn. +- `Default` Codex selection is resolved to a concrete model immediately before provisioning and stored as additive launch identity metadata. +- The remaining work before merge is review/signoff, not more architecture discovery. + +## Sources And Verification + +Official sources checked: + +- [Codex App Server](https://developers.openai.com/codex/app-server) +- [Codex CLI command line options](https://developers.openai.com/codex/cli/reference) +- [Codex Configuration Reference](https://developers.openai.com/codex/config-reference) + +Important official facts: + +- `model/list` is explicitly intended for rendering model and personality selectors. +- `model/list` returns `id`, `model`, `displayName`, `hidden`, `defaultReasoningEffort`, `supportedReasoningEfforts`, `inputModalities`, `supportsPersonality`, `isDefault`, `upgrade`, and `upgradeInfo`. +- `includeHidden: false` returns picker-visible models by default. +- `codex exec` has `--model` and `-c, --config key=value`. +- `codex exec` does not expose a first-class `--effort` flag. +- Codex config key `model_reasoning_effort` supports `minimal | low | medium | high | xhigh`. +- `xhigh` is model-dependent. +- `config/read` exists in app-server and returns effective configuration after configuration layering. +- Codex loads user config from `~/.codex/config.toml` and can also load project-scoped `.codex/config.toml` only for trusted projects. +- `model_catalog_json` can override the model catalog, including profile-level overrides. +- `codex exec` supports `--cd` and `--profile`, and `-c key=value` overrides take precedence for one invocation. + +Local probe: + +- binary: `codex-cli 0.117.0` +- method: `codex app-server` over JSON-RPC stdio +- transport: newline-delimited JSON-RPC over stdio, not `Content-Length` framing +- request: `model/list` with `{ "limit": 20, "includeHidden": false }` +- result: 8 visible models, `gpt-5.4` marked default, `nextCursor: null` +- visible models returned: `gpt-5.4`, `gpt-5.2-codex`, `gpt-5.1-codex-max`, `gpt-5.4-mini`, `gpt-5.3-codex`, `gpt-5.3-codex-spark`, `gpt-5.2`, `gpt-5.1-codex-mini` +- `xhigh` is already returned for most models. +- `gpt-5.1-codex-mini` only returned `medium | high`, so effort options must be per-model. +- `gpt-5.3-codex-spark` returned default effort `high`, so default effort must not be global. +- `codex exec --help` locally confirms `--cd`, `--profile`, `--model`, `--oss`, `--local-provider`, and repeatable `-c key=value`. +- local help confirms `--oss` is equivalent to `-c model_provider=oss`, so provider scope can differ from subscription-backed OpenAI Codex if not guarded. +- live `config/read` probe returned `{ config, origins }`. +- live `config/read` probe requires `params` object; missing `params` returns JSON-RPC error `-32600`. +- live `config/read` probe accepted `{ cwd }` and `{ profile }` without error, so the implementation should feature-detect and test scoped reads instead of assuming only global config. +- final live smoke on this worktree confirmed `model/list` returns 8 visible models, default `gpt-5.4`, `xhigh` for most models, and `medium | high` only for `gpt-5.1-codex-mini`. + +Combined app-server session probe: + +- one initialized app-server process successfully handled `account/read`, `account/rateLimits/read`, and `model/list` sequentially +- `account/read` returned a ChatGPT account shape in the local environment +- `account/rateLimits/read` returned `primary.windowDurationMins = 300` and `secondary.windowDurationMins = 10080` +- `model/list` returned the same 8 visible models in that same session +- conclusion: provider refresh should use a combined control-plane session when it needs account, limits, and catalog truth + +## Lowest-Confidence Areas And Decisions + +### 1. Auth-scoped catalog truth + +`🎯 7 🛡️ 9 🧠 6` +Estimated implementation impact: `180-350` lines + +Uncertainty: + +- app-server `model/list` may return different catalogs depending on active Codex auth state, account plan, org policy, API-key mode, or future Codex rollout flags. +- The local probe only proves one logged-in environment, not all account modes. + +Decision: + +- treat Codex model catalog as auth-scoped, not global +- cache key must include binary path, Codex home, preferred auth mode, effective auth mode, managed account stable identity when available, and API-key availability source +- never reuse a ChatGPT-account catalog as API-key-mode catalog +- never reuse an API-key-mode catalog as ChatGPT-account catalog +- when auth mode changes, keep previous catalog visible only as stale UI while refresh is in flight, then replace it + +Implementation rule: + +```text +catalogCacheKey = + binaryPath + + binaryVersion + + codexHome + + preferredAuthMode + + effectiveAuthMode + + managedAccountHash or "no-chatgpt-account" + + apiKey.source or "no-api-key" +``` + +The hash should use a per-process salt and should not be persisted. Do not persist raw email solely for catalog cache. + +### 2. Default model determinism + +`🎯 8 🛡️ 9 🧠 6` +Estimated implementation impact: `220-420` lines + +Uncertainty: + +- current UI can represent model as empty string meaning `Default` +- Codex app-server default can change after a Codex release +- launch logs, relaunch, replay, and team metadata need to stay explainable + +Decision: + +- keep `Default` as a UI selection +- resolve `Default` to a concrete `resolvedLaunchModel` immediately before launch +- persist both user selection and resolved runtime truth in launch metadata +- never silently rewrite old team config from one concrete model to another +- if a team stored `Default`, relaunch should show that it will resolve to the current Codex default before launch + +Required persisted launch identity: + +```ts +export interface ProviderModelLaunchIdentity { + providerId: TeamProviderId; + providerBackendId: TeamProviderBackendId | null; + selectedModel: string | null; + selectedModelKind: 'default' | 'explicit'; + resolvedLaunchModel: string; + catalogId: string | null; + catalogSource: 'app-server' | 'static-fallback' | 'unavailable'; + catalogFetchedAt: string | null; + selectedEffort: string | null; + resolvedEffort: string | null; +} +``` + +This identity should be written into exact logs and launch-derived metadata. It should not replace existing fields in Phase 1, but it should become the canonical explanation layer for Codex relaunch/replay. + +### 3. Effort transport through orchestrator + +`🎯 8 🛡️ 9 🧠 5` +Estimated implementation impact: `180-320` lines + +Uncertainty: + +- Agent Teams exposes a generic `--effort` concept today +- Codex CLI does not expose `--effort` +- Codex uses config key `model_reasoning_effort` + +Decision: + +- UI and main process may accept provider-aware effort strings +- orchestrator public Agent Teams CLI can continue accepting `--effort` +- Codex executor must translate Codex effort to `codex exec -c model_reasoning_effort='"value"'` +- Anthropic executor must not see Codex-only effort values +- Codex executor must not see Anthropic `max` + +No implementation phase may ship `xhigh` as selectable until this pass-through is tested. + +### 4. Catalog availability vs team-agent safety policy + +`🎯 8 🛡️ 9 🧠 5` +Estimated implementation impact: `160-280` lines + +Uncertainty: + +- app-server `model/list` says a model is available to Codex +- our team-agent contract can still make a model unsafe for Agent Teams if it breaks task/reply/bootstrap conventions +- current UI has local disabled policy for `gpt-5.3-codex-spark`, `gpt-5.2-codex`, and `gpt-5.1-codex-mini` + +Decision: + +- model catalog answers "can Codex offer this model" +- team policy answers "can Agent Teams safely launch this model" +- keep these as separate layers +- do not remove current disabled policies just because app-server returns a model +- show clear disabled copy: `Available in Codex, disabled for Agent Teams` +- disabled models can still display catalog metadata and effort metadata for transparency + +### 5. Codex binary version and app-server method compatibility + +`🎯 7 🛡️ 9 🧠 6` +Estimated implementation impact: `220-420` lines + +Uncertainty: + +- `codex app-server` is documented as an app-server integration surface, but local users can have older Codex binaries. +- `model/list` may be missing, renamed, or return a narrower shape in older binaries. +- Current `JsonRpcStdioClient` collapses JSON-RPC errors to `Error(message)`, which loses method, code, and structured details needed to distinguish `method not found` from auth/network/timeout. +- Current `CodexBinaryResolver` caches only binary path, not binary version. + +Decision: + +- make binary version part of catalog cache identity +- add structured JSON-RPC error metadata before implementing catalog fallback +- treat `method not found` as `static-fallback`, not as account failure +- treat malformed model rows as catalog degradation, not app-server runtime failure +- clear catalog cache when resolved Codex binary path or version changes + +Required implementation detail: + +```ts +export class JsonRpcRequestError extends Error { + readonly method: string; + readonly code: number | null; + readonly details: unknown; +} +``` + +The app-server model client should classify: + +- `method_not_found`: fallback to static catalog and show upgrade hint +- `timeout`: stale cache if available, then fallback +- `malformed_response`: fallback plus diagnostics +- `process_exit`: shared app-server failure for all sub-results in combined snapshot +- `auth_required`: account/read decides auth truth; model/list must not invent auth truth + +### 6. `auto` auth resolution for model catalog + +`🎯 7 🛡️ 9 🧠 6` +Estimated implementation impact: `180-320` lines + +Uncertainty: + +- UI lets users pick `auto`, `chatgpt`, or `api_key`. +- Catalog can differ between ChatGPT subscription and API key. +- The model picker must preview the catalog for the mode that launch will actually use, not only the configured preference. + +Decision: + +- `preferredAuthMode=auto` is not a catalog scope by itself +- resolve `auto` into `effectiveAuthMode` using the same readiness logic as launch +- catalog request should be scoped to the effective launch mode +- Provider Settings can show both preference and effective catalog scope when they differ +- if effective mode flips from ChatGPT to API key because ChatGPT becomes unavailable, keep stale ChatGPT catalog visually stale and refresh API-key catalog + +UX copy rule: + +- do not say `Detected from OPENAI_API_KEY` as the primary model catalog source when ChatGPT account is the effective mode +- show API-key availability only as fallback/secondary when selected auth is ChatGPT or auto resolves to ChatGPT + +### 7. App-server notifications and refresh cadence + +`🎯 8 🛡️ 8 🧠 5` +Estimated implementation impact: `160-260` lines + +Uncertainty: + +- account login flow has notifications +- current docs and local probe do not establish a dedicated model-catalog changed notification +- keeping a long-lived app-server just for model catalog would increase lifecycle complexity + +Decision: + +- do not introduce a long-lived model catalog subscription in this rollout +- use short-lived app-server sessions for refresh +- trigger catalog refresh after login success, logout, auth mode change, API-key source change, manual refresh, and provider status refresh +- do not poll `model/list` aggressively from renderer +- use `10 minute` success TTL and stale cache for UI continuity + +If a future app-server release adds model catalog notifications, integrate them later behind the catalog feature port without changing renderer contracts. + +### 8. Backup, restore, and relaunch compatibility + +`🎯 7 🛡️ 9 🧠 7` +Estimated implementation impact: `240-520` lines + +Uncertainty: + +- team launch metadata already persisted provider/model/effort/backend in several places +- adding dynamic defaults and resolved model identity can make old backups ambiguous +- old teams may contain no `modelCatalog` metadata + +Decision: + +- new `ProviderModelLaunchIdentity` is additive +- old teams without it remain readable +- relaunch derives missing identity from existing provider/model/effort/backend fields +- restore does not require the old catalog to be available +- if restored explicit model is missing from current catalog, UI preserves the explicit model with a warning instead of silently replacing it with current default +- if restored model was `Default`, relaunch preview resolves it against current catalog and says so before launch + +Migration rule: + +```text +old explicit model -> selectedModelKind="explicit", resolvedLaunchModel=old model +old empty model -> selectedModelKind="default", resolvedLaunchModel=current default at next launch +missing effort -> selectedEffort=null, resolvedEffort=current model default at next launch +``` + +### 9. UI and orchestrator version skew + +`🎯 7 🛡️ 10 🧠 7` +Estimated implementation impact: `280-620` lines + +Uncertainty: + +- `claude_team` and `agent_teams_orchestrator` can be updated at different times. +- UI can learn about `xhigh`, `minimal`, or a future model like `gpt-5.5` before the installed orchestrator can launch it safely. +- The current orchestrator static Codex helpers can reject a model that Codex app-server already exposed. + +Decision: + +- catalog visibility and launch capability are separate contracts +- UI may display app-server catalog metadata as soon as it is available +- UI must not enable launch controls that require new orchestrator behavior until runtime capability says that behavior exists +- provider-explicit Codex model strings can be accepted only after orchestrator declares dynamic Codex model support +- Codex `xhigh` can be shown as metadata before Phase 4, but it is disabled for launch until Codex effort pass-through is available + +Required runtime capability contract: + +```ts +export interface ProviderRuntimeCapabilities { + providerId: TeamProviderId; + codex?: { + supportsDynamicAppServerModels: boolean; + supportsCodexReasoningEffortConfig: boolean; + supportedCodexReasoningEfforts: Array<'minimal' | 'low' | 'medium' | 'high' | 'xhigh'>; + acceptsProviderExplicitFutureModels: boolean; + }; +} +``` + +Compatibility rule: + +```text +catalog says model/effort exists ++ team policy says model is not disabled ++ runtime capability says launch path supports it += launch control enabled +``` + +If any part is missing, the picker can still display the model, but launch must be disabled with explicit copy. + +Recommended copy: + +- `Available in Codex, waiting for Agent Teams runtime support` +- `This Codex effort is visible in Codex, but this Agent Teams runtime cannot launch it yet` +- `Upgrade the Agent Teams runtime to use this model` + +This avoids a bad state where the user selects `xhigh` successfully in UI and then gets a late `codex exec` failure. + +### 10. Future model policy, including `gpt-5.5` + +`🎯 8 🛡️ 8 🧠 6` +Estimated implementation impact: `240-520` lines + +Uncertainty: + +- app-server can expose a new model immediately after OpenAI releases it. +- the user goal is that new Codex models appear without us shipping a new static list. +- Agent Teams still needs a safety layer so one unexpected model row does not break team launch flows. + +Top 3 policies: + +1. Allow every app-server-visible model immediately: `🎯 8 🛡️ 5 🧠 3`, `80-180` lines. This best solves future releases, but it can route unverified models into team launch without product copy or rollback clarity. +2. Show every app-server-visible model immediately, launch with capability gate plus "new model" warning: `🎯 9 🛡️ 8 🧠 5`, `240-520` lines. This keeps future models visible without app releases, but still blocks only real launch incompatibilities. +3. Hide or disable unknown models until a code release updates policy: `🎯 4 🛡️ 9 🧠 2`, `60-120` lines. This is safe but defeats the reason to use `model/list`. + +Chosen policy: option 2. + +Implementation rule: + +- app-server-visible, non-hidden models appear in the picker immediately +- known disabled Agent Teams models remain disabled +- new unknown models are selectable only if runtime capabilities support dynamic Codex models +- new unknown models get a `New from Codex catalog` note until a successful launch or explicit policy promotion marks them `verified` +- if the new model does not expose usable text input or any supported effort we can launch, it is shown but disabled +- hidden models are never introduced into new-team pickers by default + +Policy statuses: + +```ts +export type CodexTeamModelPolicyStatus = + | 'verified' + | 'new-from-codex-catalog' + | 'disabled-for-agent-teams' + | 'requires-runtime-upgrade' + | 'missing-from-current-catalog'; +``` + +This means `gpt-5.5` can appear the day app-server returns it, but the UI will not pretend the full Agent Teams launch path is verified unless the local runtime can actually handle provider-explicit dynamic Codex models. + +### 11. Hidden, upgraded, and persisted models + +`🎯 8 🛡️ 9 🧠 5` +Estimated implementation impact: `160-340` lines + +Uncertainty: + +- official docs say `includeHidden: false` returns picker-visible models by default. +- persisted teams can reference a model that later becomes hidden, upgraded, renamed, or unavailable. +- app-server exposes `upgrade` and `upgradeInfo`, but we do not know every future migration shape. + +Decision: + +- normal picker uses `includeHidden: false` +- if a persisted explicit Codex model is not found in the visible catalog, run one scoped refresh with `includeHidden: true` +- if hidden lookup finds the model, show it as `Hidden in Codex catalog` and keep relaunch possible only if runtime capability and team policy allow it +- if `upgrade` points to a visible replacement, show a non-destructive migration suggestion +- never auto-rewrite persisted model ids during restore or relaunch + +Relaunch behavior: + +```text +visible model found -> normal relaunch +hidden model found -> relaunch allowed only with warning and policy pass +upgrade available -> show "Switch to recommended model" action +missing model -> keep value visible, require user to choose another model before launch +``` + +This avoids both failure modes: silently changing a user's team model, or breaking old teams because a model moved out of the default picker. + +### 12. Stored effort schema and non-dialog launch paths + +`🎯 7 🛡️ 9 🧠 7` +Estimated implementation impact: `320-760` lines + +Uncertainty: + +- effort is not used only in launch dialogs. +- team metadata, member metadata, backup/restore, draft retry, localStorage launch params, and scheduled/provisioned flows can all carry `effort`. +- current normalizers in team data paths may silently discard anything outside `low | medium | high`. + +Decision: + +- provider-aware effort parsing must be added at every inbound boundary, not only in React components +- old persisted `low | medium | high` values stay valid +- new Codex-specific values are preserved only with provider/model context +- if provider context is missing, parse as legacy effort and do not invent Codex-specific meaning +- scheduled launches and automation-like flows must either be updated in the same phase or explicitly block Codex-only efforts until updated + +High-risk code paths to audit during implementation: + +- `src/main/services/team/TeamMembersMetaStore.ts` +- `src/main/services/team/TeamDataService.ts` +- `src/main/services/team/TeamBackupService.ts` +- `src/main/services/team/TeamProvisioningService.ts` +- `src/shared/types/schedule.ts` +- `src/main/ipc/teams.ts` +- `src/main/http/teams.ts` +- renderer launch prefill and draft retry localStorage state + +Migration rule: + +```text +legacy effort with no provider context -> keep if low | medium | high +codex effort with provider=codex -> validate against selected model catalog +codex effort with provider missing -> store as selected string only, resolve before launch +unsupported restored effort -> show warning, do not silently downgrade +``` + +### 13. Renderer stale state, HMR, and out-of-order refreshes + +`🎯 8 🛡️ 9 🧠 5` +Estimated implementation impact: `180-380` lines + +Uncertainty: + +- previous provider settings work showed transient wrong states after HMR and slow refreshes. +- catalog, account, and rate limits can refresh with different timings. +- a stale app-server response can arrive after a newer auth-mode change. + +Decision: + +- every provider status refresh should carry a monotonic `requestId` or `snapshotVersion` +- renderer stores the latest accepted version per provider +- responses older than the latest accepted version are ignored +- `modelCatalog.schemaVersion` is required and future versions are treated as degraded, not fatal +- HMR should keep last ready provider status visible while a refresh is in flight +- a catalog refresh cannot overwrite account connected state unless it came from the same combined snapshot + +Required stale-write guard: + +```text +if incoming.providerId != current.providerId -> reject +if incoming.requestId < current.requestId -> reject +if incoming.authScope != current.authScope and incoming.status is not from current auth selection -> keep as stale diagnostics only +``` + +This directly targets flicker like `Codex native unavailable` followed by ready state, or fallback API-key copy appearing while ChatGPT account mode is selected. + +### 14. Privacy, logs, and diagnostics + +`🎯 8 🛡️ 9 🧠 4` +Estimated implementation impact: `120-260` lines + +Uncertainty: + +- account-scoped cache keys need stable identity, but raw email should not leak into exact logs, runtime snapshots, or persistent diagnostics. +- API-key source is useful for UX, but no secret or env value should be logged. + +Decision: + +- hash managed account identity in memory for cache keys +- use a per-process salt for volatile cache keys +- do not persist raw account email solely for model catalog cache +- exact logs can record `authScope=chatgpt` or `authScope=api_key`, not raw account identity +- diagnostics can record `apiKeySource=OPENAI_API_KEY` but never the value +- error messages preserve method/code/timeout, but redact command env and tokens + +Required diagnostic fields: + +```ts +export interface CodexModelCatalogDiagnostics { + source: 'app-server' | 'static-fallback' | 'unavailable'; + status: 'ready' | 'stale' | 'degraded' | 'unavailable'; + method?: 'model/list'; + errorCode?: string | number | null; + errorCategory?: string | null; + binaryVersion?: string | null; + effectiveAuthMode?: 'chatgpt' | 'api_key' | null; + cacheAgeMs?: number | null; +} +``` + +No UI surface should show `Unknown error` for catalog failures after this feature. + +### 15. Rollout ordering across repos + +`🎯 8 🛡️ 10 🧠 6` +Estimated implementation impact: `120-260` lines + +Uncertainty: + +- `claude_team` can ship UI before the user has a compatible `agent_teams_orchestrator` runtime in cache. +- the app can point to `CLAUDE_DEV_RUNTIME_ROOT`, bundled runtime cache, or a user-installed runtime binary. + +Decision: + +- implement orchestrator support first or behind a UI capability gate +- Provider Settings can show catalog metadata before launch support exists +- Create/Launch dialogs must consult runtime capabilities before enabling new Codex models or new Codex efforts +- the runtime health check should expose a version/capability payload, not force UI to infer support from binary version strings +- if capabilities are unavailable, default to safe: display metadata, disable launch-only features + +Rollout sequence: + +1. Add orchestrator dynamic Codex model and effort capability support. +2. Add `claude_team` catalog feature and provider status metadata. +3. Show catalog in UI with capability gates. +4. Enable launch when capability and catalog agree. +5. Remove any temporary guard only after bundled runtime and dev runtime both report capabilities in CI/smoke. + +This is the cleanest way to avoid UI and runtime getting out of sync. + +### 16. Codex config/profile/cwd catalog mismatch + +`🎯 6 🛡️ 10 🧠 8` +Estimated implementation impact: `360-900` lines + +Uncertainty: + +- official config docs allow `model_catalog_json`, and profile-level `profiles..model_catalog_json` can override it. +- Codex loads project-scoped `.codex/config.toml` only when a project or worktree is trusted. +- `codex exec` can run with a different `cwd`, profile, and inline `-c` overrides than the short-lived app-server preview session. +- current `CodexAppServerSessionFactory` starts `codex app-server` without an explicit `cwd` or profile. + +Failure mode: + +- Provider Settings shows catalog A from global config. +- Launch runs `codex exec` in project cwd with project-scoped or profile config and effectively uses catalog B. +- The user selects a model that preview says is valid, but launch resolves against a different provider/catalog. + +Top 3 policies: + +1. Global-only catalog preview: `🎯 7 🛡️ 5 🧠 3`, `80-180` lines. Fast and simple, but wrong for project-scoped Codex configs. +2. Project-scoped catalog preview for launch flows, global preview for dashboard: `🎯 9 🛡️ 9 🧠 7`, `360-900` lines. More work, but it matches actual `codex exec` launch context. +3. Ignore config and force a static OpenAI Codex provider always: `🎯 5 🛡️ 8 🧠 4`, `200-420` lines. Safer than mismatch, but it discards legitimate user Codex config and can surprise power users. + +Chosen policy: option 2. + +Decision: + +- dashboard/provider card can show a global Codex catalog snapshot +- Create/Launch dialogs must fetch or resolve catalog for the selected launch `cwd` +- if profile selection exists or is introduced, catalog cache key must include profile name +- if we pass inline config overrides to `codex exec`, equivalent preview scope must include those overrides or launch must be marked "not preview-verified" +- if project trust/config cannot be resolved, launch UI falls back to global catalog but shows `Catalog may differ for this project` + +Required preview scope: + +```ts +export interface CodexModelCatalogScope { + codexHome: string; + binaryPath: string; + binaryVersion: string | null; + cwd: string | null; + projectTrust: 'trusted' | 'untrusted' | 'unknown'; + profileName: string | null; + configFingerprint: string | null; + preferredAuthMode: 'auto' | 'chatgpt' | 'api_key' | null; + effectiveAuthMode: 'chatgpt' | 'api_key' | null; + launchOverridesFingerprint: string | null; +} +``` + +Cache key correction: + +```text +catalogCacheKey = + binaryPath + + binaryVersion + + codexHome + + cwd or "global" + + projectTrust + + profileName or "default-profile" + + configFingerprint or "unknown-config" + + launchOverridesFingerprint or "no-launch-overrides" + + preferredAuthMode + + effectiveAuthMode + + forcedLoginMethod or "no-forced-login-method" + + forcedWorkspaceHash or "no-forced-workspace" + + managedAccountHash or "no-chatgpt-account" + + apiKey.source or "no-api-key" +``` + +Implementation notes: + +- use app-server `config/read` when available to get effective config fingerprints for the same scope that launch will use +- do not parse arbitrary TOML as the primary config source if app-server can resolve effective configuration +- if app-server cannot scope `config/read` by cwd/profile, keep that uncertainty visible in diagnostics +- do not use raw config file contents as a cache key or log payload; hash only the relevant effective keys + +Relevant effective keys: + +- `model` +- `model_provider` +- `model_catalog_json` +- `profiles..model_catalog_json` +- `model_reasoning_effort` +- `forced_login_method` +- `forced_chatgpt_workspace_id` +- `openai_base_url` +- `model_providers.*` only as a redacted structural fingerprint +- `projects..trust_level` + +Acceptance: + +- a team launch from project A and project B can have different Codex catalog cache entries +- a trusted project `.codex/config.toml` changing `model_catalog_json` invalidates preview for that project +- global dashboard status does not claim to be launch-exact for every project +- exact logs record the catalog scope fingerprint, not raw config values + +### 17. Built-in OpenAI Codex provider vs custom/OSS Codex config + +`🎯 7 🛡️ 9 🧠 7` +Estimated implementation impact: `260-620` lines + +Uncertainty: + +- Codex config supports `model_provider`, custom providers, `oss_provider`, and provider auth settings. +- Agent Teams "Codex" provider is intended to mean native Codex through OpenAI/ChatGPT subscription or API-key billing, not arbitrary custom provider execution. +- app-server `model/list` can be influenced by configuration, but our product copy currently talks about Codex subscription. + +Decision: + +- this cutover should keep Agent Teams Codex scoped to the built-in OpenAI Codex provider +- custom provider and OSS provider support should be a separate provider feature, not silently mixed into `provider=codex` +- if effective config says `model_provider` is not built-in OpenAI for the launch scope, show a clear warning and block subscription-mode launch unless the user intentionally switches to a future custom-provider flow +- when launching Agent Teams Codex, pass or enforce provider config consistently so `codex exec` uses the same provider class previewed by the catalog + +Recommended launch guard: + +```text +if provider=codex and effective model_provider is neither missing nor "openai": + status = degraded + launch = blocked + copy = "This project config points Codex at a custom/local provider. Agent Teams Codex currently supports the built-in OpenAI Codex provider only." +``` + +If the team wants to support custom providers later: + +- add a separate `provider=codex-custom` or generic OpenAI-compatible provider +- do not reuse subscription UX or rate-limit UI +- do not show ChatGPT account limits for custom provider launches + +This prevents a confusing case where UI says "Codex subscription" but runtime actually routes to local OSS or a custom endpoint. + +### 18. Modalities and personality support + +`🎯 8 🛡️ 9 🧠 4` +Estimated implementation impact: `120-280` lines + +Uncertainty: + +- app-server model rows expose `inputModalities` and `supportsPersonality`. +- Agent Teams launch prompts are text-first today, but future UI can attach images or personality-like instructions. +- older model catalogs can omit `inputModalities`, and docs say missing modalities should be treated as `["text", "image"]` for backward compatibility. + +Decision: + +- launchability requires `text` input support +- image support is displayed as capability metadata, not required for normal team launch +- `supportsPersonality=false` must not disable normal team launch, but the UI must not claim `/personality` or personality-specific behavior for that model +- missing `inputModalities` uses the documented backward-compatible default + +Validation rule: + +```text +if inputModalities exists and does not include "text": + show model, disable launch, copy "This Codex model is not text-launch compatible for Agent Teams" + +if supportsPersonality=false: + hide personality controls for this model if those controls exist +``` + +This keeps model picker truthful without overfitting to the current text-only launch flow. + +### 19. Stable app-server surface vs experimental fields + +`🎯 8 🛡️ 9 🧠 4` +Estimated implementation impact: `80-180` lines + +Uncertainty: + +- app-server has an `experimentalApi` capability. +- `model/list` itself is documented on the stable API overview, but adjacent methods and future richer fields can be experimental. +- opting into experimental API globally can change response surface and error behavior. + +Decision: + +- keep `experimentalApi=false` for the model catalog rollout +- rely only on stable `model/list` fields listed in the docs +- treat extra fields as diagnostics only +- add a later explicit spike before using experimental catalog, plugin, or app-server thread features in this path + +Acceptance: + +- catalog tests run with `experimentalApi=false` +- no Phase 1-5 task depends on experimental fields +- if a future field appears, normalization ignores it unless we add a typed, tested use case + +### 20. App-server preview vs native exec signoff + +`🎯 8 🛡️ 10 🧠 6` +Estimated implementation impact: `180-420` lines + +Uncertainty: + +- `model/list` is the correct picker source, but the actual launch surface remains `codex exec --json`. +- a model can appear in app-server before `codex exec` in the installed binary handles it correctly. +- effort config can be accepted syntactically but rejected by the model/provider at runtime. + +Decision: + +- app-server catalog is necessary for UI, but not the only release gate for enabling new launch capability +- Phase 4 must include a live or mocked native-exec compatibility probe for the selected launch path +- native exec signoff should test model, provider scope, cwd, profile, and non-default effort together +- if live signoff is not available in CI, use a fixture-based unit test plus one documented local smoke command before merging + +Required signoff matrix: + +```text +default model + default effort + selected cwd +explicit gpt-5.4 + xhigh + selected cwd +gpt-5.1-codex-mini + high + selected cwd +gpt-5.1-codex-mini + xhigh -> blocked before exec +synthetic future model + capability disabled -> blocked before exec +synthetic future model + capability enabled -> argv accepted by orchestrator test +custom model_provider config -> blocked or explicit custom-provider copy +``` + +This prevents the plan from treating app-server catalog presence as proof that the full Agent Teams runtime path is healthy. + +### 21. `config/read` scope contract is only partially documented + +`🎯 7 🛡️ 9 🧠 6` +Estimated implementation impact: `180-420` lines + +Uncertainty: + +- docs list `config/read`, but the detailed request/response shape is not as explicit as `model/list`. +- local probe confirms `config/read` returns `{ config, origins }` and accepts `params`. +- local probe confirms missing `params` returns `-32600`, so callers must always send `{}` at minimum. +- local probe confirms `{ cwd }` and `{ profile }` are accepted, but we still need tests around whether they fully mirror `codex exec --cd/--profile` in all installations. + +Decision: + +- treat `config/read` as a feature-detected helper, not as a hard dependency for model catalog availability +- always call `config/read` with an object, never with missing params +- include `config/read` method/code/details in diagnostics +- if scoped `config/read` fails but global succeeds, mark launch catalog as `scope_unverified`, not `ready` +- if `config/read` is missing on older binaries, fall back to global catalog and require runtime capability plus explicit degraded copy before launch enablement + +Recommended DTO: + +```ts +export interface CodexAppServerConfigReadParams { + cwd?: string | null; + profile?: string | null; +} + +export interface CodexAppServerConfigReadResponse { + config: Record; + origins: Record; +} +``` + +Feature-detect result: + +```ts +export type CodexConfigReadSupport = + | 'supported-scoped' + | 'supported-global-only' + | 'method-missing' + | 'failed'; +``` + +Acceptance: + +- unit tests cover missing `params`, method-not-found, global success, scoped success, and scoped failure +- `config/read` failure never breaks account/rate-limit/model-list reads +- launch UI does not present a project-scoped catalog as verified unless config scope was actually checked + +### 22. Forced login method and ChatGPT workspace scope + +`🎯 7 🛡️ 10 🧠 6` +Estimated implementation impact: `180-420` lines + +Uncertainty: + +- effective Codex config can include `forced_login_method`. +- effective Codex config can include `forced_chatgpt_workspace_id`. +- workspace/account policy can affect available models, rate limits, and whether ChatGPT subscription mode is valid. +- previous Codex account work already had a real bug around forced login method, so this is not theoretical. + +Decision: + +- auth scope must include forced login method and forced workspace identity when present +- if UI-selected auth mode conflicts with `forced_login_method`, effective auth mode wins and UI must explain why +- forced workspace id must be hashed before cache/log usage +- rate-limit, account, and model catalog snapshots must be scoped together so workspace changes cannot reuse stale catalog + +Auth scope correction: + +```ts +export interface CodexCatalogAuthScope { + preferredAuthMode: 'auto' | 'chatgpt' | 'api_key' | null; + effectiveAuthMode: 'chatgpt' | 'api_key' | null; + forcedLoginMethod: 'chatgpt' | 'api_key' | null; + managedAccountHash: string | null; + forcedWorkspaceHash: string | null; + apiKeySource: string | null; +} +``` + +UX rules: + +- if user selected ChatGPT but config forces API key, show `Codex config forces API key mode for this scope` +- if user selected API key but config forces ChatGPT, show `Codex config forces ChatGPT account mode for this scope` +- if workspace id changes, show `Codex workspace changed, refreshing subscription limits and model catalog` +- never show raw workspace id in UI unless Codex app-server provides a display name that is intended for users + +Cache invalidation: + +- forced login method change invalidates both auth and catalog cache +- forced workspace hash change invalidates ChatGPT-scoped rate limits and catalog +- account logout clears all ChatGPT workspace-scoped entries + +### 23. Model catalog file trust and local file changes + +`🎯 6 🛡️ 9 🧠 7` +Estimated implementation impact: `220-520` lines + +Uncertainty: + +- `model_catalog_json` can point to a local JSON file. +- app-server resolves effective config, but our app may not know if that JSON file changed unless config fingerprint includes enough origin data. +- project-scoped `.codex/config.toml` only applies for trusted projects, so a file can exist but not be active. + +Decision: + +- treat `model_catalog_json` as part of effective config, not as a file we parse directly by default +- if `config/read.origins` exposes enough origin/path data, hash only path and mtime for invalidation, not file contents +- if origin/path data is unavailable, rely on manual refresh and short TTL +- never read arbitrary `model_catalog_json` file contents into logs or diagnostics +- do not apply project-scoped model catalog unless Codex effective config says the project is trusted and the catalog is active + +Top 3 invalidation policies: + +1. TTL/manual-refresh only: `🎯 7 🛡️ 6 🧠 2`, `40-100` lines. Simple but stale after local file edits. +2. Hash effective config plus optional mtime for active catalog file: `🎯 8 🛡️ 9 🧠 5`, `220-520` lines. Best balance without parsing arbitrary catalog files ourselves. +3. Parse and watch every possible catalog file: `🎯 5 🛡️ 7 🧠 8`, `500-1000` lines. Too much responsibility and security surface for this feature. + +Chosen policy: option 2. + +Acceptance: + +- active `model_catalog_json` path change invalidates cache +- active catalog file mtime change invalidates cache when path is available +- inactive untrusted project `.codex/config.toml` does not affect the trusted/global catalog + +## Top 3 Implementation Options + +### 1. Dedicated Codex model catalog feature - chosen + +`🎯 9 🛡️ 9 🧠 6` +Estimated size: `1200-2400` lines + +Core idea: + +- create `src/features/codex-model-catalog` +- keep model catalog rules isolated from account UI, provider status plumbing, and Electron transport +- reuse existing `CodexAppServerSessionFactory` +- expose a small feature facade to provider status and renderer model picker +- update orchestrator only where runtime status and launch effort transport require it + +Why it wins: + +- best SOLID alignment +- clean domain rules for model visibility, effort validation, fallback, and default selection +- does not make `codex-account` responsible for model policy +- least risk to Anthropic +- easiest to test without full app startup + +Main tradeoff: + +- needs small integration glue in existing provider status and team launch flows + +### 2. Fold catalog into `codex-account` + +`🎯 7 🛡️ 7 🧠 5` +Estimated size: `800-1600` lines + +Core idea: + +- extend `src/features/codex-account` with `model/list` +- use account snapshot as the only Codex control-plane snapshot +- merge account, rate limits, and model catalog in one feature + +Why it is tempting: + +- fewer new folders +- account feature already owns app-server account/rate-limit reads +- easier to fetch account plus model catalog in one app-server session + +Why I do not recommend it: + +- model catalog is not account management +- the feature becomes a broad Codex control-plane catch-all +- future provider catalog work would have to pull model rules back out +- more risk of account UI churn when only model picker changes are needed + +### 3. Full provider model catalog for all providers now + +`🎯 7 🛡️ 8 🧠 9` +Estimated size: `2500-4500` lines + +Core idea: + +- build one provider-agnostic model catalog for Anthropic, Codex, Gemini, and future providers +- move static renderer catalog policy into a shared feature +- expose one rich contract for all provider model pickers + +Why it is attractive: + +- cleanest long-term abstraction +- one UI model for labels, availability, capabilities, and efforts +- reduces future duplication + +Why not now: + +- too much surface area while Codex runtime cutover is still fresh +- Anthropic model behavior is already stable and should not be reworked for a Codex catalog issue +- would delay the concrete Codex model release problem + +## Current Code Reality + +### `claude_team` + +Existing app-server infrastructure: + +- `src/main/services/infrastructure/codexAppServer/JsonRpcStdioClient.ts` +- `src/main/services/infrastructure/codexAppServer/CodexAppServerSessionFactory.ts` +- `src/main/services/infrastructure/codexAppServer/protocol.ts` +- `src/features/codex-account/main/infrastructure/CodexAccountAppServerClient.ts` + +Current account client behavior: + +- `readAccount()` opens one app-server session. +- `readRateLimits()` opens another app-server session. +- `logout()` opens another app-server session. +- no `model/list` protocol types exist yet. +- `CodexAppServerSessionFactory` starts `codex app-server` with no explicit `cwd` or profile option. +- app-server initialize response includes `codexHome`, but the current protocol types do not expose effective config or config fingerprint. + +Current shared provider status: + +- `CliProviderStatus.models` is only `string[]`. +- `CliProviderStatus.modelAvailability` has per-model verification status but no rich model metadata. +- renderer model selector can already prefer runtime-provided `providerStatus.models`. + +Current effort type: + +```ts +export type EffortLevel = 'low' | 'medium' | 'high'; +``` + +Risk: + +- adding `xhigh` directly without provider-specific validation would let Anthropic UI accidentally offer unsupported choices. + +Current persistence and non-dialog launch paths: + +- team metadata and member metadata normalize launch-derived provider/model/effort in multiple services. +- backup/restore copies metadata but restore-time launch preview must still tolerate missing catalog metadata. +- draft retry and launch prefill can reuse old localStorage state. +- scheduled launch types can reference the shared effort type. + +Risk: + +- updating only the visible launch dialogs would leave hidden paths that silently drop Codex-only efforts or relaunch with stale default semantics. + +### `agent_teams_orchestrator` + +Current Codex model catalog: + +- `src/utils/model/codex.ts` +- static `CODEX_MODELS` +- static `DEFAULT_CODEX_MODEL` +- `isCodexModel()` checks only static ids + +Current runtime status: + +- `getUnifiedRuntimeStatusPayload('codex')` returns static Codex model ids. + +Current CLI effort: + +- top-level `--effort ` currently accepts `low | medium | high | max`. +- Codex native execution is ultimately `codex exec --json`. +- installed `codex exec --help` shows no `--effort` flag. + +Risk: + +- if we send `--effort xhigh` through current orchestrator, it fails before Codex can use it. +- if we map Anthropic `max` to Codex `xhigh`, the semantics are wrong. +- if we show `xhigh` in UI before the launch path supports it, the picker becomes misleading. + +## Target Architecture + +### Feature folder + +```text +src/features/codex-model-catalog/ + contracts/ + codexModelCatalog.dto.ts + index.ts + core/ + domain/ + codexModelCatalog.ts + codexReasoningEffort.ts + codexModelCatalogFallback.ts + normalizeCodexAppServerModel.ts + application/ + GetCodexModelCatalogUseCase.ts + CodexModelCatalogPorts.ts + main/ + composition/ + createCodexModelCatalogFeature.ts + adapters/ + output/ + CodexAppServerModelCatalogSource.ts + StaticCodexModelCatalogSource.ts + infrastructure/ + CodexModelCatalogAppServerClient.ts + InMemoryCodexModelCatalogCache.ts + preload/ + index.ts + renderer/ + adapters/ + codexModelCatalogViewModel.ts + hooks/ + useCodexModelCatalog.ts + ui/ + CodexModelEffortHint.tsx +``` + +Rules: + +- `core/domain` has all normalization and validation rules. +- `main/infrastructure` is the only layer that knows JSON-RPC method names. +- renderer never receives raw app-server rows. +- app shell imports only public feature entrypoints. + +### App-server lifecycle + +Use the existing `CodexAppServerSessionFactory`. + +Request sequence: + +1. Spawn `codex app-server`. +2. Send `initialize` with `clientInfo` and capabilities. +3. Send `initialized`. +4. Request `model/list`. +5. Drain or ignore notifications safely. +6. Close stdin and terminate the process on completion or timeout. + +Recommended timeouts: + +- initialize: `6000ms` +- `model/list`: `4500ms` +- total model catalog read: `9000ms` + +Recommended pagination: + +- request `limit: 100`, `includeHidden: false` for normal UI +- follow `nextCursor` until `null` +- hard-stop after 5 pages to avoid runaway loops +- log a degraded catalog warning if the hard-stop is hit + +### Single-session snapshot policy + +Provider status currently risks multiple sequential app-server starts: + +- account read +- rate limits read +- future model list read + +This caused slow provider loading in earlier UI work, so the plan should not add another app-server spawn in the hot path. + +Preferred design: + +- keep `codex-model-catalog` as a separate feature for ownership +- add an optional combined Codex control-plane read in composition +- when provider status refresh needs account plus rate limits plus model catalog, use one app-server session and issue all three requests inside it +- each sub-result has independent soft-failure state +- total snapshot can be partially healthy + +Snapshot shape: + +```ts +export interface CodexControlPlaneSnapshot { + binary: { + path: string; + version: string | null; + }; + account: CodexAccountSnapshotResult; + rateLimits: CodexRateLimitsSnapshotResult; + modelCatalog: CodexModelCatalogSnapshotResult; + configScope: { + cwd: string | null; + profileName: string | null; + projectTrust: 'trusted' | 'untrusted' | 'unknown'; + configReadSupport: CodexConfigReadSupport; + effectiveConfigFingerprint: string | null; + launchOverridesFingerprint: string | null; + activeModelCatalogFileFingerprint: string | null; + }; + initialize: { + codexHome: string; + platformFamily: string; + platformOs: string; + }; + fetchedAt: string; +} +``` + +Soft-failure rules: + +- account failure must not erase a fresh cached model catalog +- model catalog failure must not mark ChatGPT account disconnected +- rate-limit failure must not hide model picker options +- if app-server initialize fails, all three sub-results are degraded from the same root cause + +Required correction to the existing account flow: + +- current `CodexAccountAppServerClient.readAccount()` and `readRateLimits()` each open their own app-server process +- adding a third standalone `readModelCatalog()` would be a Provider Settings latency regression +- implement a combined app-server read path before wiring catalog into provider refresh +- keep separate methods for mutations and focused tests, but use the combined path for normal status refresh +- enrich `JsonRpcStdioClient` errors before catalog integration so the combined reader can classify `model/list` method failures without losing account truth + +Recommended application service shape: + +```ts +export interface CodexControlPlaneReader { + readSnapshot(options: CodexControlPlaneReadOptions): Promise; +} +``` + +This can live in `codex-model-catalog` composition or in a small shared Codex control-plane composition module. Do not put model normalization inside `codex-account`. + +Read scope: + +- Provider Settings global refresh can pass `cwd=null`. +- Create/Launch dialogs should pass the selected absolute `cwd`. +- Relaunch/restore should pass the team's persisted project path. +- Scheduled launch validation should pass `schedule.launchConfig.cwd`. +- If a future UI supports Codex profile selection, the same profile must be passed to preview and launch. + +## Contracts + +### App-server protocol types + +Add protocol DTOs to `src/main/services/infrastructure/codexAppServer/protocol.ts`: + +```ts +export type CodexAppServerReasoningEffort = + | 'none' + | 'minimal' + | 'low' + | 'medium' + | 'high' + | 'xhigh'; + +export interface CodexAppServerReasoningEffortOption { + reasoningEffort: CodexAppServerReasoningEffort; + description?: string | null; +} + +export type CodexAppServerInputModality = 'text' | 'image' | string; + +export interface CodexAppServerModel { + id: string; + model: string; + displayName: string; + description?: string | null; + hidden: boolean; + supportedReasoningEfforts: CodexAppServerReasoningEffortOption[]; + defaultReasoningEffort: CodexAppServerReasoningEffort; + inputModalities?: CodexAppServerInputModality[] | null; + supportsPersonality?: boolean | null; + isDefault: boolean; + upgrade?: string | null; + upgradeInfo?: unknown; + availabilityNux?: unknown; +} + +export interface CodexAppServerModelListParams { + cursor?: string | null; + limit?: number | null; + includeHidden?: boolean | null; +} + +export interface CodexAppServerModelListResponse { + data: CodexAppServerModel[]; + nextCursor: string | null; +} + +export interface CodexAppServerConfigReadParams { + cwd?: string | null; + profile?: string | null; +} + +export interface CodexAppServerConfigReadResponse { + config: Record; + origins: Record; +} +``` + +`config/read` caller rule: + +- always pass a params object, even when empty +- call global config as `config/read` with `{}` +- call project scope as `config/read` with `{ cwd }` +- call profile scope as `config/read` with `{ profile }` +- if both cwd and profile are needed, test `{ cwd, profile }` in Phase 1 and record the behavior before enabling profile-aware UI + +### Domain model + +Use separate ids: + +- `catalogId`: app-server `id`, stable identity for React keys, telemetry, and dedupe +- `launchModel`: app-server `model` when non-empty, otherwise `id` + +Reason: + +- local probe currently returned equal values, but official schema exposes both fields, so they can diverge later. +- using `id` for launch would be a latent bug if Codex introduces a display/catalog alias. + +```ts +export interface CodexCatalogModel { + catalogId: string; + launchModel: string; + displayName: string; + description: string | null; + hidden: boolean; + isDefault: boolean; + supportedReasoningEfforts: CodexReasoningEffort[]; + defaultReasoningEffort: CodexReasoningEffort | null; + inputModalities: CodexInputModality[]; + supportsPersonality: boolean; + upgrade: string | null; + source: 'app-server' | 'static-fallback'; +} +``` + +Normalization rules: + +- reject rows without a usable `id` +- derive `launchModel` from `model || id` +- default missing `inputModalities` to `['text', 'image']` for older catalogs +- default missing `supportsPersonality` to `false` +- accept documented `supportedReasoningEfforts` objects with `reasoningEffort` +- defensively accept string effort entries in tests, because older generated local types and live clients can drift +- drop duplicate `catalogId` rows after the first visible row +- drop duplicate `launchModel` rows after the first visible row unless a hidden row is the only available row +- keep unknown effort strings out of the selectable UI, but preserve them in diagnostics +- if no model is marked `isDefault`, choose static fallback default only as degraded fallback and label it as such + +### Provider status contract + +Add an optional rich catalog to `CliProviderStatus`: + +```ts +export interface CliProviderModelCatalog { + schemaVersion: 1; + source: 'app-server' | 'static-fallback' | 'unavailable'; + status: 'ready' | 'stale' | 'degraded' | 'unavailable'; + fetchedAt: string | null; + staleAt: string | null; + binary?: { + path: string | null; + version: string | null; + }; + authScope?: { + preferredAuthMode: 'auto' | 'chatgpt' | 'api_key' | null; + effectiveAuthMode: 'chatgpt' | 'api_key' | null; + forcedLoginMethod?: 'chatgpt' | 'api_key' | null; + managedAccountHash?: string | null; + forcedWorkspaceHash?: string | null; + apiKeySource?: string | null; + }; + launchScope?: { + cwd: string | null; + profileName: string | null; + projectTrust: 'trusted' | 'untrusted' | 'unknown'; + configFingerprint: string | null; + launchOverridesFingerprint: string | null; + }; + errorMessage?: string | null; + defaultModelId?: string | null; + defaultLaunchModel?: string | null; + models: CliProviderModelInfo[]; +} + +export interface CliProviderModelInfo { + catalogId: string; + launchModel: string; + displayName: string; + description?: string | null; + hidden?: boolean; + isDefault?: boolean; + supportedReasoningEfforts?: CliProviderReasoningEffort[]; + defaultReasoningEffort?: CliProviderReasoningEffort | null; + inputModalities?: string[]; + supportsPersonality?: boolean; + upgrade?: string | null; +} + +export type CliProviderReasoningEffort = + | 'none' + | 'minimal' + | 'low' + | 'medium' + | 'high' + | 'xhigh' + | 'max'; + +export interface CliProviderRuntimeCapabilities { + schemaVersion: 1; + codex?: { + supportsDynamicAppServerModels: boolean; + supportsCodexReasoningEffortConfig: boolean; + supportedCodexReasoningEfforts: Array<'minimal' | 'low' | 'medium' | 'high' | 'xhigh'>; + acceptsProviderExplicitFutureModels: boolean; + }; +} +``` + +Backwards compatibility: + +- keep `CliProviderStatus.models: string[]` +- add `CliProviderStatus.runtimeCapabilities?: CliProviderRuntimeCapabilities` +- for Codex, derive `models` from `modelCatalog.models.map(model => model.launchModel)` +- for Anthropic and Gemini, do not require `modelCatalog` +- old renderers continue to work from `models` +- new renderers prefer `modelCatalog` when present +- never put team-agent disabled policy directly into `CliProviderModelCatalog`; catalog describes Codex availability, while Agent Teams policy is applied by renderer and launch validators +- never infer launch capability only from catalog presence + +Renderer integration hotspot: + +- update `TeamModelRuntimeProviderStatus` in `src/renderer/utils/teamModelAvailability.ts` to include `modelCatalog` +- update `getRuntimeSelectorModels()` to use `modelCatalog.models[*].launchModel` for Codex +- update `getAvailableTeamProviderModelOptions()` to map rich Codex options with display labels, default badge, and catalog diagnostics +- keep Anthropic path on `getFallbackTeamProviderModelOptions()` +- keep Gemini path on existing `models: string[]` until Gemini has a richer catalog + +### Team launch effort contract + +Do not add a separate per-provider lane for this feature. + +Use existing team-level model/provider selection, but make effort provider-aware. + +Recommended implementation: + +- keep persisted field name `effort` +- widen internal effort type to `ProviderReasoningEffort` +- add provider/model validators at every launch boundary +- Anthropic UI only shows `low | medium | high` +- Codex UI shows only the selected model's `supportedReasoningEfforts` +- orchestrator accepts `minimal | low | medium | high | xhigh` for Codex and `low | medium | high | max` for Anthropic paths + +Existing validator hotspots: + +- `src/shared/types/team.ts` currently defines `EffortLevel = 'low' | 'medium' | 'high'` +- `src/main/ipc/teams.ts` currently validates only `low | medium | high` +- `src/main/http/teams.ts` currently validates only `low | medium | high` +- `src/renderer/components/team/dialogs/EffortLevelSelector.tsx` currently hardcodes only `Default | Low | Medium | High` +- `LaunchTeamDialog`, `CreateTeamDialog`, member draft rows, and member editor utilities currently cast strings with `as EffortLevel` + +Required migration: + +- replace unsafe `as EffortLevel` casts with a provider-aware normalization function +- parse provider before parsing effort in IPC and HTTP paths +- validate lead effort against lead provider/model +- validate member effort against each member's resolved provider/model +- keep old persisted `low | medium | high` values readable without migration + +Validation rule: + +```text +provider=codex: + effort must be in selectedModel.supportedReasoningEfforts + +provider=anthropic: + effort must be low | medium | high + +provider=gemini: + keep current behavior unless Gemini gets a richer effort contract +``` + +Important: + +- do not map Anthropic `max` to Codex `xhigh` +- do not map Codex `xhigh` to Anthropic `max` +- if selected Codex model changes and old effort is unsupported, reset to the new model's `defaultReasoningEffort` +- if catalog is unavailable, only allow static fallback efforts that are proven launchable + +Launch identity rule: + +- `effort` is user selection +- `resolvedEffort` is what launch sends to runtime +- if user selection is empty/default, `resolvedEffort` comes from app-server `defaultReasoningEffort` +- if resolved effort equals app-server default, runtime transport may omit `model_reasoning_effort`, but exact logs still record the resolved value + +## Runtime Launch Transport + +This was the highest-risk area in the earlier plan. The corrected plan is explicit. + +Facts: + +- `codex exec` has `--model`. +- `codex exec` has `-c, --config key=value`. +- `codex exec` has no documented `--effort`. +- Codex config has `model_reasoning_effort`. +- `model_reasoning_effort` supports `minimal | low | medium | high | xhigh`. + +Therefore: + +- Codex native launch must not pass `--effort xhigh` to Codex CLI. +- Orchestrator may keep accepting `--effort` as its public Agent Teams flag. +- When provider is Codex native, orchestrator must translate accepted effort into `codex exec -c model_reasoning_effort="value"`. +- When no effort is selected, omit `model_reasoning_effort` and let Codex use its model default. +- When effort equals the selected model's app-server default, either omit it or pass it consistently, but pick one policy and test it. + +Recommended policy: + +- omit effort when it equals app-server `defaultReasoningEffort` +- pass effort only when user explicitly selected a non-default value + +Reason: + +- this tracks Codex defaults as Codex evolves +- exact logs remain cleaner +- future app-server default changes are not blocked by stale persisted values + +Live signoff command shape: + +```bash +codex exec --json --model gpt-5.4 -c model_reasoning_effort='"xhigh"' --skip-git-repo-check --ephemeral "Return only: ok" +``` + +Quoting requirement: + +- command builder must pass `-c` and `model_reasoning_effort="xhigh"` as separate argv entries +- shell-rendered exact logs can show `-c model_reasoning_effort='"xhigh"'` +- tests should assert argv arrays, not only shell strings +- never concatenate user-controlled effort into a shell string without argv escaping + +Prelaunch validation must block: + +- `gpt-5.1-codex-mini` with `low` +- `gpt-5.1-codex-mini` with `xhigh` +- unknown effort strings from app-server until explicitly supported by our UI and orchestrator type + +## Static Fallback + +Fallback stays necessary because: + +- user may have an older Codex binary +- app-server may fail to initialize +- app-server may start but not support `model/list` +- offline usage should not make the entire model picker empty +- tests should not depend on live Codex availability + +Fallback rules: + +- fallback source is explicitly marked `static-fallback` +- fallback never claims to be current +- fallback has a short visible warning in Provider Settings only when user is choosing Codex models +- fallback model list should be minimal and conservative +- fallback must not include newly guessed future models +- fallback caused by missing `model/list` should include an upgrade hint tied to the detected Codex binary version when available + +Recommended fallback models: + +- `gpt-5.4` +- `gpt-5.4-mini` +- `gpt-5.3-codex` +- `gpt-5.2` +- `gpt-5.1-codex-mini` + +Fallback effort rules: + +- use `medium | high` for `gpt-5.1-codex-mini` +- use `low | medium | high | xhigh` for known models only if live signoff confirms `model_reasoning_effort` pass-through +- otherwise fallback UI can show richer metadata but disable non-launchable options + +API-key mode note: + +- do not use OpenAI `/v1/models` as the primary Codex picker for subscription-backed Codex +- optional API `/v1/models` fallback is allowed only for explicit API-key mode diagnostics +- if API `/v1/models` disagrees with Codex app-server `model/list`, Codex app-server wins for native Codex execution +- reason: the actual runtime surface is `codex exec`, and app-server describes what Codex clients should show + +## Cache And Refresh + +Goal: + +- make model updates feel fresh without making Provider Settings slow or flaky. + +Main-process cache: + +- key: Codex binary path plus Codex binary version plus Codex home plus launch cwd/profile/config fingerprint plus preferred auth mode plus effective auth mode plus managed account hash plus API-key source +- success TTL: `10 minutes` +- stale TTL: `24 hours` +- in-flight dedupe: one live `model/list` request per key +- manual refresh bypasses success TTL but still dedupes in-flight work +- auth mode change invalidates the ready cache for UI selection purposes +- `forced_login_method` and forced workspace changes invalidate the affected auth/catalog scope +- logout clears ChatGPT-scoped catalog cache +- API key source change clears API-key-scoped catalog cache +- project `.codex/config.toml`, global `config.toml`, or `model_catalog_json` changes clear the affected scope when detected by fingerprint change +- binary path or version change clears all Codex model catalog cache entries + +Renderer cache: + +- consume `CliProviderStatus.modelCatalog` +- no independent polling loop in the model picker +- refresh through existing provider status refresh action + +Dashboard policy: + +- do not run `model/list` on every dashboard render +- use existing provider status refresh cadence +- model catalog stale state can be shown only inside settings/model picker, not as a scary dashboard error +- dashboard catalog is a global/default-scope summary, not a promise that every project cwd has the same catalog + +Provider Settings policy: + +- open dialog with cached provider status immediately +- refresh in background +- show `Checking...` only for the area still being refreshed +- never replace a ready catalog with empty state during a refresh + +Avoid this bug: + +- do not set global provider status to `unavailable` while only the model catalog refresh is pending +- do not replace a ChatGPT-ready account state with a catalog timeout +- do not show generic `Unknown error`; preserve app-server method, timeout, and fallback source in diagnostics +- if `auto` resolves to ChatGPT, API-key detection copy stays secondary +- if `auto` resolves to API key because ChatGPT is unavailable, show why ChatGPT was skipped before showing API-key catalog + +## UI Behavior + +### Model picker + +When `provider=codex`: + +- prefer `providerStatus.modelCatalog.models` +- option value is `launchModel` +- React key can use `catalogId` +- label uses `displayName` +- default badge uses `isDefault` +- hidden app-server models are excluded from normal selector unless already persisted in a team +- disabled state uses existing Agent Teams policy plus app-server `upgrade` hints +- runtime-capability state controls whether a visible model is launchable +- fallback badge says `Using fallback catalog` only when source is fallback +- if app-server says a model is available but Agent Teams disables it, show `Available in Codex, disabled for Agent Teams` +- if app-server says a future model exists but runtime capability is missing, show `Available in Codex, waiting for Agent Teams runtime support` +- if a persisted model is missing from current catalog, show it as `Unavailable in current Codex catalog` and require user confirmation before relaunch +- if the dialog has a selected cwd and only a global catalog is available, show global options as provisional until project-scoped catalog finishes +- if project-scoped catalog differs from global catalog, keep the user's explicit selection only if it exists in the project-scoped catalog or is a preserved persisted value + +When catalog is loading: + +- keep previous options visible +- show a subtle "Refreshing models" state +- do not show an empty Codex picker unless no cached or fallback models exist +- label provisional global catalog rows as `Checking this project...` when launch cwd is known + +When catalog fails: + +- use stale cache if present +- otherwise use static fallback +- show the app-server error in diagnostics, not as a generic unknown error + +### Effort selector + +When `provider=codex` and selected model has catalog metadata: + +- show efforts from `supportedReasoningEfforts` +- mark `defaultReasoningEffort` as default +- include `xhigh` if returned by app-server and runtime capability says Codex effort config pass-through is supported +- if runtime capability is missing, show Codex-only efforts as metadata or disabled rows, not selectable launch values +- if selected effort is no longer valid, reset to default with a small explanation +- if model is Agent Teams-disabled, keep effort selector read-only or disabled to avoid suggesting launchability + +When selected model has no catalog metadata: + +- show only safe fallback efforts +- do not show `xhigh` unless launch pass-through is implemented and tested + +When `provider=anthropic`: + +- keep current selector behavior +- do not show Codex-only `minimal`, `none`, or `xhigh` +- do not change Anthropic copy + +### Default model + +Recommended behavior: + +- app-server `isDefault` defines the Codex default in UI +- "Default" label can render as `Default (gpt-5.4)` or `Default (GPT-5.4)` when catalog is ready +- new Codex teams can display `Default`, but launch must resolve it to a concrete `resolvedLaunchModel` +- existing teams keep their persisted model unless user changes it +- do not rewrite old team metadata just because app-server default changed +- exact logs and team metadata should record both selected `Default` and concrete resolved model + +Reason: + +- new teams benefit from current Codex defaults +- existing teams remain explainable even if Codex default changes later + +## Orchestrator Changes + +### Model status + +Short term: + +- keep static `CODEX_MODELS` for standalone fallback and non-app UI compatibility +- add richer status only if orchestrator can read app-server directly without slowing CLI startup + +Recommended first cut: + +- `claude_team` owns app-server model catalog for UI +- orchestrator keeps static runtime status until a dedicated orchestrator catalog source is added +- launch validation accepts provider-explicit Codex model strings even if not in static `CODEX_MODELS` +- orchestrator exposes runtime capabilities for dynamic Codex model ids and Codex reasoning effort config pass-through + +Reason: + +- UI is where the dynamic picker is needed immediately +- orchestrator should not reject a future model that Codex app-server already exposed and `claude_team` selected +- UI should not guess whether the current runtime can launch that future model + +### Validation + +Update validation so: + +- provider-explicit `codex` launches can use model strings from app-server catalog +- unknown model strings are not guessed as Codex without provider context +- static `isCodexModel()` remains valid for generic detection, not authoritative for provider-explicit launches +- if provider context is missing, keep existing conservative static validation + +### Effort transport + +Update orchestrator: + +- accept Codex efforts `minimal | low | medium | high | xhigh` +- preserve Anthropic `max` +- in Codex native executor, convert Codex effort to `-c model_reasoning_effort='"value"'` +- do not pass unsupported effort values to `codex exec` +- exact logs should show the selected effort as normalized Agent Teams metadata and the actual Codex config override + +Required tests: + +- Codex native `xhigh` becomes `-c model_reasoning_effort='"xhigh"'` +- no effort omits `model_reasoning_effort` +- Anthropic `max` remains Anthropic-only +- Codex `max` is rejected +- Anthropic `xhigh` is rejected + +## Concrete Implementation Touchpoints + +`claude_team`: + +- `src/main/services/infrastructure/codexAppServer/protocol.ts` - add app-server model DTOs +- `src/main/services/infrastructure/codexAppServer/JsonRpcStdioClient.ts` - preserve JSON-RPC error code, method, and details +- `src/main/services/infrastructure/codexAppServer/CodexBinaryResolver.ts` or a nearby service - expose binary version for cache invalidation +- `src/features/codex-model-catalog` - new feature for catalog domain, use case, app-server source, fallback source, and cache +- `src/features/codex-account/main/composition/createCodexAccountFeature.ts` - coordinate combined control-plane snapshot or delegate to shared reader +- `src/features/codex-account/renderer/mergeCodexProviderStatusWithSnapshot.ts` - preserve account truth while merging model catalog truth +- `src/shared/types/cliInstaller.ts` - add optional provider model catalog +- `src/shared/types/team.ts` - widen provider-aware effort types without breaking old persisted values +- `src/shared/types/schedule.ts` - prevent scheduled launches from dropping Codex-specific efforts +- `src/main/services/team/TeamDataService.ts` - preserve provider-aware effort and launch identity when reconstructing team state +- `src/main/services/team/TeamMembersMetaStore.ts` - stop filtering Codex efforts down to legacy `low | medium | high` +- `src/main/services/team/TeamBackupService.ts` and restore paths - preserve additive launch identity and tolerate old backups +- `src/main/services/runtime/CliProviderModelAvailabilityService.ts` - keep runtime verification compatible with `launchModel` values and do not verify hidden/catalog-only rows by accident +- `src/main/ipc/teams.ts` and `src/main/http/teams.ts` - parse provider first, then validate effort +- `src/renderer/utils/teamModelAvailability.ts` - consume rich Codex catalog +- `src/renderer/utils/teamModelCatalog.ts` - demote Codex static list to fallback and labels only +- `src/renderer/components/team/dialogs/EffortLevelSelector.tsx` - make options provider/model-aware +- `src/renderer/components/team/dialogs/LaunchTeamDialog.tsx` and `CreateTeamDialog.tsx` - remove unsafe effort casts and persist resolved launch identity +- member draft/editor components - validate per-member resolved provider/model/effort +- renderer launch prefill and draft retry storage - add a versioned launch identity payload and tolerate old entries + +`agent_teams_orchestrator`: + +- `src/entrypoints/sdk/runtimeTypes.ts` - add provider-aware Codex effort support +- `src/main.tsx` - update `--effort` parser or provider-specific validation path +- `src/utils/effort.ts` and `src/utils/providerEffort.ts` - separate Anthropic `max` from Codex `xhigh` +- Codex native executor path - convert effort to `-c model_reasoning_effort` +- `src/utils/model/codex.ts` - rename static list semantics to fallback/static detection +- `src/utils/model/validateModel.ts` - allow provider-explicit Codex app-catalog models +- runtime status/capability endpoint - expose dynamic Codex model and effort pass-through support +- exact-log/runtime status code - record selected model, resolved model, selected effort, resolved effort, and config override + +## Phased Implementation + +### Phase 0 - contracts and live spike + +Commit boundary: `docs(codex): plan app-server model catalog` + +Tasks: + +- add this plan +- keep live probe output in signoff notes or test fixture +- confirm installed Codex supports `model/list` +- confirm one app-server session can read account, rate limits, and model catalog +- confirm docs support `model_reasoning_effort` +- decide exact shell quoting for `-c model_reasoning_effort` +- capture fixtures for at least two catalog shapes: current live shape and synthetic `id !== model` +- capture current Codex binary version and document cache invalidation expectations + +Acceptance: + +- plan exists in the dedicated worktree +- no code behavior changes +- weak areas are explicitly called out + +### Phase 1 - app-server model catalog feature + +Commit boundary: `feat(codex): add app-server model catalog source` + +Tasks: + +- add structured JSON-RPC request errors with method/code/details +- expose or probe Codex binary version for catalog cache keys +- add effective config fingerprint support using app-server `config/read` when available +- add `config/read` support detection and always send `{}` params at minimum +- add `src/features/codex-model-catalog` +- add app-server protocol types +- add `CodexModelCatalogAppServerClient` +- add normalization domain rules +- add static fallback source +- add in-memory cache with TTL and in-flight dedupe +- include launch scope fields in cache keys: cwd, profile, trust, config fingerprint, launch override fingerprint +- include forced login method and forced workspace hash in auth-scoped cache keys +- normalize both documented effort option objects and defensive string effort values +- classify `method not found`, timeout, malformed response, and empty catalog separately +- add structured diagnostics without raw account email or secret-bearing env values +- expose feature facade from main composition + +Acceptance: + +- JSON-RPC `method not found` can be detected in tests +- binary version changes invalidate catalog cache +- config fingerprint changes invalidate catalog cache for that scope +- forced login/workspace changes invalidate account, limits, and catalog cache for that scope +- unit tests cover normalization, fallback, pagination, duplicate ids, missing modalities, unknown effort strings, and `id !== model` +- app-server client tests cover `model/list` request params and timeout labels +- method-not-found falls back without marking account disconnected +- diagnostics include source, status, method, error category, binary version, effective auth mode, and cache age +- no renderer behavior changes yet + +### Phase 2 - provider status integration + +Commit boundary: `feat(runtime): expose codex model catalog metadata` + +Tasks: + +- add optional `modelCatalog` to `CliProviderStatus` +- add optional `runtimeCapabilities` to `CliProviderStatus` +- merge Codex model catalog into provider status +- keep `models: string[]` derived from `launchModel` +- make provider refresh use cached, auth-scoped catalog +- implement combined account/rate-limits/catalog app-server read for normal refresh +- avoid extra app-server session in hot paths where account snapshot already refreshes +- clear ChatGPT-scoped catalog on logout and API-key-scoped catalog when API key source changes +- clear all catalog entries when Codex binary path or version changes +- ensure `auto` catalog scope follows effective launch auth mode, not just configured preference +- add request/snapshot versioning so stale refresh responses cannot overwrite newer auth state +- support global provider refresh and project-scoped launch refresh as different catalog scopes +- preserve Anthropic provider status shape + +Acceptance: + +- Codex provider status includes `modelCatalog` +- Codex provider status includes runtime capability metadata when available +- old `models` still works +- `auto` with ChatGPT ready uses ChatGPT-scoped catalog even if API key is detected +- `auto` with ChatGPT unavailable and API key ready uses API-key-scoped catalog with clear degraded copy +- forced login method overrides are reflected in effective auth copy and cache scope +- one normal Codex provider refresh does not spawn separate app-server processes for account, limits, and catalog +- Anthropic snapshots are byte-for-byte equivalent except ordering noise already present +- provider dashboard does not block on a slow catalog refresh when stale cache exists +- older refresh results are ignored after auth mode or runtime capability changes +- global dashboard catalog and project launch catalog do not overwrite each other + +### Phase 3 - dynamic UI model picker and effort selector + +Commit boundary: `feat(codex): use dynamic model catalog in team launch UI` + +Tasks: + +- update Codex model picker to prefer rich catalog +- show app-server labels, default badge, and fallback source state +- update effort selector to be provider/model-aware +- show `xhigh` metadata only for Codex models that return it +- make `xhigh` selectable only when runtime capability says Codex effort config pass-through is supported +- hide Codex-only efforts for Anthropic +- reset invalid effort on model change +- preserve missing persisted models as visible warning rows instead of silently clearing selection +- keep Agent Teams disabled policy separate from Codex app-server availability +- show future app-server models immediately, with `New from Codex catalog` status when policy has not verified them yet +- when cwd is selected, refresh project-scoped Codex catalog before enabling launch-only controls + +Acceptance: + +- `gpt-5.1-codex-mini` shows only `medium | high` +- `gpt-5.3-codex-spark` defaults to `high` +- `gpt-5.4` shows `low | medium | high | xhigh` as catalog metadata +- `xhigh` is disabled with runtime-upgrade copy until capability support is present +- app-server-visible but Agent Teams-disabled model shows disabled copy, not unavailable copy +- synthetic future `gpt-5.5` fixture appears without touching static catalog +- persisted model missing from current catalog is visible with a warning +- Anthropic UI remains `low | medium | high` +- static fallback still renders when app-server is unavailable +- global catalog can be displayed provisionally, but launch enablement waits for project-scoped catalog or explicit degraded confirmation + +### Phase 4 - launch validation and Codex effort pass-through + +Commit boundary: `feat(runtime): pass codex reasoning effort through native exec` + +Tasks: + +- widen team launch effort validation with provider-specific rules +- update IPC and HTTP validators +- update `TeamProvisioningService` request shaping +- persist additive `ProviderModelLaunchIdentity` into team metadata, exact-log metadata, and backup/restore payloads where launch identity is reconstructed +- update orchestrator parser and runtime types +- expose orchestrator runtime capability metadata for dynamic Codex models and Codex effort config +- translate Codex effort to argv entries `['-c', 'model_reasoning_effort="value"']` +- keep Anthropic `max` separate +- add exact-log metadata for selected model, resolved launch model, catalog source, selected effort, and resolved effort +- resolve `Default` to concrete launch model before provisioning +- update scheduled/provisioned launch paths or block Codex-only efforts in those paths until updated +- enforce built-in OpenAI Codex provider scope or block custom/OSS provider configs with clear copy +- pass profile/cwd/config overrides consistently between preview and `codex exec` + +Acceptance: + +- Codex `xhigh` launch reaches `codex exec` as `model_reasoning_effort` +- Codex `max` is rejected before launch +- Anthropic `xhigh` is rejected before launch +- unsupported model-effort pairs are blocked before provisioning +- provider-explicit synthetic future model is accepted only when runtime capability says dynamic Codex models are supported +- member metadata, team metadata, draft retry, and backup/restore preserve provider-aware effort +- replay/exact logs show what was selected, what default resolved to, and what was passed to Codex +- exact logs include catalog scope fingerprint and provider scope, but not raw config values + +### Phase 5 - cleanup and fallback tightening + +Commit boundary: `refactor(codex): demote static model catalog to fallback` + +Tasks: + +- rename static Codex catalog helpers to make fallback status explicit +- remove UI assumptions that static list is authoritative +- make future provider-explicit Codex ids launchable when selected from app-server catalog +- add diagnostics for catalog source and staleness +- document fallback behavior +- add a fixture/test with synthetic future model `gpt-5.5` +- remove any remaining hardcoded Codex model order from the primary Codex UI path +- add hidden-model fixture and upgrade-suggestion fixture +- add one migration test for old localStorage launch prefill without provider model launch identity +- add project-scoped catalog fixture with `model_catalog_json` +- add custom-provider config fixture +- add forced login method and forced workspace fixtures +- add `config/read` method-missing and invalid-params fixtures + +Acceptance: + +- new app-server model can appear in UI without code changes +- static fallback is visible as fallback in diagnostics +- no code path treats static `CODEX_MODELS` as the only valid Codex provider model list +- synthetic `gpt-5.5` appears through app-server fixture and can be selected without touching static catalog +- hidden persisted model is preserved with warning and is not introduced into new-team picker +- project-scoped catalog differences are visible and do not corrupt global provider status +- forced login method changes are visible and do not reuse stale catalog/rate-limit scope + +## Test Plan + +### `claude_team` unit tests + +Add tests for: + +- structured JSON-RPC error classification +- binary version cache invalidation +- effective config fingerprint cache invalidation +- `config/read` support detection, including invalid missing params +- project-scoped `model_catalog_json` fixture +- app-server model normalization +- `id` vs `model` split +- default model selection +- per-model effort options +- unknown effort filtering +- auth-scoped catalog cache keys +- `auto` auth resolving to ChatGPT vs API-key catalog scope +- combined app-server snapshot partial failures +- method-not-found fallback for older Codex app-server +- fallback catalog source +- stale cache behavior +- stale refresh response is ignored after newer auth-scope request +- global catalog and project-scoped catalog use separate cache entries +- forced login method and forced workspace hash use separate cache entries +- custom/OSS `model_provider` config is blocked or marked unsupported for Agent Teams Codex +- raw managed account email does not appear in catalog diagnostics or exact-log metadata +- provider status `models` compatibility +- provider status runtime capabilities compatibility +- provider model availability uses `launchModel`, not `catalogId` +- renderer model picker with rich catalog +- renderer effort selector with Codex and Anthropic providers +- renderer disables Codex-only efforts when runtime capability is missing +- renderer shows synthetic future model as `New from Codex catalog` +- renderer preserves hidden persisted model after `includeHidden: true` recovery +- persisted missing model warning row +- Agent Teams disabled policy overlay for app-server-visible models +- backup/restore reads old metadata and preserves new launch identity when present +- draft retry and launch prefill read old localStorage entries without dropping provider/model identity +- scheduled launch validation either supports Codex-specific effort or blocks it with explicit error +- launch preview with selected cwd does not enable launch from global-only catalog when project-scoped catalog is still unknown + +Suggested commands: + +```bash +pnpm vitest run \ + test/features/codex-model-catalog \ + test/features/codex-account \ + test/renderer/components/team \ + test/renderer/utils/teamModelCatalog.test.ts +``` + +### `agent_teams_orchestrator` tests + +Add tests for: + +- provider-explicit Codex model validation +- Codex effort parser accepts `minimal | low | medium | high | xhigh` +- Anthropic effort parser keeps existing behavior +- Codex native executor emits `-c model_reasoning_effort` +- Codex native executor builds argv entries, not unsafe shell concatenation +- no effort omits Codex effort config +- `max` is not accepted for Codex +- synthetic `gpt-5.5` passes when provider is explicitly Codex and model came from app catalog +- capability payload reports dynamic Codex model support and effort config support +- provider-explicit future model fails closed when capability is disabled +- Codex native exec argv includes cwd/profile/config override semantics that match preview scope +- custom provider config is not silently routed through subscription Codex UX + +Suggested command: + +```bash +pnpm test -- runtimeBackends providerEffort spawnMultiAgent codex +``` + +### live smoke + +Run only when developer has Codex login/API available: + +```bash +codex app-server +``` + +JSON-RPC smoke: + +```json +{ "jsonrpc": "2.0", "id": 1, "method": "model/list", "params": { "limit": 20, "includeHidden": false } } +``` + +Native exec effort smoke: + +```bash +codex exec --json --model gpt-5.4 -c model_reasoning_effort='"xhigh"' --skip-git-repo-check --ephemeral "Return only: ok" +``` + +Failure smoke: + +```bash +codex exec --json --model gpt-5.1-codex-mini -c model_reasoning_effort='"xhigh"' --skip-git-repo-check --ephemeral "Return only: ok" +``` + +Expected: + +- our app should block the second case before launch once catalog metadata is available +- if run manually, Codex may return model/provider-specific error, but product UX should not rely on that late failure + +## Risks And Mitigations + +### Risk 1 - app-server startup slows provider settings + +`🎯 8 🛡️ 8 🧠 5` + +Mitigation: + +- cache model catalog in main process +- dedupe in-flight refreshes +- use stale cache while refreshing +- combine account/rate-limit/catalog reads where possible +- never clear ready UI while refresh is pending + +### Risk 2 - effort values leak into Anthropic + +`🎯 9 🛡️ 9 🧠 4` + +Mitigation: + +- provider-specific effort validation +- renderer selector branches by provider and selected model +- tests for Anthropic not showing `xhigh`, `minimal`, or `none` +- orchestrator rejects invalid provider-effort pairs + +### Risk 3 - `id` and `model` diverge later + +`🎯 8 🛡️ 9 🧠 3` + +Mitigation: + +- use `catalogId` for identity +- use `launchModel` for runtime +- tests with fixture where `id !== model` + +### Risk 4 - app-server catalog has unknown fields or new efforts + +`🎯 8 🛡️ 8 🧠 5` + +Mitigation: + +- tolerant protocol DTOs +- unknown efforts preserved in diagnostics but not selectable +- add one small allow-list update when product intentionally supports a new effort +- no hard crash on unknown `inputModalities` + +### Risk 5 - static fallback becomes accidentally authoritative again + +`🎯 7 🛡️ 8 🧠 4` + +Mitigation: + +- name fallback helpers clearly +- include `source` in model catalog +- tests assert app-server source wins over fallback +- UI diagnostics expose fallback source + +### Risk 6 - launch path accepts model from UI but orchestrator rejects it + +`🎯 8 🛡️ 8 🧠 6` + +Mitigation: + +- provider-explicit Codex launch validation should trust `provider=codex` plus app-server-selected model +- static `isCodexModel()` remains only a generic detector +- exact tests with a future-model fixture like `gpt-5.5` + +### Risk 7 - auth-scoped catalog leaks between modes + +`🎯 7 🛡️ 9 🧠 6` + +Mitigation: + +- include auth scope in catalog cache key +- clear scoped cache on logout and API-key source changes +- tests for ChatGPT catalog not being reused in API-key mode +- UI labels catalog source and auth scope in diagnostics + +### Risk 8 - Default becomes nondeterministic across relaunch + +`🎯 8 🛡️ 9 🧠 6` + +Mitigation: + +- persist selected model kind and resolved launch model in launch identity +- exact logs record both `Default` and concrete model +- relaunch preview shows current default resolution before launch +- do not silently rewrite old explicit models + +### Risk 9 - older Codex binary lacks `model/list` + +`🎯 7 🛡️ 9 🧠 5` + +Mitigation: + +- preserve JSON-RPC error code and method +- classify method-not-found separately from app-server failure +- show static fallback with Codex upgrade hint +- cache key includes binary version so upgrades refresh the catalog + +### Risk 10 - `auto` auth shows the wrong catalog + +`🎯 7 🛡️ 9 🧠 6` + +Mitigation: + +- resolve effective auth mode before catalog scope +- keep ChatGPT and API-key catalogs separate +- UI copy distinguishes selected preference, effective launch mode, and fallback credentials +- tests cover ChatGPT-ready + API-key-present and ChatGPT-missing + API-key-ready cases + +### Risk 11 - UI enables a capability the installed runtime cannot launch + +`🎯 7 🛡️ 10 🧠 7` + +Mitigation: + +- add explicit runtime capability metadata +- display catalog metadata separately from launch enablement +- fail closed when capability is missing or stale +- test Phase 3 UI against a pre-Phase-4 runtime fixture + +### Risk 12 - future models appear but break team-agent behavior + +`🎯 8 🛡️ 8 🧠 6` + +Mitigation: + +- split Codex catalog availability from Agent Teams policy status +- show new models as `New from Codex catalog` +- block only hard incompatibilities: runtime capability missing, unsupported modality, disabled policy, unsupported effort +- exact logs record new-model status for later debugging + +### Risk 13 - hidden or upgraded persisted models are silently lost + +`🎯 8 🛡️ 9 🧠 5` + +Mitigation: + +- run one `includeHidden: true` lookup for persisted explicit models missing from visible catalog +- preserve model value during restore and relaunch preview +- show upgrade suggestions without auto-rewriting metadata +- test hidden-model and upgrade fixtures + +### Risk 14 - non-dialog launch path drops Codex effort + +`🎯 7 🛡️ 9 🧠 7` + +Mitigation: + +- audit team metadata, members metadata, backup/restore, draft retry, launch prefill, and schedule types +- parse provider before parsing effort at every main-process boundary +- block Codex-only effort in any path not updated in the same phase +- add tests outside React launch dialogs + +### Risk 15 - HMR or slow refresh overwrites correct provider state + +`🎯 8 🛡️ 9 🧠 5` + +Mitigation: + +- add request/snapshot versioning +- ignore out-of-order provider status responses +- do not let catalog failures overwrite account truth +- keep last ready state visible while a refresh is pending + +### Risk 16 - global catalog preview differs from project launch catalog + +`🎯 6 🛡️ 10 🧠 8` + +Mitigation: + +- include cwd, profile, trust, config fingerprint, and launch override fingerprint in catalog scope +- use app-server `config/read` when available to derive effective config +- keep dashboard/global catalog separate from launch/project catalog +- require project-scoped catalog before enabling launch-only controls when cwd is known + +### Risk 17 - custom or OSS Codex config is mistaken for subscription Codex + +`🎯 7 🛡️ 9 🧠 7` + +Mitigation: + +- keep Agent Teams Codex scoped to built-in OpenAI Codex provider +- detect effective `model_provider` when possible +- block or degrade custom/OSS provider configs with explicit copy +- do not show ChatGPT account limits for custom provider execution + +### Risk 18 - non-text model row appears in catalog + +`🎯 8 🛡️ 9 🧠 4` + +Mitigation: + +- require `text` input modality for Agent Teams launch +- treat missing `inputModalities` with the documented backward-compatible default +- do not claim personality support when `supportsPersonality=false` + +### Risk 19 - experimental app-server surface changes behavior + +`🎯 8 🛡️ 9 🧠 4` + +Mitigation: + +- keep `experimentalApi=false` +- rely only on documented stable `model/list` fields +- ignore unknown fields unless a typed use case is added + +### Risk 20 - app-server catalog passes but native exec fails + +`🎯 8 🛡️ 10 🧠 6` + +Mitigation: + +- treat app-server catalog as picker truth, not full launch proof +- require Phase 4 native exec argv tests and live smoke where possible +- test model, effort, cwd, profile, and provider scope together +- block unsupported model-effort pairs before `codex exec` + +### Risk 21 - `config/read` behavior differs across Codex versions + +`🎯 7 🛡️ 9 🧠 6` + +Mitigation: + +- feature-detect `config/read` +- always send `{}` params at minimum +- classify method-missing, invalid-params, scoped-failure, and global-success separately +- never make config-read failure disconnect the Codex account + +### Risk 22 - forced login/workspace reuses stale catalog + +`🎯 7 🛡️ 10 🧠 6` + +Mitigation: + +- include forced login method and forced workspace hash in auth scope +- invalidate account, limits, and catalog together when either changes +- display forced auth copy instead of showing conflicting selected auth copy +- redact workspace ids in logs and diagnostics + +### Risk 23 - local `model_catalog_json` changes without config change + +`🎯 6 🛡️ 9 🧠 7` + +Mitigation: + +- hash effective config and optionally active catalog file mtime when app-server exposes enough origin data +- keep TTL/manual refresh fallback when origin data is unavailable +- do not parse or log arbitrary catalog file contents +- do not apply untrusted project-scoped catalog files unless effective config says they are active + +## Definition Of Done + +The feature is done when: + +- Codex model picker uses app-server `model/list` when available. +- New app-server-visible Codex models appear without app code changes. +- `supportedReasoningEfforts` and `defaultReasoningEffort` drive Codex effort UI. +- `xhigh` appears only where Codex reports it. +- Anthropic UI and launch behavior are unchanged. +- Codex launches pass effort through `model_reasoning_effort`. +- UI launch controls are gated by runtime capabilities, not by catalog metadata alone. +- Future app-server-visible models appear without code changes and are marked as new until policy/runtime support is clear. +- `Default` Codex selection resolves to concrete launch identity before provisioning. +- Auth changes do not reuse stale model catalogs across ChatGPT and API-key modes. +- Project-scoped Codex config and `model_catalog_json` cannot make launch use a different catalog than preview without explicit degraded copy. +- Custom or OSS Codex provider config is not silently presented as ChatGPT subscription-backed Agent Teams Codex. +- `config/read` compatibility is feature-detected and never breaks account truth on older binaries. +- Forced login method and forced workspace changes cannot reuse stale account, rate-limit, or catalog cache. +- Codex binary upgrades invalidate stale catalog cache and retry `model/list`. +- Older Codex binaries without `model/list` fall back without breaking account state. +- Static Codex catalog is clearly fallback, not primary truth. +- Hidden persisted models are preserved with explicit warnings. +- Backup/restore, draft retry, launch prefill, member metadata, and scheduled paths do not drop provider-aware effort. +- Exact logs and diagnostics do not persist raw account identifiers or secret values. +- Exact logs include catalog scope and provider scope fingerprints for debugging preview vs launch mismatch. +- HMR and out-of-order refreshes do not replace ready provider status with stale fallback/error state. +- Provider Settings remains fast and does not show transient empty/error states during refresh. +- Tests cover catalog source, fallback, effort validation, and launch pass-through. + +## Final Signoff And Handoff + +The implementation is now ready for review after these checks stay green: + +1. `claude_team`: `pnpm typecheck` +2. `claude_team`: targeted catalog/runtime/team provisioning Vitest suites +3. `agent_teams_orchestrator_codex_native_spike`: targeted Codex native exec and runtime capability Bun suites +4. Live `codex app-server model/list` smoke against the installed Codex binary +5. Optional UI smoke with `CLAUDE_DEV_RUNTIME_ROOT=/Users/belief/dev/projects/claude/agent_teams_orchestrator_codex_native_spike` + +Merge requirement: + +- merge/pair the `claude_team` branch with the `agent_teams_orchestrator_codex_native_spike` runtime capability change. +- if the UI branch is merged without the runtime capability change, the feature remains safe but conservative: dynamic future Codex models and `xhigh` are visible as catalog metadata but blocked for launch. +- if the runtime capability change is merged without the UI branch, existing Codex native behavior remains unchanged except for the explicit runtime status payload and `xhigh` exact argv support already covered by tests. + +Recommended final manual smoke: + +```bash +CLAUDE_DEV_RUNTIME_ROOT=/Users/belief/dev/projects/claude/agent_teams_orchestrator_codex_native_spike pnpm dev +``` + +Then verify: + +- Provider Settings Codex model list is populated from app-server catalog. +- `gpt-5.1-codex-mini` shows only `medium | high`. +- `gpt-5.4` shows `low | medium | high | xhigh`. +- Anthropic does not show `minimal`, `none`, or `xhigh`. +- A synthetic or newly released Codex model is not silently hidden by static UI code. +- Launch logs include selected model, resolved launch model, selected effort, resolved effort, catalog source, and runtime capability truth. diff --git a/src/features/codex-model-catalog/contracts/dto.ts b/src/features/codex-model-catalog/contracts/dto.ts new file mode 100644 index 00000000..10b65283 --- /dev/null +++ b/src/features/codex-model-catalog/contracts/dto.ts @@ -0,0 +1,13 @@ +import type { + CliProviderModelCatalog, + CliProviderModelCatalogItem, + CliProviderModelCatalogSource, + CliProviderModelCatalogStatus, + CliProviderReasoningEffort, +} from '@shared/types'; + +export type CodexModelCatalogDto = CliProviderModelCatalog; +export type CodexModelCatalogItemDto = CliProviderModelCatalogItem; +export type CodexModelCatalogSourceDto = CliProviderModelCatalogSource; +export type CodexModelCatalogStatusDto = CliProviderModelCatalogStatus; +export type CodexModelReasoningEffortDto = CliProviderReasoningEffort; diff --git a/src/features/codex-model-catalog/contracts/index.ts b/src/features/codex-model-catalog/contracts/index.ts new file mode 100644 index 00000000..894516e9 --- /dev/null +++ b/src/features/codex-model-catalog/contracts/index.ts @@ -0,0 +1,7 @@ +export type { + CodexModelCatalogDto, + CodexModelCatalogItemDto, + CodexModelCatalogSourceDto, + CodexModelCatalogStatusDto, + CodexModelReasoningEffortDto, +} from './dto'; diff --git a/src/features/codex-model-catalog/core/domain/__tests__/normalizeCodexAppServerModel.test.ts b/src/features/codex-model-catalog/core/domain/__tests__/normalizeCodexAppServerModel.test.ts new file mode 100644 index 00000000..ed7f325c --- /dev/null +++ b/src/features/codex-model-catalog/core/domain/__tests__/normalizeCodexAppServerModel.test.ts @@ -0,0 +1,95 @@ +import { describe, expect, it } from 'vitest'; + +import { normalizeCodexAppServerModels } from '../normalizeCodexAppServerModel'; + +describe('normalizeCodexAppServerModels', () => { + it('keeps app-server model metadata required by the UI picker', () => { + const result = normalizeCodexAppServerModels([ + { + id: 'gpt-5.5', + displayName: 'GPT-5.5', + supportedReasoningEfforts: [ + { reasoningEffort: 'low' }, + { reasoningEffort: 'medium' }, + { reasoningEffort: 'high' }, + { reasoningEffort: 'xhigh' }, + ], + defaultReasoningEffort: 'xhigh', + inputModalities: ['text', 'image'], + supportsPersonality: true, + isDefault: true, + }, + ]); + + expect(result.defaultModelId).toBe('gpt-5.5'); + expect(result.models).toEqual([ + expect.objectContaining({ + id: 'gpt-5.5', + launchModel: 'gpt-5.5', + displayName: 'GPT-5.5', + supportedReasoningEfforts: ['low', 'medium', 'high', 'xhigh'], + defaultReasoningEffort: 'xhigh', + inputModalities: ['text', 'image'], + supportsPersonality: true, + isDefault: true, + source: 'app-server', + }), + ]); + }); + + it('filters hidden models unless the caller explicitly asks for them', () => { + const result = normalizeCodexAppServerModels([ + { id: 'gpt-visible', hidden: false }, + { id: 'gpt-hidden', hidden: true }, + ]); + + expect(result.models.map((model) => model.id)).toEqual(['gpt-visible']); + + const withHidden = normalizeCodexAppServerModels( + [ + { id: 'gpt-visible', hidden: false }, + { id: 'gpt-hidden', hidden: true }, + ], + { includeHidden: true } + ); + + expect(withHidden.models.map((model) => model.id)).toEqual(['gpt-visible', 'gpt-hidden']); + }); + + it('drops unknown effort values instead of leaking them into launch options', () => { + const result = normalizeCodexAppServerModels([ + { + id: 'gpt-5.4', + supportedReasoningEfforts: ['none', 'medium', { reasoningEffort: 'future-effort' }], + defaultReasoningEffort: 'future-effort', + }, + ]); + + expect(result.models[0]?.supportedReasoningEfforts).toEqual(['medium']); + expect(result.models[0]?.defaultReasoningEffort).toBe('medium'); + }); + + it('uses model as the launch value and de-duplicates duplicate launch models', () => { + const result = normalizeCodexAppServerModels([ + { + id: 'catalog-alias', + model: 'gpt-5.5', + displayName: 'GPT-5.5 Alias', + }, + { + id: 'catalog-duplicate', + model: 'gpt-5.5', + displayName: 'Duplicate GPT-5.5 Alias', + }, + ]); + + expect(result.models).toEqual([ + expect.objectContaining({ + id: 'catalog-alias', + launchModel: 'gpt-5.5', + displayName: 'GPT-5.5 Alias', + }), + ]); + expect(result.diagnostics).toContain('model/list returned duplicate launch model gpt-5.5.'); + }); +}); diff --git a/src/features/codex-model-catalog/core/domain/codexModelCatalogFallback.ts b/src/features/codex-model-catalog/core/domain/codexModelCatalogFallback.ts new file mode 100644 index 00000000..5f9c724c --- /dev/null +++ b/src/features/codex-model-catalog/core/domain/codexModelCatalogFallback.ts @@ -0,0 +1,61 @@ +import type { CliProviderModelCatalogItem, CliProviderReasoningEffort } from '@shared/types'; + +const DEFAULT_CODEX_EFFORTS = ['low', 'medium', 'high', 'xhigh'] as const; +const MINI_CODEX_EFFORTS = ['medium', 'high'] as const; + +function createFallbackModel(options: { + id: string; + displayName: string; + badgeLabel: string; + isDefault?: boolean; + efforts?: readonly CliProviderReasoningEffort[]; + defaultEffort?: CliProviderReasoningEffort; +}): CliProviderModelCatalogItem { + const efforts = [...(options.efforts ?? DEFAULT_CODEX_EFFORTS)]; + return { + id: options.id, + launchModel: options.id, + displayName: options.displayName, + hidden: false, + supportedReasoningEfforts: efforts, + defaultReasoningEffort: options.defaultEffort ?? 'medium', + inputModalities: ['text', 'image'], + supportsPersonality: false, + isDefault: options.isDefault === true, + upgrade: false, + source: 'static-fallback', + badgeLabel: options.badgeLabel, + }; +} + +export function createStaticCodexModelCatalogModels(): CliProviderModelCatalogItem[] { + return [ + createFallbackModel({ + id: 'gpt-5.4', + displayName: 'GPT-5.4', + badgeLabel: '5.4', + isDefault: true, + }), + createFallbackModel({ + id: 'gpt-5.4-mini', + displayName: 'GPT-5.4 Mini', + badgeLabel: '5.4-mini', + }), + createFallbackModel({ + id: 'gpt-5.3-codex', + displayName: 'GPT-5.3 Codex', + badgeLabel: '5.3-codex', + }), + createFallbackModel({ + id: 'gpt-5.2', + displayName: 'GPT-5.2', + badgeLabel: '5.2', + }), + createFallbackModel({ + id: 'gpt-5.1-codex-mini', + displayName: 'GPT-5.1 Codex Mini', + badgeLabel: '5.1-codex-mini', + efforts: MINI_CODEX_EFFORTS, + }), + ]; +} diff --git a/src/features/codex-model-catalog/core/domain/codexReasoningEffort.ts b/src/features/codex-model-catalog/core/domain/codexReasoningEffort.ts new file mode 100644 index 00000000..e817c6c0 --- /dev/null +++ b/src/features/codex-model-catalog/core/domain/codexReasoningEffort.ts @@ -0,0 +1,24 @@ +import type { CliProviderReasoningEffort } from '@shared/types'; + +export const CODEX_REASONING_EFFORTS = [ + 'minimal', + 'low', + 'medium', + 'high', + 'xhigh', +] as const satisfies readonly CliProviderReasoningEffort[]; + +const CODEX_REASONING_EFFORT_SET = new Set(CODEX_REASONING_EFFORTS); + +export function isCodexReasoningEffort(value: unknown): value is CliProviderReasoningEffort { + return typeof value === 'string' && CODEX_REASONING_EFFORT_SET.has(value); +} + +export function normalizeCodexReasoningEffort(value: unknown): CliProviderReasoningEffort | null { + if (typeof value !== 'string') { + return null; + } + + const normalized = value.trim().toLowerCase(); + return isCodexReasoningEffort(normalized) ? normalized : null; +} diff --git a/src/features/codex-model-catalog/core/domain/normalizeCodexAppServerModel.ts b/src/features/codex-model-catalog/core/domain/normalizeCodexAppServerModel.ts new file mode 100644 index 00000000..075c81e4 --- /dev/null +++ b/src/features/codex-model-catalog/core/domain/normalizeCodexAppServerModel.ts @@ -0,0 +1,151 @@ +import { normalizeCodexReasoningEffort, CODEX_REASONING_EFFORTS } from './codexReasoningEffort'; + +import type { CodexAppServerModel } from '@main/services/infrastructure/codexAppServer'; +import type { CliProviderModelCatalogItem, CliProviderReasoningEffort } from '@shared/types'; + +export interface NormalizedCodexModelCatalogResult { + models: CliProviderModelCatalogItem[]; + defaultModelId: string | null; + diagnostics: string[]; +} + +function normalizeModelId(model: CodexAppServerModel): string | null { + const id = model.id?.trim() || model.model?.trim() || null; + return id && id.length > 0 ? id : null; +} + +function normalizeEffortOption(option: unknown): CliProviderReasoningEffort | null { + if (typeof option === 'string') { + return normalizeCodexReasoningEffort(option); + } + + if (option && typeof option === 'object' && 'reasoningEffort' in option) { + return normalizeCodexReasoningEffort((option as { reasoningEffort?: unknown }).reasoningEffort); + } + + return null; +} + +function normalizeEfforts(model: CodexAppServerModel): CliProviderReasoningEffort[] { + const efforts = model.supportedReasoningEfforts?.flatMap((option) => { + const normalized = normalizeEffortOption(option); + return normalized ? [normalized] : []; + }); + + if (!efforts || efforts.length === 0) { + return ['low', 'medium', 'high']; + } + + return CODEX_REASONING_EFFORTS.filter((effort) => efforts.includes(effort)); +} + +function normalizeDefaultEffort( + defaultEffort: unknown, + supportedEfforts: readonly CliProviderReasoningEffort[] +): CliProviderReasoningEffort | null { + const normalized = normalizeCodexReasoningEffort(defaultEffort); + if (!normalized) { + return supportedEfforts.includes('medium') ? 'medium' : (supportedEfforts[0] ?? null); + } + + return supportedEfforts.includes(normalized) + ? normalized + : supportedEfforts.includes('medium') + ? 'medium' + : (supportedEfforts[0] ?? null); +} + +function normalizeModalities(value: unknown): string[] { + if (!Array.isArray(value)) { + return ['text', 'image']; + } + + const seen = new Set(); + const modalities: string[] = []; + for (const item of value) { + if (typeof item !== 'string') { + continue; + } + const normalized = item.trim().toLowerCase(); + if (!normalized || seen.has(normalized)) { + continue; + } + seen.add(normalized); + modalities.push(normalized); + } + + return modalities.length > 0 ? modalities : ['text', 'image']; +} + +function asBadgeLabel(modelId: string): string { + return modelId.replace(/^gpt-/, ''); +} + +export function normalizeCodexAppServerModels( + models: readonly CodexAppServerModel[] | undefined, + options: { + includeHidden?: boolean; + } = {} +): NormalizedCodexModelCatalogResult { + const diagnostics: string[] = []; + const seen = new Set(); + const seenLaunchModels = new Set(); + const normalizedModels: CliProviderModelCatalogItem[] = []; + + for (const model of models ?? []) { + const id = normalizeModelId(model); + if (!id) { + diagnostics.push('model/list returned a model without id/model.'); + continue; + } + + if (seen.has(id)) { + diagnostics.push(`model/list returned duplicate model id ${id}.`); + continue; + } + seen.add(id); + + const hidden = model.hidden === true; + if (hidden && options.includeHidden !== true) { + continue; + } + + const launchModel = model.model?.trim() || id; + if (seenLaunchModels.has(launchModel)) { + diagnostics.push(`model/list returned duplicate launch model ${launchModel}.`); + continue; + } + seenLaunchModels.add(launchModel); + + const supportedReasoningEfforts = normalizeEfforts(model); + normalizedModels.push({ + id, + launchModel, + displayName: model.displayName?.trim() || id, + hidden, + supportedReasoningEfforts, + defaultReasoningEffort: normalizeDefaultEffort( + model.defaultReasoningEffort, + supportedReasoningEfforts + ), + inputModalities: normalizeModalities(model.inputModalities), + supportsPersonality: model.supportsPersonality === true, + isDefault: model.isDefault === true, + upgrade: Boolean(model.upgrade), + source: 'app-server', + badgeLabel: asBadgeLabel(id), + }); + } + + const defaultModel = + normalizedModels.find((model) => model.isDefault) ?? + normalizedModels.find((model) => !model.hidden) ?? + normalizedModels[0] ?? + null; + + return { + models: normalizedModels, + defaultModelId: defaultModel?.id ?? null, + diagnostics, + }; +} diff --git a/src/features/codex-model-catalog/index.ts b/src/features/codex-model-catalog/index.ts new file mode 100644 index 00000000..5296c495 --- /dev/null +++ b/src/features/codex-model-catalog/index.ts @@ -0,0 +1,9 @@ +export type { + CodexModelCatalogDto, + CodexModelCatalogItemDto, + CodexModelCatalogSourceDto, + CodexModelCatalogStatusDto, + CodexModelReasoningEffortDto, +} from './contracts'; +export type { CodexModelCatalogFeatureFacade, CodexModelCatalogRequest } from './main'; +export { createCodexModelCatalogFeature } from './main'; diff --git a/src/features/codex-model-catalog/main/composition/createCodexModelCatalogFeature.ts b/src/features/codex-model-catalog/main/composition/createCodexModelCatalogFeature.ts new file mode 100644 index 00000000..5fee0da6 --- /dev/null +++ b/src/features/codex-model-catalog/main/composition/createCodexModelCatalogFeature.ts @@ -0,0 +1,357 @@ +import { createHash, randomBytes } from 'node:crypto'; + +import type { CodexAccountSnapshotDto } from '@features/codex-account/contracts'; +import type { CodexAccountFeatureFacade } from '@features/codex-account/main'; +import { CodexAccountEnvBuilder } from '@features/codex-account/main/infrastructure/CodexAccountEnvBuilder'; +import { createStaticCodexModelCatalogModels } from '@features/codex-model-catalog/core/domain/codexModelCatalogFallback'; +import { normalizeCodexAppServerModels } from '@features/codex-model-catalog/core/domain/normalizeCodexAppServerModel'; +import { + CodexAppServerSessionFactory, + CodexBinaryResolver, + JsonRpcRequestError, + JsonRpcStdioClient, +} from '@main/services/infrastructure/codexAppServer'; + +import { CodexModelCatalogAppServerClient } from '../infrastructure/CodexModelCatalogAppServerClient'; +import { InMemoryCodexModelCatalogCache } from '../infrastructure/InMemoryCodexModelCatalogCache'; + +import type { CodexModelCatalogDto } from '@features/codex-model-catalog/contracts'; +import type { Logger } from '@shared/utils/logger'; + +type LoggerPort = Pick; + +const CATALOG_CACHE_TTL_MS = 10 * 60_000; +const CATALOG_STALE_TTL_MS = 24 * 60 * 60_000; +const HASH_SALT = randomBytes(16).toString('hex'); + +export interface CodexModelCatalogRequest { + cwd?: string | null; + profile?: string | null; + includeHidden?: boolean; + forceRefresh?: boolean; +} + +export interface CodexModelCatalogFeatureFacade { + getCatalog(options?: CodexModelCatalogRequest): Promise; + invalidate(): void; + dispose(): Promise; +} + +function nowIso(): string { + return new Date().toISOString(); +} + +function staleAtIso(): string { + return new Date(Date.now() + CATALOG_CACHE_TTL_MS).toISOString(); +} + +function hashValue(value: unknown): string { + return createHash('sha256') + .update(HASH_SALT) + .update(JSON.stringify(value ?? null)) + .digest('hex') + .slice(0, 16); +} + +function classifyAppServerFailure(error: unknown): { + appServerState: CodexModelCatalogDto['diagnostics']['appServerState']; + message: string; + code: string | null; +} { + const message = error instanceof Error ? error.message : String(error); + const lower = message.toLowerCase(); + const rpcCode = + error instanceof JsonRpcRequestError && error.code !== null ? String(error.code) : null; + + if ( + lower.includes('unknown method') || + lower.includes('method not found') || + lower.includes('unknown command') || + lower.includes('no such command') || + rpcCode === '-32601' + ) { + return { + appServerState: 'incompatible', + message: 'The installed Codex binary does not support app-server model/list yet.', + code: rpcCode ?? 'method-not-found', + }; + } + + return { + appServerState: 'degraded', + message, + code: rpcCode, + }; +} + +function createCacheKey(options: { + binaryPath: string | null; + binaryVersion: string | null; + accountSnapshot: CodexAccountSnapshotDto; + cwd?: string | null; + profile?: string | null; + configFingerprint?: string | null; + includeHidden?: boolean; +}): string { + return hashValue({ + binaryPath: options.binaryPath, + binaryVersion: options.binaryVersion, + preferredAuthMode: options.accountSnapshot.preferredAuthMode, + effectiveAuthMode: options.accountSnapshot.effectiveAuthMode, + managedAccount: options.accountSnapshot.managedAccount + ? { + type: options.accountSnapshot.managedAccount.type, + planType: options.accountSnapshot.managedAccount.planType, + emailHash: hashValue(options.accountSnapshot.managedAccount.email), + } + : null, + apiKeySource: options.accountSnapshot.apiKey.source, + cwd: options.cwd?.trim() || null, + profile: options.profile?.trim() || null, + configFingerprint: options.configFingerprint ?? null, + includeHidden: options.includeHidden === true, + codexHome: process.env.CODEX_HOME?.trim() || null, + }); +} + +function setCatalogCacheEntries( + cache: InMemoryCodexModelCatalogCache, + keys: readonly string[], + catalog: CodexModelCatalogDto +): void { + const seen = new Set(); + for (const key of keys) { + if (seen.has(key)) { + continue; + } + seen.add(key); + cache.set(key, catalog); + } +} + +function createFallbackCatalog(options: { + sourceMessage: string; + appServerState: CodexModelCatalogDto['diagnostics']['appServerState']; + status?: CodexModelCatalogDto['status']; + code?: string | null; +}): CodexModelCatalogDto { + const models = createStaticCodexModelCatalogModels(); + const defaultModel = models.find((model) => model.isDefault) ?? models[0] ?? null; + return { + schemaVersion: 1, + providerId: 'codex', + source: 'static-fallback', + status: options.status ?? 'degraded', + fetchedAt: nowIso(), + staleAt: staleAtIso(), + defaultModelId: defaultModel?.id ?? null, + defaultLaunchModel: defaultModel?.launchModel ?? null, + models, + diagnostics: { + configReadState: 'skipped', + appServerState: options.appServerState, + message: options.sourceMessage, + code: options.code ?? null, + }, + }; +} + +function markCatalogStale( + catalog: CodexModelCatalogDto, + diagnostics: CodexModelCatalogDto['diagnostics'] +): CodexModelCatalogDto { + return { + ...catalog, + status: 'stale', + diagnostics, + }; +} + +export function createCodexModelCatalogFeature(options: { + logger: LoggerPort; + codexAccountFeature: Pick; +}): CodexModelCatalogFeatureFacade { + const envBuilder = new CodexAccountEnvBuilder(); + const cache = new InMemoryCodexModelCatalogCache(); + const inFlightRefreshes = new Map>(); + let cacheGeneration = 0; + const client = new CodexModelCatalogAppServerClient( + new CodexAppServerSessionFactory(new JsonRpcStdioClient(options.logger)) + ); + + async function getCatalog(request: CodexModelCatalogRequest = {}): Promise { + const accountSnapshot = await options.codexAccountFeature.getSnapshot(); + const binaryPath = await CodexBinaryResolver.resolve(); + const binaryVersion = await CodexBinaryResolver.resolveVersion(binaryPath); + + if (!binaryPath) { + return createFallbackCatalog({ + sourceMessage: 'Codex CLI was not found. Showing static fallback model list.', + appServerState: 'runtime-missing', + status: 'unavailable', + }); + } + + const env = envBuilder.buildControlPlaneEnv({ binaryPath }); + const preflightCacheKey = createCacheKey({ + binaryPath, + binaryVersion, + accountSnapshot, + cwd: request.cwd, + profile: request.profile, + configFingerprint: null, + includeHidden: request.includeHidden, + }); + + if (request.forceRefresh !== true) { + const cached = cache.get(preflightCacheKey, CATALOG_CACHE_TTL_MS); + if (cached) { + return cached; + } + } + + const existingRefresh = inFlightRefreshes.get(preflightCacheKey); + if (existingRefresh) { + return existingRefresh; + } + + const refreshGeneration = cacheGeneration; + const refreshPromise = (async (): Promise => { + let configFingerprint: string | null = null; + let configReadState: CodexModelCatalogDto['diagnostics']['configReadState'] = 'skipped'; + let configReadMessage: string | null = null; + let cacheKey = preflightCacheKey; + + try { + const payload = await client.readModelCatalogWithConfig({ + binaryPath, + env, + includeHidden: request.includeHidden, + cwd: request.cwd, + profile: request.profile, + }); + + if (payload.config.ok) { + configReadState = 'ready'; + configFingerprint = hashValue(payload.config.value); + } else { + configReadState = + payload.config.error instanceof JsonRpcRequestError && + payload.config.error.code === -32601 + ? 'unsupported' + : 'failed'; + configReadMessage = + payload.config.error instanceof Error + ? payload.config.error.message + : String(payload.config.error); + } + + cacheKey = createCacheKey({ + binaryPath, + binaryVersion, + accountSnapshot, + cwd: request.cwd, + profile: request.profile, + configFingerprint, + includeHidden: request.includeHidden, + }); + + const normalized = normalizeCodexAppServerModels( + payload.modelCatalog.models ?? payload.modelCatalog.data, + { + includeHidden: request.includeHidden, + } + ); + + const defaultModel = + normalized.models.find((model) => model.id === normalized.defaultModelId) ?? + normalized.models.find((model) => model.isDefault) ?? + normalized.models[0] ?? + null; + const diagnostics = [ + ...normalized.diagnostics, + configReadMessage ? `config/read: ${configReadMessage}` : null, + payload.modelCatalog.truncated + ? 'model/list pagination reached the safety page limit; some Codex models may be omitted.' + : null, + ].filter(Boolean); + const catalog: CodexModelCatalogDto = { + schemaVersion: 1, + providerId: 'codex', + source: 'app-server', + status: 'ready', + fetchedAt: nowIso(), + staleAt: staleAtIso(), + defaultModelId: defaultModel?.id ?? null, + defaultLaunchModel: defaultModel?.launchModel ?? null, + models: normalized.models, + diagnostics: { + configReadState, + appServerState: 'healthy', + message: diagnostics.length > 0 ? diagnostics.join(' ') : null, + code: null, + }, + }; + + if (normalized.models.length === 0) { + throw new Error('Codex app-server model/list returned no visible models.'); + } + + if (refreshGeneration === cacheGeneration) { + setCatalogCacheEntries(cache, [preflightCacheKey, cacheKey], catalog); + } + return catalog; + } catch (error) { + const failure = classifyAppServerFailure(error); + const stale = + cache.getLatest(cacheKey) ?? + (cacheKey === preflightCacheKey ? null : cache.getLatest(preflightCacheKey)); + if (stale && Date.parse(stale.fetchedAt) + CATALOG_STALE_TTL_MS > Date.now()) { + return markCatalogStale(stale, { + configReadState, + appServerState: failure.appServerState, + message: failure.message, + code: failure.code, + }); + } + + options.logger.warn('codex model catalog refresh failed', { + error: failure.message, + code: failure.code, + }); + const fallback = createFallbackCatalog({ + sourceMessage: failure.message, + appServerState: failure.appServerState, + code: failure.code, + }); + if (refreshGeneration === cacheGeneration) { + setCatalogCacheEntries(cache, [preflightCacheKey, cacheKey], fallback); + } + return fallback; + } + })(); + + inFlightRefreshes.set(preflightCacheKey, refreshPromise); + try { + return await refreshPromise; + } finally { + if (inFlightRefreshes.get(preflightCacheKey) === refreshPromise) { + inFlightRefreshes.delete(preflightCacheKey); + } + } + } + + return { + getCatalog, + invalidate: () => { + cacheGeneration += 1; + cache.clear(); + inFlightRefreshes.clear(); + }, + dispose: async () => { + cacheGeneration += 1; + cache.clear(); + inFlightRefreshes.clear(); + }, + }; +} diff --git a/src/features/codex-model-catalog/main/index.ts b/src/features/codex-model-catalog/main/index.ts new file mode 100644 index 00000000..0d29b79f --- /dev/null +++ b/src/features/codex-model-catalog/main/index.ts @@ -0,0 +1,5 @@ +export type { + CodexModelCatalogFeatureFacade, + CodexModelCatalogRequest, +} from './composition/createCodexModelCatalogFeature'; +export { createCodexModelCatalogFeature } from './composition/createCodexModelCatalogFeature'; diff --git a/src/features/codex-model-catalog/main/infrastructure/CodexModelCatalogAppServerClient.ts b/src/features/codex-model-catalog/main/infrastructure/CodexModelCatalogAppServerClient.ts new file mode 100644 index 00000000..42b46702 --- /dev/null +++ b/src/features/codex-model-catalog/main/infrastructure/CodexModelCatalogAppServerClient.ts @@ -0,0 +1,159 @@ +import type { + CodexAppServerListModelsParams, + CodexAppServerListModelsResponse, + CodexAppServerReadConfigParams, + CodexAppServerReadConfigResponse, + CodexAppServerSession, + CodexAppServerSessionFactory, +} from '@main/services/infrastructure/codexAppServer'; + +const MODEL_LIST_PAGE_LIMIT = 100; +const MODEL_LIST_MAX_PAGES = 5; +const MODEL_LIST_TIMEOUT_MS = 4_500; +const CONFIG_READ_TIMEOUT_MS = 3_500; +const INITIALIZE_TIMEOUT_MS = 6_000; +const TOTAL_TIMEOUT_MS = 9_000; + +export class CodexModelCatalogAppServerClient { + constructor(private readonly sessionFactory: CodexAppServerSessionFactory) {} + + async readModelCatalogWithConfig(options: { + binaryPath: string; + env: NodeJS.ProcessEnv; + includeHidden?: boolean; + cwd?: string | null; + profile?: string | null; + }): Promise<{ + modelCatalog: CodexAppServerListModelsResponse; + config: { ok: true; value: CodexAppServerReadConfigResponse } | { ok: false; error: unknown }; + }> { + const configParams = this.buildConfigReadParams(options); + + return this.sessionFactory.withSession( + { + binaryPath: options.binaryPath, + env: options.env, + requestTimeoutMs: MODEL_LIST_TIMEOUT_MS, + initializeTimeoutMs: INITIALIZE_TIMEOUT_MS, + totalTimeoutMs: TOTAL_TIMEOUT_MS, + label: 'codex app-server model/list with config/read', + experimentalApi: false, + }, + async (session) => { + const configPromise = session + .request( + 'config/read', + configParams, + CONFIG_READ_TIMEOUT_MS + ) + .then((value) => ({ ok: true as const, value })) + .catch((error: unknown) => ({ ok: false as const, error })); + const modelCatalogPromise = this.readModelCatalogPages(session, { + includeHidden: options.includeHidden, + }); + const [config, modelCatalog] = await Promise.all([configPromise, modelCatalogPromise]); + return { + config, + modelCatalog, + }; + } + ); + } + + async readModelCatalog(options: { + binaryPath: string; + env: NodeJS.ProcessEnv; + includeHidden?: boolean; + }): Promise { + return this.sessionFactory.withSession( + { + binaryPath: options.binaryPath, + env: options.env, + requestTimeoutMs: MODEL_LIST_TIMEOUT_MS, + initializeTimeoutMs: INITIALIZE_TIMEOUT_MS, + totalTimeoutMs: TOTAL_TIMEOUT_MS, + label: 'codex app-server model/list', + experimentalApi: false, + }, + async (session) => + this.readModelCatalogPages(session, { + includeHidden: options.includeHidden, + }) + ); + } + + async readConfig(options: { + binaryPath: string; + env: NodeJS.ProcessEnv; + cwd?: string | null; + profile?: string | null; + }): Promise { + const params = this.buildConfigReadParams(options); + + return this.sessionFactory.withSession( + { + binaryPath: options.binaryPath, + env: options.env, + requestTimeoutMs: CONFIG_READ_TIMEOUT_MS, + initializeTimeoutMs: INITIALIZE_TIMEOUT_MS, + totalTimeoutMs: TOTAL_TIMEOUT_MS, + label: 'codex app-server config/read', + experimentalApi: false, + }, + async (session) => + session.request( + 'config/read', + params, + CONFIG_READ_TIMEOUT_MS + ) + ); + } + + private buildConfigReadParams(options: { + cwd?: string | null; + profile?: string | null; + }): CodexAppServerReadConfigParams { + const params: CodexAppServerReadConfigParams = {}; + if (options.cwd?.trim()) { + params.cwd = options.cwd.trim(); + } + if (options.profile?.trim()) { + params.profile = options.profile.trim(); + } + return params; + } + + private async readModelCatalogPages( + session: CodexAppServerSession, + options: { includeHidden?: boolean } + ): Promise { + const data: NonNullable = []; + let cursor: string | null = null; + let nextCursor: string | null = null; + + for (let page = 0; page < MODEL_LIST_MAX_PAGES; page += 1) { + const payload: CodexAppServerListModelsResponse = + await session.request( + 'model/list', + { + cursor, + limit: MODEL_LIST_PAGE_LIMIT, + includeHidden: options.includeHidden === true, + } satisfies CodexAppServerListModelsParams, + MODEL_LIST_TIMEOUT_MS + ); + data.push(...(payload.data ?? payload.models ?? [])); + nextCursor = payload.nextCursor ?? null; + if (!nextCursor) { + break; + } + cursor = nextCursor; + } + + return { + data, + nextCursor, + truncated: nextCursor !== null, + }; + } +} diff --git a/src/features/codex-model-catalog/main/infrastructure/InMemoryCodexModelCatalogCache.ts b/src/features/codex-model-catalog/main/infrastructure/InMemoryCodexModelCatalogCache.ts new file mode 100644 index 00000000..6c8c4e9e --- /dev/null +++ b/src/features/codex-model-catalog/main/infrastructure/InMemoryCodexModelCatalogCache.ts @@ -0,0 +1,37 @@ +import type { CodexModelCatalogDto } from '@features/codex-model-catalog/contracts'; + +interface CacheEntry { + value: CodexModelCatalogDto; + observedAt: number; +} + +export class InMemoryCodexModelCatalogCache { + private readonly entries = new Map(); + + get(key: string, maxAgeMs: number): CodexModelCatalogDto | null { + const entry = this.entries.get(key); + if (!entry) { + return null; + } + if (Date.now() - entry.observedAt > maxAgeMs) { + return null; + } + return structuredClone(entry.value); + } + + getLatest(key: string): CodexModelCatalogDto | null { + const entry = this.entries.get(key); + return entry ? structuredClone(entry.value) : null; + } + + set(key: string, value: CodexModelCatalogDto): void { + this.entries.set(key, { + value: structuredClone(value), + observedAt: Date.now(), + }); + } + + clear(): void { + this.entries.clear(); + } +} diff --git a/src/features/codex-model-catalog/main/infrastructure/__tests__/CodexModelCatalogAppServerClient.test.ts b/src/features/codex-model-catalog/main/infrastructure/__tests__/CodexModelCatalogAppServerClient.test.ts new file mode 100644 index 00000000..c8217d87 --- /dev/null +++ b/src/features/codex-model-catalog/main/infrastructure/__tests__/CodexModelCatalogAppServerClient.test.ts @@ -0,0 +1,88 @@ +import { describe, expect, it } from 'vitest'; + +import { CodexModelCatalogAppServerClient } from '../CodexModelCatalogAppServerClient'; + +import type { + CodexAppServerSession, + CodexAppServerSessionFactory, +} from '@main/services/infrastructure/codexAppServer'; + +describe('CodexModelCatalogAppServerClient', () => { + it('reads config and paginated model/list in one app-server session', async () => { + const requests: Array<{ method: string; params: unknown }> = []; + let sessionCount = 0; + const session: CodexAppServerSession = { + initializeResponse: { + userAgent: 'codex-cli 0.117.0', + codexHome: '/Users/me/.codex', + platformFamily: 'macos', + platformOs: 'darwin', + }, + request: async (method: string, params?: unknown): Promise => { + requests.push({ method, params }); + if (method === 'config/read') { + return { config: { model: 'gpt-5.4' }, origins: {} } as TResult; + } + if (method === 'model/list') { + const cursor = (params as { cursor?: string | null }).cursor ?? null; + if (cursor === null) { + return { + data: [{ id: 'gpt-5.4', model: 'gpt-5.4' }], + nextCursor: 'page-2', + } as TResult; + } + return { + data: [{ id: 'gpt-5.5', model: 'gpt-5.5' }], + nextCursor: null, + } as TResult; + } + throw new Error(`Unexpected method ${method}`); + }, + notify: async () => undefined, + onNotification: () => () => undefined, + close: async () => undefined, + }; + const factory = { + withSession: async ( + _options: unknown, + handler: (session: CodexAppServerSession) => Promise + ): Promise => { + sessionCount += 1; + return handler(session); + }, + } as unknown as CodexAppServerSessionFactory; + + const client = new CodexModelCatalogAppServerClient(factory); + const result = await client.readModelCatalogWithConfig({ + binaryPath: '/usr/local/bin/codex', + env: {}, + cwd: '/repo', + profile: 'work', + }); + + expect(sessionCount).toBe(1); + expect(result.config).toEqual({ + ok: true, + value: { config: { model: 'gpt-5.4' }, origins: {} }, + }); + expect(result.modelCatalog).toEqual({ + data: [ + { id: 'gpt-5.4', model: 'gpt-5.4' }, + { id: 'gpt-5.5', model: 'gpt-5.5' }, + ], + nextCursor: null, + truncated: false, + }); + expect(requests).toEqual([ + { method: 'config/read', params: { cwd: '/repo', profile: 'work' } }, + { + method: 'model/list', + params: { cursor: null, limit: 100, includeHidden: false }, + }, + { + method: 'model/list', + params: { cursor: 'page-2', limit: 100, includeHidden: false }, + }, + ]); + }); +}); diff --git a/src/main/http/teams.ts b/src/main/http/teams.ts index 5495aef5..afebd5d1 100644 --- a/src/main/http/teams.ts +++ b/src/main/http/teams.ts @@ -1,5 +1,9 @@ import { validateTeamName } from '@main/ipc/guards'; import { getErrorMessage } from '@shared/utils/errorHandling'; +import { + formatEffortLevelListForProvider, + isTeamEffortLevelForProvider, +} from '@shared/utils/effortLevels'; import { createLogger } from '@shared/utils/logger'; import { migrateProviderBackendId } from '@shared/utils/providerBackend'; import { isAbsolute } from 'path'; @@ -12,8 +16,6 @@ const logger = createLogger('HTTP:teams'); type LaunchBody = Omit; -const EFFORT_LEVELS = new Set(['low', 'medium', 'high']); - class HttpBadRequestError extends Error {} class HttpFeatureUnavailableError extends Error {} @@ -76,16 +78,21 @@ function assertOptionalBoolean(value: unknown, fieldName: string): boolean | und return value; } -function assertOptionalEffort(value: unknown): EffortLevel | undefined { +function assertOptionalEffort( + value: unknown, + providerId: TeamLaunchRequest['providerId'] +): EffortLevel | undefined { if (value == null) { return undefined; } - if (typeof value !== 'string' || !EFFORT_LEVELS.has(value as EffortLevel)) { - throw new HttpBadRequestError('effort must be one of: low, medium, high'); + if (!isTeamEffortLevelForProvider(value, providerId)) { + throw new HttpBadRequestError( + `effort must be one of: ${formatEffortLevelListForProvider(providerId)}` + ); } - return value as EffortLevel; + return value; } function parseLaunchRequest(teamName: string, body: unknown): TeamLaunchRequest { @@ -109,7 +116,7 @@ function parseLaunchRequest(teamName: string, body: unknown): TeamLaunchRequest ); } const model = assertOptionalString(payload.model, 'model'); - const effort = assertOptionalEffort(payload.effort); + const effort = assertOptionalEffort(payload.effort, providerId); const clearContext = assertOptionalBoolean(payload.clearContext, 'clearContext'); const skipPermissions = assertOptionalBoolean(payload.skipPermissions, 'skipPermissions'); const worktree = assertOptionalString(payload.worktree, 'worktree'); diff --git a/src/main/index.ts b/src/main/index.ts index a89d959e..7a6c373f 100644 --- a/src/main/index.ts +++ b/src/main/index.ts @@ -25,6 +25,10 @@ import { registerCodexAccountIpc, removeCodexAccountIpc, } from '@features/codex-account/main'; +import { + createCodexModelCatalogFeature, + type CodexModelCatalogFeatureFacade, +} from '@features/codex-model-catalog/main'; import { createRecentProjectsFeature, type RecentProjectsFeatureFacade, @@ -422,6 +426,7 @@ let notificationManager: NotificationManager; let updaterService: UpdaterService; let sshConnectionManager: SshConnectionManager; let codexAccountFeature: CodexAccountFeatureFacade | null = null; +let codexModelCatalogFeature: CodexModelCatalogFeatureFacade | null = null; let recentProjectsFeature: RecentProjectsFeatureFacade; let teamDataService: TeamDataService; let teamProvisioningService: TeamProvisioningService; @@ -988,6 +993,11 @@ async function initializeServices(): Promise { configManager, }); providerConnectionService.setCodexAccountFeature(codexAccountFeature); + codexModelCatalogFeature = createCodexModelCatalogFeature({ + logger: createLogger('Feature:CodexModelCatalog'), + codexAccountFeature, + }); + providerConnectionService.setCodexModelCatalogFeature(codexModelCatalogFeature); // startProcessHealthPolling() is deferred to after window creation // (did-finish-load handler) to avoid thread pool contention at startup. @@ -1185,7 +1195,10 @@ function shutdownServices(): void { } void skillsWatcherService?.stopAll(); + providerConnectionService.setCodexModelCatalogFeature(null); providerConnectionService.setCodexAccountFeature(null); + void codexModelCatalogFeature?.dispose(); + codexModelCatalogFeature = null; void codexAccountFeature?.dispose(); codexAccountFeature = null; diff --git a/src/main/ipc/teams.ts b/src/main/ipc/teams.ts index a01e1d5c..11f34910 100644 --- a/src/main/ipc/teams.ts +++ b/src/main/ipc/teams.ts @@ -91,6 +91,10 @@ import { PROTECTED_CLI_FLAGS, } from '@shared/utils/cliArgsParser'; import { createLogger } from '@shared/utils/logger'; +import { + formatEffortLevelListForProvider, + isTeamEffortLevelForProvider, +} from '@shared/utils/effortLevels'; import { isTeamProviderBackendId, migrateProviderBackendId } from '@shared/utils/providerBackend'; import { isRateLimitMessage } from '@shared/utils/rateLimitDetector'; import { @@ -1112,10 +1116,8 @@ function isProvisioningTeamName(teamName: string): boolean { return parts.every((p) => /^[a-z0-9]+$/.test(p)); } -const VALID_EFFORT_LEVELS: readonly string[] = ['low', 'medium', 'high']; - -function isValidEffort(value: unknown): value is EffortLevel { - return typeof value === 'string' && VALID_EFFORT_LEVELS.includes(value); +function isValidEffort(value: unknown, providerId?: TeamProviderId | null): value is EffortLevel { + return isTeamEffortLevelForProvider(value, providerId); } function parseOptionalMemberProviderId( @@ -1165,15 +1167,35 @@ function parseOptionalProviderBackendId( } function parseOptionalMemberEffort( - value: unknown + value: unknown, + providerId?: TeamProviderId | null ): { valid: true; value: EffortLevel | undefined } | { valid: false; error: string } { if (value === undefined || value === null || value === '') { return { valid: true, value: undefined }; } - if (isValidEffort(value)) { + if (isValidEffort(value, providerId)) { return { valid: true, value }; } - return { valid: false, error: 'member effort must be low, medium, or high' }; + return { + valid: false, + error: `member effort must be one of ${formatEffortLevelListForProvider(providerId)}`, + }; +} + +function parseOptionalTeamEffort( + value: unknown, + providerId?: TeamProviderId | null +): { valid: true; value: EffortLevel | undefined } | { valid: false; error: string } { + if (value === undefined || value === null || value === '') { + return { valid: true, value: undefined }; + } + if (isValidEffort(value, providerId)) { + return { valid: true, value }; + } + return { + valid: false, + error: `effort must be one of ${formatEffortLevelListForProvider(providerId)}`, + }; } async function validateProvisioningRequest( @@ -1202,6 +1224,12 @@ async function validateProvisioningRequest( if (!Array.isArray(payload.members)) { return { valid: false, error: 'members must be an array' }; } + const providerId = + payload.providerId === 'codex' + ? 'codex' + : payload.providerId === 'gemini' + ? 'gemini' + : 'anthropic'; const seenNames = new Set(); const members: TeamCreateRequest['members'] = []; @@ -1237,12 +1265,20 @@ async function validateProvisioningRequest( if (model !== undefined && typeof model !== 'string') { return { valid: false, error: 'member model must be string' }; } + const effortValidation = parseOptionalMemberEffort( + (member as { effort?: unknown }).effort, + providerValidation.value ?? providerId + ); + if (!effortValidation.valid) { + return { valid: false, error: effortValidation.error }; + } members.push({ name: memberName, role: typeof role === 'string' ? role.trim() : undefined, workflow: typeof workflow === 'string' ? workflow.trim() : undefined, providerId: providerValidation.value, model: typeof model === 'string' ? model.trim() || undefined : undefined, + effort: effortValidation.value, }); } @@ -1257,12 +1293,6 @@ async function validateProvisioningRequest( if (payload.prompt !== undefined && typeof payload.prompt !== 'string') { return { valid: false, error: 'prompt must be a string' }; } - const providerId = - payload.providerId === 'codex' - ? 'codex' - : payload.providerId === 'gemini' - ? 'gemini' - : 'anthropic'; const providerBackendValidation = parseOptionalProviderBackendId( payload.providerBackendId, providerId @@ -1270,6 +1300,10 @@ async function validateProvisioningRequest( if (!providerBackendValidation.valid) { return { valid: false, error: providerBackendValidation.error }; } + const effortValidation = parseOptionalTeamEffort(payload.effort, providerId); + if (!effortValidation.valid) { + return { valid: false, error: effortValidation.error }; + } try { await fs.promises.mkdir(cwd, { recursive: true }); @@ -1324,7 +1358,7 @@ async function validateProvisioningRequest( providerId, providerBackendId: providerBackendValidation.value, model: typeof payload.model === 'string' ? payload.model.trim() || undefined : undefined, - effort: isValidEffort(payload.effort) ? payload.effort : undefined, + effort: effortValidation.value, skipPermissions: typeof payload.skipPermissions === 'boolean' ? payload.skipPermissions : undefined, worktree: @@ -1474,6 +1508,10 @@ async function handleLaunchTeam( : meta?.providerId === 'gemini' ? 'gemini' : 'anthropic'; + const effortValidation = parseOptionalTeamEffort(payload.effort, resolvedProviderId); + if (!effortValidation.valid) { + return { success: false, error: effortValidation.error }; + } const createRequest: TeamCreateRequest = { teamName: tn, @@ -1488,7 +1526,7 @@ async function handleLaunchTeam( providerBackendValidation.value ?? meta?.providerBackendId ?? membersMeta?.providerBackendId ), model: typeof payload.model === 'string' ? payload.model.trim() || undefined : undefined, - effort: isValidEffort(payload.effort) ? payload.effort : undefined, + effort: effortValidation.value, limitContext: typeof payload.limitContext === 'boolean' ? payload.limitContext : undefined, skipPermissions: typeof payload.skipPermissions === 'boolean' ? payload.skipPermissions : undefined, @@ -1520,6 +1558,11 @@ async function handleLaunchTeam( ); } + const effortValidation = parseOptionalTeamEffort(payload.effort, providerId); + if (!effortValidation.valid) { + return { success: false, error: effortValidation.error }; + } + return wrapTeamHandler('launch', () => { addMainBreadcrumb('team', 'launch', { teamName: validatedTeamName.value! }); return getTeamProvisioningService().launchTeam( @@ -1530,7 +1573,7 @@ async function handleLaunchTeam( providerId, providerBackendId: providerBackendValidation.value, model: typeof payload.model === 'string' ? payload.model.trim() || undefined : undefined, - effort: isValidEffort(payload.effort) ? payload.effort : undefined, + effort: effortValidation.value, clearContext: payload.clearContext === true ? true : undefined, skipPermissions: typeof payload.skipPermissions === 'boolean' ? payload.skipPermissions : undefined, @@ -2652,7 +2695,10 @@ async function handleCreateConfig( if (model !== undefined && typeof model !== 'string') { return { success: false, error: 'member model must be string' }; } - const effortValidation = parseOptionalMemberEffort((member as { effort?: unknown }).effort); + const effortValidation = parseOptionalMemberEffort( + (member as { effort?: unknown }).effort, + providerValidation.value + ); if (!effortValidation.valid) { return { success: false, error: effortValidation.error }; } @@ -3090,7 +3136,10 @@ async function handleAddMember( if (model !== undefined && typeof model !== 'string') { return { success: false, error: 'model must be a string' }; } - const effortValidation = parseOptionalMemberEffort((payload as { effort?: unknown }).effort); + const effortValidation = parseOptionalMemberEffort( + (payload as { effort?: unknown }).effort, + providerValidation.value + ); if (!effortValidation.valid) { return { success: false, error: effortValidation.error }; } @@ -3162,7 +3211,7 @@ async function handleReplaceMembers( workflow?: string; providerId?: 'anthropic' | 'codex' | 'gemini'; model?: string; - effort?: 'low' | 'medium' | 'high'; + effort?: EffortLevel; }[] = []; for (const item of payload.members) { if (!item || typeof item !== 'object') { @@ -3196,7 +3245,10 @@ async function handleReplaceMembers( if (m.model !== undefined && typeof m.model !== 'string') { return { success: false, error: 'member model must be string' }; } - const effortValidation = parseOptionalMemberEffort((m as { effort?: unknown }).effort); + const effortValidation = parseOptionalMemberEffort( + (m as { effort?: unknown }).effort, + providerValidation.value + ); if (!effortValidation.valid) { return { success: false, error: effortValidation.error }; } diff --git a/src/main/services/infrastructure/CliInstallerService.ts b/src/main/services/infrastructure/CliInstallerService.ts index 24254cc4..d5fc7beb 100644 --- a/src/main/services/infrastructure/CliInstallerService.ts +++ b/src/main/services/infrastructure/CliInstallerService.ts @@ -152,7 +152,11 @@ function cloneCliInstallationStatus(status: CliInstallationStatus): CliInstallat providers: status.providers.map((provider) => ({ ...provider, modelVerificationState: provider.modelVerificationState ?? 'idle', + modelCatalog: provider.modelCatalog ? structuredClone(provider.modelCatalog) : null, modelAvailability: provider.modelAvailability?.map((item) => ({ ...item })) ?? [], + runtimeCapabilities: provider.runtimeCapabilities + ? structuredClone(provider.runtimeCapabilities) + : null, capabilities: { ...provider.capabilities, extensions: { diff --git a/src/main/services/infrastructure/codexAppServer/CodexBinaryResolver.ts b/src/main/services/infrastructure/codexAppServer/CodexBinaryResolver.ts index 7b7184bf..71627c13 100644 --- a/src/main/services/infrastructure/codexAppServer/CodexBinaryResolver.ts +++ b/src/main/services/infrastructure/codexAppServer/CodexBinaryResolver.ts @@ -2,11 +2,15 @@ import { constants as fsConstants } from 'node:fs'; import * as fsp from 'node:fs/promises'; import path from 'node:path'; +import { execCli } from '@main/utils/childProcess'; + const CACHE_VERIFY_TTL_MS = 30_000; +const VERSION_CACHE_TTL_MS = 30_000; let cachedBinaryPath: string | null | undefined; let cacheVerifiedAt = 0; let resolveInFlight: Promise | null = null; +const versionCache = new Map(); async function fileExists(filePath: string): Promise { try { @@ -69,6 +73,7 @@ export class CodexBinaryResolver { cachedBinaryPath = undefined; cacheVerifiedAt = 0; resolveInFlight = null; + versionCache.clear(); } static async resolve(): Promise { @@ -117,4 +122,34 @@ export class CodexBinaryResolver { cacheVerifiedAt = Date.now(); return null; } + + static async resolveVersion(binaryPath: string | null | undefined): Promise { + const normalizedPath = binaryPath?.trim(); + if (!normalizedPath) { + return null; + } + + const cached = versionCache.get(normalizedPath); + if (cached && Date.now() - cached.observedAt <= VERSION_CACHE_TTL_MS) { + return cached.version; + } + + try { + const result = await execCli(normalizedPath, ['--version'], { + timeout: 3_000, + }); + const version = result.stdout.trim().split(/\s+/).filter(Boolean).at(-1) ?? null; + versionCache.set(normalizedPath, { + version, + observedAt: Date.now(), + }); + return version; + } catch { + versionCache.set(normalizedPath, { + version: null, + observedAt: Date.now(), + }); + return null; + } + } } diff --git a/src/main/services/infrastructure/codexAppServer/JsonRpcStdioClient.ts b/src/main/services/infrastructure/codexAppServer/JsonRpcStdioClient.ts index 4b2bf21c..fd7d29a6 100644 --- a/src/main/services/infrastructure/codexAppServer/JsonRpcStdioClient.ts +++ b/src/main/services/infrastructure/codexAppServer/JsonRpcStdioClient.ts @@ -10,6 +10,7 @@ interface JsonRpcLogger { interface JsonRpcErrorPayload { code?: number; message?: string; + data?: unknown; } interface JsonRpcResponse { @@ -49,6 +50,22 @@ function withTimeout(promise: Promise, timeoutMs: number, label: string): const DEFAULT_REQUEST_TIMEOUT_MS = 3_000; const DEFAULT_TOTAL_TIMEOUT_MS = 8_000; +export class JsonRpcRequestError extends Error { + readonly code: number | null; + readonly data: unknown; + readonly details: unknown; + readonly method: string; + + constructor(method: string, payload: JsonRpcErrorPayload) { + super(payload.message ?? 'Unknown JSON-RPC error'); + this.name = 'JsonRpcRequestError'; + this.method = method; + this.code = typeof payload.code === 'number' ? payload.code : null; + this.data = payload.data; + this.details = payload.data; + } +} + export class JsonRpcStdioClient { constructor(private readonly logger: JsonRpcLogger) {} @@ -93,6 +110,7 @@ export class JsonRpcStdioClient { const pending = new Map< number, { + method: string; resolve: (value: unknown) => void; reject: (error: Error) => void; timeoutId: ReturnType; @@ -149,7 +167,7 @@ export class JsonRpcStdioClient { pending.delete(message.id); if (message.error) { - entry.reject(new Error(message.error.message ?? 'Unknown JSON-RPC error')); + entry.reject(new JsonRpcRequestError(entry.method, message.error)); return; } @@ -222,17 +240,25 @@ export class JsonRpcStdioClient { reject(new Error(`JSON-RPC request timed out: ${method}`)); }, timeoutMs); - pending.set(id, { resolve: resolve as (value: unknown) => void, reject, timeoutId }); - - child.stdin.write(`${JSON.stringify({ id, method, params })}\n`, (error) => { - if (!error) { - return; - } - - clearTimeout(timeoutId); - pending.delete(id); - reject(error instanceof Error ? error : new Error(String(error))); + pending.set(id, { + method, + resolve: resolve as (value: unknown) => void, + reject, + timeoutId, }); + + child.stdin.write( + `${JSON.stringify({ jsonrpc: '2.0', id, method, params })}\n`, + (error) => { + if (!error) { + return; + } + + clearTimeout(timeoutId); + pending.delete(id); + reject(error instanceof Error ? error : new Error(String(error))); + } + ); }), notify: async (method: string, params?: unknown): Promise => { @@ -241,7 +267,7 @@ export class JsonRpcStdioClient { } await new Promise((resolve, reject) => { - child.stdin!.write(`${JSON.stringify({ method, params })}\n`, (error) => { + child.stdin!.write(`${JSON.stringify({ jsonrpc: '2.0', method, params })}\n`, (error) => { if (error) { reject(error instanceof Error ? error : new Error(String(error))); return; diff --git a/src/main/services/infrastructure/codexAppServer/__tests__/JsonRpcStdioClient.test.ts b/src/main/services/infrastructure/codexAppServer/__tests__/JsonRpcStdioClient.test.ts new file mode 100644 index 00000000..5e6e308b --- /dev/null +++ b/src/main/services/infrastructure/codexAppServer/__tests__/JsonRpcStdioClient.test.ts @@ -0,0 +1,79 @@ +import fs from 'node:fs'; +import os from 'node:os'; +import path from 'node:path'; + +import { afterEach, describe, expect, it } from 'vitest'; + +import { JsonRpcStdioClient } from '../JsonRpcStdioClient'; + +const tempDirs: string[] = []; + +function createStrictJsonRpcServerScript(): string { + const tempDir = fs.mkdtempSync(path.join(os.tmpdir(), 'json-rpc-stdio-client-')); + tempDirs.push(tempDir); + const scriptPath = path.join(tempDir, 'server.cjs'); + fs.writeFileSync( + scriptPath, + ` +const readline = require('node:readline'); +const rl = readline.createInterface({ input: process.stdin }); +rl.on('line', (line) => { + const message = JSON.parse(line); + if (message.jsonrpc !== '2.0') { + return; + } + if (message.method === 'fail') { + process.stdout.write(JSON.stringify({ + jsonrpc: '2.0', + id: message.id, + error: { code: -32601, message: 'No such method', data: { method: message.method } }, + }) + '\\n'); + return; + } + process.stdout.write(JSON.stringify({ + jsonrpc: '2.0', + id: message.id, + result: { ok: true, params: message.params }, + }) + '\\n'); +}); +`, + 'utf8' + ); + return scriptPath; +} + +afterEach(() => { + for (const dir of tempDirs.splice(0)) { + fs.rmSync(dir, { recursive: true, force: true }); + } +}); + +describe('JsonRpcStdioClient', () => { + it('sends JSON-RPC 2.0 framed requests and preserves structured errors', async () => { + const scriptPath = createStrictJsonRpcServerScript(); + const client = new JsonRpcStdioClient({ warn: () => undefined }); + + await client.withSession( + { + binaryPath: process.execPath, + args: [scriptPath], + label: 'strict json-rpc smoke', + requestTimeoutMs: 1_000, + totalTimeoutMs: 2_000, + }, + async (session) => { + await expect(session.request('ping', { value: 1 })).resolves.toEqual({ + ok: true, + params: { value: 1 }, + }); + + await expect(session.request('fail')).rejects.toMatchObject({ + method: 'fail', + code: -32601, + data: { method: 'fail' }, + details: { method: 'fail' }, + }); + } + ); + }); +}); diff --git a/src/main/services/infrastructure/codexAppServer/index.ts b/src/main/services/infrastructure/codexAppServer/index.ts index 63b3b013..a875f880 100644 --- a/src/main/services/infrastructure/codexAppServer/index.ts +++ b/src/main/services/infrastructure/codexAppServer/index.ts @@ -5,7 +5,7 @@ export { } from './CodexAppServerSessionFactory'; export { CodexBinaryResolver } from './CodexBinaryResolver'; export type { JsonRpcSession } from './JsonRpcStdioClient'; -export { JsonRpcStdioClient } from './JsonRpcStdioClient'; +export { JsonRpcRequestError, JsonRpcStdioClient } from './JsonRpcStdioClient'; export type { CodexAppServerAccount, CodexAppServerAccountLoginCompletedNotification, @@ -20,10 +20,17 @@ export type { CodexAppServerGetAccountRateLimitsResponse, CodexAppServerGetAccountResponse, CodexAppServerInitializeResponse, + CodexAppServerListModelsParams, + CodexAppServerListModelsResponse, CodexAppServerLoginAccountParams, CodexAppServerLoginAccountResponse, CodexAppServerLogoutAccountResponse, + CodexAppServerModel, CodexAppServerPlanType, CodexAppServerRateLimitSnapshot, CodexAppServerRateLimitWindow, + CodexAppServerReadConfigParams, + CodexAppServerReadConfigResponse, + CodexAppServerReasoningEffort, + CodexAppServerReasoningEffortOption, } from './protocol'; diff --git a/src/main/services/infrastructure/codexAppServer/protocol.ts b/src/main/services/infrastructure/codexAppServer/protocol.ts index 4fd8fbf1..7ca4b714 100644 --- a/src/main/services/infrastructure/codexAppServer/protocol.ts +++ b/src/main/services/infrastructure/codexAppServer/protocol.ts @@ -111,3 +111,53 @@ export type CodexAppServerCancelLoginAccountStatus = 'canceled' | 'notFound'; export interface CodexAppServerCancelLoginAccountResponse { status: CodexAppServerCancelLoginAccountStatus; } + +export type CodexAppServerReasoningEffort = + | 'none' + | 'minimal' + | 'low' + | 'medium' + | 'high' + | 'xhigh'; + +export interface CodexAppServerReasoningEffortOption { + reasoningEffort?: string; + description?: string | null; +} + +export interface CodexAppServerModel { + id?: string; + model?: string; + displayName?: string; + hidden?: boolean; + supportedReasoningEfforts?: (string | CodexAppServerReasoningEffortOption)[]; + defaultReasoningEffort?: string | null; + inputModalities?: string[] | null; + supportsPersonality?: boolean; + isDefault?: boolean; + upgrade?: boolean | string | null; + upgradeInfo?: unknown; +} + +export interface CodexAppServerListModelsParams { + cursor?: string | null; + limit?: number | null; + includeHidden?: boolean; +} + +export interface CodexAppServerListModelsResponse { + data?: CodexAppServerModel[]; + models?: CodexAppServerModel[]; + nextCursor?: string | null; + truncated?: boolean; +} + +export interface CodexAppServerReadConfigParams { + cwd?: string; + profile?: string; +} + +export interface CodexAppServerReadConfigResponse { + config?: Record; + origins?: Record; +} diff --git a/src/main/services/runtime/ClaudeMultimodelBridgeService.ts b/src/main/services/runtime/ClaudeMultimodelBridgeService.ts index 004869a8..87489e8b 100644 --- a/src/main/services/runtime/ClaudeMultimodelBridgeService.ts +++ b/src/main/services/runtime/ClaudeMultimodelBridgeService.ts @@ -30,6 +30,18 @@ interface RuntimeExtensionCapabilitiesResponse { apiKeys?: RuntimeExtensionCapabilityResponse; } +interface RuntimeProviderCapabilitiesResponse { + modelCatalog?: { + dynamic?: boolean; + source?: 'app-server' | 'static-fallback' | 'runtime'; + }; + reasoningEffort?: { + supported?: boolean; + values?: string[]; + configPassthrough?: boolean; + }; +} + interface ProviderStatusCommandResponse { schemaVersion?: number; providers?: Record< @@ -53,6 +65,7 @@ interface ProviderStatusCommandResponse { projectId?: string | null; authMethodDetail?: string | null; } | null; + runtimeCapabilities?: RuntimeProviderCapabilitiesResponse; } >; } @@ -119,6 +132,7 @@ interface UnifiedRuntimeStatusResponse { projectId?: string | null; authMethodDetail?: string | null; } | null; + runtimeCapabilities?: RuntimeProviderCapabilitiesResponse; } >; } @@ -164,6 +178,8 @@ function createDefaultProviderStatus(providerId: CliProviderId): CliProviderStat externalRuntimeDiagnostics: [], backend: null, connection: null, + modelCatalog: null, + runtimeCapabilities: null, }; } @@ -301,6 +317,34 @@ export class ClaudeMultimodelBridgeService { authMethodDetail: runtimeStatus.backend.authMethodDetail ?? null, } : null, + runtimeCapabilities: runtimeStatus.runtimeCapabilities + ? { + modelCatalog: runtimeStatus.runtimeCapabilities.modelCatalog + ? { + dynamic: runtimeStatus.runtimeCapabilities.modelCatalog.dynamic === true, + source: runtimeStatus.runtimeCapabilities.modelCatalog.source, + } + : undefined, + reasoningEffort: runtimeStatus.runtimeCapabilities.reasoningEffort + ? { + supported: runtimeStatus.runtimeCapabilities.reasoningEffort.supported === true, + values: + runtimeStatus.runtimeCapabilities.reasoningEffort.values?.flatMap((value) => + value === 'none' || + value === 'minimal' || + value === 'low' || + value === 'medium' || + value === 'high' || + value === 'xhigh' + ? [value] + : [] + ) ?? [], + configPassthrough: + runtimeStatus.runtimeCapabilities.reasoningEffort.configPassthrough === true, + } + : undefined, + } + : null, }; } diff --git a/src/main/services/runtime/ProviderConnectionService.ts b/src/main/services/runtime/ProviderConnectionService.ts index 5581bc29..05133652 100644 --- a/src/main/services/runtime/ProviderConnectionService.ts +++ b/src/main/services/runtime/ProviderConnectionService.ts @@ -11,10 +11,12 @@ import type { CodexAccountSnapshotDto, } from '@features/codex-account/contracts'; import type { CodexAccountFeatureFacade } from '@features/codex-account/main'; +import type { CodexModelCatalogFeatureFacade } from '@features/codex-model-catalog/main'; import type { CliProviderAuthMode, CliProviderConnectionInfo, CliProviderId, + CliProviderReasoningEffort, CliProviderStatus, } from '@shared/types'; @@ -77,6 +79,8 @@ function buildCodexForcedLoginLaunchArgs( export class ProviderConnectionService { private static instance: ProviderConnectionService | null = null; private codexAccountFeature: Pick | null = null; + private codexModelCatalogFeature: Pick | null = + null; constructor( private readonly apiKeyService = new ApiKeyService(), @@ -92,6 +96,12 @@ export class ProviderConnectionService { this.codexAccountFeature = feature; } + setCodexModelCatalogFeature( + feature: Pick | null + ): void { + this.codexModelCatalogFeature = feature; + } + getConfiguredAuthMode(providerId: CliProviderId): CliProviderAuthMode | null { if (providerId === 'anthropic') { return this.configManager.getConfig().providerConnections.anthropic.authMode; @@ -353,10 +363,53 @@ export class ProviderConnectionService { } async enrichProviderStatus(provider: CliProviderStatus): Promise { - return { + const withConnection = { ...provider, connection: await this.getConnectionInfo(provider.providerId), }; + + if (provider.providerId !== 'codex' || !this.codexModelCatalogFeature) { + return withConnection; + } + + try { + const catalog = await this.codexModelCatalogFeature.getCatalog(); + const models = catalog.models + .filter((model) => !model.hidden) + .map((model) => model.launchModel.trim()) + .filter(Boolean); + const reasoningEfforts = Array.from( + new Set( + catalog.models.flatMap( + (model) => model.supportedReasoningEfforts + ) + ) + ); + const runtimeReasoningCapability = withConnection.runtimeCapabilities?.reasoningEffort; + const runtimeModelCatalogCapability = withConnection.runtimeCapabilities?.modelCatalog; + return { + ...withConnection, + models: models.length > 0 ? models : withConnection.models, + modelCatalog: catalog, + runtimeCapabilities: { + ...withConnection.runtimeCapabilities, + modelCatalog: { + dynamic: runtimeModelCatalogCapability?.dynamic === true, + source: catalog.source, + }, + reasoningEffort: { + supported: runtimeReasoningCapability?.supported ?? reasoningEfforts.length > 0, + values: + runtimeReasoningCapability?.values && runtimeReasoningCapability.values.length > 0 + ? runtimeReasoningCapability.values + : (['low', 'medium', 'high'] satisfies CliProviderReasoningEffort[]), + configPassthrough: runtimeReasoningCapability?.configPassthrough === true, + }, + }, + }; + } catch { + return withConnection; + } } async enrichProviderStatuses(providers: CliProviderStatus[]): Promise { diff --git a/src/main/services/team/TeamDataService.ts b/src/main/services/team/TeamDataService.ts index 5807ea51..9bdf17b9 100644 --- a/src/main/services/team/TeamDataService.ts +++ b/src/main/services/team/TeamDataService.ts @@ -8,6 +8,7 @@ import { wrapAgentBlock, } from '@shared/constants/agentBlocks'; import { getMemberColorByName } from '@shared/constants/memberColors'; +import { isTeamEffortLevel } from '@shared/utils/effortLevels'; import { classifyIdleNotificationText } from '@shared/utils/idleNotificationSemantics'; import { isLeadMember } from '@shared/utils/leadDetection'; import { createLogger } from '@shared/utils/logger'; @@ -1258,10 +1259,7 @@ export class TeamDataService { ? request.providerId : undefined, model: request.model?.trim() || undefined, - effort: - request.effort === 'low' || request.effort === 'medium' || request.effort === 'high' - ? request.effort - : undefined, + effort: isTeamEffortLevel(request.effort) ? request.effort : undefined, agentType: 'general-purpose', joinedAt: Date.now(), }; @@ -1297,7 +1295,7 @@ export class TeamDataService { workflow?: string; providerId?: 'anthropic' | 'codex' | 'gemini'; model?: string; - effort?: 'low' | 'medium' | 'high'; + effort?: TeamMember['effort']; }[]; } ): Promise { @@ -1339,10 +1337,7 @@ export class TeamDataService { workflow: member.workflow?.trim() || undefined, providerId: normalizeOptionalTeamProviderId(member.providerId), model: member.model?.trim() || undefined, - effort: - member.effort === 'low' || member.effort === 'medium' || member.effort === 'high' - ? member.effort - : undefined, + effort: isTeamEffortLevel(member.effort) ? member.effort : undefined, agentType: prev?.agentType ?? 'general-purpose', agentId: isSameActiveMember ? prev?.agentId : undefined, color: prev?.color, @@ -2418,10 +2413,7 @@ export class TeamDataService { workflow: member.workflow?.trim() || undefined, providerId: normalizeOptionalTeamProviderId(member.providerId), model: member.model?.trim() || undefined, - effort: - member.effort === 'low' || member.effort === 'medium' || member.effort === 'high' - ? member.effort - : undefined, + effort: isTeamEffortLevel(member.effort) ? member.effort : undefined, agentType: 'general-purpose' as const, joinedAt, })) diff --git a/src/main/services/team/TeamMemberResolver.ts b/src/main/services/team/TeamMemberResolver.ts index 62bc8f06..9e819f66 100644 --- a/src/main/services/team/TeamMemberResolver.ts +++ b/src/main/services/team/TeamMemberResolver.ts @@ -123,7 +123,7 @@ export class TeamMemberResolver { workflow?: string; providerId?: 'anthropic' | 'codex' | 'gemini'; model?: string; - effort?: 'low' | 'medium' | 'high'; + effort?: TeamMember['effort']; color?: string; cwd?: string; } @@ -166,7 +166,7 @@ export class TeamMemberResolver { workflow?: string; providerId?: 'anthropic' | 'codex' | 'gemini'; model?: string; - effort?: 'low' | 'medium' | 'high'; + effort?: TeamMember['effort']; color?: string; removedAt?: number; } diff --git a/src/main/services/team/TeamMembersMetaStore.ts b/src/main/services/team/TeamMembersMetaStore.ts index 064b0273..7a316a15 100644 --- a/src/main/services/team/TeamMembersMetaStore.ts +++ b/src/main/services/team/TeamMembersMetaStore.ts @@ -1,5 +1,6 @@ import { FileReadTimeoutError, readFileUtf8WithTimeout } from '@main/utils/fsRead'; import { getTeamsBasePath } from '@main/utils/pathDecoder'; +import { isTeamEffortLevel } from '@shared/utils/effortLevels'; import { createCliAutoSuffixNameGuard } from '@shared/utils/teamMemberName'; import { normalizeOptionalTeamProviderId } from '@shared/utils/teamProvider'; import * as fs from 'fs'; @@ -36,10 +37,7 @@ function normalizeMember(member: TeamMember): TeamMember | null { workflow: typeof member.workflow === 'string' ? member.workflow.trim() || undefined : undefined, providerId: normalizeOptionalTeamProviderId(member.providerId), model: typeof member.model === 'string' ? member.model.trim() || undefined : undefined, - effort: - member.effort === 'low' || member.effort === 'medium' || member.effort === 'high' - ? member.effort - : undefined, + effort: isTeamEffortLevel(member.effort) ? member.effort : undefined, agentType: typeof member.agentType === 'string' ? member.agentType.trim() || undefined : undefined, color: typeof member.color === 'string' ? member.color.trim() || undefined : undefined, diff --git a/src/main/services/team/TeamMetaStore.ts b/src/main/services/team/TeamMetaStore.ts index 2f9553a0..71170eff 100644 --- a/src/main/services/team/TeamMetaStore.ts +++ b/src/main/services/team/TeamMetaStore.ts @@ -6,6 +6,8 @@ import * as path from 'path'; import { atomicWriteAsync } from './atomicWrite'; +import type { ProviderModelLaunchIdentity, TeamProviderId } from '@shared/types'; + /** * Persisted team-level metadata saved by the UI before CLI provisioning. * CLI does not know about this file — it only reads/writes config.json. @@ -27,6 +29,7 @@ export interface TeamMetaFile { worktree?: string; extraCliArgs?: string; limitContext?: boolean; + launchIdentity?: ProviderModelLaunchIdentity; createdAt: number; } @@ -40,6 +43,70 @@ function normalizeOptionalBackendId(value: unknown): string | undefined { return trimmed.length > 0 ? trimmed : undefined; } +function normalizeProviderId(value: unknown): TeamProviderId | undefined { + return value === 'anthropic' || value === 'codex' || value === 'gemini' ? value : undefined; +} + +function normalizeOptionalString(value: unknown): string | null { + return typeof value === 'string' && value.trim().length > 0 ? value.trim() : null; +} + +function normalizeLaunchIdentity(value: unknown): ProviderModelLaunchIdentity | undefined { + if (!value || typeof value !== 'object') { + return undefined; + } + + const raw = value as Partial; + const providerId = normalizeProviderId(raw.providerId); + const selectedModelKind = + raw.selectedModelKind === 'default' || raw.selectedModelKind === 'explicit' + ? raw.selectedModelKind + : null; + if (!providerId || !selectedModelKind) { + return undefined; + } + + const catalogSource = + raw.catalogSource === 'app-server' || + raw.catalogSource === 'static-fallback' || + raw.catalogSource === 'runtime' || + raw.catalogSource === 'unavailable' + ? raw.catalogSource + : 'unavailable'; + const selectedEffort = + raw.selectedEffort === 'none' || + raw.selectedEffort === 'minimal' || + raw.selectedEffort === 'low' || + raw.selectedEffort === 'medium' || + raw.selectedEffort === 'high' || + raw.selectedEffort === 'xhigh' + ? raw.selectedEffort + : null; + const resolvedEffort = + raw.resolvedEffort === 'none' || + raw.resolvedEffort === 'minimal' || + raw.resolvedEffort === 'low' || + raw.resolvedEffort === 'medium' || + raw.resolvedEffort === 'high' || + raw.resolvedEffort === 'xhigh' + ? raw.resolvedEffort + : null; + + return { + providerId, + providerBackendId: + migrateProviderBackendId(providerId, normalizeOptionalString(raw.providerBackendId)) ?? null, + selectedModel: normalizeOptionalString(raw.selectedModel), + selectedModelKind, + resolvedLaunchModel: normalizeOptionalString(raw.resolvedLaunchModel), + catalogId: normalizeOptionalString(raw.catalogId), + catalogSource, + catalogFetchedAt: normalizeOptionalString(raw.catalogFetchedAt), + selectedEffort, + resolvedEffort, + }; +} + export class TeamMetaStore { private getMetaPath(teamName: string): string { return path.join(getTeamsBasePath(), teamName, 'team.meta.json'); @@ -110,6 +177,7 @@ export class TeamMetaStore { extraCliArgs: typeof file.extraCliArgs === 'string' ? file.extraCliArgs.trim() || undefined : undefined, limitContext: typeof file.limitContext === 'boolean' ? file.limitContext : undefined, + launchIdentity: normalizeLaunchIdentity(file.launchIdentity), createdAt: typeof file.createdAt === 'number' ? file.createdAt : Date.now(), }; } @@ -133,6 +201,7 @@ export class TeamMetaStore { worktree: data.worktree?.trim() || undefined, extraCliArgs: data.extraCliArgs?.trim() || undefined, limitContext: data.limitContext, + launchIdentity: normalizeLaunchIdentity(data.launchIdentity), createdAt: data.createdAt, }; await atomicWriteAsync(this.getMetaPath(teamName), JSON.stringify(payload, null, 2)); diff --git a/src/main/services/team/TeamProvisioningService.ts b/src/main/services/team/TeamProvisioningService.ts index 06ba0121..312f039a 100644 --- a/src/main/services/team/TeamProvisioningService.ts +++ b/src/main/services/team/TeamProvisioningService.ts @@ -40,6 +40,7 @@ import { resolveLanguageName } from '@shared/utils/agentLanguage'; import { getAnthropicDefaultTeamModel } from '@shared/utils/anthropicModelDefaults'; import { parseCliArgs } from '@shared/utils/cliArgsParser'; import { deriveContextMetrics, inferContextWindowTokens } from '@shared/utils/contextMetrics'; +import { isTeamEffortLevel } from '@shared/utils/effortLevels'; import { getErrorMessage } from '@shared/utils/errorHandling'; import { isInboxNoiseMessage, @@ -169,6 +170,7 @@ interface RelayInboxMessageView { import type { ActiveToolCall, + CliProviderRuntimeCapabilities, CrossTeamSendResult, EffortLevel, InboxMessage, @@ -179,6 +181,7 @@ import type { MemberSpawnStatusEntry, PersistedTeamLaunchPhase, PersistedTeamLaunchSummary, + ProviderModelLaunchIdentity, TeamAgentRuntimeBackendType, TeamAgentRuntimeEntry, TeamAgentRuntimeSnapshot, @@ -320,6 +323,21 @@ interface ProviderModelListCommandResponse { >; } +interface RuntimeStatusCommandResponse { + providers?: Record< + string, + { + runtimeCapabilities?: CliProviderRuntimeCapabilities | null; + } + >; +} + +interface RuntimeProviderLaunchFacts { + defaultModel: string | null; + modelIds: Set; + runtimeCapabilities: CliProviderRuntimeCapabilities | null; +} + function extractJsonObjectFromCli(raw: string): T { const trimmed = raw.trim(); try { @@ -334,6 +352,65 @@ function extractJsonObjectFromCli(raw: string): T { } } +function getExplicitLaunchModelSelection(model: string | undefined): string | undefined { + const trimmed = model?.trim(); + if (!trimmed || isDefaultProviderModelSelection(trimmed)) { + return undefined; + } + return trimmed; +} + +function getLaunchModelArg( + providerId: TeamProviderId, + model: string | undefined, + launchIdentity?: ProviderModelLaunchIdentity | null +): string | undefined { + const explicitModel = getExplicitLaunchModelSelection(model); + if (explicitModel) { + return explicitModel; + } + + if ( + providerId === 'codex' && + launchIdentity?.selectedModelKind === 'default' && + launchIdentity.resolvedLaunchModel + ) { + return launchIdentity.resolvedLaunchModel; + } + + return undefined; +} + +function normalizeProviderModelListModels( + provider: NonNullable[string] | undefined +): Set { + const models = new Set(); + for (const entry of provider?.models ?? []) { + const modelId = typeof entry === 'string' ? entry : entry.id; + const trimmed = modelId?.trim(); + if (trimmed) { + models.add(trimmed); + } + } + return models; +} + +function isLegacySafeEffort(effort: EffortLevel): boolean { + return effort === 'low' || effort === 'medium' || effort === 'high'; +} + +function isCodexEffortRuntimeSupported( + effort: EffortLevel, + capabilities: CliProviderRuntimeCapabilities | null +): boolean { + if (isLegacySafeEffort(effort)) { + return true; + } + + const reasoning = capabilities?.reasoningEffort; + return reasoning?.configPassthrough === true && reasoning.values.includes(effort); +} + function isProbeTimeoutMessage(message: string): boolean { const lower = message.toLowerCase(); return ( @@ -476,6 +553,7 @@ function logRuntimeLaunchSnapshot( geminiRuntimeAuth?: GeminiRuntimeAuthState | null; promptSize?: PromptSizeSummary | null; expectedMembersCount?: number; + launchIdentity?: ProviderModelLaunchIdentity | null; } ): void { const providerId = resolveTeamProviderId(request.providerId); @@ -489,6 +567,7 @@ function logRuntimeLaunchSnapshot( getConfiguredRuntimeBackend(providerId), promptSize: options?.promptSize ?? null, expectedMembersCount: options?.expectedMembersCount ?? null, + launchIdentity: options?.launchIdentity ?? null, geminiRuntimeAuth: providerId === 'gemini' ? { @@ -1257,17 +1336,22 @@ function buildEffectiveTeamMemberSpec( const defaultProviderId = normalizeTeamMemberProviderId(defaults.providerId); const effectiveProviderId = memberProviderId ?? defaultProviderId ?? 'anthropic'; const model = - member.model?.trim() || + getExplicitLaunchModelSelection(member.model) || (memberProviderId == null || memberProviderId === defaultProviderId - ? defaults.model?.trim() + ? getExplicitLaunchModelSelection(defaults.model) : undefined) || undefined; + const effort = + member.effort ?? + (memberProviderId == null || memberProviderId === defaultProviderId + ? defaults.effort + : undefined); return { ...member, providerId: effectiveProviderId, model, - effort: member.effort ?? defaults.effort, + effort, }; } @@ -2859,6 +2943,214 @@ export class TeamProvisioningService { this.controlApiBaseUrlResolver = resolver; } + private async readRuntimeProviderLaunchFacts(params: { + claudePath: string; + cwd: string; + providerId: TeamProviderId; + env: NodeJS.ProcessEnv; + limitContext?: boolean; + }): Promise { + if (params.providerId === 'anthropic') { + return { + defaultModel: getAnthropicDefaultTeamModel(params.limitContext === true), + modelIds: new Set(), + runtimeCapabilities: null, + }; + } + + const modelListPromise = execCli( + params.claudePath, + ['model', 'list', '--json', '--provider', params.providerId], + { + cwd: params.cwd, + env: params.env, + timeout: 10_000, + } + ); + const runtimeStatusPromise = + params.providerId === 'codex' + ? execCli(params.claudePath, ['runtime', 'status', '--json', '--provider', 'codex'], { + cwd: params.cwd, + env: params.env, + timeout: 8_000, + }) + : null; + + const [modelListResult, runtimeStatusResult] = await Promise.allSettled([ + modelListPromise, + runtimeStatusPromise, + ]); + + let defaultModel: string | null = null; + let modelIds = new Set(); + if (modelListResult.status === 'fulfilled') { + try { + const parsed = extractJsonObjectFromCli( + modelListResult.value.stdout + ); + const provider = parsed.providers?.[params.providerId]; + defaultModel = + typeof provider?.defaultModel === 'string' && provider.defaultModel.trim().length > 0 + ? provider.defaultModel.trim() + : null; + modelIds = normalizeProviderModelListModels(provider); + } catch (error) { + logger.warn( + `[${params.providerId}] Failed to parse runtime model list for launch validation: ${ + error instanceof Error ? error.message : String(error) + }` + ); + } + } + + let runtimeCapabilities: CliProviderRuntimeCapabilities | null = null; + if ( + runtimeStatusResult.status === 'fulfilled' && + runtimeStatusResult.value && + typeof runtimeStatusResult.value.stdout === 'string' + ) { + try { + const parsed = extractJsonObjectFromCli( + runtimeStatusResult.value.stdout + ); + runtimeCapabilities = parsed.providers?.[params.providerId]?.runtimeCapabilities ?? null; + } catch (error) { + logger.warn( + `[${params.providerId}] Failed to parse runtime capabilities for launch validation: ${ + error instanceof Error ? error.message : String(error) + }` + ); + } + } + + return { + defaultModel, + modelIds, + runtimeCapabilities, + }; + } + + private buildProviderModelLaunchIdentity(params: { + request: Pick; + facts: RuntimeProviderLaunchFacts; + }): ProviderModelLaunchIdentity { + const providerId = resolveTeamProviderId(params.request.providerId); + const explicitModel = getExplicitLaunchModelSelection(params.request.model); + const resolvedLaunchModel = explicitModel ?? params.facts.defaultModel; + const resolvedEffort = params.request.effort ?? null; + + return { + providerId, + providerBackendId: + migrateProviderBackendId(providerId, params.request.providerBackendId) ?? null, + selectedModel: explicitModel ?? null, + selectedModelKind: explicitModel ? 'explicit' : 'default', + resolvedLaunchModel, + catalogId: resolvedLaunchModel, + catalogSource: 'runtime', + catalogFetchedAt: null, + selectedEffort: params.request.effort ?? null, + resolvedEffort, + }; + } + + private validateRuntimeLaunchSelection(params: { + actorLabel: string; + providerId: TeamProviderId; + model?: string; + effort?: EffortLevel; + facts: RuntimeProviderLaunchFacts; + }): void { + const explicitModel = getExplicitLaunchModelSelection(params.model); + + if (params.providerId !== 'codex') { + if (params.effort && !isLegacySafeEffort(params.effort)) { + throw new Error( + `${params.actorLabel} uses effort "${params.effort}", but ${getTeamProviderLabel( + params.providerId + )} currently supports only low, medium, or high effort in Agent Teams.` + ); + } + return; + } + + if ( + params.effort && + !isCodexEffortRuntimeSupported(params.effort, params.facts.runtimeCapabilities) + ) { + throw new Error( + `${params.actorLabel} uses Codex effort "${params.effort}", but this Agent Teams runtime does not expose Codex reasoning config passthrough yet. Use low, medium, or high for now.` + ); + } + + if (!explicitModel || params.facts.modelIds.has(explicitModel)) { + return; + } + + if (params.facts.runtimeCapabilities?.modelCatalog?.dynamic === true) { + return; + } + + throw new Error( + `${params.actorLabel} uses Codex model "${explicitModel}", but this Agent Teams runtime does not declare dynamic Codex model launch support yet. Upgrade the runtime or pick a listed Codex model.` + ); + } + + private async resolveAndValidateLaunchIdentity(params: { + claudePath: string; + cwd: string; + env: NodeJS.ProcessEnv; + request: Pick< + TeamCreateRequest, + 'providerId' | 'providerBackendId' | 'model' | 'effort' | 'limitContext' + >; + effectiveMembers: TeamCreateRequest['members']; + }): Promise { + const leadProviderId = resolveTeamProviderId(params.request.providerId); + const factsByProvider = new Map(); + const getFacts = async (providerId: TeamProviderId): Promise => { + const cached = factsByProvider.get(providerId); + if (cached) { + return cached; + } + const facts = await this.readRuntimeProviderLaunchFacts({ + claudePath: params.claudePath, + cwd: params.cwd, + providerId, + env: params.env, + limitContext: params.request.limitContext, + }); + factsByProvider.set(providerId, facts); + return facts; + }; + + const leadFacts = await getFacts(leadProviderId); + this.validateRuntimeLaunchSelection({ + actorLabel: 'Team lead', + providerId: leadProviderId, + model: params.request.model, + effort: params.request.effort, + facts: leadFacts, + }); + + for (const member of params.effectiveMembers) { + const memberProviderId = resolveTeamProviderId(member.providerId); + const memberFacts = await getFacts(memberProviderId); + this.validateRuntimeLaunchSelection({ + actorLabel: `Member ${member.name}`, + providerId: memberProviderId, + model: member.model, + effort: member.effort, + facts: memberFacts, + }); + } + + return this.buildProviderModelLaunchIdentity({ + request: params.request, + facts: leadFacts, + }); + } + async getClaudeLogs( teamName: string, query?: { offset?: number; limit?: number } @@ -6227,6 +6519,13 @@ export class TeamProvisioningService { primaryEnv: provisioningEnv, limitContext: request.limitContext, }); + const launchIdentity = await this.resolveAndValidateLaunchIdentity({ + claudePath, + cwd: request.cwd, + env: shellEnv, + request, + effectiveMembers: effectiveMemberSpecs, + }); const runId = randomUUID(); const startedAt = nowIso(); const run: ProvisioningRun = { @@ -6363,6 +6662,11 @@ export class TeamProvisioningService { run.bootstrapUserPromptPath = null; throw error; } + const launchModelArg = getLaunchModelArg( + resolveTeamProviderId(request.providerId), + request.model, + launchIdentity + ); const spawnArgs = [ '--input-format', 'stream-json', @@ -6385,7 +6689,7 @@ export class TeamProvisioningService { ...(request.skipPermissions !== false ? ['--dangerously-skip-permissions', '--permission-mode', 'bypassPermissions'] : ['--permission-prompt-tool', 'stdio', '--permission-mode', 'default']), - ...(request.model ? ['--model', request.model] : []), + ...(launchModelArg ? ['--model', launchModelArg] : []), ...(request.effort ? ['--effort', request.effort] : []), ...(request.worktree ? ['--worktree', request.worktree] : []), ...parseCliArgs(request.extraCliArgs), @@ -6400,6 +6704,7 @@ export class TeamProvisioningService { geminiRuntimeAuth, promptSize, expectedMembersCount: effectiveMemberSpecs.length, + launchIdentity, }); try { // Pre-save our meta files before spawn — CLI doesn't touch these. @@ -6422,6 +6727,7 @@ export class TeamProvisioningService { worktree: request.worktree, extraCliArgs: request.extraCliArgs, limitContext: request.limitContext, + launchIdentity, createdAt: Date.now(), }); const membersToWrite = applyDistinctProvisioningMemberColors( @@ -6431,10 +6737,7 @@ export class TeamProvisioningService { workflow: m.workflow?.trim() || undefined, providerId: normalizeOptionalTeamProviderId(m.providerId), model: m.model?.trim() || undefined, - effort: - m.effort === 'low' || m.effort === 'medium' || m.effort === 'high' - ? m.effort - : undefined, + effort: isTeamEffortLevel(m.effort) ? m.effort : undefined, agentType: 'general-purpose' as const, joinedAt: Date.now(), })) @@ -6804,6 +7107,13 @@ export class TeamProvisioningService { primaryEnv: provisioningEnv, limitContext: request.limitContext, }); + const launchIdentity = await this.resolveAndValidateLaunchIdentity({ + claudePath, + cwd: request.cwd, + env: shellEnv, + request, + effectiveMembers: effectiveMemberSpecs, + }); // Build a synthetic TeamCreateRequest for reuse by shared infrastructure const syntheticRequest: TeamCreateRequest = { @@ -7013,8 +7323,13 @@ export class TeamProvisioningService { `[${request.teamName}] Launching with --resume ${previousSessionId} for session continuity` ); } - if (request.model) { - launchArgs.push('--model', request.model); + const launchModelArg = getLaunchModelArg( + resolveTeamProviderId(request.providerId), + request.model, + launchIdentity + ); + if (launchModelArg) { + launchArgs.push('--model', launchModelArg); } if (request.effort) { launchArgs.push('--effort', request.effort); @@ -7033,6 +7348,7 @@ export class TeamProvisioningService { geminiRuntimeAuth, promptSize, expectedMembersCount: effectiveMemberSpecs.length, + launchIdentity, }); // --resume is added above when a valid previous session JSONL exists. // Without it, CLI creates a fresh session ID automatically. @@ -7050,6 +7366,7 @@ export class TeamProvisioningService { worktree: request.worktree, extraCliArgs: request.extraCliArgs, limitContext: request.limitContext, + launchIdentity, createdAt: Date.now(), }); await this.membersMetaStore.writeMembers( @@ -7060,10 +7377,7 @@ export class TeamProvisioningService { workflow: member.workflow?.trim() || undefined, providerId: normalizeOptionalTeamProviderId(member.providerId), model: member.model?.trim() || undefined, - effort: - member.effort === 'low' || member.effort === 'medium' || member.effort === 'high' - ? member.effort - : undefined, + effort: isTeamEffortLevel(member.effort) ? member.effort : undefined, agentType: 'general-purpose', color: getMemberColorByName(member.name.trim()), joinedAt: Date.now(), @@ -8514,16 +8828,11 @@ export class TeamProvisioningService { normalizeTeamMemberProviderId(metaMember?.providerId) ?? normalizeTeamMemberProviderId(configuredMember?.providerId); const model = metaMember?.model?.trim() || configuredMember?.model?.trim() || undefined; - const effort = - metaMember?.effort === 'low' || - metaMember?.effort === 'medium' || - metaMember?.effort === 'high' - ? metaMember.effort - : configuredMember?.effort === 'low' || - configuredMember?.effort === 'medium' || - configuredMember?.effort === 'high' - ? configuredMember.effort - : undefined; + const effort = isTeamEffortLevel(metaMember?.effort) + ? metaMember.effort + : isTeamEffortLevel(configuredMember?.effort) + ? configuredMember.effort + : undefined; const agentType = metaMember?.agentType?.trim() || configuredMember?.agentType?.trim() || undefined; const removedAt = metaMember?.removedAt ?? configuredMember?.removedAt; @@ -13035,12 +13344,9 @@ export class TeamProvisioningService { const effectiveLeadProviderId = normalizeTeamMemberProviderId(launchState.providerId) ?? 'anthropic'; const effectiveLeadModel = launchState.model?.trim() || undefined; - const effectiveLeadEffort = - launchState.effort === 'low' || - launchState.effort === 'medium' || - launchState.effort === 'high' - ? launchState.effort - : undefined; + const effectiveLeadEffort = isTeamEffortLevel(launchState.effort) + ? launchState.effort + : undefined; const membersByName = new Map( (launchState.members ?? []).map((member) => [member.name.toLowerCase(), member] as const) @@ -13075,10 +13381,7 @@ export class TeamProvisioningService { delete nextMember.model; } - const effort = - state.effort === 'low' || state.effort === 'medium' || state.effort === 'high' - ? state.effort - : undefined; + const effort = isTeamEffortLevel(state.effort) ? state.effort : undefined; if (effort) { nextMember.effort = effort; } else { @@ -13712,10 +14015,7 @@ export class TeamProvisioningService { workflow: member.workflow?.trim() || undefined, providerId: normalizeOptionalTeamProviderId(member.providerId), model: member.model?.trim() || undefined, - effort: - member.effort === 'low' || member.effort === 'medium' || member.effort === 'high' - ? member.effort - : undefined, + effort: isTeamEffortLevel(member.effort) ? member.effort : undefined, agentType: 'general-purpose' as const, joinedAt, })) @@ -13758,10 +14058,7 @@ export class TeamProvisioningService { const providerId = normalizeOptionalTeamProviderId(member.providerId); const model = typeof member.model === 'string' ? member.model.trim() || undefined : undefined; - const effort = - member.effort === 'low' || member.effort === 'medium' || member.effort === 'high' - ? member.effort - : undefined; + const effort = isTeamEffortLevel(member.effort) ? member.effort : undefined; const prev = byName.get(name); if (!prev) { byName.set(name, { name, role, workflow, providerId, model, effort }); @@ -13923,10 +14220,7 @@ export class TeamProvisioningService { typeof member.workflow === 'string' ? member.workflow.trim() || undefined : undefined, providerId: normalizeTeamMemberProviderId(member.providerId ?? member.provider), model: typeof member.model === 'string' ? member.model.trim() || undefined : undefined, - effort: - member.effort === 'low' || member.effort === 'medium' || member.effort === 'high' - ? member.effort - : undefined, + effort: isTeamEffortLevel(member.effort) ? member.effort : undefined, }); } // Defense: ignore CLI auto-suffixed duplicates (alice-2) when base name exists. diff --git a/src/renderer/components/team/dialogs/EffortLevelSelector.tsx b/src/renderer/components/team/dialogs/EffortLevelSelector.tsx index 76de3985..b3395d46 100644 --- a/src/renderer/components/team/dialogs/EffortLevelSelector.tsx +++ b/src/renderer/components/team/dialogs/EffortLevelSelector.tsx @@ -2,54 +2,136 @@ import React from 'react'; import { Label } from '@renderer/components/ui/label'; import { cn } from '@renderer/lib/utils'; +import { useStore } from '@renderer/store'; import { Brain } from 'lucide-react'; -const EFFORT_OPTIONS = [ +import type { CliProviderStatus, EffortLevel, TeamProviderId } from '@shared/types'; + +const BASE_EFFORT_OPTIONS = [ { value: '', label: 'Default' }, { value: 'low', label: 'Low' }, { value: 'medium', label: 'Medium' }, { value: 'high', label: 'High' }, ] as const; +const EFFORT_LABELS: Record = { + none: 'None', + minimal: 'Minimal', + low: 'Low', + medium: 'Medium', + high: 'High', + xhigh: 'XHigh', +}; + +const BASE_CODEX_SAFE_EFFORTS = new Set(['low', 'medium', 'high']); + export interface EffortLevelSelectorProps { value: string; onValueChange: (value: string) => void; id?: string; + providerId?: TeamProviderId; + model?: string; +} + +function getCatalogModel( + providerStatus: CliProviderStatus | null | undefined, + model: string | undefined +): NonNullable['models'][number] | null { + const catalog = providerStatus?.modelCatalog; + if (!catalog || catalog.providerId !== 'codex') { + return null; + } + + const explicitModel = model?.trim(); + if (explicitModel) { + return ( + catalog.models.find( + (item) => item.launchModel === explicitModel || item.id === explicitModel + ) ?? null + ); + } + + return ( + catalog.models.find((item) => item.id === catalog.defaultModelId) ?? + catalog.models.find((item) => item.isDefault) ?? + null + ); +} + +function getEffortOptions(params: { + providerId?: TeamProviderId; + model?: string; + providerStatus?: CliProviderStatus | null; +}): readonly { value: string; label: string }[] { + if (params.providerId !== 'codex') { + return BASE_EFFORT_OPTIONS; + } + + const runtimeCapability = params.providerStatus?.runtimeCapabilities?.reasoningEffort; + const catalogModel = getCatalogModel(params.providerStatus, params.model); + const catalogEfforts = catalogModel?.supportedReasoningEfforts ?? []; + const candidateEfforts = + catalogEfforts.length > 0 ? catalogEfforts : (runtimeCapability?.values ?? []); + const safeEfforts = + runtimeCapability?.configPassthrough === true + ? candidateEfforts + : candidateEfforts.filter((effort) => BASE_CODEX_SAFE_EFFORTS.has(effort)); + const efforts = safeEfforts.length > 0 ? safeEfforts : (['low', 'medium', 'high'] as const); + const defaultLabel = catalogModel?.defaultReasoningEffort + ? `Default (${EFFORT_LABELS[catalogModel.defaultReasoningEffort]})` + : 'Default'; + + return [ + { value: '', label: defaultLabel }, + ...efforts.map((effort) => ({ + value: effort, + label: EFFORT_LABELS[effort], + })), + ]; } export const EffortLevelSelector: React.FC = ({ value, onValueChange, id, -}) => ( -
- -
- -
- {EFFORT_OPTIONS.map((opt) => ( - - ))} + providerId, + model, +}) => { + const providerStatus = useStore( + (s) => s.cliStatus?.providers.find((provider) => provider.providerId === providerId) ?? null + ); + const effortOptions = getEffortOptions({ providerId, model, providerStatus }); + + return ( +
+ +
+ +
+ {effortOptions.map((opt) => ( + + ))} +
+

+ Controls how much reasoning the selected provider invests before responding. Default uses + the provider's standard behavior for the selected model. +

-

- Controls how much reasoning the selected provider invests before responding. Default uses the - provider's standard behavior for the selected model. -

-
-); + ); +}; diff --git a/src/renderer/components/team/dialogs/LaunchTeamDialog.tsx b/src/renderer/components/team/dialogs/LaunchTeamDialog.tsx index 49203ee3..bb917507 100644 --- a/src/renderer/components/team/dialogs/LaunchTeamDialog.tsx +++ b/src/renderer/components/team/dialogs/LaunchTeamDialog.tsx @@ -2011,6 +2011,8 @@ export const LaunchTeamDialog = (props: LaunchTeamDialogProps): React.JSX.Elemen value={selectedEffort} onValueChange={setSelectedEffort} id="dialog-effort" + providerId={selectedProviderId} + model={selectedModel} /> {providerId === 'anthropic' ? ( {lockProviderModel && (

diff --git a/src/renderer/components/team/members/MembersEditorSection.tsx b/src/renderer/components/team/members/MembersEditorSection.tsx index dac0dee7..e7fb826e 100644 --- a/src/renderer/components/team/members/MembersEditorSection.tsx +++ b/src/renderer/components/team/members/MembersEditorSection.tsx @@ -5,6 +5,7 @@ import { Label } from '@renderer/components/ui/label'; import { getParticipantAvatarUrlByIndex } from '@renderer/utils/memberAvatarCatalog'; import { CUSTOM_ROLE, NO_ROLE, PRESET_ROLES } from '@renderer/constants/teamRoles'; import { normalizeOptionalTeamProviderId } from '@shared/utils/teamProvider'; +import { isTeamEffortLevel } from '@shared/utils/effortLevels'; import { Plus } from 'lucide-react'; import { MembersJsonEditor } from '../dialogs/MembersJsonEditor'; @@ -50,10 +51,9 @@ function parseJsonToDrafts(text: string): MemberDraft[] { const workflow = typeof item.workflow === 'string' ? item.workflow.trim() : ''; const providerId = normalizeOptionalTeamProviderId(item.providerId); const model = typeof item.model === 'string' ? item.model.trim() : ''; - const effort: EffortLevel | undefined = - item.effort === 'low' || item.effort === 'medium' || item.effort === 'high' - ? item.effort - : undefined; + const effort: EffortLevel | undefined = isTeamEffortLevel(item.effort) + ? item.effort + : undefined; const presetRoles: readonly string[] = PRESET_ROLES; const isPreset = presetRoles.includes(role); return createMemberDraft({ @@ -227,8 +227,7 @@ export const MembersEditorSection = ({ c.id === memberId ? { ...c, - effort: - effort === 'low' || effort === 'medium' || effort === 'high' ? effort : undefined, + effort: isTeamEffortLevel(effort) ? effort : undefined, } : c ) diff --git a/src/renderer/components/team/members/membersEditorUtils.ts b/src/renderer/components/team/members/membersEditorUtils.ts index 78bda46f..336916a1 100644 --- a/src/renderer/components/team/members/membersEditorUtils.ts +++ b/src/renderer/components/team/members/membersEditorUtils.ts @@ -2,6 +2,7 @@ import { CUSTOM_ROLE, NO_ROLE, PRESET_ROLES } from '@renderer/constants/teamRole import { serializeChipsWithText } from '@renderer/types/inlineChip'; import { normalizeCreateLaunchProviderForUi } from '@renderer/utils/geminiUiFreeze'; import { normalizeExplicitTeamModelForUi } from '@renderer/utils/teamModelAvailability'; +import { isTeamEffortLevel } from '@shared/utils/effortLevels'; import { isLeadMember } from '@shared/utils/leadDetection'; import { buildTeamMemberColorMap } from '@shared/utils/teamMemberColors'; import { validateTeamMemberNameFormat } from '@shared/utils/teamMemberName'; @@ -120,10 +121,7 @@ export function normalizeMemberDraftForProviderMode( } function normalizeDraftEffort(value: string | undefined): EffortLevel | undefined { - if (value === 'low' || value === 'medium' || value === 'high') { - return value; - } - return undefined; + return isTeamEffortLevel(value) ? value : undefined; } interface ExistingMemberColorInput { diff --git a/src/renderer/services/createTeamDraftStorage.ts b/src/renderer/services/createTeamDraftStorage.ts index eb616f60..2a7cab07 100644 --- a/src/renderer/services/createTeamDraftStorage.ts +++ b/src/renderer/services/createTeamDraftStorage.ts @@ -11,6 +11,10 @@ import { del, get, set } from 'idb-keyval'; +import { isTeamEffortLevel } from '@shared/utils/effortLevels'; + +import type { EffortLevel } from '@shared/types'; + // --------------------------------------------------------------------------- // Types // --------------------------------------------------------------------------- @@ -27,7 +31,7 @@ export interface SerializedMemberDraft { workflow?: string; providerId?: 'anthropic' | 'codex' | 'gemini'; model?: string; - effort?: 'low' | 'medium' | 'high'; + effort?: EffortLevel; } export interface CreateTeamDraftSnapshot { @@ -67,10 +71,7 @@ function isValidMember(m: unknown): m is SerializedMemberDraft { obj.providerId === 'codex' || obj.providerId === 'gemini') && (obj.model === undefined || typeof obj.model === 'string') && - (obj.effort === undefined || - obj.effort === 'low' || - obj.effort === 'medium' || - obj.effort === 'high') + (obj.effort === undefined || isTeamEffortLevel(obj.effort)) ); } diff --git a/src/renderer/utils/__tests__/teamModelAvailability.codexCatalog.test.ts b/src/renderer/utils/__tests__/teamModelAvailability.codexCatalog.test.ts new file mode 100644 index 00000000..af312422 --- /dev/null +++ b/src/renderer/utils/__tests__/teamModelAvailability.codexCatalog.test.ts @@ -0,0 +1,161 @@ +import { describe, expect, it } from 'vitest'; + +import { + CODEX_DYNAMIC_MODEL_REQUIRES_RUNTIME_SUPPORT_REASON, + getAvailableTeamProviderModelOptions, + getAvailableTeamProviderModels, + getTeamModelSelectionError, +} from '../teamModelAvailability'; + +import type { CliProviderStatus } from '@shared/types'; + +function createCodexProviderStatus( + models: NonNullable['models'], + options: { dynamicLaunch?: boolean } = {} +): CliProviderStatus { + return { + providerId: 'codex', + displayName: 'Codex', + supported: true, + authenticated: true, + authMethod: 'chatgpt', + verificationState: 'verified', + models: models.map((model) => model.launchModel), + modelCatalog: { + schemaVersion: 1, + providerId: 'codex', + source: 'app-server', + status: 'ready', + fetchedAt: '2026-04-21T00:00:00.000Z', + staleAt: '2026-04-21T00:01:00.000Z', + defaultModelId: models[0]?.id ?? null, + defaultLaunchModel: models[0]?.launchModel ?? null, + models, + diagnostics: { + configReadState: 'ready', + appServerState: 'healthy', + }, + }, + modelAvailability: [], + runtimeCapabilities: { + modelCatalog: { + dynamic: options.dynamicLaunch === true, + source: 'app-server', + }, + reasoningEffort: { + supported: true, + values: ['low', 'medium', 'high'], + configPassthrough: false, + }, + }, + canLoginFromUi: true, + capabilities: { + teamLaunch: true, + oneShot: true, + extensions: { + plugins: { status: 'unsupported', ownership: 'shared', reason: null }, + mcp: { status: 'supported', ownership: 'shared', reason: null }, + skills: { status: 'supported', ownership: 'shared', reason: null }, + apiKeys: { status: 'supported', ownership: 'shared', reason: null }, + }, + }, + }; +} + +describe('team model availability Codex catalog integration', () => { + it('uses app-server catalog models even when the static Codex list has not learned a new model yet', () => { + const providerStatus = createCodexProviderStatus( + [ + { + id: 'gpt-5.5', + launchModel: 'gpt-5.5', + displayName: 'GPT-5.5', + hidden: false, + supportedReasoningEfforts: ['low', 'medium', 'high', 'xhigh'], + defaultReasoningEffort: 'high', + inputModalities: ['text', 'image'], + supportsPersonality: false, + isDefault: true, + upgrade: false, + source: 'app-server', + badgeLabel: '5.5', + }, + ], + { dynamicLaunch: true } + ); + + expect(getAvailableTeamProviderModels('codex', providerStatus)).toEqual(['gpt-5.5']); + expect(getAvailableTeamProviderModelOptions('codex', providerStatus)).toEqual([ + { value: '', label: 'Default', badgeLabel: 'Default' }, + { + value: 'gpt-5.5', + label: '5.5', + badgeLabel: '5.5', + availabilityStatus: 'available', + availabilityReason: null, + }, + ]); + }); + + it('shows app-server future models but blocks launch until runtime declares dynamic support', () => { + const providerStatus = createCodexProviderStatus([ + { + id: 'gpt-5.5', + launchModel: 'gpt-5.5', + displayName: 'GPT-5.5', + hidden: false, + supportedReasoningEfforts: ['low', 'medium', 'high', 'xhigh'], + defaultReasoningEffort: 'high', + inputModalities: ['text', 'image'], + supportsPersonality: false, + isDefault: true, + upgrade: false, + source: 'app-server', + }, + ]); + + expect(getAvailableTeamProviderModels('codex', providerStatus)).toEqual([]); + expect(getAvailableTeamProviderModelOptions('codex', providerStatus)[1]).toMatchObject({ + value: 'gpt-5.5', + label: '5.5', + badgeLabel: 'New', + availabilityStatus: null, + }); + expect(getTeamModelSelectionError('codex', 'gpt-5.5', providerStatus)).toContain( + CODEX_DYNAMIC_MODEL_REQUIRES_RUNTIME_SUPPORT_REASON + ); + }); + + it('keeps existing disabled model policy on top of the dynamic catalog', () => { + const providerStatus = createCodexProviderStatus([ + { + id: 'gpt-5.3-codex-spark', + launchModel: 'gpt-5.3-codex-spark', + displayName: 'GPT-5.3 Codex Spark', + hidden: false, + supportedReasoningEfforts: ['high'], + defaultReasoningEffort: 'high', + inputModalities: ['text', 'image'], + supportsPersonality: false, + isDefault: false, + upgrade: false, + source: 'app-server', + }, + { + id: 'gpt-5.4', + launchModel: 'gpt-5.4', + displayName: 'GPT-5.4', + hidden: false, + supportedReasoningEfforts: ['low', 'medium', 'high'], + defaultReasoningEffort: 'medium', + inputModalities: ['text', 'image'], + supportsPersonality: false, + isDefault: true, + upgrade: false, + source: 'app-server', + }, + ]); + + expect(getAvailableTeamProviderModels('codex', providerStatus)).toEqual(['gpt-5.4']); + }); +}); diff --git a/src/renderer/utils/teamModelAvailability.ts b/src/renderer/utils/teamModelAvailability.ts index 585ec2d5..5521f3c0 100644 --- a/src/renderer/utils/teamModelAvailability.ts +++ b/src/renderer/utils/teamModelAvailability.ts @@ -4,6 +4,7 @@ import { getTeamProviderLabel, getTeamProviderModelOptions, getVisibleTeamProviderModels, + CODEX_DYNAMIC_MODEL_REQUIRES_RUNTIME_SUPPORT_REASON, GPT_5_1_CODEX_MAX_CHATGPT_UI_DISABLED_REASON, GPT_5_1_CODEX_MINI_UI_DISABLED_MODEL, GPT_5_1_CODEX_MINI_UI_DISABLED_REASON, @@ -28,6 +29,7 @@ import type { } from '@shared/types'; export { + CODEX_DYNAMIC_MODEL_REQUIRES_RUNTIME_SUPPORT_REASON, GPT_5_1_CODEX_MAX_CHATGPT_UI_DISABLED_REASON, GPT_5_1_CODEX_MINI_UI_DISABLED_MODEL, GPT_5_1_CODEX_MINI_UI_DISABLED_REASON, @@ -44,8 +46,10 @@ export type TeamModelRuntimeProviderStatus = Pick< CliProviderStatus, | 'providerId' | 'models' + | 'modelCatalog' | 'modelAvailability' | 'modelVerificationState' + | 'runtimeCapabilities' | 'authMethod' | 'backend' | 'authenticated' @@ -100,6 +104,56 @@ function getFallbackTeamProviderModelOptions( })); } +function getRuntimeCatalogModels( + providerId: SupportedProviderId, + providerStatus?: TeamModelRuntimeProviderStatus | null +): string[] | null { + if (providerId !== 'codex' || providerStatus?.modelCatalog?.providerId !== 'codex') { + return null; + } + + const models = providerStatus.modelCatalog.models + .filter((model) => !model.hidden) + .map((model) => model.launchModel.trim()) + .filter(Boolean); + return models.length > 0 ? models : null; +} + +function getRuntimeCatalogModelOption( + providerId: SupportedProviderId, + model: string, + providerStatus?: TeamModelRuntimeProviderStatus | null +): TeamRuntimeModelOption | null { + if (providerId !== 'codex' || providerStatus?.modelCatalog?.providerId !== 'codex') { + return null; + } + + const catalogModel = providerStatus.modelCatalog.models.find( + (item) => item.launchModel === model || item.id === model + ); + if (!catalogModel) { + return null; + } + + return { + value: catalogModel.launchModel, + label: + getProviderScopedTeamModelLabel(providerId, catalogModel.displayName) ?? + catalogModel.displayName, + badgeLabel: + catalogModel.badgeLabel ?? + (getTeamProviderModelOptions(providerId).some((option) => option.value === model) + ? undefined + : 'New'), + availabilityStatus: getRuntimeModelAvailability( + providerId, + catalogModel.launchModel, + providerStatus + ), + availabilityReason: getRuntimeModelAvailabilityReason(catalogModel.launchModel, providerStatus), + }; +} + function getRuntimeSelectorModels( providerId: SupportedProviderId, providerStatus?: TeamModelRuntimeProviderStatus | null @@ -108,6 +162,11 @@ function getRuntimeSelectorModels( return []; } + const catalogModels = getRuntimeCatalogModels(providerId, providerStatus); + if (catalogModels) { + return getVisibleTeamProviderModels(providerId, catalogModels, providerStatus); + } + return sortTeamProviderModels(providerId, providerStatus.models); } @@ -208,12 +267,18 @@ export function getAvailableTeamProviderModelOptions( const visibleModels = getRuntimeSelectorModels(providerId, providerStatus); return [ { value: '', label: 'Default', badgeLabel: 'Default' }, - ...visibleModels.map((model) => ({ - value: model, - label: getProviderScopedTeamModelLabel(providerId, model) ?? model, - availabilityStatus: getRuntimeModelAvailability(providerId, model, providerStatus), - availabilityReason: getRuntimeModelAvailabilityReason(model, providerStatus), - })), + ...visibleModels.map((model) => { + const catalogOption = getRuntimeCatalogModelOption(providerId, model, providerStatus); + if (catalogOption) { + return catalogOption; + } + return { + value: model, + label: getProviderScopedTeamModelLabel(providerId, model) ?? model, + availabilityStatus: getRuntimeModelAvailability(providerId, model, providerStatus), + availabilityReason: getRuntimeModelAvailabilityReason(model, providerStatus), + }; + }), ]; } diff --git a/src/renderer/utils/teamModelCatalog.ts b/src/renderer/utils/teamModelCatalog.ts index 94f72c49..0e0d1e2f 100644 --- a/src/renderer/utils/teamModelCatalog.ts +++ b/src/renderer/utils/teamModelCatalog.ts @@ -15,7 +15,10 @@ export { } from '@shared/utils/providerModelVisibility'; type SupportedProviderId = CliProviderId | TeamProviderId; -type RuntimeAwareProviderStatus = Pick; +type RuntimeAwareProviderStatus = Pick< + CliProviderStatus, + 'providerId' | 'authMethod' | 'backend' | 'modelCatalog' | 'runtimeCapabilities' +>; export interface TeamProviderModelOption { value: string; @@ -33,6 +36,8 @@ export const GPT_5_2_CODEX_UI_DISABLED_REASON = 'Temporarily disabled for team agents - this model is not currently available on the Codex native runtime.'; export const GPT_5_3_CODEX_SPARK_UI_DISABLED_REASON = 'Temporarily disabled for team agents - this model has been less reliable with bootstrap, task, and reply tool contracts.'; +export const CODEX_DYNAMIC_MODEL_REQUIRES_RUNTIME_SUPPORT_REASON = + 'Available in Codex, waiting for Agent Teams runtime support.'; const TEAM_PROVIDER_LABELS: Record = { anthropic: 'Anthropic', @@ -152,6 +157,13 @@ function getKnownTeamProviderModelOption( return TEAM_PROVIDER_MODEL_OPTIONS[providerId].find((option) => option.value === trimmed); } +function isKnownTeamProviderModel( + providerId: SupportedProviderId | undefined, + model: string | undefined +): boolean { + return Boolean(getKnownTeamProviderModelOption(providerId, model)); +} + export function getTeamProviderModelOptions( providerId: SupportedProviderId ): readonly TeamProviderModelOption[] { @@ -389,6 +401,18 @@ export function getRuntimeAwareTeamModelUiDisabledReason( return null; } + if ( + providerId === 'codex' && + providerStatus?.modelCatalog?.providerId === 'codex' && + providerStatus.modelCatalog.models.some( + (item) => item.launchModel === trimmed || item.id === trimmed + ) && + !isKnownTeamProviderModel(providerId, trimmed) && + providerStatus.runtimeCapabilities?.modelCatalog?.dynamic !== true + ) { + return CODEX_DYNAMIC_MODEL_REQUIRES_RUNTIME_SUPPORT_REASON; + } + return isRuntimeHiddenTeamModel(providerId, trimmed, providerStatus) ? GPT_5_1_CODEX_MAX_CHATGPT_UI_DISABLED_REASON : null; diff --git a/src/shared/types/cliInstaller.ts b/src/shared/types/cliInstaller.ts index 4f5a4fb5..027ccfc7 100644 --- a/src/shared/types/cliInstaller.ts +++ b/src/shared/types/cliInstaller.ts @@ -117,6 +117,57 @@ export interface CliProviderModelAvailability { checkedAt?: string | null; } +export type CliProviderReasoningEffort = 'none' | 'minimal' | 'low' | 'medium' | 'high' | 'xhigh'; + +export type CliProviderModelCatalogSource = 'app-server' | 'static-fallback'; +export type CliProviderModelCatalogStatus = 'ready' | 'stale' | 'degraded' | 'unavailable'; + +export interface CliProviderModelCatalogItem { + id: string; + launchModel: string; + displayName: string; + hidden: boolean; + supportedReasoningEfforts: CliProviderReasoningEffort[]; + defaultReasoningEffort: CliProviderReasoningEffort | null; + inputModalities: string[]; + supportsPersonality: boolean; + isDefault: boolean; + upgrade: boolean; + source: CliProviderModelCatalogSource; + badgeLabel?: string | null; + statusMessage?: string | null; +} + +export interface CliProviderModelCatalog { + schemaVersion: 1; + providerId: CliProviderId; + source: CliProviderModelCatalogSource; + status: CliProviderModelCatalogStatus; + fetchedAt: string; + staleAt: string; + defaultModelId: string | null; + defaultLaunchModel: string | null; + models: CliProviderModelCatalogItem[]; + diagnostics: { + configReadState: 'ready' | 'unsupported' | 'failed' | 'skipped'; + appServerState: 'healthy' | 'degraded' | 'runtime-missing' | 'incompatible'; + message?: string | null; + code?: string | null; + }; +} + +export interface CliProviderRuntimeCapabilities { + modelCatalog?: { + dynamic: boolean; + source?: CliProviderModelCatalogSource | 'runtime'; + }; + reasoningEffort?: { + supported: boolean; + values: CliProviderReasoningEffort[]; + configPassthrough?: boolean; + }; +} + export interface CliProviderStatus { providerId: CliProviderId; displayName: string; @@ -127,7 +178,9 @@ export interface CliProviderStatus { modelVerificationState?: 'idle' | 'verifying' | 'verified'; statusMessage?: string | null; models: string[]; + modelCatalog?: CliProviderModelCatalog | null; modelAvailability?: CliProviderModelAvailability[]; + runtimeCapabilities?: CliProviderRuntimeCapabilities | null; canLoginFromUi: boolean; capabilities: { teamLaunch: boolean; diff --git a/src/shared/types/team.ts b/src/shared/types/team.ts index c5f9499b..9eb2ee92 100644 --- a/src/shared/types/team.ts +++ b/src/shared/types/team.ts @@ -782,10 +782,23 @@ export interface TeamViewSnapshot { isAlive?: boolean; } -export type EffortLevel = 'low' | 'medium' | 'high'; +export type EffortLevel = 'none' | 'minimal' | 'low' | 'medium' | 'high' | 'xhigh'; export type TeamProviderId = 'anthropic' | 'codex' | 'gemini'; export type TeamProviderBackendId = 'auto' | 'adapter' | 'api' | 'cli-sdk' | 'codex-native'; +export interface ProviderModelLaunchIdentity { + providerId: TeamProviderId; + providerBackendId: TeamProviderBackendId | null; + selectedModel: string | null; + selectedModelKind: 'default' | 'explicit'; + resolvedLaunchModel: string | null; + catalogId: string | null; + catalogSource: 'app-server' | 'static-fallback' | 'runtime' | 'unavailable'; + catalogFetchedAt: string | null; + selectedEffort: EffortLevel | null; + resolvedEffort: EffortLevel | null; +} + export interface TeamLaunchRequest { teamName: string; cwd: string; diff --git a/src/shared/utils/effortLevels.ts b/src/shared/utils/effortLevels.ts new file mode 100644 index 00000000..958d9c03 --- /dev/null +++ b/src/shared/utils/effortLevels.ts @@ -0,0 +1,57 @@ +import type { EffortLevel, TeamProviderId } from '@shared/types/team'; + +export const TEAM_EFFORT_LEVELS = [ + 'none', + 'minimal', + 'low', + 'medium', + 'high', + 'xhigh', +] as const satisfies readonly EffortLevel[]; + +export const LEGACY_TEAM_EFFORT_LEVELS = [ + 'low', + 'medium', + 'high', +] as const satisfies readonly EffortLevel[]; + +export const CODEX_TEAM_EFFORT_LEVELS = [ + 'minimal', + 'low', + 'medium', + 'high', + 'xhigh', +] as const satisfies readonly EffortLevel[]; + +const LEGACY_TEAM_EFFORT_LEVEL_SET = new Set(LEGACY_TEAM_EFFORT_LEVELS); +const CODEX_TEAM_EFFORT_LEVEL_SET = new Set(CODEX_TEAM_EFFORT_LEVELS); + +export function isTeamEffortLevel(value: unknown): value is EffortLevel { + return typeof value === 'string' && TEAM_EFFORT_LEVELS.includes(value as EffortLevel); +} + +export function formatEffortLevelList(): string { + return TEAM_EFFORT_LEVELS.join(', '); +} + +export function isTeamEffortLevelForProvider( + value: unknown, + providerId?: TeamProviderId | null +): value is EffortLevel { + if (!isTeamEffortLevel(value)) { + return false; + } + + if (providerId === 'codex') { + return CODEX_TEAM_EFFORT_LEVEL_SET.has(value); + } + + return LEGACY_TEAM_EFFORT_LEVEL_SET.has(value); +} + +export function formatEffortLevelListForProvider(providerId?: TeamProviderId | null): string { + if (providerId === 'codex') { + return CODEX_TEAM_EFFORT_LEVELS.join(', '); + } + return LEGACY_TEAM_EFFORT_LEVELS.join(', '); +} diff --git a/test/main/services/team/TeamProvisioningService.test.ts b/test/main/services/team/TeamProvisioningService.test.ts index d18c5f8d..948b6995 100644 --- a/test/main/services/team/TeamProvisioningService.test.ts +++ b/test/main/services/team/TeamProvisioningService.test.ts @@ -46,6 +46,46 @@ vi.mock('@main/services/team/TeamTaskReader', () => ({ })); vi.mock('@main/utils/childProcess', () => ({ + execCli: vi.fn(async (_binaryPath: string | null, args: string[]) => { + if (args[0] === 'model') { + return { + stdout: JSON.stringify({ + schemaVersion: 1, + providers: { + codex: { + defaultModel: 'gpt-5.4', + models: [{ id: 'gpt-5.4', label: 'GPT-5.4', description: 'Codex default' }], + }, + gemini: { + defaultModel: 'gemini-2.5-pro', + models: [{ id: 'gemini-2.5-pro', label: 'Gemini 2.5 Pro', description: 'Default' }], + }, + }, + }), + stderr: '', + }; + } + if (args[0] === 'runtime') { + return { + stdout: JSON.stringify({ + providers: { + codex: { + runtimeCapabilities: { + modelCatalog: { dynamic: false, source: 'runtime' }, + reasoningEffort: { + supported: true, + values: ['low', 'medium', 'high'], + configPassthrough: false, + }, + }, + }, + }, + }), + stderr: '', + }; + } + return { stdout: '', stderr: '' }; + }), spawnCli: vi.fn(), killProcessTree: vi.fn(), })); diff --git a/test/main/services/team/TeamProvisioningServicePrompts.test.ts b/test/main/services/team/TeamProvisioningServicePrompts.test.ts index d0e3e0e7..8968fdd3 100644 --- a/test/main/services/team/TeamProvisioningServicePrompts.test.ts +++ b/test/main/services/team/TeamProvisioningServicePrompts.test.ts @@ -24,6 +24,46 @@ vi.mock('@main/services/team/ClaudeBinaryResolver', () => ({ })); vi.mock('@main/utils/childProcess', () => ({ + execCli: vi.fn(async (_binaryPath: string | null, args: string[]) => { + if (args[0] === 'model') { + return { + stdout: JSON.stringify({ + schemaVersion: 1, + providers: { + codex: { + defaultModel: 'gpt-5.4', + models: [{ id: 'gpt-5.4', label: 'GPT-5.4', description: 'Codex default' }], + }, + gemini: { + defaultModel: 'gemini-2.5-pro', + models: [{ id: 'gemini-2.5-pro', label: 'Gemini 2.5 Pro', description: 'Default' }], + }, + }, + }), + stderr: '', + }; + } + if (args[0] === 'runtime') { + return { + stdout: JSON.stringify({ + providers: { + codex: { + runtimeCapabilities: { + modelCatalog: { dynamic: false, source: 'runtime' }, + reasoningEffort: { + supported: true, + values: ['low', 'medium', 'high'], + configPassthrough: false, + }, + }, + }, + }, + }), + stderr: '', + }; + } + return { stdout: '', stderr: '' }; + }), spawnCli: vi.fn(), killProcessTree: vi.fn(), })); @@ -45,7 +85,7 @@ import { TeamProvisioningService, } from '@main/services/team/TeamProvisioningService'; import { ClaudeBinaryResolver } from '@main/services/team/ClaudeBinaryResolver'; -import { spawnCli } from '@main/utils/childProcess'; +import { execCli, spawnCli } from '@main/utils/childProcess'; import { setAppDataBasePath } from '@main/utils/pathDecoder'; function createFakeChild() { @@ -314,6 +354,72 @@ describe('TeamProvisioningService prompt content (solo mode discipline)', () => await svc.cancelProvisioning(runId); }); + it('blocks Codex xhigh launch effort until runtime exposes reasoning config passthrough', async () => { + vi.mocked(ClaudeBinaryResolver.resolve).mockResolvedValue('/fake/codex'); + vi.mocked(spawnCli).mockReset(); + + const svc = new TeamProvisioningService(); + (svc as any).buildProvisioningEnv = vi.fn(async () => ({ + env: {}, + authSource: 'codex_runtime', + providerArgs: [], + })); + (svc as any).validateAgentTeamsMcpRuntime = vi.fn(async () => {}); + (svc as any).startFilesystemMonitor = vi.fn(); + (svc as any).pathExists = vi.fn(async () => false); + + await expect( + svc.createTeam( + { + teamName: 'codex-xhigh-blocked', + cwd: process.cwd(), + members: [], + providerId: 'codex', + effort: 'xhigh', + }, + () => {} + ) + ).rejects.toThrow('does not expose Codex reasoning config passthrough yet'); + + expect(spawnCli).not.toHaveBeenCalled(); + }); + + it('blocks future Codex catalog models until runtime declares dynamic launch support', async () => { + vi.mocked(ClaudeBinaryResolver.resolve).mockResolvedValue('/fake/codex'); + vi.mocked(spawnCli).mockReset(); + + const svc = new TeamProvisioningService(); + (svc as any).buildProvisioningEnv = vi.fn(async () => ({ + env: {}, + authSource: 'codex_runtime', + providerArgs: [], + })); + (svc as any).validateAgentTeamsMcpRuntime = vi.fn(async () => {}); + (svc as any).startFilesystemMonitor = vi.fn(); + (svc as any).pathExists = vi.fn(async () => false); + + await expect( + svc.createTeam( + { + teamName: 'codex-future-model-blocked', + cwd: process.cwd(), + members: [], + providerId: 'codex', + model: 'gpt-5.5', + effort: 'medium', + }, + () => {} + ) + ).rejects.toThrow('does not declare dynamic Codex model launch support yet'); + + expect(execCli).toHaveBeenCalledWith( + '/fake/codex', + ['runtime', 'status', '--json', '--provider', 'codex'], + expect.objectContaining({ cwd: process.cwd() }) + ); + expect(spawnCli).not.toHaveBeenCalled(); + }); + it('restart teammate message keeps the exact teammate identity and avoids duplicate semantics', () => { const message = buildRestartMemberSpawnMessage('forge-labs', 'Forge Labs', 'lead', { name: 'alice',