312 lines
9.6 KiB
Markdown
312 lines
9.6 KiB
Markdown
# Codex Native Runtime Integration Decision
|
|
|
|
**Status**: Decision
|
|
**Date**: 2026-04-19
|
|
**Owner repos**:
|
|
|
|
- `claude_team`
|
|
- `agent_teams_orchestrator`
|
|
- `plugin-kit-ai`
|
|
|
|
## Purpose
|
|
|
|
Record the chosen direction for improving Codex integration in the multimodel runtime without losing native Codex capabilities such as plugins, skills, and MCP.
|
|
|
|
## Chosen Plan Assessment
|
|
|
|
- Chosen plan: normalized internal event/log layer plus staged `Codex-native` backend lane
|
|
- Assessment: `🎯 9 🛡️ 9 🧠 7`
|
|
- Estimated first serious wave: `2200-4500` lines across `agent_teams_orchestrator`, `claude_team`, and `plugin-kit-ai`
|
|
|
|
## Current Reality
|
|
|
|
Today, `Codex` inside our multimodel runtime is **not** executed through the real Codex runtime.
|
|
|
|
Instead, the current path is:
|
|
|
|
- `claude_team`
|
|
- `agent_teams_orchestrator`
|
|
- internal Codex backend
|
|
- OpenAI Responses API
|
|
|
|
In practice this means:
|
|
|
|
- the orchestrator keeps Anthropic-style streaming semantics
|
|
- `Codex` is treated as a model backend, not as a native runtime
|
|
- native Codex plugins are not honestly end-to-end supported
|
|
- current `Codex` capability support is limited by our adapter, not by the real Codex runtime
|
|
|
|
## What We Learned
|
|
|
|
After deep code and docs analysis, the most important conclusions are:
|
|
|
|
1. `@openai/codex-sdk` and `codex exec --json` are the real official execution seam for embedded Codex runtime usage.
|
|
2. `codex exec` supports API-key mode, so API-key mode itself is not the blocker.
|
|
3. `Codex` native plugins, apps, skills, and MCP are part of the real Codex runtime flow.
|
|
4. Our current `agent_teams_orchestrator` query loop is deeply coupled to Anthropic-style events and tool semantics.
|
|
5. A full drop-in swap from the current Codex adapter to `@openai/codex-sdk / codex exec` would not be a safe transport-only change. It would change runtime semantics.
|
|
6. `plugin-kit-ai` is a good fit for plugin management and native plugin placement.
|
|
7. `codex app-server` is promising for richer control-plane features, but should not be the foundation of the first production rollout for plugin management.
|
|
|
|
## Chosen Direction
|
|
|
|
We will **not** force Codex into the current Anthropic-shaped runtime contract.
|
|
|
|
We will instead:
|
|
|
|
- add a new **internal normalized event/log layer**
|
|
- keep execution semantics provider-native where needed
|
|
- add a separate **Codex-native runtime lane**
|
|
- use `plugin-kit-ai` for plugin management and native plugin placement
|
|
|
|
In practical terms:
|
|
|
|
- current Codex path stays available as the fallback/default path at first
|
|
- real Codex runtime execution becomes a separate lane instead of a drop-in replacement
|
|
- unified logs come from normalization, not from pretending every provider has Anthropic-native runtime semantics
|
|
|
|
## Decision Summary
|
|
|
|
### We are doing this
|
|
|
|
- keep the current Codex adapter path as the fallback/default path initially
|
|
- introduce a new `Codex-native` backend lane using `@openai/codex-sdk / codex exec`
|
|
- introduce a normalized internal event/log format for all providers
|
|
- map Anthropic, Gemini, and future Codex-native events into that normalized format
|
|
- keep unified logging, transcript projection, analytics, and UI-facing event handling on top of the normalized layer
|
|
- use `plugin-kit-ai` for:
|
|
- install
|
|
- update
|
|
- remove
|
|
- repair
|
|
- discover
|
|
- catalog
|
|
- native Codex plugin placement through native marketplace/filesystem layout
|
|
|
|
### We are not doing this
|
|
|
|
- not replacing the whole multimodel runtime in one shot
|
|
- not forcing real Codex runtime execution into fake Anthropic transport semantics
|
|
- not pretending a full `@openai/codex-sdk / codex exec` swap is a drop-in backend replacement
|
|
- not making `app-server plugin/*` the first production seam
|
|
|
|
## Why We Chose This
|
|
|
|
### Main benefit
|
|
|
|
This path gives us both:
|
|
|
|
- unified internal logs/events
|
|
- a real path to native Codex runtime capabilities
|
|
|
|
without requiring a full rewrite of the current multimodel runtime.
|
|
|
|
### Main reason against a direct full swap
|
|
|
|
The current orchestrator is deeply coupled to Anthropic-shaped runtime behavior:
|
|
|
|
- `tool_use`
|
|
- `tool_result`
|
|
- `content_block_start`
|
|
- `input_json_delta`
|
|
- `message_delta`
|
|
- current permission and sandbox flow
|
|
- current synthetic tool/result handling
|
|
- current transcript persistence and resume logic
|
|
|
|
`codex exec` emits a different event model:
|
|
|
|
- `thread.started`
|
|
- `turn.started`
|
|
- `turn.completed`
|
|
- `turn.failed`
|
|
- `item.started`
|
|
- `item.updated`
|
|
- `item.completed`
|
|
|
|
and item types such as:
|
|
|
|
- `agent_message`
|
|
- `reasoning`
|
|
- `command_execution`
|
|
- `file_change`
|
|
- `mcp_tool_call`
|
|
|
|
That is not just a different wire format. It is a different runtime shape.
|
|
|
|
## What Changes Per Repo
|
|
|
|
### `agent_teams_orchestrator`
|
|
|
|
This repo takes the biggest change.
|
|
|
|
We want to:
|
|
|
|
- introduce a provider-neutral normalized event/log model
|
|
- add adapter mappers from current Anthropic/Gemini style streams into that model
|
|
- add a separate `Codex-native` backend lane through `@openai/codex-sdk / codex exec`
|
|
- keep the current Codex adapter path alive as fallback during migration
|
|
- avoid forcing `codex exec` events into fake `tool_use/tool_result` transport semantics
|
|
|
|
We do **not** want to:
|
|
|
|
- replace the current Codex backend in one shot
|
|
- rewrite all providers around Codex-native semantics
|
|
- make transcript/log normalization depend on Anthropic wire events
|
|
|
|
### `claude_team`
|
|
|
|
This repo should stay relatively stable compared with the orchestrator.
|
|
|
|
We want to:
|
|
|
|
- keep one multimodel runtime concept
|
|
- stay capability-aware per provider/backend lane
|
|
- consume normalized runtime/log DTOs rather than assuming one provider-shaped event model
|
|
- integrate plugin management through `plugin-kit-ai`
|
|
- keep Codex plugin support gated behind the real Codex-native lane
|
|
|
|
We do **not** want to:
|
|
|
|
- invent a fake Codex plugin support state while execution still goes through the old adapter lane
|
|
- force UI logic to infer runtime truth from provider labels alone
|
|
|
|
### `plugin-kit-ai`
|
|
|
|
This repo remains the management engine, not the execution engine.
|
|
|
|
We want to:
|
|
|
|
- use it for catalog
|
|
- use it for discover
|
|
- use it for install/update/remove/repair
|
|
- use it for native Codex plugin placement through native marketplace/filesystem layout
|
|
|
|
We do **not** want to:
|
|
|
|
- make it responsible for running Codex plugins inside sessions
|
|
- blur installation and execution into one concern
|
|
|
|
## Target Architecture
|
|
|
|
### Runtime execution
|
|
|
|
- `Anthropic` can continue on the current path for now
|
|
- `Gemini` can continue on the current path for now
|
|
- `Codex-native` gets a dedicated backend lane through `@openai/codex-sdk / codex exec`
|
|
|
|
### Internal normalization
|
|
|
|
All runtime backends must project into a shared internal event/log model.
|
|
|
|
The normalized layer should represent concepts such as:
|
|
|
|
- turn started
|
|
- assistant text
|
|
- reasoning
|
|
- command execution
|
|
- MCP call
|
|
- file change
|
|
- approval request
|
|
- turn completed
|
|
- turn failed
|
|
|
|
The normalized format is the source of truth for:
|
|
|
|
- logs
|
|
- transcript projection
|
|
- analytics
|
|
- UI-facing activity/event summaries
|
|
|
|
The normalized format is **not** required to preserve provider-native wire semantics.
|
|
|
|
## Codex Plugins Strategy
|
|
|
|
For Codex plugins we want:
|
|
|
|
- native Codex runtime execution
|
|
- native Codex marketplace/filesystem placement
|
|
- provider-aware plugin management in `claude_team`
|
|
|
|
Therefore:
|
|
|
|
- `plugin-kit-ai` is the management engine
|
|
- real Codex runtime is the execution engine
|
|
|
|
This is important because plugin installation and plugin execution are different concerns.
|
|
|
|
Installing a native Codex plugin is not enough by itself if the session still runs through our current Responses API adapter path.
|
|
|
|
## App Server Position
|
|
|
|
`codex app-server` remains relevant, but not as the first critical path for this migration.
|
|
|
|
It is better positioned as a later control-plane enhancement for things like:
|
|
|
|
- auth state
|
|
- MCP status and OAuth flows
|
|
- skills/config inspection
|
|
- external config import
|
|
|
|
For the first production rollout, it should not be the hard dependency for plugin lifecycle management.
|
|
|
|
## Implementation Phases
|
|
|
|
### Phase 1
|
|
|
|
- design and introduce the normalized internal event/log layer
|
|
- keep current backends working
|
|
- define the internal mapping contract clearly
|
|
|
|
### Phase 2
|
|
|
|
- add a `Codex-native` backend lane through `@openai/codex-sdk / codex exec`
|
|
- keep the current Codex adapter as fallback
|
|
- validate API-key mode, working directory behavior, sandbox mode, approval policy, thread resume, and streaming
|
|
|
|
### Phase 3
|
|
|
|
- integrate `plugin-kit-ai` for provider-aware plugin management
|
|
- add native Codex plugin placement through native marketplace/filesystem model
|
|
- keep current UI provider-aware and capability-aware
|
|
|
|
### Phase 4
|
|
|
|
- optionally add selective `codex app-server` control-plane integration where it provides clear value
|
|
|
|
## Main Risks And Guardrails
|
|
|
|
### Risk 1 - treating `codex-sdk/exec` as a transport-only swap
|
|
|
|
This is the most dangerous mistake.
|
|
|
|
Guardrail:
|
|
|
|
- treat `Codex-native` as a separate runtime lane
|
|
- normalize logs/events above it
|
|
- do not assume the current Anthropic-shaped tool loop can be preserved unchanged
|
|
|
|
### Risk 2 - claiming Codex plugin support too early
|
|
|
|
Installing native Codex plugins is not enough if execution still runs through the current adapter path.
|
|
|
|
Guardrail:
|
|
|
|
- only advertise Codex plugin support when the session actually runs through the Codex-native lane
|
|
|
|
### Risk 3 - overcommitting to `app-server` too early
|
|
|
|
`codex app-server` is useful, but it should not become a hard dependency for the first production plugin rollout.
|
|
|
|
Guardrail:
|
|
|
|
- use it later for selective control-plane features
|
|
- do not block the first migration on `app-server plugin/*`
|
|
|
|
## Practical Rule
|
|
|
|
If we need **unified logs**, we normalize events.
|
|
|
|
If we need **native Codex capabilities**, we do not fake Codex into Anthropic runtime semantics.
|
|
|
|
That is the core architectural rule for this migration.
|