agent-ecosystem/docs/research/codex-native-runtime-integration-decision.md

9.6 KiB

Codex Native Runtime Integration Decision

Status: Decision
Date: 2026-04-19
Owner repos:

  • claude_team
  • agent_teams_orchestrator
  • plugin-kit-ai

Purpose

Record the chosen direction for improving Codex integration in the multimodel runtime without losing native Codex capabilities such as plugins, skills, and MCP.

Chosen Plan Assessment

  • Chosen plan: normalized internal event/log layer plus staged Codex-native backend lane
  • Assessment: 🎯 9 🛡️ 9 🧠 7
  • Estimated first serious wave: 2200-4500 lines across agent_teams_orchestrator, claude_team, and plugin-kit-ai

Current Reality

Today, Codex inside our multimodel runtime is not executed through the real Codex runtime.

Instead, the current path is:

  • claude_team
  • agent_teams_orchestrator
  • internal Codex backend
  • OpenAI Responses API

In practice this means:

  • the orchestrator keeps Anthropic-style streaming semantics
  • Codex is treated as a model backend, not as a native runtime
  • native Codex plugins are not honestly end-to-end supported
  • current Codex capability support is limited by our adapter, not by the real Codex runtime

What We Learned

After deep code and docs analysis, the most important conclusions are:

  1. @openai/codex-sdk and codex exec --json are the real official execution seam for embedded Codex runtime usage.
  2. codex exec supports API-key mode, so API-key mode itself is not the blocker.
  3. Codex native plugins, apps, skills, and MCP are part of the real Codex runtime flow.
  4. Our current agent_teams_orchestrator query loop is deeply coupled to Anthropic-style events and tool semantics.
  5. A full drop-in swap from the current Codex adapter to @openai/codex-sdk / codex exec would not be a safe transport-only change. It would change runtime semantics.
  6. plugin-kit-ai is a good fit for plugin management and native plugin placement.
  7. codex app-server is promising for richer control-plane features, but should not be the foundation of the first production rollout for plugin management.

Chosen Direction

We will not force Codex into the current Anthropic-shaped runtime contract.

We will instead:

  • add a new internal normalized event/log layer
  • keep execution semantics provider-native where needed
  • add a separate Codex-native runtime lane
  • use plugin-kit-ai for plugin management and native plugin placement

In practical terms:

  • current Codex path stays available as the fallback/default path at first
  • real Codex runtime execution becomes a separate lane instead of a drop-in replacement
  • unified logs come from normalization, not from pretending every provider has Anthropic-native runtime semantics

Decision Summary

We are doing this

  • keep the current Codex adapter path as the fallback/default path initially
  • introduce a new Codex-native backend lane using @openai/codex-sdk / codex exec
  • introduce a normalized internal event/log format for all providers
  • map Anthropic, Gemini, and future Codex-native events into that normalized format
  • keep unified logging, transcript projection, analytics, and UI-facing event handling on top of the normalized layer
  • use plugin-kit-ai for:
    • install
    • update
    • remove
    • repair
    • discover
    • catalog
    • native Codex plugin placement through native marketplace/filesystem layout

We are not doing this

  • not replacing the whole multimodel runtime in one shot
  • not forcing real Codex runtime execution into fake Anthropic transport semantics
  • not pretending a full @openai/codex-sdk / codex exec swap is a drop-in backend replacement
  • not making app-server plugin/* the first production seam

Why We Chose This

Main benefit

This path gives us both:

  • unified internal logs/events
  • a real path to native Codex runtime capabilities

without requiring a full rewrite of the current multimodel runtime.

Main reason against a direct full swap

The current orchestrator is deeply coupled to Anthropic-shaped runtime behavior:

  • tool_use
  • tool_result
  • content_block_start
  • input_json_delta
  • message_delta
  • current permission and sandbox flow
  • current synthetic tool/result handling
  • current transcript persistence and resume logic

codex exec emits a different event model:

  • thread.started
  • turn.started
  • turn.completed
  • turn.failed
  • item.started
  • item.updated
  • item.completed

and item types such as:

  • agent_message
  • reasoning
  • command_execution
  • file_change
  • mcp_tool_call

That is not just a different wire format. It is a different runtime shape.

What Changes Per Repo

agent_teams_orchestrator

This repo takes the biggest change.

We want to:

  • introduce a provider-neutral normalized event/log model
  • add adapter mappers from current Anthropic/Gemini style streams into that model
  • add a separate Codex-native backend lane through @openai/codex-sdk / codex exec
  • keep the current Codex adapter path alive as fallback during migration
  • avoid forcing codex exec events into fake tool_use/tool_result transport semantics

We do not want to:

  • replace the current Codex backend in one shot
  • rewrite all providers around Codex-native semantics
  • make transcript/log normalization depend on Anthropic wire events

claude_team

This repo should stay relatively stable compared with the orchestrator.

We want to:

  • keep one multimodel runtime concept
  • stay capability-aware per provider/backend lane
  • consume normalized runtime/log DTOs rather than assuming one provider-shaped event model
  • integrate plugin management through plugin-kit-ai
  • keep Codex plugin support gated behind the real Codex-native lane

We do not want to:

  • invent a fake Codex plugin support state while execution still goes through the old adapter lane
  • force UI logic to infer runtime truth from provider labels alone

plugin-kit-ai

This repo remains the management engine, not the execution engine.

We want to:

  • use it for catalog
  • use it for discover
  • use it for install/update/remove/repair
  • use it for native Codex plugin placement through native marketplace/filesystem layout

We do not want to:

  • make it responsible for running Codex plugins inside sessions
  • blur installation and execution into one concern

Target Architecture

Runtime execution

  • Anthropic can continue on the current path for now
  • Gemini can continue on the current path for now
  • Codex-native gets a dedicated backend lane through @openai/codex-sdk / codex exec

Internal normalization

All runtime backends must project into a shared internal event/log model.

The normalized layer should represent concepts such as:

  • turn started
  • assistant text
  • reasoning
  • command execution
  • MCP call
  • file change
  • approval request
  • turn completed
  • turn failed

The normalized format is the source of truth for:

  • logs
  • transcript projection
  • analytics
  • UI-facing activity/event summaries

The normalized format is not required to preserve provider-native wire semantics.

Codex Plugins Strategy

For Codex plugins we want:

  • native Codex runtime execution
  • native Codex marketplace/filesystem placement
  • provider-aware plugin management in claude_team

Therefore:

  • plugin-kit-ai is the management engine
  • real Codex runtime is the execution engine

This is important because plugin installation and plugin execution are different concerns.

Installing a native Codex plugin is not enough by itself if the session still runs through our current Responses API adapter path.

App Server Position

codex app-server remains relevant, but not as the first critical path for this migration.

It is better positioned as a later control-plane enhancement for things like:

  • auth state
  • MCP status and OAuth flows
  • skills/config inspection
  • external config import

For the first production rollout, it should not be the hard dependency for plugin lifecycle management.

Implementation Phases

Phase 1

  • design and introduce the normalized internal event/log layer
  • keep current backends working
  • define the internal mapping contract clearly

Phase 2

  • add a Codex-native backend lane through @openai/codex-sdk / codex exec
  • keep the current Codex adapter as fallback
  • validate API-key mode, working directory behavior, sandbox mode, approval policy, thread resume, and streaming

Phase 3

  • integrate plugin-kit-ai for provider-aware plugin management
  • add native Codex plugin placement through native marketplace/filesystem model
  • keep current UI provider-aware and capability-aware

Phase 4

  • optionally add selective codex app-server control-plane integration where it provides clear value

Main Risks And Guardrails

Risk 1 - treating codex-sdk/exec as a transport-only swap

This is the most dangerous mistake.

Guardrail:

  • treat Codex-native as a separate runtime lane
  • normalize logs/events above it
  • do not assume the current Anthropic-shaped tool loop can be preserved unchanged

Risk 2 - claiming Codex plugin support too early

Installing native Codex plugins is not enough if execution still runs through the current adapter path.

Guardrail:

  • only advertise Codex plugin support when the session actually runs through the Codex-native lane

Risk 3 - overcommitting to app-server too early

codex app-server is useful, but it should not become a hard dependency for the first production plugin rollout.

Guardrail:

  • use it later for selective control-plane features
  • do not block the first migration on app-server plugin/*

Practical Rule

If we need unified logs, we normalize events.

If we need native Codex capabilities, we do not fake Codex into Anthropic runtime semantics.

That is the core architectural rule for this migration.