admin-valentin/agent-ecosystem

Fork 0

777genius ba37c1caf5 feat(runtime): add codex-native phase 0 app integration

2026-04-19 19:33:37 +03:00

6.2 KiB

Raw Blame History

Codex Native Runtime - Phase 0 Sign-off Evidence

Captured on 2026-04-19.

This file is the repo-visible evidence package referenced by:

codex-native-runtime-phase-0-implementation-spec.md

Verdict

Phase 0 sign-off evidence is now captured.

What this proves:

the codex-native lane executes through the raw codex exec --json seam
persisted transcript projection remains parseable by current claude_team readers
ephemeral and persistent runs keep different history-completeness truth
thread status, warning attribution, executable identity, and usage authority survive end-to-end
old Codex lane fallback truth remains covered by targeted regression tests

What this does not mean:

codex-native should be unlocked for general runtime selection
auto should start resolving to codex-native
broader plugin or interactive capability claims are now safe

Command Package

`agent_teams_orchestrator`

Executed:

bun test src/services/codexNative/signOffHarness.test.ts \
  src/services/codexNative/statusAuthority.test.ts \
  src/services/codexNative/transcriptProjector.test.ts \
  src/services/codexNative/turnExecutor.test.ts \
  src/services/codexNative/execRunner.test.ts \
  src/services/codexNative/jsonlMapper.test.ts \
  src/services/runtimeBackends/codexBackendResolver.test.ts \
  src/services/runtimeBackends/registry.agentTeams.test.ts

Observed result:

27 pass
0 fail

`claude_team`

Executed:

pnpm exec vitest run \
  test/main/utils/jsonl.test.ts \
  test/main/services/parsing/SessionParser.test.ts \
  test/main/services/team/BoardTaskExactLogStrictParser.test.ts \
  test/main/ipc/configValidation.test.ts \
  test/main/services/runtime/ProviderConnectionService.test.ts \
  test/main/services/runtime/providerAwareCliEnv.test.ts \
  test/main/services/runtime/ClaudeMultimodelBridgeService.test.ts \
  test/renderer/components/runtime/providerConnectionUi.test.ts \
  test/renderer/components/runtime/ProviderRuntimeSettingsDialog.test.ts \
  test/renderer/components/cli/CliStatusVisibility.test.ts

Observed result:

134 pass
0 fail

Diff cleanliness

Executed:

git diff --check

Observed result:

clean in both worktrees

Live Native Run Evidence

Common live-run facts

Observed from both runs:

native binary path: /usr/local/bin/codex
native binary source: system-path
native binary version: codex-cli 0.117.0
credential input source for the sign-off harness: OPENAI_API_KEY
credential source observed by the runner: explicit-api-key
capability profile: headless-limited
final assistant text: OK

Ephemeral run

Executed:

bun run ./scripts/codex-native-phase0-signoff.ts \
  --cwd /tmp \
  --prompt 'Reply only with OK' \
  --ephemeral

Observed result:

thread id: 019da680-6f43-7e10-824c-4d985bcdca12
completion policy: ephemeral
final history completeness: live-only
final usage authority: live-turn-completed
assistant usage:
- input tokens: 23616
- cached input tokens: 0
- output tokens: 42

History authority proof:

projected warning subtype: codex_native_warning
projected warning source: history
observed warning text contained:
- thread/read failed while backfilling turn items for turn completion
- ephemeral threads do not support includeTurns

This is the explicit proof that ephemeral live stream does not equal canonical hydrated history.

Persistent run

Executed:

bun run ./scripts/codex-native-phase0-signoff.ts \
  --cwd /tmp \
  --prompt 'Reply only with OK' \
  --persistent

Observed result:

thread id: 019da680-6f42-77c0-94f1-4e450a69d1f1
completion policy: persistent
final history completeness: explicit-hydration-required
final usage authority: live-turn-completed
assistant usage:
- input tokens: 23616
- cached input tokens: 0
- output tokens: 33

This is the explicit proof that persistent native runs keep a different history-completeness contract from ephemeral runs.

Warning Attribution Proof

The live runs produced both:

process/runtime warnings
history-completeness warnings

Observed process-attributed warnings included:

plugin cache / featured plugins unauthorized warnings
state DB migration mismatch warnings
shell snapshot timeout warnings
MCP process-group termination warnings

Observed history-attributed warning included:

thread/read failed while backfilling turn items for turn completion: ... ephemeral threads do not support includeTurns

This proves the lane now keeps process and history warning truth distinct in projected transcript rows.

Thread-status Proof

Observed projected system rows included:

codex_native_thread_status
- running
- completed

This proves the lane now writes native thread-status authority into persisted transcript-compatible rows instead of forcing UI and replay consumers to infer health from provider-global process truth.

Parser And Exact-log Proof

Covered by green targeted tests:

test/main/utils/jsonl.test.ts
test/main/services/parsing/SessionParser.test.ts
test/main/services/team/BoardTaskExactLogStrictParser.test.ts

These tests prove:

projected assistant usage remains parseable
projected warning/source metadata remains parseable
projected execution-summary/history metadata remains parseable
exact-log readers do not drop the native authority rows

Degraded Old-lane Fallback Proof

Covered by green targeted tests:

src/services/runtimeBackends/codexBackendResolver.test.ts
src/services/runtimeBackends/registry.agentTeams.test.ts

Those tests prove:

auto still does not silently resolve to codex-native
native lane remains unavailable without:
- feature flag
- binary
- CODEX_API_KEY
old Codex lane remains the truthful fallback when native is absent or degraded

Sign-off Conclusion

✅ The Phase 0 code path is implementation-complete and evidence-backed.

⚠️ The lane should still remain:

feature-flagged
non-default
non-auto-resolved
non-selectable for normal runtime switching

That remaining lock is now a rollout-policy choice, not a missing-code problem.

6.2 KiB Raw Blame History

Codex Native Runtime - Phase 0 Sign-off Evidence

Verdict

Command Package

agent_teams_orchestrator

claude_team

Diff cleanliness

Live Native Run Evidence

Common live-run facts

Ephemeral run

Persistent run

Warning Attribution Proof

Thread-status Proof

Parser And Exact-log Proof

Degraded Old-lane Fallback Proof

Sign-off Conclusion

6.2 KiB

Raw Blame History

`agent_teams_orchestrator`

`claude_team`