113 KiB
OpenCode Snapshot-First Proof Upgrade Plan
Goal
Reduce false or avoidable OpenCode review warnings for new tasks by upgrading
metadata-only OpenCode edit, write, and apply_patch changes to verified
full-text before/after changes when, and only when, existing OpenCode snapshot
evidence proves the exact file state transition.
The implementation must be fail-closed:
- If proof is complete, store full before/after content and remove manual-only warnings.
- If proof is incomplete, ambiguous, too large, binary, outside scope, or unavailable, keep the current warning.
- Never use current disk content as proof for historical before/after.
- Never broaden attribution outside strict delivery context.
Non-goals
- Do not change Codex or Anthropic task extraction.
- Do not change generic review UI semantics.
- Do not infer diffs from current disk.
- Do not scan unrelated OpenCode sessions.
- Do not increase OpenCode snapshot file size limits as part of this work.
- Do not retroactively "fix" old tasks unless existing ledger backfill has strict delivery and snapshot evidence.
Current System Facts
Desktop repo:
ChangeExtractorServicerequests OpenCode ledger backfill only when delivery context is available.- Backfill goes through
OpenCodeReadinessBridge.backfillOpenCodeTaskLedger. - Imported events already support full before/after content and metadata-only fallbacks.
Orchestrator repo:
OpenCodeProfileManager.buildManagedConfig()already setssnapshot: true.OpenCodeLedgerBridgeService.backfill()reconstructs toolpart changes, then callsOpenCodeChangeEvidenceEnricher.enrich().OpenCodeOfflineSessionReaderreads OpenCode SQLite history in read-only mode and extracts snapshot windows.OpenCodeSnapshotEvidenceProviderServicereads before/after snapshot file contents with strict limits.OpenCodeToolpartChangeReconstructoralready creates exacttoolpart-chainchanges when it has a known baseline.OpenCodeChangeEvidenceEnricheralready upgrades some metadata-only changes through snapshot or inverse chain proof.
This plan should strengthen the existing evidence path rather than introduce a new capture subsystem.
Risk Estimate
Recommended implementation:
- Functional bug risk: 3/10.
- Performance regression risk: 2/10.
- Data safety risk: 2/10.
- Complexity: 7/10.
- Approximate runtime change size: 220-450 LOC.
- Approximate total change size with tests and diagnostics: 450-900 LOC.
The low data safety risk depends on preserving fail-closed behavior. If any step starts accepting guesses as proof, data safety risk becomes 7/10 or worse.
Hard Safety Invariants
These invariants are more important than reducing warnings.
- A full-text upgrade must be tied to one task, one member, one OpenCode session, one delivery record, one assistant message, one toolpart, and one snapshot window.
- The upgrade must be local to OpenCode. Codex, Anthropic, generic task-log parsing, and non-OpenCode review flows must not change.
strict-deliveryis required for every snapshot-based full-text upgrade. Compatible attribution may still import metadata-only events, but it must not produce auto-safe before/after content.- Current disk content is never historical evidence. It can be displayed as read-only context by the desktop, but it cannot remove a warning or enable safe reject.
- Hash-only evidence is not full-text evidence. A hash can verify text that was already read from a trusted snapshot, but a hash alone is not enough.
- Empty string is valid full text.
nullandundefinedmean unavailable. - Large, binary, truncated, path-unsafe, or schema-unsupported content stays metadata-only.
- A failed upgrade must preserve the original change event shape as much as possible. It may add diagnostics, but it must not remove warnings or mutate operation/confidence.
- Imported event idempotency must remain based on existing source import keys. The upgrade must not create duplicate events for the same toolpart/path.
- Any multi-change path chain must be all-or-nothing for unresolved changes in that path/window. Partial upgrades are allowed only for changes that were already independently exact before the chain attempt.
- The implementation must never make a previously non-rejectable change
rejectable unless both
beforeContentand the target after/absence state are proven from the same trusted historical evidence path. - Diagnostics are allowed to become more detailed. They are not allowed to be used as a substitute for proof.
Things That Are Explicitly Not Proof
These signals can be useful diagnostics, but they must not remove warnings or enable safe reject by themselves:
- Current disk content matching an expected hash.
- Current disk content matching
newString. - A file path appearing in OpenCode tool metadata.
- A file path appearing in a snapshot diff without readable before/after text.
- A before/after hash without the corresponding text blob.
- An OpenCode tool status of
completed. - The absence of an error in the toolpart.
- The task title mentioning the same directory.
- A member name matching the expected teammate.
- A session id matching but no strict delivery record.
- A snapshot window in the same session but a different assistant message.
- A snapshot window that overlaps several toolparts ambiguously.
- A successful manual UI render of current disk preview.
If implementation pressure makes any of these tempting, stop and keep the warning.
Threat Model
The feature is not security-sensitive in the network sense, but it is data-safety-sensitive. The main threat is a false full-text proof that enables safe reject/apply for the wrong historical state.
Bug classes to defend against:
- Cross-task contamination:
- A file change from task A appears in task B review.
- Main defense: strict delivery, canonical task id, source message/window matching, real-data smoke.
- Cross-member contamination:
- A teammate using the same OpenCode profile is attributed to another member.
- Main defense: delivery record member/lane/session matching.
- Cross-window contamination:
- A toolpart is matched to the wrong snapshot window in the same message.
- Main defense: exactly-one window matching and order tests.
- False baseline:
- Current disk or hash-only evidence is treated as historical before text.
- Main defense: "not proof" list and code review checklist.
- Unsafe warning removal:
- UI stops warning about a file that is still manual-only.
- Main defense: central warning predicate and negative warning tests.
- Duplicate imported events:
- The same source toolpart appears twice after re-backfill.
- Main defense: source-key idempotency audit and repeated-backfill tests.
- Silent performance regression:
- Snapshot proof reads too many blobs or times out often.
- Main defense: proof-needed filtering, existing limits, timing counters.
- Unsupported upstream shape:
- OpenCode changes SQLite/snapshot schema and our parser guesses.
- Main defense: shape fingerprint, unsupported fallback, abort condition.
For every bug class above, the implementation needs at least one negative test or real-data smoke assertion.
Pre-Implementation Audit Checklist
Before writing runtime code, answer these questions from the current codebase:
- Does the task-change ledger importer update, replace, supersede, or append
events with the same
sourceImportKey? - Does the desktop review bundle dedupe by source key, file path, event id, or a computed change id?
- Which exact helper decides whether a file is rejectable?
- Which warnings are currently surfaced in
TeamChangesSectionversus the full review dialog? - Does the OpenCode backfill cache hide an upgraded result for up to 60 seconds after a first metadata-only result?
- Does
materializeMetadataOnlyChangespreserveevidenceProof,snapshotId,snapshotSource, and warnings exactly? - Can the snapshot provider return
beforeState/afterStatewith hashes but no text, and how is that serialized into task-change events? - Are OpenCode snapshot windows always message-local in current real data?
- Are there real examples where a single toolpart touches more than one file?
- Are there real examples where
apply_patchcontains rename or mode-only changes? - Does the snapshot provider ever return duplicate file entries for the same normalized path?
- Does the task-change worker cache bundle results independently from the OpenCode backfill cache?
- Are task-level warnings derived only from imported events, or can they come from boundary parsing separately?
- Does a safe reject require
beforeContent, or canbeforeState.exists === falseplusafterContentbe enough for creates? - Is there any existing telemetry/log sink where structured counters can be emitted without leaking file contents?
If any answer is unknown, add a focused diagnostic or unit test before changing behavior. Do not use implementation guesses for these contracts.
Decision Gates
These gates must be passed in order. Do not skip gates to get fewer warnings faster.
| Gate | Required evidence | If not met |
|---|---|---|
| G0 contract audit | importer, bundle, rejectability, cache behavior known | no runtime change |
| G1 diagnostics-only | new diagnostics pass tests with no behavior change | fix diagnostics first |
| G2 shadow proof | proof computes stats but imports original changes | keep behavior disabled |
| G3 single-change proof | positive and negative single-change tests pass | keep apply disabled |
| G4 real-data single-change smoke | OpenCode improves or stays same, non-OpenCode unchanged | do not enable default |
| G5 multi-change proof | all chain tests pass, no ambiguous branch accepted | keep full unavailable |
| G6 real-data full smoke | no cross-task leakage, budgets pass | keep default single-change |
| G7 rollback check | OPENCODE_SNAPSHOT_PROOF_UPGRADE=off restores old behavior |
do not ship |
The implementation should be easy to stop after G4. Single-change mode is a
valid ship point; full mode is optional.
Known Unknowns That Block Full Mode
full mode must stay disabled if any of these are still unknown:
- Whether importer supersedes or appends duplicate
sourceImportKeyevents. - Whether real OpenCode data has nested or overlapping snapshot windows.
- Whether real OpenCode
apply_patchparts include rename, chmod, or binary patch shapes. - Whether multi-change same-path chains occur often enough to justify the risk.
- Whether review bundle dedupe can handle upgraded old events without duplicate rows.
- Whether snapshot proof stats can be collected without logging sensitive content.
Unknowns do not block diagnostics or single-change mode. They block full mode.
Assumption Ledger
Keep an explicit ledger of assumptions. Each assumption needs a validation path and a fallback. Do not leave assumptions implicit in implementation code.
| Assumption | Validation | Fallback if false |
|---|---|---|
| OpenCode snapshot windows are message-local | unit fixture and real-data diagnostics | metadata-only fallback |
| Source import keys are stable across re-backfill | repeated-backfill test | new-imports-only, no old rewrite |
| Review bundle dedupes safely | Phase 0 audit and bridge test | do not upgrade old events |
| Empty string survives materialization | serialization test | do not upgrade empty files |
| Existing reject helper checks current disk | desktop contract test | fix shared helper before enabling |
| Snapshot store objects remain readable long enough | retention fixture and diagnostics | metadata-only fallback |
| Part ordering is stable enough for chains | ordering unit tests | disable full |
| Warning predicates are complete | unit tests naming every removed warning | preserve warning |
| Stats can be emitted without content | log review and tests | disable stats or redact harder |
| Non-OpenCode fingerprints stay identical | real-data mode comparison | keep apply modes disabled |
If an assumption has no validation path, it should be moved to "Known Unknowns"
and block full mode.
Capability And Version Gates
Do not assume that snapshot: true in managed OpenCode config means snapshot
evidence is usable for every session. Treat snapshot proof as a runtime
capability that must be observed for the specific session being backfilled.
Required capability checks:
- OpenCode SQLite schema is supported.
- Session identity includes project id, directory, worktree, and git VCS.
- Session worktree matches the expected workspace root.
- Snapshot windows are present and paired.
- Snapshot git store reader reports the expected shape fingerprint.
- Snapshot file evidence can be read under the existing limits.
- The proof path sees the same normalized relative path in reconstruction and snapshot evidence.
Suggested result type:
type SnapshotProofCapability =
| {
supported: true
shapeFingerprint: string
sessionId: string
projectId: string
}
| {
supported: false
code:
| 'sqlite-schema-unsupported'
| 'session-identity-missing'
| 'workspace-mismatch'
| 'snapshot-window-missing'
| 'snapshot-store-unsupported'
| 'snapshot-store-missing'
diagnostics: string[]
}
Rules:
- Unsupported capability returns metadata-only fallback.
- Unknown capability returns metadata-only fallback.
- Capability diagnostics may be emitted in
shadow. - Capability success alone is not proof. It only allows proof attempts.
Mode Behavior Matrix
The mode must determine both proof computation and proof application.
| Mode | Compute proof? | Apply proof? | Import changed events? | Intended use |
|---|---|---|---|---|
off |
no | no | no | rollback and baseline comparison |
shadow |
yes | no | no | validate proof quality and performance |
single-change |
yes | only one unresolved change per path/window | yes | safe first rollout |
full |
yes | one-change and verified chains | yes | optional later rollout |
If implementation makes shadow import different events from off, it is a
bug. If implementation makes off compute snapshot proof, it is a performance
bug.
Minimum Safe Scope
The first behavior-changing implementation should intentionally support less than the full theoretical feature.
Allowed in first single-change apply mode:
- OpenCode only.
strict-deliveryonly.- One unresolved change for one normalized path inside one snapshot window.
- Text files within existing size limits.
writecreate when before absence and after text are proven.writemodify when before text, after text, and toolpart after content agree.editmodify whenoldStringoccurs exactly once and produces snapshot after.- delete when before text and after absence are proven.
Explicitly excluded from first apply mode:
- Multi-change same-path chains.
apply_patchwithout parsed hunks.- rename, chmod, binary patch, submodule, and mode-only changes.
- Any case requiring current disk as evidence.
- Any case requiring line-ending normalization.
- Any case where snapshot evidence exists but operation semantics are unclear.
- Any old metadata-only event rewrite unless source-key supersede is proven.
This scope is deliberately conservative. The goal is to prove the pipeline, not to maximize warning reduction in the first implementation.
Lowest-Confidence Areas And Mitigations
The implementation should explicitly address the areas below because they are where mistakes are most likely.
Snapshot Window Matching
Risk: OpenCode history can contain several step-start and step-finish
records in the same assistant message. Incorrect ordering could attach a
toolpart to the wrong snapshot pair.
Mitigation:
- Keep the existing requirement that a toolpart must match exactly one window.
- Keep message-local matching. Do not match a toolpart to a window from another assistant message.
- Add tests where a toolpart is before the first window, after the last window, and inside two overlapping windows.
- If window order cannot be proven from
rawParts, skip the upgrade.
Multi-Change Chains
Risk: several edits to the same file can produce the same final content through more than one path. This is the easiest place to create a convincing but wrong diff.
Mitigation:
- Implement single-change upgrades first.
- Gate multi-change chain upgrades behind a narrow helper and dense tests.
- Do not allow a
writein the middle of a reverse chain unless both sides of that write are independently proven. - Abort the whole path/window chain on the first ambiguous step.
- Add a kill switch that can disable multi-change upgrades while leaving single-change upgrades enabled.
Warning Removal
Risk: broad substring filtering can hide warnings that still matter, especially task-boundary or attribution warnings.
Mitigation:
- Do not remove warnings by broad terms like
manual-onlyalone. - Centralize warning predicates and match only known OpenCode baseline/content warning messages.
- Preserve all warnings that mention attribution, delivery, boundary, confidence, path scope, binary, too-large, truncated, or unavailable snapshot content.
- Add tests where a warning contains
manual-onlybut is unrelated to baseline proof.
Snapshot Shape Stability
Risk: OpenCode can change SQLite or snapshot git-store shape. A shape change could make old assumptions invalid.
Mitigation:
- Keep
snapshotShapeFingerprintchecks visible in diagnostics. - Treat unknown or unsupported shapes as metadata-only fallback.
- Do not add compatibility shims that guess from partial rows.
- Add an abort condition for a real-data shape mismatch.
Snapshot Store Retention
Risk: OpenCode SQLite can contain snapshot window hashes while the corresponding git-store objects are missing, pruned, moved, or unreadable. The history then looks promising but cannot prove full text.
Mitigation:
- Treat missing snapshot objects as metadata-only fallback.
- Keep a distinct diagnostic for missing store object versus unsupported shape.
- Do not retry by reading current disk.
- Do not reconstruct from only one side of the snapshot pair.
- Add a fixture where the window exists but object read fails.
Performance
Risk: reading snapshot blobs for every task can become expensive on large sessions.
Mitigation:
- Try snapshot proof only for unresolved OpenCode changes in strict delivery.
- Pass only unresolved touched paths to the snapshot reader unless a same-path chain requires exact already-proven neighbors.
- Keep the current snapshot read limits.
- Add timing diagnostics around snapshot proof attempts.
- Abort rollout if repeated snapshot timeouts appear in smoke data.
Existing Ledger Events
Risk: a task that was previously imported as metadata-only may later be backfilled with better evidence. If importer semantics are append-only, the UI could show duplicates or stale warnings.
Mitigation:
- Audit importer behavior before enabling upgrades for old data.
- Prefer stable source-key replacement/superseding if already supported.
- If replacement is not supported, limit the behavior change to new backfill imports and leave old events untouched.
- Add repeated-backfill tests before real-data smoke.
Cache Invalidation
Risk: desktop or worker cache may return an old metadata-only bundle after the orchestrator has imported stronger evidence, making validation confusing or causing stale warnings to persist.
Mitigation:
- Audit all cache layers in Phase 0.
- Include the OpenCode ledger fingerprint or imported event count in cache invalidation if an existing mechanism supports it.
- For tests, clear or bypass caches instead of waiting for TTLs.
- Do not add broad cache busting for all teams. Keep invalidation scoped to the requested team/task.
Partial Success Semantics
Risk: one file in a task upgrades while another remains metadata-only. Bulk review actions might accidentally assume the task is fully safe.
Mitigation:
- Keep rejectability file-level.
- Keep task-level warnings if any file remains manual-only or if boundaries are uncertain.
- Add a mixed-task desktop test.
Feature Flag And Rollback
Add a runtime guard before changing behavior:
type SnapshotProofUpgradeMode = 'off' | 'shadow' | 'single-change' | 'full'
function getSnapshotProofUpgradeMode(env: NodeJS.ProcessEnv): SnapshotProofUpgradeMode {
const raw = env.OPENCODE_SNAPSHOT_PROOF_UPGRADE
if (raw === '0' || raw === 'off') {
return 'off'
}
if (raw === 'shadow') {
return 'shadow'
}
if (raw === 'full') {
return 'full'
}
if (raw === 'single-change') {
return 'single-change'
}
return 'shadow'
}
Recommended rollout:
- Default to
shadowduring development and first smoke validation. - Move to
single-changeonly after shadow stats show expected upgrades with no behavior changes. - Move to
fullonly after multi-change chain tests and real-data smoke pass. - Keep
offavailable as an emergency rollback path. - If
fulllater becomes the default, that should be a separate rollout change after the implementation has passed real-data smoke in explicitfullmode.
If the project already has a central feature-flag/env helper for OpenCode runtime behavior, use that instead of adding a new ad-hoc parser.
shadow mode is intentionally different from off:
offdoes not attempt proof.shadowattempts proof and records stats/diagnostics, but returns the original changes to the importer.single-changeapplies only one-change path/window upgrades.fullapplies single-change and multi-change chain upgrades.
This gives a low-risk way to validate proof quality and performance on real data before changing review safety.
Architecture
Use this pipeline:
OpenCode SQLite history
-> toolpart reconstruction
-> strict delivery attribution
-> snapshot window grouping
-> snapshot file read with limits
-> proof upgrade per path
-> validate candidate batch
-> import task-change events
The upgrade belongs in the orchestrator evidence layer, primarily around:
OpenCodeChangeEvidenceEnricher.tsOpenCodeSnapshotEvidenceProvider.tsOpenCodeToolpartChangeReconstructor.tsonly if a small helper or extra metadata is needed- tests near existing OpenCode evidence and ledger bridge tests
Avoid touching desktop review UI for the proof itself. The desktop should only benefit from better imported event content.
Composite Identity Contract
Every full-text proof must be anchored to a composite identity. Do not rely on any single field alone.
Required identity dimensions:
type SnapshotProofIdentity = {
teamName: string
taskId: string
memberName: string
laneId?: string
sessionId: string
parentUserMessageId?: string
assistantMessageId: string
sourceMessageId: string
sourcePartId: string
toolUseId: string
relativePath: string
snapshotWindowId: string
fromSnapshot: string
toSnapshot: string
}
Rules:
taskIdmust be canonical, not display-only.memberName,laneId, andsessionIdmust come from strict delivery or already trusted session records.sourceMessageIdmust match the snapshot window message id.sourcePartIdmust be inside the matched window according to the same message's part order.relativePathmust be normalized through the existing OpenCode path helpers.fromSnapshotandtoSnapshotmust be the exact pair used to read file evidence.
If any identity dimension is missing, the default is metadata-only-fallback.
Ordering Contract
Same-path chains are safe only if toolpart order is stable and proven. Use the existing ordering data from OpenCode SQLite. Do not introduce a new sort.
Preferred order keys, in priority order:
messageTimeCreatedmessageIdSortmessagePartOrderpartId
Rules:
- Do not sort only by
partId. - Do not sort only by timestamp.
- Do not merge parts from different
sourceMessageIdvalues into one chain. - If two parts have indistinguishable order, do not upgrade the chain.
- If raw part order is unavailable, single-change upgrade may still work, but multi-change mode must skip.
Example guard:
function hasStablePartOrder(parts: SourcePartSortKey[]): boolean {
const seen = new Set<string>()
for (const part of parts) {
const key = [
part.messageTimeCreated,
part.messageIdSort,
part.messagePartOrder,
part.partId,
].join('\0')
if (seen.has(key)) {
return false
}
seen.add(key)
}
return true
}
If the real symbol names differ, keep the same invariant.
Cross-Repo Contract Boundaries
This feature crosses the desktop repo and the orchestrator repo. Keep the contract explicit.
Desktop responsibilities:
- Request OpenCode backfill only when delivery context exists.
- Keep cache/in-flight dedupe behavior.
- Render full-text events as diffs.
- Render metadata-only events as manual-only warnings.
- Use existing rejectability checks. Do not special-case OpenCode snapshot events in the UI unless a rendering bug is found.
Orchestrator responsibilities:
- Read OpenCode history and snapshot evidence.
- Decide whether proof is strong enough to materialize before/after text.
- Preserve strict delivery attribution.
- Preserve source import keys.
- Emit diagnostics explaining why upgrades were skipped.
Shared contract:
type ReviewSafetyContract = {
sourceImportKey: string
evidenceProof: OpenCodeEvidenceProof
beforeContent: string | null
afterContent: string | null
beforeState?: { exists?: boolean; sha256?: string; sizeBytes?: number; unavailableReason?: string }
afterState?: { exists?: boolean; sha256?: string; sizeBytes?: number; unavailableReason?: string }
warnings?: string[]
}
Safe reject requires a proven historical baseline:
function hasSafeHistoricalBaseline(change: ReviewSafetyContract): boolean {
if (change.beforeContent !== null) {
return true
}
return change.beforeState?.exists === false && change.afterContent !== null
}
The exact desktop helper may have a different name. The invariant should match this contract.
Apply/Reject Execution Safety Contract
Snapshot proof can make a review event eligible for normal diff rendering and safe reject consideration. It must not bypass current worktree conflict checks.
Review safety and execution safety are different:
- Review safety answers: "Do we know the historical before/after for this change?"
- Execution safety answers: "Can we apply or reject this change against the user's current disk state right now?"
This feature only upgrades review safety. It must not weaken execution safety.
Required rules:
- Rejecting a modify still requires the current file to match the expected after state, or whatever stricter existing conflict check is already used.
- Rejecting a create still requires the current file to match the created after state before deletion.
- Rejecting a delete still requires the current absence/after state to match the expected deleted state before restoring before content.
- Accepting an OpenCode change must not overwrite unrelated current disk edits.
- Bulk
Reject Allmust keep per-file conflict checks and skip unsafe files. - Current disk mismatch should produce a conflict/manual warning, not a proof downgrade.
Suggested predicate split:
function isReviewSafe(change: ReviewSafetyContract): boolean {
return isSnapshotReviewSafe(change)
}
function canExecuteReject(input: {
change: ReviewSafetyContract
currentDiskState: { exists: boolean; sha256?: string }
}): boolean {
if (!isReviewSafe(input.change)) {
return false
}
// Use the existing project helper here. This sketch only documents that
// execution safety is a separate check from proof safety.
return currentDiskMatchesExpectedAfterState(input.change, input.currentDiskState)
}
Do not implement currentDiskMatchesExpectedAfterState ad hoc if the project
already has a conflict/rejectability helper. This plan requires preserving that
existing behavior.
Data Model Contract
Do not introduce a new task-change event shape unless absolutely necessary. Prefer filling existing fields:
type UpgradedOpenCodeChangeContract = {
sourceTool: 'write' | 'edit' | 'apply_patch' | 'snapshot_patch'
sourceImportKey: string
evidenceProof: 'opencode-snapshot' | 'inverse-edit-chain' | 'inverse-apply-patch-chain' | 'toolpart-chain'
confidence: 'high' | 'exact'
beforeContent: string | null
afterContent: string | null
beforeState: {
exists?: boolean
sha256?: string
sizeBytes?: number
unavailableReason?: never
}
afterState: {
exists?: boolean
sha256?: string
sizeBytes?: number
unavailableReason?: never
}
snapshotId?: string
snapshotSource?: 'opencode'
warnings: string[]
}
Important:
- Upgraded full-text events should not carry
unavailableReasonfor the before/after side they claim to prove. - Metadata-only events may carry
unavailableReason, but then they must remain non-rejectable. confidence: 'high'is acceptable for snapshot proof. Useexactonly for truly exact toolpart chains that already have local full text.snapshotIdis useful provenance, but it is not required for safety if the proof was otherwise validated. MissingsnapshotIdshould be diagnostic.
Storage And Memory Contract
The feature must not create a second blob storage path or keep large full-text content in memory longer than the existing ledger import requires.
Rules:
- Reuse the existing task-change ledger content storage.
- Do not duplicate before/after text in diagnostics, stats, or cache keys.
- Do not add per-team global caches of snapshot file content.
- Do not store both snapshot raw blobs and task-change blobs unless the existing snapshot reader already does that internally.
- Apply the per-file and total byte limits before materializing upgraded events.
- If a file exceeds the limit, store metadata-only state with a reason.
- If many small files exceed the total byte budget, skip the excess files as metadata-only instead of raising the limit.
- Stats should count bytes read and files skipped, but never include content.
Suggested stats additions:
type SnapshotProofStorageStats = {
bytesRead: number
bytesMaterialized: number
skippedByByteLimit: number
skippedByTotalBudget: number
}
Memory pressure is a reason to keep metadata-only fallback. It is not a reason to increase limits or stream partial text into a diff.
Mutation Rules
When an upgrade is skipped, only diagnostics may change. The returned
ReconstructedOpenCodeToolChange must preserve:
beforeContentafterContentbeforeStateafterStateoperationconfidencewarningsevidenceProofsourceImportKey
When an upgrade succeeds, only these fields may change:
beforeContentafterContentbeforeStateafterStateoperation, only when the tool semantics and snapshot operation both prove itconfidencewarnings, only through the central resolved-warning predicateevidenceProofevidenceDiagnosticssnapshotIdsnapshotSource
No other fields should be rewritten by the proof upgrade. This reduces accidental attribution changes.
Proof Levels
Use explicit proof labels and keep their meaning strict.
type OpenCodeEvidenceProof =
| 'toolpart-chain'
| 'opencode-snapshot'
| 'inverse-edit-chain'
| 'inverse-apply-patch-chain'
| 'metadata-only-fallback'
Accepted for auto review:
toolpart-chainopencode-snapshotinverse-edit-chaininverse-apply-patch-chain
Not accepted for safe reject/apply:
metadata-only-fallback- current disk only
- file path metadata only
- hash without text
- text without matching task/window/path proof
Proof Decision Tables
Operation State Table
| Tool | Snapshot before | Snapshot after | Tool fields | Upgrade? | Reason |
|---|---|---|---|---|---|
write create |
absent | text | content absent or same as after | yes | create is fully proven |
write modify |
text | text | content absent or same as after | yes | before and after are fully proven |
write modify |
unavailable | text | any | no | overwrite baseline is unknown |
write modify |
text | text | content differs from after | no | toolpart and snapshot disagree |
edit modify |
text | text | old/new apply exactly once | yes | edit transition is proven |
edit modify |
text | text | old/new ambiguous | no | multiple valid transitions are possible |
apply_patch modify |
text | text | hunks verify exactly | yes | patch transition is proven |
apply_patch modify |
text | text | hunks missing | maybe | only if the snapshot window has exact single-file proof and no competing changes |
| delete | text | absent | delete operation | yes | delete is fully proven |
| any | binary/large/unavailable | any | any | no | full text is not available |
Confidence Table
| Evidence | Confidence | Safe reject/apply? |
|---|---|---|
| toolpart chain with known previous text | exact |
yes |
| snapshot before/after with verified transition | high |
yes |
| inverse edit/apply-patch chain with exact single replacements | high |
yes |
| snapshot path anchor without verified transition | medium |
no |
| metadata-only toolpart | medium |
no |
Do not upgrade confidence from medium to high unless safe reject/apply would
also be valid.
Proof State Machine
Implement the upgrade as a state machine, not as scattered conditionals.
original change
-> not eligible
-> candidate
-> snapshot evidence requested
-> snapshot evidence matched
-> transition verified
-> upgraded change
-> validated import candidate
Failure from any state returns to the original metadata-only change plus diagnostics.
Allowed transitions:
| From | To | Required condition |
|---|---|---|
| original | not eligible | not OpenCode, exact already, flag off, non-strict delivery |
| original | candidate | OpenCode unresolved change, strict delivery, flag permits |
| candidate | snapshot requested | non-empty touched path set within limits |
| snapshot requested | snapshot matched | exactly one window and one file anchor |
| snapshot matched | transition verified | operation-specific before/after proof succeeds |
| transition verified | upgraded | state hashes match content and warnings stripped safely |
| upgraded | validated import candidate | existing candidate validation accepts it |
Forbidden transitions:
- original -> upgraded
- candidate -> upgraded
- snapshot requested -> upgraded
- snapshot matched -> upgraded without operation-specific verification
- skipped -> upgraded
Suggested type:
type ProofState =
| { state: 'not-eligible'; reason: SnapshotUpgradeDiagnosticCode }
| { state: 'candidate'; change: ReconstructedOpenCodeToolChange }
| { state: 'snapshot-matched'; change: ReconstructedOpenCodeToolChange; anchor: SnapshotFileAnchor }
| { state: 'transition-verified'; change: ReconstructedOpenCodeToolChange; before: string | null; after: string | null }
| { state: 'upgraded'; change: ReconstructedOpenCodeToolChange }
| { state: 'skipped'; reason: SnapshotUpgradeDiagnosticCode; original: ReconstructedOpenCodeToolChange }
The concrete implementation does not have to use this exact union, but it should preserve the same transitions.
Exhaustiveness And Type Safety
Use exhaustive switches for proof decisions, operation handling, and feature
flag modes. Do not add a permissive default branch that silently preserves or
upgrades without naming the case.
function assertNever(value: never, context: string): never {
throw new Error(`Unexpected ${context}: ${String(value)}`)
}
function applyProofDecision(
decision: SnapshotProofDecision,
original: ReconstructedOpenCodeToolChange,
): ReconstructedOpenCodeToolChange {
switch (decision.type) {
case 'upgraded':
return decision.change
case 'skipped':
return original
default:
return assertNever(decision, 'snapshot proof decision')
}
}
If TypeScript cannot prove exhaustiveness, keep the code more explicit rather than using casts. A cast in proof code should be treated as a review smell.
Default Answers To Uncertainty
Use these defaults when implementation hits an unclear case:
| Question | Default |
|---|---|
| Is attribution strict enough? | no upgrade |
| Is toolpart order stable? | no multi-change upgrade |
| Does snapshot text prove the operation? | no upgrade |
| Does warning removal feel broad? | preserve warning |
| Is content text or binary? | treat as unavailable |
| Does old event replacement behavior seem unclear? | new imports only |
| Is cache invalidation unclear? | do not rely on cache for proof |
| Does UI need an OpenCode-specific branch? | fix shared helper or stop |
| Is performance impact unclear? | keep flag off or single-change only |
These defaults are part of the safety design, not temporary indecision.
Formal Proof Predicates
Implement proof decisions through small predicates that can be unit tested directly. Avoid spreading equivalent checks across several branches.
function isReadableFullText(value: string | null | undefined): value is string {
return typeof value === 'string'
}
function isKnownAbsent(state: { exists?: boolean } | undefined): boolean {
return state?.exists === false
}
function hasUnavailableReason(state: { unavailableReason?: string } | undefined): boolean {
return typeof state?.unavailableReason === 'string' && state.unavailableReason.length > 0
}
function isProvenCreate(change: ReviewSafetyContract): boolean {
return (
isKnownAbsent(change.beforeState) &&
isReadableFullText(change.afterContent) &&
!hasUnavailableReason(change.afterState)
)
}
function isProvenModify(change: ReviewSafetyContract): boolean {
return (
isReadableFullText(change.beforeContent) &&
isReadableFullText(change.afterContent) &&
!hasUnavailableReason(change.beforeState) &&
!hasUnavailableReason(change.afterState)
)
}
function isProvenDelete(change: ReviewSafetyContract): boolean {
return (
isReadableFullText(change.beforeContent) &&
change.afterState?.exists === false &&
!hasUnavailableReason(change.beforeState)
)
}
function isSnapshotReviewSafe(change: ReviewSafetyContract): boolean {
return (
change.evidenceProof === 'opencode-snapshot' ||
change.evidenceProof === 'inverse-edit-chain' ||
change.evidenceProof === 'inverse-apply-patch-chain' ||
change.evidenceProof === 'toolpart-chain'
) && (
isProvenCreate(change) ||
isProvenModify(change) ||
isProvenDelete(change)
)
}
The real implementation can use existing helper names, but tests should cover
the predicates above as behavior. In particular, unavailableReason on a side
that claims full proof should make the change unsafe.
Atomicity And Failure Semantics
Snapshot proof should behave atomically at three levels.
Per change:
- Success returns one upgraded change.
- Failure returns the original change unchanged plus diagnostics.
- No intermediate state should be visible to importer validation.
Per same-path chain:
- Success upgrades every unresolved change in the chain.
- Failure upgrades none of the unresolved changes in the chain.
- Already exact changes may remain exact, but they must not be rewritten by the failed chain attempt.
Per import batch:
- Candidate validation runs after proof upgrade.
- If import fails, review safety must not observe a partially imported safe state.
- Retry uses stable source import keys.
Implementation pattern:
const original = change
const decision = tryUpgradeChange(change)
if (decision.type === 'skipped') {
diagnostics.push(decision.reason)
return original
}
const upgraded = decision.change
if (!isSnapshotReviewSafe(upgraded)) {
diagnostics.push('snapshot-upgrade-skipped/postcondition-failed')
return original
}
return upgraded
Do not mutate change in place before postconditions pass.
Postconditions
Every successful upgrade must satisfy these postconditions:
function assertUpgradePostconditions(input: {
original: ReconstructedOpenCodeToolChange
upgraded: ReconstructedOpenCodeToolChange
}): boolean {
const { original, upgraded } = input
return (
original.sourceImportKey === upgraded.sourceImportKey &&
original.taskId === upgraded.taskId &&
original.teamName === upgraded.teamName &&
original.memberName === upgraded.memberName &&
original.sessionId === upgraded.sessionId &&
original.sourcePartId === upgraded.sourcePartId &&
original.sourceMessageId === upgraded.sourceMessageId &&
original.relativePath === upgraded.relativePath &&
upgraded.evidenceProof !== 'metadata-only-fallback' &&
isSnapshotReviewSafe(upgraded)
)
}
If a postcondition fails, keep the original change and emit a diagnostic. A postcondition failure is a bug in proof logic, not a reason to relax safety.
Runtime Assertion Policy
Assertions should catch programmer errors without making production data unsafe.
Rules:
- In tests, postcondition failures should fail loudly.
- In production backfill, postcondition failures should skip the upgrade, preserve the original metadata-only change, and emit a diagnostic.
- Assertions must never catch an error and continue with upgraded content.
- Assertions must not include file content in thrown messages.
- Assertions should include stable identifiers such as task id, source part id, source import key, and normalized relative path.
Suggested pattern:
function enforceUpgradePostconditions(input: {
original: ReconstructedOpenCodeToolChange
upgraded: ReconstructedOpenCodeToolChange
diagnostics: string[]
}): ReconstructedOpenCodeToolChange {
if (assertUpgradePostconditions(input)) {
return input.upgraded
}
input.diagnostics.push(
`snapshot-upgrade-skipped/postcondition-failed:${input.original.sourceImportKey}`,
)
return input.original
}
Do not use runtime assertions to justify looser proof predicates. Assertions are a last guard, not the proof itself.
Implementation Phases
Phase 0 - Audit Contracts Before Behavior Changes
This phase should be completed before any runtime behavior change.
Audit:
- Ledger import behavior for duplicate
sourceImportKey. - Review bundle dedupe behavior.
- Existing rejectability helper.
- Existing OpenCode backfill cache and in-flight behavior.
materializeMetadataOnlyChangesserialization of proof fields.- Current real-data snapshot diagnostics for a few OpenCode teams.
Deliverable:
Contract audit:
- sourceImportKey duplicate policy: replace | supersede | append | unknown
- review bundle dedupe key: ...
- rejectability helper: ...
- metadata materialization preserves proof fields: yes | no
- observed snapshot shape fingerprint: ...
- can proceed to Phase 1: yes | no
If any field is unknown, do not proceed to behavior changes.
Phase 1 - Add Targeted Diagnostics
Add diagnostics that explain why a metadata-only change was not upgraded. This makes real-data validation much easier.
Examples:
snapshot-upgrade-skipped/no-windowsnapshot-upgrade-skipped/ambiguous-windowsnapshot-upgrade-skipped/no-file-anchorsnapshot-upgrade-skipped/binarysnapshot-upgrade-skipped/too-largesnapshot-upgrade-skipped/path-chain-ambiguoussnapshot-upgrade-skipped/toolpart-after-mismatchsnapshot-upgrade-skipped/current-disk-not-proofsnapshot-upgrade-skipped/strict-delivery-requiredsnapshot-upgrade-skipped/unsupported-snapshot-shapesnapshot-upgrade-skipped/warning-preservedsnapshot-upgrade-skipped/feature-flag-off
These diagnostics should not be user-noisy by default, but they should be available in backfill result diagnostics and tests.
Diagnostics should be structured internally even if the public result remains a string array:
type SnapshotUpgradeDiagnosticCode =
| 'snapshot-upgrade-skipped/no-window'
| 'snapshot-upgrade-skipped/ambiguous-window'
| 'snapshot-upgrade-skipped/no-file-anchor'
| 'snapshot-upgrade-skipped/operation-mismatch'
| 'snapshot-upgrade-skipped/toolpart-after-mismatch'
| 'snapshot-upgrade-skipped/path-chain-ambiguous'
| 'snapshot-upgrade-skipped/strict-delivery-required'
| 'snapshot-upgrade-skipped/unsupported-snapshot-shape'
function pushSnapshotDiagnostic(
diagnostics: string[],
code: SnapshotUpgradeDiagnosticCode,
detail: string,
): void {
diagnostics.push(`${code}: ${detail}`)
}
Using a small typed union makes it harder to accidentally invent inconsistent diagnostics throughout the proof code.
Phase 2 - Make Upgrade Eligibility Explicit
Add a helper that decides whether a change needs snapshot proof. It should skip already exact full-text changes.
function needsSnapshotProof(change: ReconstructedOpenCodeToolChange): boolean {
if (change.evidenceProof === 'toolpart-chain') {
return false
}
if (change.beforeContent !== null && change.afterContent !== null) {
return false
}
if (change.beforeState?.exists === false && change.afterContent !== null) {
return false
}
if (change.beforeContent !== null && change.afterState?.exists === false) {
return false
}
return (
change.sourceTool === 'write' ||
change.sourceTool === 'edit' ||
change.sourceTool === 'apply_patch' ||
change.sourceTool === 'snapshot_patch'
)
}
Use this helper before expensive snapshot work where possible:
const changesNeedingProof = params.changes.filter(needsSnapshotProof)
if (changesNeedingProof.length === 0) {
return result
}
Important: this helper should reduce work, not reduce safety. If in doubt, include a change in snapshot proof attempt and let the proof logic reject it.
Add a second helper for safety eligibility. It must be stricter than
needsSnapshotProof.
function mayUseSnapshotProof(input: {
attributionMode: OpenCodeLedgerAttributionMode
change: ReconstructedOpenCodeToolChange
mode: SnapshotProofUpgradeMode
}): boolean {
if (input.mode === 'off') {
return false
}
if (input.attributionMode !== 'strict-delivery') {
return false
}
if (!needsSnapshotProof(input.change)) {
return false
}
return input.change.attributionMethod === 'delivery-ledger-taskrefs'
}
This keeps "should we spend time trying?" separate from "is this proof allowed to affect review safety?".
Add a third helper for apply eligibility. shadow may compute proof, but must
not apply it.
function mayApplySnapshotProof(input: {
mode: SnapshotProofUpgradeMode
changeCountForPathWindow: number
}): boolean {
if (input.mode === 'off' || input.mode === 'shadow') {
return false
}
if (input.mode === 'single-change') {
return input.changeCountForPathWindow === 1
}
return input.mode === 'full'
}
The call site should look structurally like this:
const decision = tryComputeSnapshotProof(change)
stats.record(decision)
if (!mayApplySnapshotProof({ mode, changeCountForPathWindow })) {
return originalChange
}
return decision.type === 'upgraded' ? decision.change : originalChange
This prevents diagnostics-only validation from accidentally changing imported review events.
Also make the final proof decision return a typed result instead of a nullable change. Nullable returns tend to hide why an upgrade failed.
type SnapshotProofDecision =
| {
type: 'upgraded'
change: ReconstructedOpenCodeToolChange
proof: Exclude<OpenCodeEvidenceProof, 'metadata-only-fallback'>
}
| {
type: 'skipped'
reason: SnapshotUpgradeDiagnosticCode
preserveOriginal: true
}
function preserveOriginal(
reason: SnapshotUpgradeDiagnosticCode,
): SnapshotProofDecision {
return { type: 'skipped', reason, preserveOriginal: true }
}
Callers should be forced to handle both branches. A skipped decision must return the original change unchanged except for diagnostics collected outside the change object.
Phase 3 - Strengthen Snapshot Anchor Matching
Snapshot anchors should be accepted only when all these conditions hold:
- The change belongs to a strict delivery session.
- The source toolpart belongs to exactly one snapshot window.
- The snapshot window belongs to the same OpenCode message.
- The normalized touched path is inside the session worktree.
- The snapshot reader returns an anchor for the exact relative path.
- Text content is full text, not binary, and within existing limits.
- The file operation is compatible with the tool operation.
Add a small validation helper:
type SnapshotAnchorValidation =
| { ok: true }
| { ok: false; reason: string }
function validateSnapshotAnchorForChange(input: {
change: ReconstructedOpenCodeToolChange
anchor: SnapshotFileAnchor | undefined
}): SnapshotAnchorValidation {
const { change, anchor } = input
if (!anchor) {
return { ok: false, reason: 'snapshot-upgrade-skipped/no-file-anchor' }
}
if (change.operation === 'create' && anchor.operation !== 'create') {
return { ok: false, reason: 'snapshot-upgrade-skipped/operation-mismatch' }
}
if (change.operation === 'delete' && anchor.operation !== 'delete') {
return { ok: false, reason: 'snapshot-upgrade-skipped/operation-mismatch' }
}
if (change.operation === 'modify' && anchor.operation === 'create') {
return { ok: false, reason: 'snapshot-upgrade-skipped/operation-mismatch' }
}
return { ok: true }
}
Do not rely only on operation matching. It is a gate, not proof.
The validation helper should also distinguish these concepts:
anchor operation: what the snapshot diff says happened to the file.tool operation: what the reconstructed toolpart thinks happened.review operation: what the imported task-change event will expose.
If these disagree, do not silently rewrite the operation unless the snapshot
transition and tool semantics both prove the new value. For example, a write
with no previous baseline may be reconstructed as modify; if snapshot says
create and before is absent, it may be upgraded to create. A reconstructed
edit must not become create or delete.
Add source identity checks before path checks:
function isSameSourceWindow(input: {
change: ReconstructedOpenCodeToolChange
windowMessageId: string
windowId: string
matchedWindowIds: string[]
}): boolean {
return (
input.change.sourceMessageId === input.windowMessageId &&
input.matchedWindowIds.length === 1 &&
input.matchedWindowIds[0] === input.windowId
)
}
The exact data shape can differ, but the check must prove message-local and single-window identity before using the snapshot file anchor.
Phase 4 - Upgrade Single-Change Snapshot Proof
For one change touching a file within a snapshot window, upgrade directly if the snapshot anchor proves the full transition.
Rules:
writecreate:- accept when
beforeState.exists === falseandafterContentis full text. - if toolpart content exists, require it to equal snapshot after content.
- accept when
writemodify:- accept when both snapshot before and after are full text.
- if toolpart content exists, require it to equal snapshot after content.
editmodify:- accept when both snapshot before and after are full text.
- require applying
oldString -> newStringto before to equal after, unless the edit came from a verified snapshot patch.
apply_patch:- accept when snapshot before and after are full text.
- if parsed hunks exist, verify before-to-after application or inverse chain.
- delete:
- accept when snapshot before is full text and snapshot after is absent.
For phase 4, do not support "maybe" apply_patch upgrades without parsed hunks.
Keep those for phase 5 or leave them metadata-only. This reduces the first
behavior change to the most provable cases.
Example helper:
function applyEditExactlyOnce(input: {
before: string
oldString: string | undefined
newString: string | undefined
}): string | null {
if (
typeof input.oldString !== 'string' ||
typeof input.newString !== 'string' ||
input.oldString === input.newString
) {
return null
}
if (countOccurrences(input.before, input.oldString) !== 1) {
return null
}
return input.before.replace(input.oldString, input.newString)
}
Example upgrade:
function upgradeEditFromSnapshot(input: {
change: ReconstructedOpenCodeToolChange
anchor: SnapshotFileAnchor
}): ReconstructedOpenCodeToolChange | null {
const before = input.anchor.beforeContent
const after = input.anchor.afterContent
if (typeof before !== 'string' || typeof after !== 'string') {
return null
}
const applied = applyEditExactlyOnce({
before,
oldString: input.change.oldString,
newString: input.change.newString,
})
if (applied !== after) {
return null
}
return {
...input.change,
beforeContent: before,
afterContent: after,
beforeState: contentStateForText(before),
afterState: contentStateForText(after),
confidence: 'high',
evidenceProof: 'opencode-snapshot',
snapshotId: input.anchor.snapshotId,
snapshotSource: input.anchor.snapshotId ? 'opencode' : undefined,
warnings: stripManualOnlyWarnings(input.change.warnings, input.anchor.warnings),
}
}
Add a generic transition verifier so write/edit/apply_patch decisions share the same state checks:
type VerifiedTransition =
| { ok: true; beforeContent: string | null; afterContent: string | null; operation: 'create' | 'modify' | 'delete' }
| { ok: false; reason: SnapshotUpgradeDiagnosticCode }
function verifySnapshotTransition(input: {
change: ReconstructedOpenCodeToolChange
anchor: SnapshotFileAnchor
}): VerifiedTransition {
const before = input.anchor.beforeContent
const after = input.anchor.afterContent
if (input.anchor.operation === 'create') {
return typeof after === 'string'
? { ok: true, beforeContent: null, afterContent: after, operation: 'create' }
: { ok: false, reason: 'snapshot-upgrade-skipped/no-file-anchor' }
}
if (input.anchor.operation === 'delete') {
return typeof before === 'string'
? { ok: true, beforeContent: before, afterContent: null, operation: 'delete' }
: { ok: false, reason: 'snapshot-upgrade-skipped/no-file-anchor' }
}
if (typeof before !== 'string' || typeof after !== 'string') {
return { ok: false, reason: 'snapshot-upgrade-skipped/no-file-anchor' }
}
return { ok: true, beforeContent: before, afterContent: after, operation: 'modify' }
}
This function should not be the final proof for edit or apply_patch. It only
proves that snapshot text exists for the operation state.
Before returning an upgraded change, verify the emitted states match the emitted content:
function assertStateMatchesContent(input: {
beforeContent: string | null
afterContent: string | null
beforeState: ReconstructedOpenCodeToolChange['beforeState']
afterState: ReconstructedOpenCodeToolChange['afterState']
}): boolean {
if (input.beforeContent !== null) {
const expected = contentStateForText(input.beforeContent)
if (input.beforeState?.sha256 !== expected.sha256) {
return false
}
}
if (input.afterContent !== null) {
const expected = contentStateForText(input.afterContent)
if (input.afterState?.sha256 !== expected.sha256) {
return false
}
}
return true
}
If this assertion fails, keep the original metadata-only change and emit a diagnostic. Do not import inconsistent state/content.
Phase 5 - Upgrade Multi-Change Same-Path Chains
When several changes touch the same file inside one snapshot window, only upgrade if the whole chain verifies.
Algorithm:
- Start from snapshot
afterContent. - Walk changes for that path in reverse source order.
- For each change:
- if it already has full before/after, require its after to equal the cursor.
- for
edit, reversenewString -> oldStringexactly once. - for
apply_patch, reverse parsed hunks exactly once. - for
write, only allow it as the first/oldest operation if snapshot before matches the previous state or known absent state.
- If any step is ambiguous, stop and keep all unresolved warnings.
- If the reverse chain reaches snapshot
beforeContent, materialize replacements for every unresolved change in the chain.
Pseudo-code:
function upgradeSamePathChain(input: {
changes: ReconstructedOpenCodeToolChange[]
anchor: SnapshotFileAnchor
diagnostics: string[]
}): Map<string, ReconstructedOpenCodeToolChange> {
const replacements = new Map<string, ReconstructedOpenCodeToolChange>()
let cursor = input.anchor.afterContent
if (typeof cursor !== 'string') {
input.diagnostics.push('snapshot-upgrade-skipped/no-after-anchor')
return replacements
}
for (let index = input.changes.length - 1; index >= 0; index -= 1) {
const change = input.changes[index]
if (!change) {
continue
}
const upgraded = reverseOneChangeFromAfter({ change, after: cursor, anchor: input.anchor })
if (!upgraded) {
input.diagnostics.push(`snapshot-upgrade-skipped/path-chain-ambiguous:${change.relativePath}`)
return new Map()
}
replacements.set(change.sourceImportKey, upgraded.change)
cursor = upgraded.beforeContent
}
if (typeof input.anchor.beforeContent === 'string' && cursor !== input.anchor.beforeContent) {
input.diagnostics.push('snapshot-upgrade-skipped/path-chain-boundary-mismatch')
return new Map()
}
return replacements
}
This is the highest-risk section. Keep tests dense here.
If there is any schedule pressure, defer this whole phase. Single-change upgrades are enough to reduce many warnings and are much less risky.
Additional multi-change restrictions:
- Do not cross snapshot-window boundaries.
- Do not cross assistant-message boundaries.
- Do not cross task delivery boundaries.
- Do not mix changes with different
sourceMessageId. - Do not mix changes with different normalized
relativePath. - Do not include changes whose source import key is missing or duplicated.
- Do not upgrade a chain if any change in the path has an operation that cannot be reversed from the current cursor.
- Do not upgrade if the final reverse cursor does not exactly equal snapshot
beforeContentfor modify/delete, or known absence for create.
Add this explicit guard:
function assertSinglePathWindowChain(input: {
changes: ReconstructedOpenCodeToolChange[]
}): boolean {
const relativePaths = new Set(input.changes.map(change => change.relativePath))
const messageIds = new Set(input.changes.map(change => change.sourceMessageId))
const importKeys = new Set(input.changes.map(change => change.sourceImportKey))
return (
relativePaths.size === 1 &&
messageIds.size === 1 &&
importKeys.size === input.changes.length
)
}
Phase 6 - Warning Stripping Must Be Conservative
Only remove warnings that are made false by the new proof.
Safe to remove after verified before/after:
OpenCode edit was captured without a proven full-text baseline; apply/reject is manual-only.OpenCode write overwrote an existing file before the bridge had a known baseline; reject is manual-only.OpenCode apply_patch was captured without full before/after text; review is manual-only.OpenCode toolpart content was unavailable or too large; review is manual-only.full review depends on snapshot evidence
Do not remove:
- attribution warnings
- low confidence task boundary warnings
- delivery context warnings
- path outside session directory warnings
- large/binary warnings for other files
- warnings attached to unrelated changes in the same task
- snapshot unavailable warnings attached to the same file
- any warning whose text is not in the known resolved warning predicate
Example:
function isResolvedByFullTextProof(warning: string): boolean {
return (
warning === 'OpenCode edit was captured without a proven full-text baseline; apply/reject is manual-only.' ||
warning === 'OpenCode write overwrote an existing file before the bridge had a known baseline; reject is manual-only.' ||
warning === 'OpenCode apply_patch was captured without full before/after text; review is manual-only.' ||
warning === 'OpenCode toolpart content was unavailable or too large; review is manual-only.' ||
warning.includes('full review depends on snapshot evidence')
)
}
function stripManualOnlyWarnings(
existing: string[] | undefined,
snapshotWarnings: string[] | undefined,
): string[] {
return [
...(existing ?? []).filter(warning => !isResolvedByFullTextProof(warning)),
...(snapshotWarnings ?? []),
].filter(Boolean)
}
If snapshot warnings contain unavailable content for this exact file, the change should probably not have been upgraded. Add a test for that.
Phase 7 - Preserve Performance Limits And Add Budgets
Do not increase these limits by default:
maxFiles: 100maxBytesPerTextFile: 1024 * 1024maxTotalBytes: 4 * 1024 * 1024timeoutMs: 3000
Additional guard:
const unresolved = params.changes.filter(needsSnapshotProof)
if (unresolved.length === 0) {
return result
}
const touchedRelativePaths = [...new Set(unresolved.map(change => change.relativePath))]
Do not pass already exact changes into touchedRelativePaths unless needed for
chain verification. This keeps snapshot reads narrow.
Add explicit performance budgets:
- A no-op backfill with no unresolved OpenCode changes should not invoke the snapshot reader.
- A strict-delivery task with one unresolved file should read one touched path.
- Snapshot proof attempt should record elapsed time in diagnostics when it exceeds 500 ms.
- More than two snapshot timeouts in a real-data smoke run blocks rollout.
- The broad real-data smoke should not increase total runtime by more than 10% compared with the baseline measured before the change.
Implementation sketch:
const startedAt = performance.now()
const snapshotResult = await readSnapshotEvidence()
const elapsedMs = performance.now() - startedAt
if (elapsedMs > 500) {
diagnostics.push(`snapshot-upgrade-slow: ${Math.round(elapsedMs)}ms`)
}
Use the local runtime timing primitive already used in the orchestrator if
performance.now() is not available in that module.
Add a resource envelope for one backfill call:
type SnapshotProofResourceEnvelope = {
maxSnapshotReadsPerBackfill: 10
maxTouchedPathsPerRead: 100
maxBytesPerTextFile: 1024 * 1024
maxTotalBytesPerRead: 4 * 1024 * 1024
maxElapsedMsPerRead: 3000
}
Do not add hidden retries that can multiply these limits. One failed or timed out snapshot read should produce diagnostics and preserve metadata-only changes.
Phase 8 - Idempotency And Existing Ledger Events
The upgrade may change the materialized content for a source event that was previously imported as metadata-only. That needs a clear policy.
Preferred policy:
- Keep
sourceImportKeystable. - Let the existing ledger importer treat the upgraded event as the same source event, not a new file change.
- If the importer is append-only and cannot update a previous event safely, do not attempt to rewrite old ledger data in this feature.
- For new tasks, the upgraded evidence should be imported on the first backfill.
- For old tasks, a re-backfill can show better evidence only if the existing ledger/import layer already supports replacing or superseding by source key.
Add a test for repeated backfill. It should not duplicate files in the review bundle.
Phase 9 - Desktop Contract Validation
This phase should not add new UI behavior unless tests expose a bug. It validates that the upgraded events are already consumed safely.
Checklist:
- Full-text upgraded OpenCode event renders through the same path as Codex full-text diffs.
- Metadata-only OpenCode event still renders the warning banner.
- Mixed full-text and metadata-only task keeps per-file rejectability.
Reject Allskips metadata-only files.- Current disk preview remains read-only context.
- Task summary warnings remain visible if attribution or boundary warnings remain.
If any item fails, fix the shared review safety helper rather than adding a separate OpenCode-specific branch in the UI.
Observability And Metrics
Add counters to diagnostics or existing debug output. They should be cheap and safe to expose in test logs.
Suggested counters:
type SnapshotProofStats = {
attemptedChanges: number
upgradedChanges: number
skippedChanges: number
skippedByReason: Record<string, number>
snapshotReadCount: number
snapshotReadTimeouts: number
snapshotReadElapsedMs: number
touchedPathCount: number
exactToolpartChainCount: number
metadataOnlyFallbackCount: number
}
Use these stats in smoke output:
OpenCode snapshot proof:
- attempted: 12
- upgraded: 7
- skipped: 5
- skipped/no-window: 2
- skipped/path-chain-ambiguous: 1
- skipped/too-large: 2
- snapshot reads: 3
- snapshot read time: 184ms
Metrics must not include file content or secrets. Paths are acceptable only if the existing diagnostics already expose paths in the same context.
Deterministic Output Comparison
Use deterministic fingerprints to compare off, shadow, and apply modes.
This catches accidental behavior changes that are hard to see in UI screenshots.
Suggested fingerprint input:
type ReviewBundleFingerprintInput = Array<{
taskId: string
relativePath: string
sourceImportKey: string
evidenceProof: string | undefined
operation: string
beforeSha256?: string
afterSha256?: string
warningCount: number
rejectable: boolean
}>
Rules:
offandshadowfingerprints must match except for diagnostics/stats.single-changemay change OpenCode entries only.fullmay change OpenCode entries only.- Non-OpenCode entries must have identical fingerprints in every mode.
- Fingerprints must not include raw file content.
If a mode comparison fails, inspect the structured diff before looking at UI.
Cache And Re-Backfill Policy
The safest initial policy is:
- New backfills may import upgraded proof.
- Existing metadata-only events should not be rewritten unless the current importer already has a proven source-key replacement/supersede path.
- The desktop cache should not be globally invalidated.
- A task-specific refresh may re-read after successful OpenCode import.
- If cache behavior is unclear, tests should bypass cache and the rollout should leave old events unchanged.
Pseudo-policy:
type ExistingEventPolicy = 'new-imports-only' | 'supersede-by-source-key'
function chooseExistingEventPolicy(audit: {
importerSupersedesBySourceKey: boolean
reviewBundleDedupesBySourceKey: boolean
}): ExistingEventPolicy {
return audit.importerSupersedesBySourceKey && audit.reviewBundleDedupesBySourceKey
? 'supersede-by-source-key'
: 'new-imports-only'
}
Do not create a third policy that appends upgraded duplicates and relies on UI filtering to hide the old event.
Rollback Runbook
Rollback must be possible without data repair.
Immediate rollback:
OPENCODE_SNAPSHOT_PROOF_UPGRADE=off
Expected behavior after rollback:
- New OpenCode backfills return to previous metadata-only/manual-only behavior for cases without exact toolpart chains.
- Existing already-imported upgraded events remain valid historical full-text events. Do not delete them as part of rollback.
- No new upgraded events should be imported while the flag is off.
- Desktop review should continue to render previously imported full-text events.
If rollback is needed because upgraded duplicates were imported:
- Do not add renderer-side filtering as a permanent fix.
- Identify whether duplicates share
sourceImportKey. - Fix importer/source-key dedupe.
- Add a regression test with the duplicated event fixture.
- Only then consider a one-off ledger cleanup, and only with explicit user approval.
If rollback is needed because of performance:
- Keep diagnostics.
- Disable proof upgrade.
- Preserve exact
toolpart-chainbehavior. - Inspect snapshot read counters and touched path counts.
- Re-enable only after reducing reads, not after raising limits.
Implementation Slices
Prefer these slices even if the work lands in one PR. Each slice should compile and have focused tests before the next slice starts.
- Diagnostics only:
- Add typed diagnostic codes.
- Add stats object.
- No behavior change.
- Eligibility only:
- Add feature flag parser.
- Add
needsSnapshotProofandmayUseSnapshotProof. - Prove
offmode has no behavior change.
- Shadow proof:
- Compute proof decisions and stats.
- Return original changes to importer.
- Compare
shadowandoffoutputs.
- Single-change proof:
- Implement formal predicates.
- Implement create/modify/delete proof for one path/window/change.
- Keep multi-change groups skipped.
- Import/idempotency validation:
- Verify source-key dedupe or choose
new-imports-only. - Add repeated-backfill tests.
- Verify source-key dedupe or choose
- Desktop validation:
- Verify shared rejectability consumes upgraded events safely.
- No OpenCode-specific renderer branch unless a shared helper bug is found.
- Multi-change proof:
- Implement only after ordering contract tests pass.
- Keep behind
full.
- Default enablement:
- Enable
single-changeonly after real-data smoke. - Enable
fullonly in a separate rollout decision.
- Enable
Stop points:
- It is acceptable to stop after slice 3 and ship only
shadow. - It is acceptable to stop after slice 4 and ship only
single-change. - It is acceptable to stop after diagnostics if real data shows unsupported snapshot shape.
- It is not acceptable to ship multi-change proof without real or synthetic chain coverage.
Definition Of Done By Mode
off
- No behavior change from current metadata-only/full-text decisions.
- Diagnostics may mention that the feature is disabled.
- Tests prove no upgraded event appears in this mode.
shadow
Required before any apply mode can be default:
- Proof attempts run for eligible OpenCode changes.
- Importer receives the original change list.
- Stats include would-upgrade and skipped counts.
- No review diff, rejectability, warning, or file count changes.
- Real-data smoke shows non-OpenCode teams unchanged.
- Performance budget passes while proof is computed but not applied.
single-change
Required before this mode can be default:
- Only one unresolved change for a path/window can upgrade.
- Multi-change path/window groups are skipped with diagnostics.
writecreate/modify,editmodify, and delete cases have positive and negative tests.- Non-OpenCode teams are unchanged in real-data smoke.
- Metadata-only count for OpenCode tasks decreases or stays equal.
- No duplicate review rows after repeated backfill.
full
Required before this mode can be default:
- Every known unknown that blocks full mode is resolved.
- Same-path multi-change order is proven by tests.
- Chain upgrades are all-or-nothing for unresolved changes.
- Real-data smoke includes at least one actual multi-change chain or a synthetic fixture with equivalent shape.
fullmode can be disabled without changing code.- A separate rollout decision enables
full; it must not become default as a side effect of implementing single-change mode.
Edge Case Matrix
Attribution and Task Boundaries
- No delivery context:
- Do not run strict snapshot upgrade.
- Keep existing backfill skipped behavior.
- Delivery context exists but does not include the requested task:
- Keep
no-attributionbehavior. Do not use compatible fallback for safe full-text.
- Keep
- Compatible attribution mode:
- Do not upgrade to auto-safe full text.
- Reason: task ownership is not strict enough.
- Missing task start boundary:
- Snapshot proof may prove file content, but task boundary warning remains.
- Estimated end boundary:
- Snapshot proof may prove file content, but boundary warning remains.
- Same OpenCode session contains several tasks:
- Only strict delivery records for the requested task are eligible.
- Same member touches same file for two tasks:
- Do not merge changes across delivery windows.
- Multiple members share an OpenCode profile:
- Require the delivery record member/lane/session match. Do not trust profile alone.
- Runtime delivery ledger was reset after launch:
- No strict delivery context means no safe upgrade. Keep warnings.
- Delivery record has task refs but missing observed assistant message:
- Do not use message-local snapshot proof unless the toolpart can still be tied to the delivered prompt through existing strict delivery matching.
- Delivery record has a pre-prompt cursor but no post-prompt cursor:
- Keep strict-delivery matching conservative. Do not widen to the whole session.
- Task display id matches but canonical task id differs:
- Use canonical task id for safe upgrades.
Snapshot Windows
- No snapshot windows:
- Keep metadata-only warning.
- Toolpart outside window:
- Keep metadata-only warning.
- Toolpart matches multiple windows:
- Keep metadata-only warning.
- Window has before hash but no after hash:
- Keep metadata-only warning.
- Window has after hash but no before hash:
- Allow create only if file absence is explicitly proven. Otherwise keep warning.
- Snapshot diff contains the path but operation is unknown:
- Keep metadata-only warning.
- Snapshot diff includes more changed files than reconstructed toolparts:
- Upgrade only exact reconstructed paths. Add diagnostic for extra snapshot paths.
- Snapshot diff misses a reconstructed path:
- Keep that path metadata-only.
- OpenCode SQLite changed during read:
- Existing transaction snapshot is okay, but add diagnostic.
- OpenCode schema changed:
- Treat as unsupported history shape, no upgrade.
- Snapshot git store object is missing:
- Keep metadata-only warning and include the store diagnostic.
- Snapshot git store read times out:
- Keep metadata-only warning and include timeout diagnostic.
- Snapshot window hashes exist but git-store object is pruned:
- Keep metadata-only warning and include retention diagnostic.
- Snapshot window hashes exist but point to an object from a different project:
- Treat as workspace mismatch and skip.
- Snapshot window contains no reconstructed changes after path filtering:
- Do not read files for that window.
- Snapshot reader returns duplicate entries for one relative path:
- Treat as ambiguous and skip that path.
- Snapshot reader returns content for a path with different casing:
- Use existing normalized comparison key. If identity is ambiguous, skip.
- Snapshot window is valid but OpenCode part JSON was truncated by our reader
cap:
- Treat affected toolparts as metadata-only. Do not combine partial part data with snapshot proof.
- Snapshot window contains changes from a tool type not modeled by this plan:
- Keep those changes metadata-only until the tool type has explicit tests.
File Content
- Text over size limit:
- Keep warning with
too-large.
- Keep warning with
- Binary or null-byte:
- Keep warning with
binary.
- Keep warning with
- Empty file:
- Valid text content. Do not confuse empty string with unavailable.
- Missing file after delete:
- Valid delete if before content is known.
- Missing file before create:
- Valid create if after content is known.
- File exists before create operation:
- Operation mismatch, no upgrade.
- File absent after modify operation:
- Operation mismatch, no upgrade.
- Content normalizes differently by line endings:
- Do not normalize for proof. Exact byte-equivalent UTF-8 text comparison is required.
- Content has invalid UTF-8:
- Treat as binary/unavailable.
- Generated/minified text below limit:
- It can be upgraded if full text is available, but review UI may still choose to collapse display.
- File mode-only changes:
- Do not create a text diff upgrade unless text before/after also changed or mode changes are explicitly modeled.
- Very small binary file:
- Size does not make it text. Binary detection still wins.
- UTF-16 or other non-UTF-8 text:
- Treat as unavailable unless the existing snapshot reader explicitly decodes and hashes the exact same text representation used by review events.
- Secrets in file content:
- Do not log content in diagnostics. Existing ledger storage rules apply to before/after blobs.
- Git LFS pointer file:
- Treat the pointer text as the file content if that is what the snapshot contains. Do not dereference LFS objects.
- Sparse checkout missing working-tree file:
- Irrelevant for proof. Snapshot evidence may still be valid, but execution safety must handle current disk conflict separately.
- Submodule path:
- Do not read inside submodule git data unless existing snapshot reader explicitly models submodules. Treat as metadata-only otherwise.
- Permission denied reading snapshot object:
- Keep metadata-only warning and include a permission diagnostic.
Edit Semantics
oldString === newString:- Skip as no-op as current code does.
oldStringmissing:- Cannot prove edit from toolpart alone.
newStringmissing:- Cannot prove edit from toolpart alone.
oldStringappears twice:- No upgrade unless snapshot chain proves exact transition through another trusted source.
newStringappears twice when reversing:- No inverse upgrade.
- Replacement creates same final content through multiple possible paths:
- No upgrade.
replaceAllor multi-replacement edit shape appears:- Do not upgrade until that tool shape is explicitly parsed and tested.
- Edit tool reports success but snapshot before does not contain
oldString:- No upgrade.
- Edit tool reports success but snapshot after does not contain
newString:- No upgrade unless the replacement legitimately deletes the string and the exact transition verifies.
- Empty
oldString:- Do not upgrade. Empty search strings are ambiguous.
- Empty
newString:- Valid deletion only when
oldStringoccurs exactly once and snapshot after equals the deletion result.
- Valid deletion only when
- Overlapping replacements:
- Do not upgrade unless exact before-to-after application has one valid path.
Write Semantics
- Write creates new file:
- Upgrade only if before absent and after content known.
- Write overwrites existing file:
- Upgrade only if snapshot before and snapshot after are known.
- Toolpart content differs from snapshot after:
- No upgrade.
- Toolpart content is truncated:
- Snapshot can still prove after only if snapshot after is available and operation is fully verified.
- Keep a diagnostic that toolpart content was truncated but snapshot proof was used.
- Existing file baseline unavailable:
- Keep current warning.
- Write after earlier edit in the same path/window:
- Treat as chain case. Do not single-change upgrade.
- Write followed by edit in the same path/window:
- Treat as chain case. Single-change upgrade is not enough.
- Write content is available but snapshot after is unavailable:
- Do not use toolpart content alone for existing-file overwrite baseline.
- Write content equals previous content:
- It may be a no-op. Do not create a misleading modify diff unless snapshot shows a real text state transition or the review event model supports no-op changes explicitly.
- Write creates parent directories:
- Only the file text is in scope. Directory creation is not a text proof.
Apply Patch Semantics
- Patch text unavailable:
- Snapshot can prove final file-level before/after only if window has exact path anchor.
- Parsed update hunks apply exactly once:
- Allow inverse chain proof.
- Parsed hunks apply multiple places:
- No upgrade.
- Patch creates/deletes file:
- Verify operation with snapshot before/after states.
- Patch touches files not in metadata:
- Add diagnostic and do not upgrade missing paths unless snapshot path proof is exact.
- Patch contains rename:
- Do not upgrade as text modify unless rename support is explicitly modeled.
- Patch changes file mode only:
- Keep metadata-only unless mode changes are supported by the review event schema.
- Patch contains CRLF-sensitive context:
- Exact text verification is required. Do not line-ending-normalize.
- Patch partially applies in reverse:
- No upgrade. All hunks must verify.
- Patch has context-only hunks:
- Do not treat context as a change without before/after text proof.
- Patch deletes and recreates the same file in one patch:
- Treat as ambiguous unless the parser explicitly models it and tests cover it.
Paths and Workspaces
- Absolute paths outside workspace:
- Reject upgrade.
..path traversal:- Reject upgrade.
- Windows path separators:
- Normalize, then validate.
- Symlink points outside workspace:
- Do not read current disk. Snapshot git store path normalization should be trusted only for repository paths.
- Session directory is subdirectory:
- Touched paths outside session directory get diagnostic. Do not let this alone prove or disprove content.
- Case-insensitive filesystem:
- Use existing OpenCode path comparison helpers.
- Unicode normalization differences in file names:
- Use existing normalized path keys. Do not add a second normalization scheme in this feature.
- Nested git repository inside workspace:
- Verify snapshot identity against the OpenCode project worktree, not just the process cwd.
- Worktree moved after task:
- Use recorded project identity. If workspace identity cannot be trusted, no upgrade.
- Workspace root is a symlink:
- Use existing workspace comparison helpers. Do not add ad-hoc
realpathbehavior unless tests cover both symlinked and non-symlinked roots.
- Use existing workspace comparison helpers. Do not add ad-hoc
- File path contains newline or control characters:
- Do not include raw path in diagnostics without escaping. Upgrade only if existing path normalization accepts it safely.
- Case-only rename:
- Treat as rename/path operation, not a text modify, unless the review event schema explicitly models it.
- Path appears both as file and directory across before/after:
- Keep metadata-only unless snapshot reader explicitly models the transition.
Concurrency and Later Changes
- Current disk changed after task:
- Irrelevant. Do not use current disk for proof.
- Another member changed same file after OpenCode task:
- Snapshot proof remains historical. Review conflict detection must happen elsewhere.
- Backfill runs twice:
- Source import keys must dedupe.
- Backfill interrupted:
- Existing ledger import must remain idempotent.
- OpenCode host is still writing SQLite:
- Rely on read-only transaction and existing fingerprint diagnostics. Do not retry aggressively.
- Two backfills run concurrently:
- Existing in-flight dedupe should prevent duplicate desktop calls. The importer must still dedupe by source key.
- User manually edits a file while review is open:
- Snapshot proof remains historical. Apply/reject conflict handling is outside this feature.
- Team is relaunched while backfill is running:
- Use run/session identity from the delivery context. Do not merge new runtime sessions into the old task proof.
- Snapshot proof succeeds but ledger import fails:
- Retry should be idempotent by source key. Diagnostics should not mark the task as safely upgraded until import succeeds.
- Snapshot store is pruned between capability check and file read:
- Treat the read failure as metadata-only fallback. Do not retry from current disk.
- OpenCode writes a new assistant message while backfill reads SQLite:
- Use the read-only transaction snapshot and existing fingerprint diagnostics. Do not merge later rows into the current proof attempt.
- SQLite WAL is corrupt or cannot be read:
- Treat session history as unavailable/unsupported. Do not use partial rows for safe proof.
- OpenCode JSON row is malformed:
- Skip that row and keep affected changes metadata-only. Do not infer from surrounding rows.
UI And Review Semantics
- Full-text upgrade enables normal diff rendering only if the imported event has both safe baseline and safe target state.
- Metadata-only warnings should remain visible and should not be hidden by task summary aggregation.
Reject Allmust still skip non-rejectable files.- A task-level warning may remain even when all file diffs are full-text.
- A file-level warning may remain even when another file in the same task is upgraded.
- Do not change viewed-count behavior in this feature.
- Do not hide task cards solely because all OpenCode warnings were resolved.
- Do not change accept/reject button labels or statuses in this feature.
- Do not mark a file viewed just because snapshot proof succeeded.
- A file becoming review-safe does not mean reject execution must succeed if the user changed the worktree after the task.
- Conflict messaging for apply/reject should remain the existing shared behavior, not a new OpenCode-only message path.
- If a task card warning disappears because all file-level OpenCode baseline warnings were resolved, task-boundary and attribution warnings must still remain visible.
- The UI should not describe a snapshot-upgraded file as "guaranteed safe". It is "review-safe" or "full-text verified"; execution can still conflict.
- Do not add success toasts or celebratory messaging for proof upgrades. This is infrastructure, not a user-facing achievement.
Security And Privacy
- Do not log before/after content.
- Do not include long snippets in diagnostics.
- Do not include raw paths with control characters in diagnostics.
- Do not include delivery payload text in proof stats.
- Do not expand file size limits for convenience.
- Do not add a new IPC path that exposes arbitrary snapshot reads.
- If a file is upgraded, it is stored through the existing task-change ledger content path. Do not add a second storage location.
Serialization And Backward Compatibility
- Older desktop builds may see new diagnostics but should not require a new event schema to render metadata-only fallback.
- Missing
snapshotSourceshould not crash review rendering. - Missing
snapshotIdshould not crash review rendering. - Unknown
evidenceProofvalues should be treated as unsafe by review safety helpers. - JSON serialization must preserve empty string content.
- JSON serialization must distinguish absent file from empty file.
- Large content omitted by limits must serialize as unavailable state, not empty content.
Test Plan
Risk To Test Traceability
| Risk | Required test/smoke |
|---|---|
| Cross-task contamination | real-data smoke with at least two tasks in one OpenCode session |
| Cross-member contamination | fixture with shared profile but different member/lane |
| Wrong snapshot window | unit test with overlapping windows and outside-window toolpart |
| False baseline from current disk | unit test proving current disk is never consulted |
| Unsafe warning removal | unit test with unrelated manual-only warning preserved |
| Duplicate imported events | repeated-backfill bridge test |
| Performance regression | smoke budget with snapshot read counters |
| Unsupported OpenCode shape | snapshot provider unsupported-shape test |
| Mixed safe/unsafe task | desktop integration test for Reject All skipping metadata-only |
| Cache stale result | bridge or desktop worker test bypassing/invalidating cache deliberately |
| Capability false positive | fixture with snapshot enabled but missing store object |
| Shadow mode mutation | fingerprint comparison between off and shadow |
| Snapshot retention loss | fixture where window exists but object read fails |
| Execution conflict bypass | desktop/review test where current disk differs from expected after |
| Memory/storage blowup | fixture with many small files exceeding total byte budget |
| Malformed OpenCode rows | offline reader/reconstructor fixture with malformed part JSON |
Negative Control Fixtures
Negative controls are cases that look close to valid proof but must not upgrade.
Required negative controls:
- Same file path, same member, but different task id.
- Same task id, same file path, but different member/lane.
- Same session and file path, but toolpart outside the snapshot window.
- Snapshot before/after text exists, but
oldStringoccurs twice. - Snapshot after equals toolpart content, but before is unavailable.
- Snapshot path matches, but operation is rename or mode-only.
- Current disk matches expected after, but snapshot before is missing.
shadowcomputes an upgrade decision, but imported fingerprint matchesoff.- Existing metadata-only event appears before upgraded event with same source key.
- Unknown
evidenceProofappears in imported data.
Each negative control should assert both behavior and diagnostic reason. A negative control without a reason is hard to debug and easy to regress.
Golden Fixture Coverage Matrix
Maintain a small set of golden fixtures that cover the supported state space.
| Fixture | Mode | Expected |
|---|---|---|
| write-create-text | single-change |
upgraded create |
| write-modify-text | single-change |
upgraded modify |
| edit-modify-once | single-change |
upgraded modify |
| delete-text | single-change |
upgraded delete |
| duplicate-old-string | single-change |
metadata-only |
| missing-before | single-change |
metadata-only |
| toolpart-outside-window | single-change |
metadata-only |
| shadow-valid-edit | shadow |
would-upgrade stats, original import |
| non-opencode-task | all modes | unchanged fingerprint |
| missing-snapshot-object | all apply modes | metadata-only |
| multi-change-chain | single-change |
skipped |
| multi-change-chain | full |
upgraded only if chain verifies |
Golden fixtures should be tiny and deterministic. They should not depend on wall-clock time, filesystem case behavior, or the user's current worktree.
Unit Tests
Add or extend OpenCodeChangeEvidenceEnricher.test.ts.
Tests:
- Upgrades metadata-only edit from exact snapshot before/after.
- Does not upgrade edit when
oldStringappears twice. - Does not upgrade edit when snapshot after does not equal applied result.
- Upgrades write create with before absent and after text.
- Upgrades write modify when toolpart content equals snapshot after.
- Does not upgrade write modify when toolpart content differs from snapshot after.
- Upgrades delete with before text and after absent.
- Does not remove attribution warnings after content proof.
- Keeps manual-only warning when anchor has unavailable before/after content.
- Multi-edit same-path chain upgrades only when the whole chain verifies.
- Multi-edit same-path chain keeps all metadata-only fallbacks when one step is ambiguous.
- Snapshot provider unavailable keeps current behavior.
- Does not remove unrelated
manual-onlywarning text. - Keeps task boundary warnings after successful content proof.
- Does not upgrade compatible attribution mode.
- Does not upgrade when feature flag is
off. - Single-change mode skips multi-change chain upgrade.
- Empty file create and empty file modify are handled as valid text.
- CRLF/LF mismatch fails proof instead of normalizing.
- Duplicate source import keys block chain upgrade.
- Empty
newStringdeletion upgrades only with exact single occurrence. - Empty
oldStringnever upgrades. - State hashes must match emitted full text.
- Snapshot anchor duplicate path entry skips upgrade.
writeno-op does not create a misleading diff.- Skipped proof preserves the original change object fields.
- Successful proof mutates only allowed fields.
- Snapshot proof decision returns typed skipped reason, not
null. - Unsupported snapshot shape never upgrades.
- Existing-event policy defaults to
new-imports-onlywhen dedupe is unknown. - State machine cannot jump from candidate to upgraded without transition verification.
- Unstable part order blocks multi-change upgrade.
- Unknown
evidenceProofis unsafe in review safety helper. - Empty string survives materialization and serialization.
- Absent file is not serialized as empty file.
shadowmode computes proof stats but returns original changes.mayApplySnapshotProofblocks multi-change groups insingle-change.- Exhaustive switches fail compilation when a new mode/proof decision is not handled.
- Capability success is required before snapshot proof attempt.
- Missing snapshot git-store object keeps metadata-only fallback.
offandshadowreview bundle fingerprints match.- Non-OpenCode fingerprints are identical across all modes.
- Review-safe upgraded change still fails reject execution when current disk mismatches expected after.
- Total byte budget skips excess files as metadata-only.
- Malformed OpenCode part JSON cannot produce upgraded proof.
- LFS pointer text is not dereferenced.
- Submodule paths stay metadata-only unless explicitly modeled.
- Runtime postcondition failure preserves original metadata-only change.
- Every golden fixture has a paired negative control.
- Minimum safe scope excludes unsupported operation shapes.
Example fixture shape:
const change: ReconstructedOpenCodeToolChange = {
taskId: 'task-1',
taskRef: 'task-1',
taskRefKind: 'canonical',
teamName: 'team',
memberName: 'alice',
sessionId: 'session',
assistantMessageId: 'message-1',
toolUseId: 'tool-1',
sourcePartId: 'part-1',
sourceMessageId: 'message-1',
sourceTool: 'edit',
sourceImportKey: 'session:part-1:src/app.ts',
filePath: '/workspace/src/app.ts',
relativePath: 'src/app.ts',
beforeContent: null,
afterContent: null,
operation: 'modify',
confidence: 'medium',
attributionMethod: 'delivery-ledger-taskrefs',
oldString: 'const value = 1',
newString: 'const value = 2',
beforeState: { exists: true, unavailableReason: 'opencode-edit-baseline-not-captured' },
afterState: { exists: true, unavailableReason: 'opencode-edit-final-content-unavailable' },
evidenceProof: 'metadata-only-fallback',
warnings: ['OpenCode edit was captured without a proven full-text baseline; apply/reject is manual-only.'],
timestamp: new Date(0).toISOString(),
}
Synthetic Fixture Schema
Use a compact fixture builder so edge cases do not depend entirely on live OpenCode data.
type SnapshotProofFixture = {
name: string
mode: SnapshotProofUpgradeMode
attributionMode: OpenCodeLedgerAttributionMode
delivery: {
teamName: string
taskId: string
memberName: string
laneId?: string
sessionId: string
assistantMessageId: string
}
windows: Array<{
messageId: string
windowId: string
fromSnapshot: string
toSnapshot: string
startPartOrder: number
finishPartOrder: number
}>
parts: Array<{
partId: string
messageId: string
order: number
tool: 'write' | 'edit' | 'apply_patch'
filePath: string
oldString?: string
newString?: string
content?: string
}>
snapshotFiles: Array<{
relativePath: string
beforeContent?: string
afterContent?: string
beforeExists: boolean
afterExists: boolean
}>
expected: {
upgraded: number
metadataOnly: number
diagnostics: string[]
}
}
Fixture rules:
- Every positive fixture needs a paired negative fixture that differs by one proof condition.
- Fixtures should prefer tiny strings so failures are easy to inspect.
- Fixtures must include at least one empty string case and one absent-file case.
- Fixtures must include one path with unsafe characters for diagnostics escaping.
- Fixtures must not include secrets or large blobs.
Snapshot Provider Tests
Extend OpenCodeSnapshotEvidenceProvider.test.ts.
Tests:
- Groups only unresolved proof-needed changes into touched paths.
- Emits diagnostic for missing window.
- Emits diagnostic for ambiguous window.
- Preserves existing limits.
- Does not read snapshot for unrelated exact changes.
- Does not match windows across assistant messages.
- Emits diagnostic for extra snapshot paths not in reconstructed toolparts.
- Emits timeout diagnostic while preserving metadata-only fallback.
- Does not read snapshot windows with no unresolved touched paths.
- Escapes unsafe path text in diagnostics.
Ledger Bridge Tests
Extend OpenCodeLedgerBridgeService tests or add a focused fixture test.
Tests:
- Backfill imports upgraded full-text event for strict delivery OpenCode edit.
- Backfill keeps metadata-only event for compatible attribution.
- Backfill keeps metadata-only event with no delivery context.
- Imported event has stable source import key and dedupes on rerun.
snapshotShapeFingerprintis present when snapshot proof was used.- Repeated backfill does not duplicate file entries.
- Old metadata-only imported event is not rewritten unless importer already supports superseding by source key.
- Snapshot proof is not attempted for Codex or Anthropic members.
- Snapshot proof is not attempted for OpenCode exact
toolpart-chainchanges. - Backfill cache does not return stale metadata-only data after an upgraded import in the same test.
- Import failure leaves no partial safe-review state.
Desktop Integration Tests
Only if needed. The desktop review UI already handles full text and metadata-only.
Smoke check:
- Full-text OpenCode upgraded event renders a real diff.
- Metadata-only event still renders manual-only warning.
- Reject is enabled only for full-text safe baseline.
- Warnings remain visible for task boundary uncertainty.
Reject Allskips a mixed task where one OpenCode file upgraded and another stayed metadata-only.- Current disk preview remains read-only and does not become a reject baseline.
- Viewed count is unchanged by proof upgrade.
- Task-level boundary warning remains visible after all file diffs upgrade.
- Reject execution still blocks when current disk no longer matches the expected after state.
- Bulk
Reject Allrejects only files that pass both review-safety and execution-safety checks.
Property-Like Tests
Add small table-driven tests for transition verification:
const editCases = [
{ name: 'single replacement', before: 'a = 1', oldString: '1', newString: '2', after: 'a = 2', ok: true },
{ name: 'duplicate old', before: 'a 1 b 1', oldString: '1', newString: '2', after: 'a 2 b 1', ok: false },
{ name: 'empty old', before: 'abc', oldString: '', newString: 'x', after: 'xabc', ok: false },
{ name: 'delete exactly once', before: 'abc', oldString: 'b', newString: '', after: 'ac', ok: true },
]
The point is not random fuzzing. The point is to make ambiguous replacement rules explicit and hard to regress.
Real Data Smoke
Before implementation, capture a baseline:
time pnpm test --run test/main/services/team/TaskChangeComputer.test.ts
time pnpm test --run test/main/services/team/ChangeExtractorService.test.ts
After implementation, run the same commands:
pnpm test --run test/main/services/team/TaskChangeComputer.test.ts
pnpm test --run test/main/services/team/ChangeExtractorService.test.ts
Then run the existing real-data smoke scripts used for task changes. Required checks:
errors: 0- no increase in item errors
- no cross-task file leakage
- no increase in metadata-only count for OpenCode tasks
- no change for Codex-only teams
- no change for Anthropic-only teams
- broad smoke runtime increase <= 10%
- snapshot timeout count <= 2
- upgraded OpenCode full-text count is explainable by diagnostics
- no decrease in task-boundary warnings unless task-boundary code changed separately
offandshadowfingerprints match except diagnostics/stats- non-OpenCode fingerprints match in all modes
Manual target cases:
relay-works-3/#1f735bearelay-works-3/#bf01e5c3relay-works-3/#43e6b9b0should remain Codex-related, not OpenCode-upgradedsignal-ops-22should remain unaffected because it has no OpenCode members- any OpenCode team with real
snapshotShapeFingerprintpresent in diagnostics - one team with missing/reset delivery ledger, if available
Add at least one synthetic OpenCode snapshot fixture if real data lacks a clean single-change full-text snapshot case. Real data validates integration, but a synthetic fixture is better for precise edge cases.
Real-data smoke should compare before/after summaries:
Before:
- OpenCode metadata-only file changes: N
- OpenCode full-text file changes: M
- non-OpenCode full-text file changes: X
- task-boundary warnings: B
After:
- OpenCode metadata-only file changes: <= N
- OpenCode full-text file changes: >= M
- non-OpenCode full-text file changes: X
- task-boundary warnings: B
Any non-OpenCode count change is a blocker.
Failure Injection Tests
Add targeted failure injection where practical:
- Snapshot provider throws.
- Snapshot provider times out.
- Snapshot provider returns duplicate path entries.
- Ledger importer rejects the batch.
- Backfill runs twice with the same source import key.
- Feature flag changes from
single-changetooff. - Snapshot proof succeeds for one file and fails for another file in the same task.
Expected result for every failure injection: original metadata-only safety is preserved, no duplicate review rows, diagnostics explain the skip/failure.
Serialization Tests
Add tests around the task-change event materialization boundary:
beforeContent: ''remains empty string.afterContent: ''remains empty string.beforeContent: nullremains unavailable/absent according to state.- Unknown
evidenceProofdoes not make a file rejectable. - Snapshot fields survive import/export if present.
- Snapshot fields may be absent without renderer crashes.
Manual QA Runbook
Manual QA is not a substitute for tests, but it helps catch integration mistakes.
Prepare:
- Pick one OpenCode team with snapshot evidence.
- Pick one Codex-only or Anthropic-only team.
- Record before counts for:
- OpenCode metadata-only files.
- OpenCode full-text files.
- non-OpenCode full-text files.
- task-boundary warnings.
- snapshot proof diagnostics.
Run with off:
OPENCODE_SNAPSHOT_PROOF_UPGRADE=off pnpm test --run test/main/services/team/TaskChangeComputer.test.ts
Expected:
- No new upgraded OpenCode snapshot events.
- Existing exact toolpart-chain behavior unchanged.
Run with shadow:
OPENCODE_SNAPSHOT_PROOF_UPGRADE=shadow pnpm test --run test/main/services/team/TaskChangeComputer.test.ts
Expected:
- Snapshot proof stats are emitted.
- Would-upgrade counts are visible.
- Imported/reviewed changes are identical to
off. - Any difference from
offoutside diagnostics is a blocker.
Run with single-change:
OPENCODE_SNAPSHOT_PROOF_UPGRADE=single-change pnpm test --run test/main/services/team/TaskChangeComputer.test.ts
Expected:
- OpenCode full-text count may increase.
- OpenCode metadata-only count may decrease or stay equal.
- non-OpenCode counts are unchanged.
- Multi-change groups are skipped with diagnostics.
Run with full only after tests pass:
OPENCODE_SNAPSHOT_PROOF_UPGRADE=full pnpm test --run test/main/services/team/TaskChangeComputer.test.ts
Expected:
- Same guarantees as
single-change. - Multi-change upgrades appear only when diagnostics can explain the full chain.
UI spot check:
- Open a mixed task with one upgraded file and one metadata-only file.
- Verify the upgraded file shows a diff.
- Verify the metadata-only file still shows a warning.
- Verify
Reject Allskips metadata-only files. - Verify current disk preview is not treated as baseline.
- Verify task boundary warnings remain if present.
Any mismatch is a blocker.
Acceptance Criteria
The implementation is acceptable only if all are true:
- OpenCode-only behavior changed.
- Strict delivery remains required for snapshot full-text upgrades.
- Exact existing
toolpart-chainbehavior is unchanged. - Metadata-only fallback still works.
- No current disk content is used as historical proof.
- No broad OpenCode session scan is introduced.
- Snapshot read limits are unchanged or narrower.
- Ambiguous chains keep warnings.
- Large and binary files keep warnings.
- Tests cover same-path multi-change chains.
- Real-data smoke shows no cross-task leakage.
- Feature flag can disable the upgrade.
- Repeated backfill does not duplicate review files.
- Warning removal is limited to known resolved warning predicates.
- Performance budgets pass.
- The implementation has an explicit fallback for unsupported OpenCode snapshot shapes.
- The implementation includes Phase 0 contract audit notes in the PR/commit description or test output.
- No warning is removed unless a unit test names that exact warning or predicate.
- No current-disk preview path is involved in a proof decision.
- No behavior change occurs when
OPENCODE_SNAPSHOT_PROOF_UPGRADE=off. - Smoke output includes attempted/upgraded/skipped counts.
- Full mode is not enabled while any known unknown remains unresolved.
- Existing metadata-only events are not rewritten unless source-key supersede is proven by tests.
- Cache behavior is documented in the Phase 0 audit.
- Composite proof identity is enforced before snapshot text is trusted.
- Toolpart ordering is explicitly verified before multi-change upgrades.
single-changeandfullhave separate definitions of done.- Serialization preserves empty string versus absent file.
shadowmode proves expected upgrades without changing imported review events.- Exhaustive handling covers every proof decision and feature flag mode.
- Capability gates are checked per session, not inferred from config alone.
- Missing/pruned snapshot store objects preserve metadata-only fallback.
- Deterministic fingerprints prove non-OpenCode behavior is unchanged.
- Apply/reject execution safety still checks current disk state after review proof succeeds.
- Storage and memory budgets are enforced without duplicate blob storage.
- Malformed/truncated OpenCode rows cannot produce upgraded proof.
- First apply rollout stays within the minimum safe scope.
- Negative controls prove close-but-invalid cases remain metadata-only.
- Runtime postcondition failures preserve original changes.
Verification Command Matrix
Use the narrowest useful commands first, then broader smoke.
| Layer | Command or check | Required result |
|---|---|---|
| Typecheck | pnpm typecheck |
passes |
| Enricher unit | targeted OpenCodeChangeEvidenceEnricher tests |
passes |
| Snapshot provider | targeted OpenCodeSnapshotEvidenceProvider tests |
passes |
| Ledger bridge | targeted OpenCodeLedgerBridgeService tests |
passes |
| Desktop review safety | targeted review/rejectability tests | passes |
| Off mode | task-change tests with OPENCODE_SNAPSHOT_PROOF_UPGRADE=off |
old behavior |
| Shadow mode | task-change tests with OPENCODE_SNAPSHOT_PROOF_UPGRADE=shadow |
stats only |
| Single-change mode | task-change tests with OPENCODE_SNAPSHOT_PROOF_UPGRADE=single-change |
only one-change upgrades |
| Full mode | task-change tests with OPENCODE_SNAPSHOT_PROOF_UPGRADE=full |
only after chain tests |
| Real data | existing task-change smoke on OpenCode and non-OpenCode teams | no leakage |
Do not use full smoke as a substitute for single-change smoke. They prove
different safety boundaries.
Code Review Checklist
Use this checklist before merging the implementation:
- Every upgraded change has a non-
metadata-only-fallbackproof. - Every upgraded modify has both
beforeContentandafterContent. - Every upgraded create has
beforeState.exists === falseandafterContent. - Every upgraded delete has
beforeContentandafterState.exists === false. - State hashes match emitted content.
- No branch reads current disk as proof.
- No branch catches an error and upgrades anyway.
- Every skipped branch preserves the original change.
- Warning stripping uses a central predicate.
- Multi-change mode can be disabled independently.
- Snapshot reader limits are unchanged or narrower.
- Tests include at least one negative case for every positive upgrade case.
- Real-data smoke includes at least one OpenCode team and one non-OpenCode team.
- No new IPC or filesystem read path bypasses existing workspace trust checks.
- No content appears in diagnostics, metrics, or thrown error messages.
offmode is covered by a test and is easy to use during rollback.- The proof logic is structured so forbidden state transitions are not possible without an obvious code review smell.
shadowmode has been run on real data before any apply mode is enabled.- Any new union member requires an exhaustive switch update, not a permissive default branch.
- Review-safe and execution-safe are checked separately.
- Large-file and total-byte budget tests prove metadata-only fallback.
- Minimum safe scope is visible in code structure, not only in tests.
- Negative controls exist for task, member, window, baseline, and operation mismatch.
Implementation Anti-Patterns
Do not implement the feature using these patterns:
- A broad
try/catchthat returns an upgraded change on partial data. - Mutating the original change object in place before proof has succeeded.
- Removing warnings before the final proof decision.
- Reading current disk to fill
beforeContent. - Comparing normalized line endings for proof.
- Treating a matching hash as content.
- Creating OpenCode-specific rejectability logic in the renderer.
- Appending upgraded duplicate events and expecting UI sorting to hide stale ones.
- Increasing snapshot limits to make a test pass.
- Falling back from strict delivery to compatible attribution for safety.
- Adding
fullmode as the default in the same change that introduces it. - Treating empty string as missing content.
- Treating missing content as empty string.
- Sorting same-path chains by only one field.
- Retrying snapshot reads in a loop without a budget.
- Treating review-safe as automatically execution-safe.
- Adding a second blob store or cache for before/after content.
- Dereferencing Git LFS or submodule content outside the existing snapshot reader.
- Using malformed partial OpenCode JSON rows as proof context.
- Expanding the first apply mode to unsupported operations because the snapshot text happens to be available.
- Hiding uncertainty by changing user-facing wording from warning to success.
Rollout Strategy
- Land diagnostics and helper functions with no behavior change if practical.
- Add the feature flag with default
offin tests where needed. - Add snapshot-first upgrade for single-change same-path cases.
- Run targeted tests and real-data smoke.
- Enable
single-changemode for local smoke. - Add multi-change chain upgrade only after tests are solid.
- Move to
fullmode only if multi-change smoke is clean. - Inspect warnings before and after for OpenCode tasks.
If multi-change support looks risky during implementation, stop after single-change mode. Single-change upgrade is already useful and lower risk.
Recommended shipping sequence:
PR 1: diagnostics + eligibility helpers + no behavior change
PR 2: single-change snapshot proof upgrade behind flag
PR 3: enable single-change by default for OpenCode strict delivery
PR 4: multi-change chain upgrade behind flag
PR 5: enable full mode only after real-data smoke
If this stays as one PR, keep the same commit structure locally and verify each step before moving to the next one.
Abort Conditions
Do not continue implementation if any of these happens:
- Snapshot windows cannot be reliably matched to toolparts.
- Existing OpenCode snapshot shape differs from tests in real data.
- Real-data smoke shows any new cross-task file leakage.
- Performance smoke shows repeated timeouts.
- A change would require using current disk as proof.
- A change would require broad compatible attribution scanning.
- Warning stripping needs broad substring matching to pass tests.
- Multi-change support requires accepting ambiguous edit/apply-patch replacements.
- Source import key dedupe behavior is unclear.
- The only available validation is manual UI inspection.
- A test has to assert against current wall-clock timing without a stable budget.
- Formal proof predicates require exceptions to support the first implementation.
- Postconditions fail for any positive fixture.
single-changemode needs multi-change assumptions to pass.fullmode needs renderer-specific special cases to appear safe.- Rollback with
OPENCODE_SNAPSHOT_PROOF_UPGRADE=offdoes not restore old behavior for new backfills. - Empty content and missing content cannot be distinguished at serialization.
Open Questions Template For Implementation PR
Every implementation PR should answer these in its description:
OpenCode snapshot proof PR checklist:
- Mode implemented: diagnostics | single-change | full
- Default mode:
- Phase 0 contract audit completed: yes | no
- Source import key duplicate policy:
- Review bundle dedupe key:
- Rejectability helper:
- Existing event policy: new-imports-only | supersede-by-source-key
- Snapshot shape fingerprint observed:
- Real-data teams tested:
- Non-OpenCode teams unchanged: yes | no
- Snapshot proof stats:
- Rollback tested with OPENCODE_SNAPSHOT_PROOF_UPGRADE=off: yes | no
- Known unknowns remaining:
If the PR cannot answer one of these, it should not enable new behavior by default.
Example Final Change Shape
Before upgrade, metadata-only edit:
{
"sourceTool": "edit",
"before": {
"exists": true,
"unavailableReason": "opencode-edit-baseline-not-captured"
},
"after": {
"exists": true,
"unavailableReason": "opencode-edit-final-content-unavailable"
},
"evidenceProof": "metadata-only-fallback",
"warnings": [
"OpenCode edit was captured without a proven full-text baseline; apply/reject is manual-only."
]
}
After verified snapshot upgrade:
{
"sourceTool": "edit",
"before": {
"exists": true,
"sha256": "before-hash",
"sizeBytes": 128
},
"after": {
"exists": true,
"sha256": "after-hash",
"sizeBytes": 128
},
"evidenceProof": "opencode-snapshot",
"snapshotSource": "opencode",
"warnings": []
}
If any proof check fails, the event must stay in the first shape.
Notes for Future Maintainers
The important invariant is not "fewer warnings". The invariant is "warnings are removed only when the system has stronger evidence than before".
Warnings are correct when historical full text is not proven. A warning is a better outcome than an unsafe reject button.