From 4087060cca95405ae0073b931c386b56f2630f6f Mon Sep 17 00:00:00 2001 From: 777genius Date: Sun, 17 May 2026 16:03:14 +0300 Subject: [PATCH] docs(opencode): add snapshot proof upgrade plan --- ...ncode-snapshot-first-proof-upgrade-plan.md | 3068 +++++++++++++++++ 1 file changed, 3068 insertions(+) create mode 100644 docs/team-management/opencode-snapshot-first-proof-upgrade-plan.md diff --git a/docs/team-management/opencode-snapshot-first-proof-upgrade-plan.md b/docs/team-management/opencode-snapshot-first-proof-upgrade-plan.md new file mode 100644 index 00000000..75631a42 --- /dev/null +++ b/docs/team-management/opencode-snapshot-first-proof-upgrade-plan.md @@ -0,0 +1,3068 @@ +# OpenCode Snapshot-First Proof Upgrade Plan + +## Goal + +Reduce false or avoidable OpenCode review warnings for new tasks by upgrading +metadata-only OpenCode `edit`, `write`, and `apply_patch` changes to verified +full-text before/after changes when, and only when, existing OpenCode snapshot +evidence proves the exact file state transition. + +The implementation must be fail-closed: + +- If proof is complete, store full before/after content and remove manual-only warnings. +- If proof is incomplete, ambiguous, too large, binary, outside scope, or unavailable, keep the current warning. +- Never use current disk content as proof for historical before/after. +- Never broaden attribution outside strict delivery context. + +## Non-goals + +- Do not change Codex or Anthropic task extraction. +- Do not change generic review UI semantics. +- Do not infer diffs from current disk. +- Do not scan unrelated OpenCode sessions. +- Do not increase OpenCode snapshot file size limits as part of this work. +- Do not retroactively "fix" old tasks unless existing ledger backfill has strict delivery and snapshot evidence. + +## Current System Facts + +Desktop repo: + +- `ChangeExtractorService` requests OpenCode ledger backfill only when delivery context is available. +- Backfill goes through `OpenCodeReadinessBridge.backfillOpenCodeTaskLedger`. +- Imported events already support full before/after content and metadata-only fallbacks. + +Orchestrator repo: + +- `OpenCodeProfileManager.buildManagedConfig()` already sets `snapshot: true`. +- `OpenCodeLedgerBridgeService.backfill()` reconstructs toolpart changes, then calls `OpenCodeChangeEvidenceEnricher.enrich()`. +- `OpenCodeOfflineSessionReader` reads OpenCode SQLite history in read-only mode and extracts snapshot windows. +- `OpenCodeSnapshotEvidenceProviderService` reads before/after snapshot file contents with strict limits. +- `OpenCodeToolpartChangeReconstructor` already creates exact `toolpart-chain` changes when it has a known baseline. +- `OpenCodeChangeEvidenceEnricher` already upgrades some metadata-only changes through snapshot or inverse chain proof. + +This plan should strengthen the existing evidence path rather than introduce a +new capture subsystem. + +## Risk Estimate + +Recommended implementation: + +- Functional bug risk: 3/10. +- Performance regression risk: 2/10. +- Data safety risk: 2/10. +- Complexity: 7/10. +- Approximate runtime change size: 220-450 LOC. +- Approximate total change size with tests and diagnostics: 450-900 LOC. + +The low data safety risk depends on preserving fail-closed behavior. If any +step starts accepting guesses as proof, data safety risk becomes 7/10 or worse. + +## Hard Safety Invariants + +These invariants are more important than reducing warnings. + +1. A full-text upgrade must be tied to one task, one member, one OpenCode + session, one delivery record, one assistant message, one toolpart, and one + snapshot window. +2. The upgrade must be local to OpenCode. Codex, Anthropic, generic task-log + parsing, and non-OpenCode review flows must not change. +3. `strict-delivery` is required for every snapshot-based full-text upgrade. + Compatible attribution may still import metadata-only events, but it must not + produce auto-safe before/after content. +4. Current disk content is never historical evidence. It can be displayed as + read-only context by the desktop, but it cannot remove a warning or enable + safe reject. +5. Hash-only evidence is not full-text evidence. A hash can verify text that was + already read from a trusted snapshot, but a hash alone is not enough. +6. Empty string is valid full text. `null` and `undefined` mean unavailable. +7. Large, binary, truncated, path-unsafe, or schema-unsupported content stays + metadata-only. +8. A failed upgrade must preserve the original change event shape as much as + possible. It may add diagnostics, but it must not remove warnings or mutate + operation/confidence. +9. Imported event idempotency must remain based on existing source import keys. + The upgrade must not create duplicate events for the same toolpart/path. +10. Any multi-change path chain must be all-or-nothing for unresolved changes in + that path/window. Partial upgrades are allowed only for changes that were + already independently exact before the chain attempt. +11. The implementation must never make a previously non-rejectable change + rejectable unless both `beforeContent` and the target after/absence state + are proven from the same trusted historical evidence path. +12. Diagnostics are allowed to become more detailed. They are not allowed to be + used as a substitute for proof. + +## Things That Are Explicitly Not Proof + +These signals can be useful diagnostics, but they must not remove warnings or +enable safe reject by themselves: + +- Current disk content matching an expected hash. +- Current disk content matching `newString`. +- A file path appearing in OpenCode tool metadata. +- A file path appearing in a snapshot diff without readable before/after text. +- A before/after hash without the corresponding text blob. +- An OpenCode tool status of `completed`. +- The absence of an error in the toolpart. +- The task title mentioning the same directory. +- A member name matching the expected teammate. +- A session id matching but no strict delivery record. +- A snapshot window in the same session but a different assistant message. +- A snapshot window that overlaps several toolparts ambiguously. +- A successful manual UI render of current disk preview. + +If implementation pressure makes any of these tempting, stop and keep the +warning. + +## Threat Model + +The feature is not security-sensitive in the network sense, but it is +data-safety-sensitive. The main threat is a false full-text proof that enables +safe reject/apply for the wrong historical state. + +Bug classes to defend against: + +1. Cross-task contamination: + - A file change from task A appears in task B review. + - Main defense: strict delivery, canonical task id, source message/window + matching, real-data smoke. +2. Cross-member contamination: + - A teammate using the same OpenCode profile is attributed to another member. + - Main defense: delivery record member/lane/session matching. +3. Cross-window contamination: + - A toolpart is matched to the wrong snapshot window in the same message. + - Main defense: exactly-one window matching and order tests. +4. False baseline: + - Current disk or hash-only evidence is treated as historical before text. + - Main defense: "not proof" list and code review checklist. +5. Unsafe warning removal: + - UI stops warning about a file that is still manual-only. + - Main defense: central warning predicate and negative warning tests. +6. Duplicate imported events: + - The same source toolpart appears twice after re-backfill. + - Main defense: source-key idempotency audit and repeated-backfill tests. +7. Silent performance regression: + - Snapshot proof reads too many blobs or times out often. + - Main defense: proof-needed filtering, existing limits, timing counters. +8. Unsupported upstream shape: + - OpenCode changes SQLite/snapshot schema and our parser guesses. + - Main defense: shape fingerprint, unsupported fallback, abort condition. + +For every bug class above, the implementation needs at least one negative test +or real-data smoke assertion. + +## Pre-Implementation Audit Checklist + +Before writing runtime code, answer these questions from the current codebase: + +1. Does the task-change ledger importer update, replace, supersede, or append + events with the same `sourceImportKey`? +2. Does the desktop review bundle dedupe by source key, file path, event id, or + a computed change id? +3. Which exact helper decides whether a file is rejectable? +4. Which warnings are currently surfaced in `TeamChangesSection` versus the + full review dialog? +5. Does the OpenCode backfill cache hide an upgraded result for up to 60 seconds + after a first metadata-only result? +6. Does `materializeMetadataOnlyChanges` preserve `evidenceProof`, + `snapshotId`, `snapshotSource`, and warnings exactly? +7. Can the snapshot provider return `beforeState`/`afterState` with hashes but + no text, and how is that serialized into task-change events? +8. Are OpenCode snapshot windows always message-local in current real data? +9. Are there real examples where a single toolpart touches more than one file? +10. Are there real examples where `apply_patch` contains rename or mode-only + changes? +11. Does the snapshot provider ever return duplicate file entries for the same + normalized path? +12. Does the task-change worker cache bundle results independently from the + OpenCode backfill cache? +13. Are task-level warnings derived only from imported events, or can they come + from boundary parsing separately? +14. Does a safe reject require `beforeContent`, or can `beforeState.exists === false` + plus `afterContent` be enough for creates? +15. Is there any existing telemetry/log sink where structured counters can be + emitted without leaking file contents? + +If any answer is unknown, add a focused diagnostic or unit test before changing +behavior. Do not use implementation guesses for these contracts. + +## Decision Gates + +These gates must be passed in order. Do not skip gates to get fewer warnings +faster. + +| Gate | Required evidence | If not met | +| --- | --- | --- | +| G0 contract audit | importer, bundle, rejectability, cache behavior known | no runtime change | +| G1 diagnostics-only | new diagnostics pass tests with no behavior change | fix diagnostics first | +| G2 shadow proof | proof computes stats but imports original changes | keep behavior disabled | +| G3 single-change proof | positive and negative single-change tests pass | keep apply disabled | +| G4 real-data single-change smoke | OpenCode improves or stays same, non-OpenCode unchanged | do not enable default | +| G5 multi-change proof | all chain tests pass, no ambiguous branch accepted | keep `full` unavailable | +| G6 real-data full smoke | no cross-task leakage, budgets pass | keep default `single-change` | +| G7 rollback check | `OPENCODE_SNAPSHOT_PROOF_UPGRADE=off` restores old behavior | do not ship | + +The implementation should be easy to stop after G4. Single-change mode is a +valid ship point; `full` mode is optional. + +## Known Unknowns That Block Full Mode + +`full` mode must stay disabled if any of these are still unknown: + +- Whether importer supersedes or appends duplicate `sourceImportKey` events. +- Whether real OpenCode data has nested or overlapping snapshot windows. +- Whether real OpenCode `apply_patch` parts include rename, chmod, or binary + patch shapes. +- Whether multi-change same-path chains occur often enough to justify the risk. +- Whether review bundle dedupe can handle upgraded old events without duplicate + rows. +- Whether snapshot proof stats can be collected without logging sensitive + content. + +Unknowns do not block diagnostics or single-change mode. They block `full` mode. + +## Assumption Ledger + +Keep an explicit ledger of assumptions. Each assumption needs a validation path +and a fallback. Do not leave assumptions implicit in implementation code. + +| Assumption | Validation | Fallback if false | +| --- | --- | --- | +| OpenCode snapshot windows are message-local | unit fixture and real-data diagnostics | metadata-only fallback | +| Source import keys are stable across re-backfill | repeated-backfill test | new-imports-only, no old rewrite | +| Review bundle dedupes safely | Phase 0 audit and bridge test | do not upgrade old events | +| Empty string survives materialization | serialization test | do not upgrade empty files | +| Existing reject helper checks current disk | desktop contract test | fix shared helper before enabling | +| Snapshot store objects remain readable long enough | retention fixture and diagnostics | metadata-only fallback | +| Part ordering is stable enough for chains | ordering unit tests | disable `full` | +| Warning predicates are complete | unit tests naming every removed warning | preserve warning | +| Stats can be emitted without content | log review and tests | disable stats or redact harder | +| Non-OpenCode fingerprints stay identical | real-data mode comparison | keep apply modes disabled | + +If an assumption has no validation path, it should be moved to "Known Unknowns" +and block `full` mode. + +## Capability And Version Gates + +Do not assume that `snapshot: true` in managed OpenCode config means snapshot +evidence is usable for every session. Treat snapshot proof as a runtime +capability that must be observed for the specific session being backfilled. + +Required capability checks: + +- OpenCode SQLite schema is supported. +- Session identity includes project id, directory, worktree, and git VCS. +- Session worktree matches the expected workspace root. +- Snapshot windows are present and paired. +- Snapshot git store reader reports the expected shape fingerprint. +- Snapshot file evidence can be read under the existing limits. +- The proof path sees the same normalized relative path in reconstruction and + snapshot evidence. + +Suggested result type: + +```ts +type SnapshotProofCapability = + | { + supported: true + shapeFingerprint: string + sessionId: string + projectId: string + } + | { + supported: false + code: + | 'sqlite-schema-unsupported' + | 'session-identity-missing' + | 'workspace-mismatch' + | 'snapshot-window-missing' + | 'snapshot-store-unsupported' + | 'snapshot-store-missing' + diagnostics: string[] + } +``` + +Rules: + +- Unsupported capability returns metadata-only fallback. +- Unknown capability returns metadata-only fallback. +- Capability diagnostics may be emitted in `shadow`. +- Capability success alone is not proof. It only allows proof attempts. + +## Mode Behavior Matrix + +The mode must determine both proof computation and proof application. + +| Mode | Compute proof? | Apply proof? | Import changed events? | Intended use | +| --- | --- | --- | --- | --- | +| `off` | no | no | no | rollback and baseline comparison | +| `shadow` | yes | no | no | validate proof quality and performance | +| `single-change` | yes | only one unresolved change per path/window | yes | safe first rollout | +| `full` | yes | one-change and verified chains | yes | optional later rollout | + +If implementation makes `shadow` import different events from `off`, it is a +bug. If implementation makes `off` compute snapshot proof, it is a performance +bug. + +## Minimum Safe Scope + +The first behavior-changing implementation should intentionally support less +than the full theoretical feature. + +Allowed in first `single-change` apply mode: + +- OpenCode only. +- `strict-delivery` only. +- One unresolved change for one normalized path inside one snapshot window. +- Text files within existing size limits. +- `write` create when before absence and after text are proven. +- `write` modify when before text, after text, and toolpart after content agree. +- `edit` modify when `oldString` occurs exactly once and produces snapshot after. +- delete when before text and after absence are proven. + +Explicitly excluded from first apply mode: + +- Multi-change same-path chains. +- `apply_patch` without parsed hunks. +- rename, chmod, binary patch, submodule, and mode-only changes. +- Any case requiring current disk as evidence. +- Any case requiring line-ending normalization. +- Any case where snapshot evidence exists but operation semantics are unclear. +- Any old metadata-only event rewrite unless source-key supersede is proven. + +This scope is deliberately conservative. The goal is to prove the pipeline, not +to maximize warning reduction in the first implementation. + +## Lowest-Confidence Areas And Mitigations + +The implementation should explicitly address the areas below because they are +where mistakes are most likely. + +### Snapshot Window Matching + +Risk: OpenCode history can contain several `step-start` and `step-finish` +records in the same assistant message. Incorrect ordering could attach a +toolpart to the wrong snapshot pair. + +Mitigation: + +- Keep the existing requirement that a toolpart must match exactly one window. +- Keep message-local matching. Do not match a toolpart to a window from another + assistant message. +- Add tests where a toolpart is before the first window, after the last window, + and inside two overlapping windows. +- If window order cannot be proven from `rawParts`, skip the upgrade. + +### Multi-Change Chains + +Risk: several edits to the same file can produce the same final content through +more than one path. This is the easiest place to create a convincing but wrong +diff. + +Mitigation: + +- Implement single-change upgrades first. +- Gate multi-change chain upgrades behind a narrow helper and dense tests. +- Do not allow a `write` in the middle of a reverse chain unless both sides of + that write are independently proven. +- Abort the whole path/window chain on the first ambiguous step. +- Add a kill switch that can disable multi-change upgrades while leaving + single-change upgrades enabled. + +### Warning Removal + +Risk: broad substring filtering can hide warnings that still matter, especially +task-boundary or attribution warnings. + +Mitigation: + +- Do not remove warnings by broad terms like `manual-only` alone. +- Centralize warning predicates and match only known OpenCode baseline/content + warning messages. +- Preserve all warnings that mention attribution, delivery, boundary, + confidence, path scope, binary, too-large, truncated, or unavailable snapshot + content. +- Add tests where a warning contains `manual-only` but is unrelated to baseline + proof. + +### Snapshot Shape Stability + +Risk: OpenCode can change SQLite or snapshot git-store shape. A shape change +could make old assumptions invalid. + +Mitigation: + +- Keep `snapshotShapeFingerprint` checks visible in diagnostics. +- Treat unknown or unsupported shapes as metadata-only fallback. +- Do not add compatibility shims that guess from partial rows. +- Add an abort condition for a real-data shape mismatch. + +### Snapshot Store Retention + +Risk: OpenCode SQLite can contain snapshot window hashes while the corresponding +git-store objects are missing, pruned, moved, or unreadable. The history then +looks promising but cannot prove full text. + +Mitigation: + +- Treat missing snapshot objects as metadata-only fallback. +- Keep a distinct diagnostic for missing store object versus unsupported shape. +- Do not retry by reading current disk. +- Do not reconstruct from only one side of the snapshot pair. +- Add a fixture where the window exists but object read fails. + +### Performance + +Risk: reading snapshot blobs for every task can become expensive on large +sessions. + +Mitigation: + +- Try snapshot proof only for unresolved OpenCode changes in strict delivery. +- Pass only unresolved touched paths to the snapshot reader unless a same-path + chain requires exact already-proven neighbors. +- Keep the current snapshot read limits. +- Add timing diagnostics around snapshot proof attempts. +- Abort rollout if repeated snapshot timeouts appear in smoke data. + +### Existing Ledger Events + +Risk: a task that was previously imported as metadata-only may later be +backfilled with better evidence. If importer semantics are append-only, the UI +could show duplicates or stale warnings. + +Mitigation: + +- Audit importer behavior before enabling upgrades for old data. +- Prefer stable source-key replacement/superseding if already supported. +- If replacement is not supported, limit the behavior change to new backfill + imports and leave old events untouched. +- Add repeated-backfill tests before real-data smoke. + +### Cache Invalidation + +Risk: desktop or worker cache may return an old metadata-only bundle after the +orchestrator has imported stronger evidence, making validation confusing or +causing stale warnings to persist. + +Mitigation: + +- Audit all cache layers in Phase 0. +- Include the OpenCode ledger fingerprint or imported event count in cache + invalidation if an existing mechanism supports it. +- For tests, clear or bypass caches instead of waiting for TTLs. +- Do not add broad cache busting for all teams. Keep invalidation scoped to the + requested team/task. + +### Partial Success Semantics + +Risk: one file in a task upgrades while another remains metadata-only. Bulk +review actions might accidentally assume the task is fully safe. + +Mitigation: + +- Keep rejectability file-level. +- Keep task-level warnings if any file remains manual-only or if boundaries are + uncertain. +- Add a mixed-task desktop test. + +## Feature Flag And Rollback + +Add a runtime guard before changing behavior: + +```ts +type SnapshotProofUpgradeMode = 'off' | 'shadow' | 'single-change' | 'full' + +function getSnapshotProofUpgradeMode(env: NodeJS.ProcessEnv): SnapshotProofUpgradeMode { + const raw = env.OPENCODE_SNAPSHOT_PROOF_UPGRADE + if (raw === '0' || raw === 'off') { + return 'off' + } + if (raw === 'shadow') { + return 'shadow' + } + if (raw === 'full') { + return 'full' + } + if (raw === 'single-change') { + return 'single-change' + } + return 'shadow' +} +``` + +Recommended rollout: + +- Default to `shadow` during development and first smoke validation. +- Move to `single-change` only after shadow stats show expected upgrades with + no behavior changes. +- Move to `full` only after multi-change chain tests and real-data smoke pass. +- Keep `off` available as an emergency rollback path. +- If `full` later becomes the default, that should be a separate rollout change + after the implementation has passed real-data smoke in explicit `full` mode. + +If the project already has a central feature-flag/env helper for OpenCode +runtime behavior, use that instead of adding a new ad-hoc parser. + +`shadow` mode is intentionally different from `off`: + +- `off` does not attempt proof. +- `shadow` attempts proof and records stats/diagnostics, but returns the + original changes to the importer. +- `single-change` applies only one-change path/window upgrades. +- `full` applies single-change and multi-change chain upgrades. + +This gives a low-risk way to validate proof quality and performance on real +data before changing review safety. + +## Architecture + +Use this pipeline: + +```text +OpenCode SQLite history + -> toolpart reconstruction + -> strict delivery attribution + -> snapshot window grouping + -> snapshot file read with limits + -> proof upgrade per path + -> validate candidate batch + -> import task-change events +``` + +The upgrade belongs in the orchestrator evidence layer, primarily around: + +- `OpenCodeChangeEvidenceEnricher.ts` +- `OpenCodeSnapshotEvidenceProvider.ts` +- `OpenCodeToolpartChangeReconstructor.ts` only if a small helper or extra metadata is needed +- tests near existing OpenCode evidence and ledger bridge tests + +Avoid touching desktop review UI for the proof itself. The desktop should only +benefit from better imported event content. + +## Composite Identity Contract + +Every full-text proof must be anchored to a composite identity. Do not rely on +any single field alone. + +Required identity dimensions: + +```ts +type SnapshotProofIdentity = { + teamName: string + taskId: string + memberName: string + laneId?: string + sessionId: string + parentUserMessageId?: string + assistantMessageId: string + sourceMessageId: string + sourcePartId: string + toolUseId: string + relativePath: string + snapshotWindowId: string + fromSnapshot: string + toSnapshot: string +} +``` + +Rules: + +- `taskId` must be canonical, not display-only. +- `memberName`, `laneId`, and `sessionId` must come from strict delivery or + already trusted session records. +- `sourceMessageId` must match the snapshot window message id. +- `sourcePartId` must be inside the matched window according to the same + message's part order. +- `relativePath` must be normalized through the existing OpenCode path helpers. +- `fromSnapshot` and `toSnapshot` must be the exact pair used to read file + evidence. + +If any identity dimension is missing, the default is `metadata-only-fallback`. + +## Ordering Contract + +Same-path chains are safe only if toolpart order is stable and proven. Use the +existing ordering data from OpenCode SQLite. Do not introduce a new sort. + +Preferred order keys, in priority order: + +1. `messageTimeCreated` +2. `messageIdSort` +3. `messagePartOrder` +4. `partId` + +Rules: + +- Do not sort only by `partId`. +- Do not sort only by timestamp. +- Do not merge parts from different `sourceMessageId` values into one chain. +- If two parts have indistinguishable order, do not upgrade the chain. +- If raw part order is unavailable, single-change upgrade may still work, but + multi-change mode must skip. + +Example guard: + +```ts +function hasStablePartOrder(parts: SourcePartSortKey[]): boolean { + const seen = new Set() + for (const part of parts) { + const key = [ + part.messageTimeCreated, + part.messageIdSort, + part.messagePartOrder, + part.partId, + ].join('\0') + if (seen.has(key)) { + return false + } + seen.add(key) + } + return true +} +``` + +If the real symbol names differ, keep the same invariant. + +## Cross-Repo Contract Boundaries + +This feature crosses the desktop repo and the orchestrator repo. Keep the +contract explicit. + +Desktop responsibilities: + +- Request OpenCode backfill only when delivery context exists. +- Keep cache/in-flight dedupe behavior. +- Render full-text events as diffs. +- Render metadata-only events as manual-only warnings. +- Use existing rejectability checks. Do not special-case OpenCode snapshot + events in the UI unless a rendering bug is found. + +Orchestrator responsibilities: + +- Read OpenCode history and snapshot evidence. +- Decide whether proof is strong enough to materialize before/after text. +- Preserve strict delivery attribution. +- Preserve source import keys. +- Emit diagnostics explaining why upgrades were skipped. + +Shared contract: + +```ts +type ReviewSafetyContract = { + sourceImportKey: string + evidenceProof: OpenCodeEvidenceProof + beforeContent: string | null + afterContent: string | null + beforeState?: { exists?: boolean; sha256?: string; sizeBytes?: number; unavailableReason?: string } + afterState?: { exists?: boolean; sha256?: string; sizeBytes?: number; unavailableReason?: string } + warnings?: string[] +} +``` + +Safe reject requires a proven historical baseline: + +```ts +function hasSafeHistoricalBaseline(change: ReviewSafetyContract): boolean { + if (change.beforeContent !== null) { + return true + } + return change.beforeState?.exists === false && change.afterContent !== null +} +``` + +The exact desktop helper may have a different name. The invariant should match +this contract. + +## Apply/Reject Execution Safety Contract + +Snapshot proof can make a review event eligible for normal diff rendering and +safe reject consideration. It must not bypass current worktree conflict checks. + +Review safety and execution safety are different: + +- Review safety answers: "Do we know the historical before/after for this + change?" +- Execution safety answers: "Can we apply or reject this change against the + user's current disk state right now?" + +This feature only upgrades review safety. It must not weaken execution safety. + +Required rules: + +- Rejecting a modify still requires the current file to match the expected after + state, or whatever stricter existing conflict check is already used. +- Rejecting a create still requires the current file to match the created after + state before deletion. +- Rejecting a delete still requires the current absence/after state to match the + expected deleted state before restoring before content. +- Accepting an OpenCode change must not overwrite unrelated current disk edits. +- Bulk `Reject All` must keep per-file conflict checks and skip unsafe files. +- Current disk mismatch should produce a conflict/manual warning, not a proof + downgrade. + +Suggested predicate split: + +```ts +function isReviewSafe(change: ReviewSafetyContract): boolean { + return isSnapshotReviewSafe(change) +} + +function canExecuteReject(input: { + change: ReviewSafetyContract + currentDiskState: { exists: boolean; sha256?: string } +}): boolean { + if (!isReviewSafe(input.change)) { + return false + } + // Use the existing project helper here. This sketch only documents that + // execution safety is a separate check from proof safety. + return currentDiskMatchesExpectedAfterState(input.change, input.currentDiskState) +} +``` + +Do not implement `currentDiskMatchesExpectedAfterState` ad hoc if the project +already has a conflict/rejectability helper. This plan requires preserving that +existing behavior. + +## Data Model Contract + +Do not introduce a new task-change event shape unless absolutely necessary. +Prefer filling existing fields: + +```ts +type UpgradedOpenCodeChangeContract = { + sourceTool: 'write' | 'edit' | 'apply_patch' | 'snapshot_patch' + sourceImportKey: string + evidenceProof: 'opencode-snapshot' | 'inverse-edit-chain' | 'inverse-apply-patch-chain' | 'toolpart-chain' + confidence: 'high' | 'exact' + beforeContent: string | null + afterContent: string | null + beforeState: { + exists?: boolean + sha256?: string + sizeBytes?: number + unavailableReason?: never + } + afterState: { + exists?: boolean + sha256?: string + sizeBytes?: number + unavailableReason?: never + } + snapshotId?: string + snapshotSource?: 'opencode' + warnings: string[] +} +``` + +Important: + +- Upgraded full-text events should not carry `unavailableReason` for the + before/after side they claim to prove. +- Metadata-only events may carry `unavailableReason`, but then they must remain + non-rejectable. +- `confidence: 'high'` is acceptable for snapshot proof. Use `exact` only for + truly exact toolpart chains that already have local full text. +- `snapshotId` is useful provenance, but it is not required for safety if the + proof was otherwise validated. Missing `snapshotId` should be diagnostic. + +## Storage And Memory Contract + +The feature must not create a second blob storage path or keep large full-text +content in memory longer than the existing ledger import requires. + +Rules: + +- Reuse the existing task-change ledger content storage. +- Do not duplicate before/after text in diagnostics, stats, or cache keys. +- Do not add per-team global caches of snapshot file content. +- Do not store both snapshot raw blobs and task-change blobs unless the existing + snapshot reader already does that internally. +- Apply the per-file and total byte limits before materializing upgraded events. +- If a file exceeds the limit, store metadata-only state with a reason. +- If many small files exceed the total byte budget, skip the excess files as + metadata-only instead of raising the limit. +- Stats should count bytes read and files skipped, but never include content. + +Suggested stats additions: + +```ts +type SnapshotProofStorageStats = { + bytesRead: number + bytesMaterialized: number + skippedByByteLimit: number + skippedByTotalBudget: number +} +``` + +Memory pressure is a reason to keep metadata-only fallback. It is not a reason +to increase limits or stream partial text into a diff. + +## Mutation Rules + +When an upgrade is skipped, only diagnostics may change. The returned +`ReconstructedOpenCodeToolChange` must preserve: + +- `beforeContent` +- `afterContent` +- `beforeState` +- `afterState` +- `operation` +- `confidence` +- `warnings` +- `evidenceProof` +- `sourceImportKey` + +When an upgrade succeeds, only these fields may change: + +- `beforeContent` +- `afterContent` +- `beforeState` +- `afterState` +- `operation`, only when the tool semantics and snapshot operation both prove it +- `confidence` +- `warnings`, only through the central resolved-warning predicate +- `evidenceProof` +- `evidenceDiagnostics` +- `snapshotId` +- `snapshotSource` + +No other fields should be rewritten by the proof upgrade. This reduces +accidental attribution changes. + +## Proof Levels + +Use explicit proof labels and keep their meaning strict. + +```ts +type OpenCodeEvidenceProof = + | 'toolpart-chain' + | 'opencode-snapshot' + | 'inverse-edit-chain' + | 'inverse-apply-patch-chain' + | 'metadata-only-fallback' +``` + +Accepted for auto review: + +- `toolpart-chain` +- `opencode-snapshot` +- `inverse-edit-chain` +- `inverse-apply-patch-chain` + +Not accepted for safe reject/apply: + +- `metadata-only-fallback` +- current disk only +- file path metadata only +- hash without text +- text without matching task/window/path proof + +## Proof Decision Tables + +### Operation State Table + +| Tool | Snapshot before | Snapshot after | Tool fields | Upgrade? | Reason | +| --- | --- | --- | --- | --- | --- | +| `write` create | absent | text | content absent or same as after | yes | create is fully proven | +| `write` modify | text | text | content absent or same as after | yes | before and after are fully proven | +| `write` modify | unavailable | text | any | no | overwrite baseline is unknown | +| `write` modify | text | text | content differs from after | no | toolpart and snapshot disagree | +| `edit` modify | text | text | old/new apply exactly once | yes | edit transition is proven | +| `edit` modify | text | text | old/new ambiguous | no | multiple valid transitions are possible | +| `apply_patch` modify | text | text | hunks verify exactly | yes | patch transition is proven | +| `apply_patch` modify | text | text | hunks missing | maybe | only if the snapshot window has exact single-file proof and no competing changes | +| delete | text | absent | delete operation | yes | delete is fully proven | +| any | binary/large/unavailable | any | any | no | full text is not available | + +### Confidence Table + +| Evidence | Confidence | Safe reject/apply? | +| --- | --- | --- | +| toolpart chain with known previous text | `exact` | yes | +| snapshot before/after with verified transition | `high` | yes | +| inverse edit/apply-patch chain with exact single replacements | `high` | yes | +| snapshot path anchor without verified transition | `medium` | no | +| metadata-only toolpart | `medium` | no | + +Do not upgrade confidence from `medium` to `high` unless safe reject/apply would +also be valid. + +## Proof State Machine + +Implement the upgrade as a state machine, not as scattered conditionals. + +```text +original change + -> not eligible + -> candidate + -> snapshot evidence requested + -> snapshot evidence matched + -> transition verified + -> upgraded change + -> validated import candidate +``` + +Failure from any state returns to the original metadata-only change plus +diagnostics. + +Allowed transitions: + +| From | To | Required condition | +| --- | --- | --- | +| original | not eligible | not OpenCode, exact already, flag off, non-strict delivery | +| original | candidate | OpenCode unresolved change, strict delivery, flag permits | +| candidate | snapshot requested | non-empty touched path set within limits | +| snapshot requested | snapshot matched | exactly one window and one file anchor | +| snapshot matched | transition verified | operation-specific before/after proof succeeds | +| transition verified | upgraded | state hashes match content and warnings stripped safely | +| upgraded | validated import candidate | existing candidate validation accepts it | + +Forbidden transitions: + +- original -> upgraded +- candidate -> upgraded +- snapshot requested -> upgraded +- snapshot matched -> upgraded without operation-specific verification +- skipped -> upgraded + +Suggested type: + +```ts +type ProofState = + | { state: 'not-eligible'; reason: SnapshotUpgradeDiagnosticCode } + | { state: 'candidate'; change: ReconstructedOpenCodeToolChange } + | { state: 'snapshot-matched'; change: ReconstructedOpenCodeToolChange; anchor: SnapshotFileAnchor } + | { state: 'transition-verified'; change: ReconstructedOpenCodeToolChange; before: string | null; after: string | null } + | { state: 'upgraded'; change: ReconstructedOpenCodeToolChange } + | { state: 'skipped'; reason: SnapshotUpgradeDiagnosticCode; original: ReconstructedOpenCodeToolChange } +``` + +The concrete implementation does not have to use this exact union, but it +should preserve the same transitions. + +## Exhaustiveness And Type Safety + +Use exhaustive switches for proof decisions, operation handling, and feature +flag modes. Do not add a permissive `default` branch that silently preserves or +upgrades without naming the case. + +```ts +function assertNever(value: never, context: string): never { + throw new Error(`Unexpected ${context}: ${String(value)}`) +} + +function applyProofDecision( + decision: SnapshotProofDecision, + original: ReconstructedOpenCodeToolChange, +): ReconstructedOpenCodeToolChange { + switch (decision.type) { + case 'upgraded': + return decision.change + case 'skipped': + return original + default: + return assertNever(decision, 'snapshot proof decision') + } +} +``` + +If TypeScript cannot prove exhaustiveness, keep the code more explicit rather +than using casts. A cast in proof code should be treated as a review smell. + +## Default Answers To Uncertainty + +Use these defaults when implementation hits an unclear case: + +| Question | Default | +| --- | --- | +| Is attribution strict enough? | no upgrade | +| Is toolpart order stable? | no multi-change upgrade | +| Does snapshot text prove the operation? | no upgrade | +| Does warning removal feel broad? | preserve warning | +| Is content text or binary? | treat as unavailable | +| Does old event replacement behavior seem unclear? | new imports only | +| Is cache invalidation unclear? | do not rely on cache for proof | +| Does UI need an OpenCode-specific branch? | fix shared helper or stop | +| Is performance impact unclear? | keep flag off or single-change only | + +These defaults are part of the safety design, not temporary indecision. + +## Formal Proof Predicates + +Implement proof decisions through small predicates that can be unit tested +directly. Avoid spreading equivalent checks across several branches. + +```ts +function isReadableFullText(value: string | null | undefined): value is string { + return typeof value === 'string' +} + +function isKnownAbsent(state: { exists?: boolean } | undefined): boolean { + return state?.exists === false +} + +function hasUnavailableReason(state: { unavailableReason?: string } | undefined): boolean { + return typeof state?.unavailableReason === 'string' && state.unavailableReason.length > 0 +} + +function isProvenCreate(change: ReviewSafetyContract): boolean { + return ( + isKnownAbsent(change.beforeState) && + isReadableFullText(change.afterContent) && + !hasUnavailableReason(change.afterState) + ) +} + +function isProvenModify(change: ReviewSafetyContract): boolean { + return ( + isReadableFullText(change.beforeContent) && + isReadableFullText(change.afterContent) && + !hasUnavailableReason(change.beforeState) && + !hasUnavailableReason(change.afterState) + ) +} + +function isProvenDelete(change: ReviewSafetyContract): boolean { + return ( + isReadableFullText(change.beforeContent) && + change.afterState?.exists === false && + !hasUnavailableReason(change.beforeState) + ) +} + +function isSnapshotReviewSafe(change: ReviewSafetyContract): boolean { + return ( + change.evidenceProof === 'opencode-snapshot' || + change.evidenceProof === 'inverse-edit-chain' || + change.evidenceProof === 'inverse-apply-patch-chain' || + change.evidenceProof === 'toolpart-chain' + ) && ( + isProvenCreate(change) || + isProvenModify(change) || + isProvenDelete(change) + ) +} +``` + +The real implementation can use existing helper names, but tests should cover +the predicates above as behavior. In particular, `unavailableReason` on a side +that claims full proof should make the change unsafe. + +## Atomicity And Failure Semantics + +Snapshot proof should behave atomically at three levels. + +Per change: + +- Success returns one upgraded change. +- Failure returns the original change unchanged plus diagnostics. +- No intermediate state should be visible to importer validation. + +Per same-path chain: + +- Success upgrades every unresolved change in the chain. +- Failure upgrades none of the unresolved changes in the chain. +- Already exact changes may remain exact, but they must not be rewritten by the + failed chain attempt. + +Per import batch: + +- Candidate validation runs after proof upgrade. +- If import fails, review safety must not observe a partially imported safe + state. +- Retry uses stable source import keys. + +Implementation pattern: + +```ts +const original = change +const decision = tryUpgradeChange(change) +if (decision.type === 'skipped') { + diagnostics.push(decision.reason) + return original +} +const upgraded = decision.change +if (!isSnapshotReviewSafe(upgraded)) { + diagnostics.push('snapshot-upgrade-skipped/postcondition-failed') + return original +} +return upgraded +``` + +Do not mutate `change` in place before postconditions pass. + +## Postconditions + +Every successful upgrade must satisfy these postconditions: + +```ts +function assertUpgradePostconditions(input: { + original: ReconstructedOpenCodeToolChange + upgraded: ReconstructedOpenCodeToolChange +}): boolean { + const { original, upgraded } = input + return ( + original.sourceImportKey === upgraded.sourceImportKey && + original.taskId === upgraded.taskId && + original.teamName === upgraded.teamName && + original.memberName === upgraded.memberName && + original.sessionId === upgraded.sessionId && + original.sourcePartId === upgraded.sourcePartId && + original.sourceMessageId === upgraded.sourceMessageId && + original.relativePath === upgraded.relativePath && + upgraded.evidenceProof !== 'metadata-only-fallback' && + isSnapshotReviewSafe(upgraded) + ) +} +``` + +If a postcondition fails, keep the original change and emit a diagnostic. A +postcondition failure is a bug in proof logic, not a reason to relax safety. + +## Runtime Assertion Policy + +Assertions should catch programmer errors without making production data unsafe. + +Rules: + +- In tests, postcondition failures should fail loudly. +- In production backfill, postcondition failures should skip the upgrade, + preserve the original metadata-only change, and emit a diagnostic. +- Assertions must never catch an error and continue with upgraded content. +- Assertions must not include file content in thrown messages. +- Assertions should include stable identifiers such as task id, source part id, + source import key, and normalized relative path. + +Suggested pattern: + +```ts +function enforceUpgradePostconditions(input: { + original: ReconstructedOpenCodeToolChange + upgraded: ReconstructedOpenCodeToolChange + diagnostics: string[] +}): ReconstructedOpenCodeToolChange { + if (assertUpgradePostconditions(input)) { + return input.upgraded + } + input.diagnostics.push( + `snapshot-upgrade-skipped/postcondition-failed:${input.original.sourceImportKey}`, + ) + return input.original +} +``` + +Do not use runtime assertions to justify looser proof predicates. Assertions are +a last guard, not the proof itself. + +## Implementation Phases + +### Phase 0 - Audit Contracts Before Behavior Changes + +This phase should be completed before any runtime behavior change. + +Audit: + +- Ledger import behavior for duplicate `sourceImportKey`. +- Review bundle dedupe behavior. +- Existing rejectability helper. +- Existing OpenCode backfill cache and in-flight behavior. +- `materializeMetadataOnlyChanges` serialization of proof fields. +- Current real-data snapshot diagnostics for a few OpenCode teams. + +Deliverable: + +```text +Contract audit: +- sourceImportKey duplicate policy: replace | supersede | append | unknown +- review bundle dedupe key: ... +- rejectability helper: ... +- metadata materialization preserves proof fields: yes | no +- observed snapshot shape fingerprint: ... +- can proceed to Phase 1: yes | no +``` + +If any field is `unknown`, do not proceed to behavior changes. + +### Phase 1 - Add Targeted Diagnostics + +Add diagnostics that explain why a metadata-only change was not upgraded. +This makes real-data validation much easier. + +Examples: + +- `snapshot-upgrade-skipped/no-window` +- `snapshot-upgrade-skipped/ambiguous-window` +- `snapshot-upgrade-skipped/no-file-anchor` +- `snapshot-upgrade-skipped/binary` +- `snapshot-upgrade-skipped/too-large` +- `snapshot-upgrade-skipped/path-chain-ambiguous` +- `snapshot-upgrade-skipped/toolpart-after-mismatch` +- `snapshot-upgrade-skipped/current-disk-not-proof` +- `snapshot-upgrade-skipped/strict-delivery-required` +- `snapshot-upgrade-skipped/unsupported-snapshot-shape` +- `snapshot-upgrade-skipped/warning-preserved` +- `snapshot-upgrade-skipped/feature-flag-off` + +These diagnostics should not be user-noisy by default, but they should be +available in backfill result diagnostics and tests. + +Diagnostics should be structured internally even if the public result remains a +string array: + +```ts +type SnapshotUpgradeDiagnosticCode = + | 'snapshot-upgrade-skipped/no-window' + | 'snapshot-upgrade-skipped/ambiguous-window' + | 'snapshot-upgrade-skipped/no-file-anchor' + | 'snapshot-upgrade-skipped/operation-mismatch' + | 'snapshot-upgrade-skipped/toolpart-after-mismatch' + | 'snapshot-upgrade-skipped/path-chain-ambiguous' + | 'snapshot-upgrade-skipped/strict-delivery-required' + | 'snapshot-upgrade-skipped/unsupported-snapshot-shape' + +function pushSnapshotDiagnostic( + diagnostics: string[], + code: SnapshotUpgradeDiagnosticCode, + detail: string, +): void { + diagnostics.push(`${code}: ${detail}`) +} +``` + +Using a small typed union makes it harder to accidentally invent inconsistent +diagnostics throughout the proof code. + +### Phase 2 - Make Upgrade Eligibility Explicit + +Add a helper that decides whether a change needs snapshot proof. It should skip +already exact full-text changes. + +```ts +function needsSnapshotProof(change: ReconstructedOpenCodeToolChange): boolean { + if (change.evidenceProof === 'toolpart-chain') { + return false + } + if (change.beforeContent !== null && change.afterContent !== null) { + return false + } + if (change.beforeState?.exists === false && change.afterContent !== null) { + return false + } + if (change.beforeContent !== null && change.afterState?.exists === false) { + return false + } + return ( + change.sourceTool === 'write' || + change.sourceTool === 'edit' || + change.sourceTool === 'apply_patch' || + change.sourceTool === 'snapshot_patch' + ) +} +``` + +Use this helper before expensive snapshot work where possible: + +```ts +const changesNeedingProof = params.changes.filter(needsSnapshotProof) +if (changesNeedingProof.length === 0) { + return result +} +``` + +Important: this helper should reduce work, not reduce safety. If in doubt, +include a change in snapshot proof attempt and let the proof logic reject it. + +Add a second helper for safety eligibility. It must be stricter than +`needsSnapshotProof`. + +```ts +function mayUseSnapshotProof(input: { + attributionMode: OpenCodeLedgerAttributionMode + change: ReconstructedOpenCodeToolChange + mode: SnapshotProofUpgradeMode +}): boolean { + if (input.mode === 'off') { + return false + } + if (input.attributionMode !== 'strict-delivery') { + return false + } + if (!needsSnapshotProof(input.change)) { + return false + } + return input.change.attributionMethod === 'delivery-ledger-taskrefs' +} +``` + +This keeps "should we spend time trying?" separate from "is this proof allowed +to affect review safety?". + +Add a third helper for apply eligibility. `shadow` may compute proof, but must +not apply it. + +```ts +function mayApplySnapshotProof(input: { + mode: SnapshotProofUpgradeMode + changeCountForPathWindow: number +}): boolean { + if (input.mode === 'off' || input.mode === 'shadow') { + return false + } + if (input.mode === 'single-change') { + return input.changeCountForPathWindow === 1 + } + return input.mode === 'full' +} +``` + +The call site should look structurally like this: + +```ts +const decision = tryComputeSnapshotProof(change) +stats.record(decision) +if (!mayApplySnapshotProof({ mode, changeCountForPathWindow })) { + return originalChange +} +return decision.type === 'upgraded' ? decision.change : originalChange +``` + +This prevents diagnostics-only validation from accidentally changing imported +review events. + +Also make the final proof decision return a typed result instead of a nullable +change. Nullable returns tend to hide why an upgrade failed. + +```ts +type SnapshotProofDecision = + | { + type: 'upgraded' + change: ReconstructedOpenCodeToolChange + proof: Exclude + } + | { + type: 'skipped' + reason: SnapshotUpgradeDiagnosticCode + preserveOriginal: true + } + +function preserveOriginal( + reason: SnapshotUpgradeDiagnosticCode, +): SnapshotProofDecision { + return { type: 'skipped', reason, preserveOriginal: true } +} +``` + +Callers should be forced to handle both branches. A skipped decision must return +the original change unchanged except for diagnostics collected outside the +change object. + +### Phase 3 - Strengthen Snapshot Anchor Matching + +Snapshot anchors should be accepted only when all these conditions hold: + +1. The change belongs to a strict delivery session. +2. The source toolpart belongs to exactly one snapshot window. +3. The snapshot window belongs to the same OpenCode message. +4. The normalized touched path is inside the session worktree. +5. The snapshot reader returns an anchor for the exact relative path. +6. Text content is full text, not binary, and within existing limits. +7. The file operation is compatible with the tool operation. + +Add a small validation helper: + +```ts +type SnapshotAnchorValidation = + | { ok: true } + | { ok: false; reason: string } + +function validateSnapshotAnchorForChange(input: { + change: ReconstructedOpenCodeToolChange + anchor: SnapshotFileAnchor | undefined +}): SnapshotAnchorValidation { + const { change, anchor } = input + if (!anchor) { + return { ok: false, reason: 'snapshot-upgrade-skipped/no-file-anchor' } + } + + if (change.operation === 'create' && anchor.operation !== 'create') { + return { ok: false, reason: 'snapshot-upgrade-skipped/operation-mismatch' } + } + + if (change.operation === 'delete' && anchor.operation !== 'delete') { + return { ok: false, reason: 'snapshot-upgrade-skipped/operation-mismatch' } + } + + if (change.operation === 'modify' && anchor.operation === 'create') { + return { ok: false, reason: 'snapshot-upgrade-skipped/operation-mismatch' } + } + + return { ok: true } +} +``` + +Do not rely only on operation matching. It is a gate, not proof. + +The validation helper should also distinguish these concepts: + +- `anchor operation`: what the snapshot diff says happened to the file. +- `tool operation`: what the reconstructed toolpart thinks happened. +- `review operation`: what the imported task-change event will expose. + +If these disagree, do not silently rewrite the operation unless the snapshot +transition and tool semantics both prove the new value. For example, a `write` +with no previous baseline may be reconstructed as `modify`; if snapshot says +`create` and before is absent, it may be upgraded to `create`. A reconstructed +`edit` must not become `create` or `delete`. + +Add source identity checks before path checks: + +```ts +function isSameSourceWindow(input: { + change: ReconstructedOpenCodeToolChange + windowMessageId: string + windowId: string + matchedWindowIds: string[] +}): boolean { + return ( + input.change.sourceMessageId === input.windowMessageId && + input.matchedWindowIds.length === 1 && + input.matchedWindowIds[0] === input.windowId + ) +} +``` + +The exact data shape can differ, but the check must prove message-local and +single-window identity before using the snapshot file anchor. + +### Phase 4 - Upgrade Single-Change Snapshot Proof + +For one change touching a file within a snapshot window, upgrade directly if the +snapshot anchor proves the full transition. + +Rules: + +- `write` create: + - accept when `beforeState.exists === false` and `afterContent` is full text. + - if toolpart content exists, require it to equal snapshot after content. +- `write` modify: + - accept when both snapshot before and after are full text. + - if toolpart content exists, require it to equal snapshot after content. +- `edit` modify: + - accept when both snapshot before and after are full text. + - require applying `oldString -> newString` to before to equal after, unless the edit came from a verified snapshot patch. +- `apply_patch`: + - accept when snapshot before and after are full text. + - if parsed hunks exist, verify before-to-after application or inverse chain. +- delete: + - accept when snapshot before is full text and snapshot after is absent. + +For phase 4, do not support "maybe" `apply_patch` upgrades without parsed hunks. +Keep those for phase 5 or leave them metadata-only. This reduces the first +behavior change to the most provable cases. + +Example helper: + +```ts +function applyEditExactlyOnce(input: { + before: string + oldString: string | undefined + newString: string | undefined +}): string | null { + if ( + typeof input.oldString !== 'string' || + typeof input.newString !== 'string' || + input.oldString === input.newString + ) { + return null + } + if (countOccurrences(input.before, input.oldString) !== 1) { + return null + } + return input.before.replace(input.oldString, input.newString) +} +``` + +Example upgrade: + +```ts +function upgradeEditFromSnapshot(input: { + change: ReconstructedOpenCodeToolChange + anchor: SnapshotFileAnchor +}): ReconstructedOpenCodeToolChange | null { + const before = input.anchor.beforeContent + const after = input.anchor.afterContent + if (typeof before !== 'string' || typeof after !== 'string') { + return null + } + + const applied = applyEditExactlyOnce({ + before, + oldString: input.change.oldString, + newString: input.change.newString, + }) + if (applied !== after) { + return null + } + + return { + ...input.change, + beforeContent: before, + afterContent: after, + beforeState: contentStateForText(before), + afterState: contentStateForText(after), + confidence: 'high', + evidenceProof: 'opencode-snapshot', + snapshotId: input.anchor.snapshotId, + snapshotSource: input.anchor.snapshotId ? 'opencode' : undefined, + warnings: stripManualOnlyWarnings(input.change.warnings, input.anchor.warnings), + } +} +``` + +Add a generic transition verifier so write/edit/apply_patch decisions share the +same state checks: + +```ts +type VerifiedTransition = + | { ok: true; beforeContent: string | null; afterContent: string | null; operation: 'create' | 'modify' | 'delete' } + | { ok: false; reason: SnapshotUpgradeDiagnosticCode } + +function verifySnapshotTransition(input: { + change: ReconstructedOpenCodeToolChange + anchor: SnapshotFileAnchor +}): VerifiedTransition { + const before = input.anchor.beforeContent + const after = input.anchor.afterContent + + if (input.anchor.operation === 'create') { + return typeof after === 'string' + ? { ok: true, beforeContent: null, afterContent: after, operation: 'create' } + : { ok: false, reason: 'snapshot-upgrade-skipped/no-file-anchor' } + } + + if (input.anchor.operation === 'delete') { + return typeof before === 'string' + ? { ok: true, beforeContent: before, afterContent: null, operation: 'delete' } + : { ok: false, reason: 'snapshot-upgrade-skipped/no-file-anchor' } + } + + if (typeof before !== 'string' || typeof after !== 'string') { + return { ok: false, reason: 'snapshot-upgrade-skipped/no-file-anchor' } + } + + return { ok: true, beforeContent: before, afterContent: after, operation: 'modify' } +} +``` + +This function should not be the final proof for `edit` or `apply_patch`. It only +proves that snapshot text exists for the operation state. + +Before returning an upgraded change, verify the emitted states match the emitted +content: + +```ts +function assertStateMatchesContent(input: { + beforeContent: string | null + afterContent: string | null + beforeState: ReconstructedOpenCodeToolChange['beforeState'] + afterState: ReconstructedOpenCodeToolChange['afterState'] +}): boolean { + if (input.beforeContent !== null) { + const expected = contentStateForText(input.beforeContent) + if (input.beforeState?.sha256 !== expected.sha256) { + return false + } + } + if (input.afterContent !== null) { + const expected = contentStateForText(input.afterContent) + if (input.afterState?.sha256 !== expected.sha256) { + return false + } + } + return true +} +``` + +If this assertion fails, keep the original metadata-only change and emit a +diagnostic. Do not import inconsistent state/content. + +### Phase 5 - Upgrade Multi-Change Same-Path Chains + +When several changes touch the same file inside one snapshot window, only +upgrade if the whole chain verifies. + +Algorithm: + +1. Start from snapshot `afterContent`. +2. Walk changes for that path in reverse source order. +3. For each change: + - if it already has full before/after, require its after to equal the cursor. + - for `edit`, reverse `newString -> oldString` exactly once. + - for `apply_patch`, reverse parsed hunks exactly once. + - for `write`, only allow it as the first/oldest operation if snapshot before + matches the previous state or known absent state. +4. If any step is ambiguous, stop and keep all unresolved warnings. +5. If the reverse chain reaches snapshot `beforeContent`, materialize + replacements for every unresolved change in the chain. + +Pseudo-code: + +```ts +function upgradeSamePathChain(input: { + changes: ReconstructedOpenCodeToolChange[] + anchor: SnapshotFileAnchor + diagnostics: string[] +}): Map { + const replacements = new Map() + let cursor = input.anchor.afterContent + + if (typeof cursor !== 'string') { + input.diagnostics.push('snapshot-upgrade-skipped/no-after-anchor') + return replacements + } + + for (let index = input.changes.length - 1; index >= 0; index -= 1) { + const change = input.changes[index] + if (!change) { + continue + } + + const upgraded = reverseOneChangeFromAfter({ change, after: cursor, anchor: input.anchor }) + if (!upgraded) { + input.diagnostics.push(`snapshot-upgrade-skipped/path-chain-ambiguous:${change.relativePath}`) + return new Map() + } + + replacements.set(change.sourceImportKey, upgraded.change) + cursor = upgraded.beforeContent + } + + if (typeof input.anchor.beforeContent === 'string' && cursor !== input.anchor.beforeContent) { + input.diagnostics.push('snapshot-upgrade-skipped/path-chain-boundary-mismatch') + return new Map() + } + + return replacements +} +``` + +This is the highest-risk section. Keep tests dense here. + +If there is any schedule pressure, defer this whole phase. Single-change +upgrades are enough to reduce many warnings and are much less risky. + +Additional multi-change restrictions: + +- Do not cross snapshot-window boundaries. +- Do not cross assistant-message boundaries. +- Do not cross task delivery boundaries. +- Do not mix changes with different `sourceMessageId`. +- Do not mix changes with different normalized `relativePath`. +- Do not include changes whose source import key is missing or duplicated. +- Do not upgrade a chain if any change in the path has an operation that cannot + be reversed from the current cursor. +- Do not upgrade if the final reverse cursor does not exactly equal snapshot + `beforeContent` for modify/delete, or known absence for create. + +Add this explicit guard: + +```ts +function assertSinglePathWindowChain(input: { + changes: ReconstructedOpenCodeToolChange[] +}): boolean { + const relativePaths = new Set(input.changes.map(change => change.relativePath)) + const messageIds = new Set(input.changes.map(change => change.sourceMessageId)) + const importKeys = new Set(input.changes.map(change => change.sourceImportKey)) + return ( + relativePaths.size === 1 && + messageIds.size === 1 && + importKeys.size === input.changes.length + ) +} +``` + +### Phase 6 - Warning Stripping Must Be Conservative + +Only remove warnings that are made false by the new proof. + +Safe to remove after verified before/after: + +- `OpenCode edit was captured without a proven full-text baseline; apply/reject is manual-only.` +- `OpenCode write overwrote an existing file before the bridge had a known baseline; reject is manual-only.` +- `OpenCode apply_patch was captured without full before/after text; review is manual-only.` +- `OpenCode toolpart content was unavailable or too large; review is manual-only.` +- `full review depends on snapshot evidence` + +Do not remove: + +- attribution warnings +- low confidence task boundary warnings +- delivery context warnings +- path outside session directory warnings +- large/binary warnings for other files +- warnings attached to unrelated changes in the same task +- snapshot unavailable warnings attached to the same file +- any warning whose text is not in the known resolved warning predicate + +Example: + +```ts +function isResolvedByFullTextProof(warning: string): boolean { + return ( + warning === 'OpenCode edit was captured without a proven full-text baseline; apply/reject is manual-only.' || + warning === 'OpenCode write overwrote an existing file before the bridge had a known baseline; reject is manual-only.' || + warning === 'OpenCode apply_patch was captured without full before/after text; review is manual-only.' || + warning === 'OpenCode toolpart content was unavailable or too large; review is manual-only.' || + warning.includes('full review depends on snapshot evidence') + ) +} + +function stripManualOnlyWarnings( + existing: string[] | undefined, + snapshotWarnings: string[] | undefined, +): string[] { + return [ + ...(existing ?? []).filter(warning => !isResolvedByFullTextProof(warning)), + ...(snapshotWarnings ?? []), + ].filter(Boolean) +} +``` + +If snapshot warnings contain unavailable content for this exact file, the change +should probably not have been upgraded. Add a test for that. + +### Phase 7 - Preserve Performance Limits And Add Budgets + +Do not increase these limits by default: + +- `maxFiles: 100` +- `maxBytesPerTextFile: 1024 * 1024` +- `maxTotalBytes: 4 * 1024 * 1024` +- `timeoutMs: 3000` + +Additional guard: + +```ts +const unresolved = params.changes.filter(needsSnapshotProof) +if (unresolved.length === 0) { + return result +} + +const touchedRelativePaths = [...new Set(unresolved.map(change => change.relativePath))] +``` + +Do not pass already exact changes into `touchedRelativePaths` unless needed for +chain verification. This keeps snapshot reads narrow. + +Add explicit performance budgets: + +- A no-op backfill with no unresolved OpenCode changes should not invoke the + snapshot reader. +- A strict-delivery task with one unresolved file should read one touched path. +- Snapshot proof attempt should record elapsed time in diagnostics when it + exceeds 500 ms. +- More than two snapshot timeouts in a real-data smoke run blocks rollout. +- The broad real-data smoke should not increase total runtime by more than 10% + compared with the baseline measured before the change. + +Implementation sketch: + +```ts +const startedAt = performance.now() +const snapshotResult = await readSnapshotEvidence() +const elapsedMs = performance.now() - startedAt +if (elapsedMs > 500) { + diagnostics.push(`snapshot-upgrade-slow: ${Math.round(elapsedMs)}ms`) +} +``` + +Use the local runtime timing primitive already used in the orchestrator if +`performance.now()` is not available in that module. + +Add a resource envelope for one backfill call: + +```ts +type SnapshotProofResourceEnvelope = { + maxSnapshotReadsPerBackfill: 10 + maxTouchedPathsPerRead: 100 + maxBytesPerTextFile: 1024 * 1024 + maxTotalBytesPerRead: 4 * 1024 * 1024 + maxElapsedMsPerRead: 3000 +} +``` + +Do not add hidden retries that can multiply these limits. One failed or timed +out snapshot read should produce diagnostics and preserve metadata-only changes. + +### Phase 8 - Idempotency And Existing Ledger Events + +The upgrade may change the materialized content for a source event that was +previously imported as metadata-only. That needs a clear policy. + +Preferred policy: + +1. Keep `sourceImportKey` stable. +2. Let the existing ledger importer treat the upgraded event as the same source + event, not a new file change. +3. If the importer is append-only and cannot update a previous event safely, + do not attempt to rewrite old ledger data in this feature. +4. For new tasks, the upgraded evidence should be imported on the first backfill. +5. For old tasks, a re-backfill can show better evidence only if the existing + ledger/import layer already supports replacing or superseding by source key. + +Add a test for repeated backfill. It should not duplicate files in the review +bundle. + +### Phase 9 - Desktop Contract Validation + +This phase should not add new UI behavior unless tests expose a bug. It validates +that the upgraded events are already consumed safely. + +Checklist: + +- Full-text upgraded OpenCode event renders through the same path as Codex full-text diffs. +- Metadata-only OpenCode event still renders the warning banner. +- Mixed full-text and metadata-only task keeps per-file rejectability. +- `Reject All` skips metadata-only files. +- Current disk preview remains read-only context. +- Task summary warnings remain visible if attribution or boundary warnings remain. + +If any item fails, fix the shared review safety helper rather than adding a +separate OpenCode-specific branch in the UI. + +## Observability And Metrics + +Add counters to diagnostics or existing debug output. They should be cheap and +safe to expose in test logs. + +Suggested counters: + +```ts +type SnapshotProofStats = { + attemptedChanges: number + upgradedChanges: number + skippedChanges: number + skippedByReason: Record + snapshotReadCount: number + snapshotReadTimeouts: number + snapshotReadElapsedMs: number + touchedPathCount: number + exactToolpartChainCount: number + metadataOnlyFallbackCount: number +} +``` + +Use these stats in smoke output: + +```text +OpenCode snapshot proof: +- attempted: 12 +- upgraded: 7 +- skipped: 5 +- skipped/no-window: 2 +- skipped/path-chain-ambiguous: 1 +- skipped/too-large: 2 +- snapshot reads: 3 +- snapshot read time: 184ms +``` + +Metrics must not include file content or secrets. Paths are acceptable only if +the existing diagnostics already expose paths in the same context. + +## Deterministic Output Comparison + +Use deterministic fingerprints to compare `off`, `shadow`, and apply modes. +This catches accidental behavior changes that are hard to see in UI screenshots. + +Suggested fingerprint input: + +```ts +type ReviewBundleFingerprintInput = Array<{ + taskId: string + relativePath: string + sourceImportKey: string + evidenceProof: string | undefined + operation: string + beforeSha256?: string + afterSha256?: string + warningCount: number + rejectable: boolean +}> +``` + +Rules: + +- `off` and `shadow` fingerprints must match except for diagnostics/stats. +- `single-change` may change OpenCode entries only. +- `full` may change OpenCode entries only. +- Non-OpenCode entries must have identical fingerprints in every mode. +- Fingerprints must not include raw file content. + +If a mode comparison fails, inspect the structured diff before looking at UI. + +## Cache And Re-Backfill Policy + +The safest initial policy is: + +- New backfills may import upgraded proof. +- Existing metadata-only events should not be rewritten unless the current + importer already has a proven source-key replacement/supersede path. +- The desktop cache should not be globally invalidated. +- A task-specific refresh may re-read after successful OpenCode import. +- If cache behavior is unclear, tests should bypass cache and the rollout should + leave old events unchanged. + +Pseudo-policy: + +```ts +type ExistingEventPolicy = 'new-imports-only' | 'supersede-by-source-key' + +function chooseExistingEventPolicy(audit: { + importerSupersedesBySourceKey: boolean + reviewBundleDedupesBySourceKey: boolean +}): ExistingEventPolicy { + return audit.importerSupersedesBySourceKey && audit.reviewBundleDedupesBySourceKey + ? 'supersede-by-source-key' + : 'new-imports-only' +} +``` + +Do not create a third policy that appends upgraded duplicates and relies on UI +filtering to hide the old event. + +## Rollback Runbook + +Rollback must be possible without data repair. + +Immediate rollback: + +```bash +OPENCODE_SNAPSHOT_PROOF_UPGRADE=off +``` + +Expected behavior after rollback: + +- New OpenCode backfills return to previous metadata-only/manual-only behavior + for cases without exact toolpart chains. +- Existing already-imported upgraded events remain valid historical full-text + events. Do not delete them as part of rollback. +- No new upgraded events should be imported while the flag is off. +- Desktop review should continue to render previously imported full-text events. + +If rollback is needed because upgraded duplicates were imported: + +1. Do not add renderer-side filtering as a permanent fix. +2. Identify whether duplicates share `sourceImportKey`. +3. Fix importer/source-key dedupe. +4. Add a regression test with the duplicated event fixture. +5. Only then consider a one-off ledger cleanup, and only with explicit user + approval. + +If rollback is needed because of performance: + +1. Keep diagnostics. +2. Disable proof upgrade. +3. Preserve exact `toolpart-chain` behavior. +4. Inspect snapshot read counters and touched path counts. +5. Re-enable only after reducing reads, not after raising limits. + +## Implementation Slices + +Prefer these slices even if the work lands in one PR. Each slice should compile +and have focused tests before the next slice starts. + +1. Diagnostics only: + - Add typed diagnostic codes. + - Add stats object. + - No behavior change. +2. Eligibility only: + - Add feature flag parser. + - Add `needsSnapshotProof` and `mayUseSnapshotProof`. + - Prove `off` mode has no behavior change. +3. Shadow proof: + - Compute proof decisions and stats. + - Return original changes to importer. + - Compare `shadow` and `off` outputs. +4. Single-change proof: + - Implement formal predicates. + - Implement create/modify/delete proof for one path/window/change. + - Keep multi-change groups skipped. +5. Import/idempotency validation: + - Verify source-key dedupe or choose `new-imports-only`. + - Add repeated-backfill tests. +6. Desktop validation: + - Verify shared rejectability consumes upgraded events safely. + - No OpenCode-specific renderer branch unless a shared helper bug is found. +7. Multi-change proof: + - Implement only after ordering contract tests pass. + - Keep behind `full`. +8. Default enablement: + - Enable `single-change` only after real-data smoke. + - Enable `full` only in a separate rollout decision. + +Stop points: + +- It is acceptable to stop after slice 3 and ship only `shadow`. +- It is acceptable to stop after slice 4 and ship only `single-change`. +- It is acceptable to stop after diagnostics if real data shows unsupported + snapshot shape. +- It is not acceptable to ship multi-change proof without real or synthetic + chain coverage. + +## Definition Of Done By Mode + +### `off` + +- No behavior change from current metadata-only/full-text decisions. +- Diagnostics may mention that the feature is disabled. +- Tests prove no upgraded event appears in this mode. + +### `shadow` + +Required before any apply mode can be default: + +- Proof attempts run for eligible OpenCode changes. +- Importer receives the original change list. +- Stats include would-upgrade and skipped counts. +- No review diff, rejectability, warning, or file count changes. +- Real-data smoke shows non-OpenCode teams unchanged. +- Performance budget passes while proof is computed but not applied. + +### `single-change` + +Required before this mode can be default: + +- Only one unresolved change for a path/window can upgrade. +- Multi-change path/window groups are skipped with diagnostics. +- `write` create/modify, `edit` modify, and delete cases have positive and + negative tests. +- Non-OpenCode teams are unchanged in real-data smoke. +- Metadata-only count for OpenCode tasks decreases or stays equal. +- No duplicate review rows after repeated backfill. + +### `full` + +Required before this mode can be default: + +- Every known unknown that blocks full mode is resolved. +- Same-path multi-change order is proven by tests. +- Chain upgrades are all-or-nothing for unresolved changes. +- Real-data smoke includes at least one actual multi-change chain or a synthetic + fixture with equivalent shape. +- `full` mode can be disabled without changing code. +- A separate rollout decision enables `full`; it must not become default as a + side effect of implementing single-change mode. + +## Edge Case Matrix + +### Attribution and Task Boundaries + +- No delivery context: + - Do not run strict snapshot upgrade. + - Keep existing backfill skipped behavior. +- Delivery context exists but does not include the requested task: + - Keep `no-attribution` behavior. Do not use compatible fallback for safe full-text. +- Compatible attribution mode: + - Do not upgrade to auto-safe full text. + - Reason: task ownership is not strict enough. +- Missing task start boundary: + - Snapshot proof may prove file content, but task boundary warning remains. +- Estimated end boundary: + - Snapshot proof may prove file content, but boundary warning remains. +- Same OpenCode session contains several tasks: + - Only strict delivery records for the requested task are eligible. +- Same member touches same file for two tasks: + - Do not merge changes across delivery windows. +- Multiple members share an OpenCode profile: + - Require the delivery record member/lane/session match. Do not trust profile alone. +- Runtime delivery ledger was reset after launch: + - No strict delivery context means no safe upgrade. Keep warnings. +- Delivery record has task refs but missing observed assistant message: + - Do not use message-local snapshot proof unless the toolpart can still be + tied to the delivered prompt through existing strict delivery matching. +- Delivery record has a pre-prompt cursor but no post-prompt cursor: + - Keep strict-delivery matching conservative. Do not widen to the whole session. +- Task display id matches but canonical task id differs: + - Use canonical task id for safe upgrades. + +### Snapshot Windows + +- No snapshot windows: + - Keep metadata-only warning. +- Toolpart outside window: + - Keep metadata-only warning. +- Toolpart matches multiple windows: + - Keep metadata-only warning. +- Window has before hash but no after hash: + - Keep metadata-only warning. +- Window has after hash but no before hash: + - Allow create only if file absence is explicitly proven. Otherwise keep warning. +- Snapshot diff contains the path but operation is unknown: + - Keep metadata-only warning. +- Snapshot diff includes more changed files than reconstructed toolparts: + - Upgrade only exact reconstructed paths. Add diagnostic for extra snapshot paths. +- Snapshot diff misses a reconstructed path: + - Keep that path metadata-only. +- OpenCode SQLite changed during read: + - Existing transaction snapshot is okay, but add diagnostic. +- OpenCode schema changed: + - Treat as unsupported history shape, no upgrade. +- Snapshot git store object is missing: + - Keep metadata-only warning and include the store diagnostic. +- Snapshot git store read times out: + - Keep metadata-only warning and include timeout diagnostic. +- Snapshot window hashes exist but git-store object is pruned: + - Keep metadata-only warning and include retention diagnostic. +- Snapshot window hashes exist but point to an object from a different project: + - Treat as workspace mismatch and skip. +- Snapshot window contains no reconstructed changes after path filtering: + - Do not read files for that window. +- Snapshot reader returns duplicate entries for one relative path: + - Treat as ambiguous and skip that path. +- Snapshot reader returns content for a path with different casing: + - Use existing normalized comparison key. If identity is ambiguous, skip. +- Snapshot window is valid but OpenCode part JSON was truncated by our reader + cap: + - Treat affected toolparts as metadata-only. Do not combine partial part data + with snapshot proof. +- Snapshot window contains changes from a tool type not modeled by this plan: + - Keep those changes metadata-only until the tool type has explicit tests. + +### File Content + +- Text over size limit: + - Keep warning with `too-large`. +- Binary or null-byte: + - Keep warning with `binary`. +- Empty file: + - Valid text content. Do not confuse empty string with unavailable. +- Missing file after delete: + - Valid delete if before content is known. +- Missing file before create: + - Valid create if after content is known. +- File exists before create operation: + - Operation mismatch, no upgrade. +- File absent after modify operation: + - Operation mismatch, no upgrade. +- Content normalizes differently by line endings: + - Do not normalize for proof. Exact byte-equivalent UTF-8 text comparison is required. +- Content has invalid UTF-8: + - Treat as binary/unavailable. +- Generated/minified text below limit: + - It can be upgraded if full text is available, but review UI may still choose to collapse display. +- File mode-only changes: + - Do not create a text diff upgrade unless text before/after also changed or mode changes are explicitly modeled. +- Very small binary file: + - Size does not make it text. Binary detection still wins. +- UTF-16 or other non-UTF-8 text: + - Treat as unavailable unless the existing snapshot reader explicitly decodes + and hashes the exact same text representation used by review events. +- Secrets in file content: + - Do not log content in diagnostics. Existing ledger storage rules apply to + before/after blobs. +- Git LFS pointer file: + - Treat the pointer text as the file content if that is what the snapshot + contains. Do not dereference LFS objects. +- Sparse checkout missing working-tree file: + - Irrelevant for proof. Snapshot evidence may still be valid, but execution + safety must handle current disk conflict separately. +- Submodule path: + - Do not read inside submodule git data unless existing snapshot reader + explicitly models submodules. Treat as metadata-only otherwise. +- Permission denied reading snapshot object: + - Keep metadata-only warning and include a permission diagnostic. + +### Edit Semantics + +- `oldString === newString`: + - Skip as no-op as current code does. +- `oldString` missing: + - Cannot prove edit from toolpart alone. +- `newString` missing: + - Cannot prove edit from toolpart alone. +- `oldString` appears twice: + - No upgrade unless snapshot chain proves exact transition through another trusted source. +- `newString` appears twice when reversing: + - No inverse upgrade. +- Replacement creates same final content through multiple possible paths: + - No upgrade. +- `replaceAll` or multi-replacement edit shape appears: + - Do not upgrade until that tool shape is explicitly parsed and tested. +- Edit tool reports success but snapshot before does not contain `oldString`: + - No upgrade. +- Edit tool reports success but snapshot after does not contain `newString`: + - No upgrade unless the replacement legitimately deletes the string and the exact transition verifies. +- Empty `oldString`: + - Do not upgrade. Empty search strings are ambiguous. +- Empty `newString`: + - Valid deletion only when `oldString` occurs exactly once and snapshot after + equals the deletion result. +- Overlapping replacements: + - Do not upgrade unless exact before-to-after application has one valid path. + +### Write Semantics + +- Write creates new file: + - Upgrade only if before absent and after content known. +- Write overwrites existing file: + - Upgrade only if snapshot before and snapshot after are known. +- Toolpart content differs from snapshot after: + - No upgrade. +- Toolpart content is truncated: + - Snapshot can still prove after only if snapshot after is available and operation is fully verified. + - Keep a diagnostic that toolpart content was truncated but snapshot proof was used. +- Existing file baseline unavailable: + - Keep current warning. +- Write after earlier edit in the same path/window: + - Treat as chain case. Do not single-change upgrade. +- Write followed by edit in the same path/window: + - Treat as chain case. Single-change upgrade is not enough. +- Write content is available but snapshot after is unavailable: + - Do not use toolpart content alone for existing-file overwrite baseline. +- Write content equals previous content: + - It may be a no-op. Do not create a misleading modify diff unless snapshot + shows a real text state transition or the review event model supports no-op + changes explicitly. +- Write creates parent directories: + - Only the file text is in scope. Directory creation is not a text proof. + +### Apply Patch Semantics + +- Patch text unavailable: + - Snapshot can prove final file-level before/after only if window has exact path anchor. +- Parsed update hunks apply exactly once: + - Allow inverse chain proof. +- Parsed hunks apply multiple places: + - No upgrade. +- Patch creates/deletes file: + - Verify operation with snapshot before/after states. +- Patch touches files not in metadata: + - Add diagnostic and do not upgrade missing paths unless snapshot path proof is exact. +- Patch contains rename: + - Do not upgrade as text modify unless rename support is explicitly modeled. +- Patch changes file mode only: + - Keep metadata-only unless mode changes are supported by the review event schema. +- Patch contains CRLF-sensitive context: + - Exact text verification is required. Do not line-ending-normalize. +- Patch partially applies in reverse: + - No upgrade. All hunks must verify. +- Patch has context-only hunks: + - Do not treat context as a change without before/after text proof. +- Patch deletes and recreates the same file in one patch: + - Treat as ambiguous unless the parser explicitly models it and tests cover it. + +### Paths and Workspaces + +- Absolute paths outside workspace: + - Reject upgrade. +- `..` path traversal: + - Reject upgrade. +- Windows path separators: + - Normalize, then validate. +- Symlink points outside workspace: + - Do not read current disk. Snapshot git store path normalization should be trusted only for repository paths. +- Session directory is subdirectory: + - Touched paths outside session directory get diagnostic. Do not let this alone prove or disprove content. +- Case-insensitive filesystem: + - Use existing OpenCode path comparison helpers. +- Unicode normalization differences in file names: + - Use existing normalized path keys. Do not add a second normalization scheme in this feature. +- Nested git repository inside workspace: + - Verify snapshot identity against the OpenCode project worktree, not just the process cwd. +- Worktree moved after task: + - Use recorded project identity. If workspace identity cannot be trusted, no upgrade. +- Workspace root is a symlink: + - Use existing workspace comparison helpers. Do not add ad-hoc `realpath` + behavior unless tests cover both symlinked and non-symlinked roots. +- File path contains newline or control characters: + - Do not include raw path in diagnostics without escaping. Upgrade only if + existing path normalization accepts it safely. +- Case-only rename: + - Treat as rename/path operation, not a text modify, unless the review event + schema explicitly models it. +- Path appears both as file and directory across before/after: + - Keep metadata-only unless snapshot reader explicitly models the transition. + +### Concurrency and Later Changes + +- Current disk changed after task: + - Irrelevant. Do not use current disk for proof. +- Another member changed same file after OpenCode task: + - Snapshot proof remains historical. Review conflict detection must happen elsewhere. +- Backfill runs twice: + - Source import keys must dedupe. +- Backfill interrupted: + - Existing ledger import must remain idempotent. +- OpenCode host is still writing SQLite: + - Rely on read-only transaction and existing fingerprint diagnostics. Do not retry aggressively. +- Two backfills run concurrently: + - Existing in-flight dedupe should prevent duplicate desktop calls. The importer must still dedupe by source key. +- User manually edits a file while review is open: + - Snapshot proof remains historical. Apply/reject conflict handling is outside this feature. +- Team is relaunched while backfill is running: + - Use run/session identity from the delivery context. Do not merge new runtime + sessions into the old task proof. +- Snapshot proof succeeds but ledger import fails: + - Retry should be idempotent by source key. Diagnostics should not mark the + task as safely upgraded until import succeeds. +- Snapshot store is pruned between capability check and file read: + - Treat the read failure as metadata-only fallback. Do not retry from current disk. +- OpenCode writes a new assistant message while backfill reads SQLite: + - Use the read-only transaction snapshot and existing fingerprint diagnostics. + Do not merge later rows into the current proof attempt. +- SQLite WAL is corrupt or cannot be read: + - Treat session history as unavailable/unsupported. Do not use partial rows for + safe proof. +- OpenCode JSON row is malformed: + - Skip that row and keep affected changes metadata-only. Do not infer from + surrounding rows. + +### UI And Review Semantics + +- Full-text upgrade enables normal diff rendering only if the imported event has + both safe baseline and safe target state. +- Metadata-only warnings should remain visible and should not be hidden by task + summary aggregation. +- `Reject All` must still skip non-rejectable files. +- A task-level warning may remain even when all file diffs are full-text. +- A file-level warning may remain even when another file in the same task is upgraded. +- Do not change viewed-count behavior in this feature. +- Do not hide task cards solely because all OpenCode warnings were resolved. +- Do not change accept/reject button labels or statuses in this feature. +- Do not mark a file viewed just because snapshot proof succeeded. +- A file becoming review-safe does not mean reject execution must succeed if the + user changed the worktree after the task. +- Conflict messaging for apply/reject should remain the existing shared + behavior, not a new OpenCode-only message path. +- If a task card warning disappears because all file-level OpenCode baseline + warnings were resolved, task-boundary and attribution warnings must still + remain visible. +- The UI should not describe a snapshot-upgraded file as "guaranteed safe". + It is "review-safe" or "full-text verified"; execution can still conflict. +- Do not add success toasts or celebratory messaging for proof upgrades. This is + infrastructure, not a user-facing achievement. + +### Security And Privacy + +- Do not log before/after content. +- Do not include long snippets in diagnostics. +- Do not include raw paths with control characters in diagnostics. +- Do not include delivery payload text in proof stats. +- Do not expand file size limits for convenience. +- Do not add a new IPC path that exposes arbitrary snapshot reads. +- If a file is upgraded, it is stored through the existing task-change ledger + content path. Do not add a second storage location. + +### Serialization And Backward Compatibility + +- Older desktop builds may see new diagnostics but should not require a new + event schema to render metadata-only fallback. +- Missing `snapshotSource` should not crash review rendering. +- Missing `snapshotId` should not crash review rendering. +- Unknown `evidenceProof` values should be treated as unsafe by review safety + helpers. +- JSON serialization must preserve empty string content. +- JSON serialization must distinguish absent file from empty file. +- Large content omitted by limits must serialize as unavailable state, not empty + content. + +## Test Plan + +### Risk To Test Traceability + +| Risk | Required test/smoke | +| --- | --- | +| Cross-task contamination | real-data smoke with at least two tasks in one OpenCode session | +| Cross-member contamination | fixture with shared profile but different member/lane | +| Wrong snapshot window | unit test with overlapping windows and outside-window toolpart | +| False baseline from current disk | unit test proving current disk is never consulted | +| Unsafe warning removal | unit test with unrelated `manual-only` warning preserved | +| Duplicate imported events | repeated-backfill bridge test | +| Performance regression | smoke budget with snapshot read counters | +| Unsupported OpenCode shape | snapshot provider unsupported-shape test | +| Mixed safe/unsafe task | desktop integration test for `Reject All` skipping metadata-only | +| Cache stale result | bridge or desktop worker test bypassing/invalidating cache deliberately | +| Capability false positive | fixture with snapshot enabled but missing store object | +| Shadow mode mutation | fingerprint comparison between `off` and `shadow` | +| Snapshot retention loss | fixture where window exists but object read fails | +| Execution conflict bypass | desktop/review test where current disk differs from expected after | +| Memory/storage blowup | fixture with many small files exceeding total byte budget | +| Malformed OpenCode rows | offline reader/reconstructor fixture with malformed part JSON | + +### Negative Control Fixtures + +Negative controls are cases that look close to valid proof but must not upgrade. + +Required negative controls: + +- Same file path, same member, but different task id. +- Same task id, same file path, but different member/lane. +- Same session and file path, but toolpart outside the snapshot window. +- Snapshot before/after text exists, but `oldString` occurs twice. +- Snapshot after equals toolpart content, but before is unavailable. +- Snapshot path matches, but operation is rename or mode-only. +- Current disk matches expected after, but snapshot before is missing. +- `shadow` computes an upgrade decision, but imported fingerprint matches `off`. +- Existing metadata-only event appears before upgraded event with same source key. +- Unknown `evidenceProof` appears in imported data. + +Each negative control should assert both behavior and diagnostic reason. A +negative control without a reason is hard to debug and easy to regress. + +### Golden Fixture Coverage Matrix + +Maintain a small set of golden fixtures that cover the supported state space. + +| Fixture | Mode | Expected | +| --- | --- | --- | +| write-create-text | `single-change` | upgraded create | +| write-modify-text | `single-change` | upgraded modify | +| edit-modify-once | `single-change` | upgraded modify | +| delete-text | `single-change` | upgraded delete | +| duplicate-old-string | `single-change` | metadata-only | +| missing-before | `single-change` | metadata-only | +| toolpart-outside-window | `single-change` | metadata-only | +| shadow-valid-edit | `shadow` | would-upgrade stats, original import | +| non-opencode-task | all modes | unchanged fingerprint | +| missing-snapshot-object | all apply modes | metadata-only | +| multi-change-chain | `single-change` | skipped | +| multi-change-chain | `full` | upgraded only if chain verifies | + +Golden fixtures should be tiny and deterministic. They should not depend on +wall-clock time, filesystem case behavior, or the user's current worktree. + +### Unit Tests + +Add or extend `OpenCodeChangeEvidenceEnricher.test.ts`. + +Tests: + +1. Upgrades metadata-only edit from exact snapshot before/after. +2. Does not upgrade edit when `oldString` appears twice. +3. Does not upgrade edit when snapshot after does not equal applied result. +4. Upgrades write create with before absent and after text. +5. Upgrades write modify when toolpart content equals snapshot after. +6. Does not upgrade write modify when toolpart content differs from snapshot after. +7. Upgrades delete with before text and after absent. +8. Does not remove attribution warnings after content proof. +9. Keeps manual-only warning when anchor has unavailable before/after content. +10. Multi-edit same-path chain upgrades only when the whole chain verifies. +11. Multi-edit same-path chain keeps all metadata-only fallbacks when one step is ambiguous. +12. Snapshot provider unavailable keeps current behavior. +13. Does not remove unrelated `manual-only` warning text. +14. Keeps task boundary warnings after successful content proof. +15. Does not upgrade compatible attribution mode. +16. Does not upgrade when feature flag is `off`. +17. Single-change mode skips multi-change chain upgrade. +18. Empty file create and empty file modify are handled as valid text. +19. CRLF/LF mismatch fails proof instead of normalizing. +20. Duplicate source import keys block chain upgrade. +21. Empty `newString` deletion upgrades only with exact single occurrence. +22. Empty `oldString` never upgrades. +23. State hashes must match emitted full text. +24. Snapshot anchor duplicate path entry skips upgrade. +25. `write` no-op does not create a misleading diff. +26. Skipped proof preserves the original change object fields. +27. Successful proof mutates only allowed fields. +28. Snapshot proof decision returns typed skipped reason, not `null`. +29. Unsupported snapshot shape never upgrades. +30. Existing-event policy defaults to `new-imports-only` when dedupe is unknown. +31. State machine cannot jump from candidate to upgraded without transition verification. +32. Unstable part order blocks multi-change upgrade. +33. Unknown `evidenceProof` is unsafe in review safety helper. +34. Empty string survives materialization and serialization. +35. Absent file is not serialized as empty file. +36. `shadow` mode computes proof stats but returns original changes. +37. `mayApplySnapshotProof` blocks multi-change groups in `single-change`. +38. Exhaustive switches fail compilation when a new mode/proof decision is not handled. +39. Capability success is required before snapshot proof attempt. +40. Missing snapshot git-store object keeps metadata-only fallback. +41. `off` and `shadow` review bundle fingerprints match. +42. Non-OpenCode fingerprints are identical across all modes. +43. Review-safe upgraded change still fails reject execution when current disk mismatches expected after. +44. Total byte budget skips excess files as metadata-only. +45. Malformed OpenCode part JSON cannot produce upgraded proof. +46. LFS pointer text is not dereferenced. +47. Submodule paths stay metadata-only unless explicitly modeled. +48. Runtime postcondition failure preserves original metadata-only change. +49. Every golden fixture has a paired negative control. +50. Minimum safe scope excludes unsupported operation shapes. + +Example fixture shape: + +```ts +const change: ReconstructedOpenCodeToolChange = { + taskId: 'task-1', + taskRef: 'task-1', + taskRefKind: 'canonical', + teamName: 'team', + memberName: 'alice', + sessionId: 'session', + assistantMessageId: 'message-1', + toolUseId: 'tool-1', + sourcePartId: 'part-1', + sourceMessageId: 'message-1', + sourceTool: 'edit', + sourceImportKey: 'session:part-1:src/app.ts', + filePath: '/workspace/src/app.ts', + relativePath: 'src/app.ts', + beforeContent: null, + afterContent: null, + operation: 'modify', + confidence: 'medium', + attributionMethod: 'delivery-ledger-taskrefs', + oldString: 'const value = 1', + newString: 'const value = 2', + beforeState: { exists: true, unavailableReason: 'opencode-edit-baseline-not-captured' }, + afterState: { exists: true, unavailableReason: 'opencode-edit-final-content-unavailable' }, + evidenceProof: 'metadata-only-fallback', + warnings: ['OpenCode edit was captured without a proven full-text baseline; apply/reject is manual-only.'], + timestamp: new Date(0).toISOString(), +} +``` + +### Synthetic Fixture Schema + +Use a compact fixture builder so edge cases do not depend entirely on live +OpenCode data. + +```ts +type SnapshotProofFixture = { + name: string + mode: SnapshotProofUpgradeMode + attributionMode: OpenCodeLedgerAttributionMode + delivery: { + teamName: string + taskId: string + memberName: string + laneId?: string + sessionId: string + assistantMessageId: string + } + windows: Array<{ + messageId: string + windowId: string + fromSnapshot: string + toSnapshot: string + startPartOrder: number + finishPartOrder: number + }> + parts: Array<{ + partId: string + messageId: string + order: number + tool: 'write' | 'edit' | 'apply_patch' + filePath: string + oldString?: string + newString?: string + content?: string + }> + snapshotFiles: Array<{ + relativePath: string + beforeContent?: string + afterContent?: string + beforeExists: boolean + afterExists: boolean + }> + expected: { + upgraded: number + metadataOnly: number + diagnostics: string[] + } +} +``` + +Fixture rules: + +- Every positive fixture needs a paired negative fixture that differs by one + proof condition. +- Fixtures should prefer tiny strings so failures are easy to inspect. +- Fixtures must include at least one empty string case and one absent-file case. +- Fixtures must include one path with unsafe characters for diagnostics escaping. +- Fixtures must not include secrets or large blobs. + +### Snapshot Provider Tests + +Extend `OpenCodeSnapshotEvidenceProvider.test.ts`. + +Tests: + +1. Groups only unresolved proof-needed changes into touched paths. +2. Emits diagnostic for missing window. +3. Emits diagnostic for ambiguous window. +4. Preserves existing limits. +5. Does not read snapshot for unrelated exact changes. +6. Does not match windows across assistant messages. +7. Emits diagnostic for extra snapshot paths not in reconstructed toolparts. +8. Emits timeout diagnostic while preserving metadata-only fallback. +9. Does not read snapshot windows with no unresolved touched paths. +10. Escapes unsafe path text in diagnostics. + +### Ledger Bridge Tests + +Extend `OpenCodeLedgerBridgeService` tests or add a focused fixture test. + +Tests: + +1. Backfill imports upgraded full-text event for strict delivery OpenCode edit. +2. Backfill keeps metadata-only event for compatible attribution. +3. Backfill keeps metadata-only event with no delivery context. +4. Imported event has stable source import key and dedupes on rerun. +5. `snapshotShapeFingerprint` is present when snapshot proof was used. +6. Repeated backfill does not duplicate file entries. +7. Old metadata-only imported event is not rewritten unless importer already supports superseding by source key. +8. Snapshot proof is not attempted for Codex or Anthropic members. +9. Snapshot proof is not attempted for OpenCode exact `toolpart-chain` changes. +10. Backfill cache does not return stale metadata-only data after an upgraded import in the same test. +11. Import failure leaves no partial safe-review state. + +### Desktop Integration Tests + +Only if needed. The desktop review UI already handles full text and metadata-only. + +Smoke check: + +1. Full-text OpenCode upgraded event renders a real diff. +2. Metadata-only event still renders manual-only warning. +3. Reject is enabled only for full-text safe baseline. +4. Warnings remain visible for task boundary uncertainty. +5. `Reject All` skips a mixed task where one OpenCode file upgraded and another stayed metadata-only. +6. Current disk preview remains read-only and does not become a reject baseline. +7. Viewed count is unchanged by proof upgrade. +8. Task-level boundary warning remains visible after all file diffs upgrade. +9. Reject execution still blocks when current disk no longer matches the expected + after state. +10. Bulk `Reject All` rejects only files that pass both review-safety and + execution-safety checks. + +### Property-Like Tests + +Add small table-driven tests for transition verification: + +```ts +const editCases = [ + { name: 'single replacement', before: 'a = 1', oldString: '1', newString: '2', after: 'a = 2', ok: true }, + { name: 'duplicate old', before: 'a 1 b 1', oldString: '1', newString: '2', after: 'a 2 b 1', ok: false }, + { name: 'empty old', before: 'abc', oldString: '', newString: 'x', after: 'xabc', ok: false }, + { name: 'delete exactly once', before: 'abc', oldString: 'b', newString: '', after: 'ac', ok: true }, +] +``` + +The point is not random fuzzing. The point is to make ambiguous replacement +rules explicit and hard to regress. + +### Real Data Smoke + +Before implementation, capture a baseline: + +```bash +time pnpm test --run test/main/services/team/TaskChangeComputer.test.ts +time pnpm test --run test/main/services/team/ChangeExtractorService.test.ts +``` + +After implementation, run the same commands: + +```bash +pnpm test --run test/main/services/team/TaskChangeComputer.test.ts +pnpm test --run test/main/services/team/ChangeExtractorService.test.ts +``` + +Then run the existing real-data smoke scripts used for task changes. Required +checks: + +- `errors: 0` +- no increase in item errors +- no cross-task file leakage +- no increase in metadata-only count for OpenCode tasks +- no change for Codex-only teams +- no change for Anthropic-only teams +- broad smoke runtime increase <= 10% +- snapshot timeout count <= 2 +- upgraded OpenCode full-text count is explainable by diagnostics +- no decrease in task-boundary warnings unless task-boundary code changed separately +- `off` and `shadow` fingerprints match except diagnostics/stats +- non-OpenCode fingerprints match in all modes + +Manual target cases: + +- `relay-works-3/#1f735bea` +- `relay-works-3/#bf01e5c3` +- `relay-works-3/#43e6b9b0` should remain Codex-related, not OpenCode-upgraded +- `signal-ops-22` should remain unaffected because it has no OpenCode members +- any OpenCode team with real `snapshotShapeFingerprint` present in diagnostics +- one team with missing/reset delivery ledger, if available + +Add at least one synthetic OpenCode snapshot fixture if real data lacks a clean +single-change full-text snapshot case. Real data validates integration, but a +synthetic fixture is better for precise edge cases. + +Real-data smoke should compare before/after summaries: + +```text +Before: +- OpenCode metadata-only file changes: N +- OpenCode full-text file changes: M +- non-OpenCode full-text file changes: X +- task-boundary warnings: B + +After: +- OpenCode metadata-only file changes: <= N +- OpenCode full-text file changes: >= M +- non-OpenCode full-text file changes: X +- task-boundary warnings: B +``` + +Any non-OpenCode count change is a blocker. + +### Failure Injection Tests + +Add targeted failure injection where practical: + +- Snapshot provider throws. +- Snapshot provider times out. +- Snapshot provider returns duplicate path entries. +- Ledger importer rejects the batch. +- Backfill runs twice with the same source import key. +- Feature flag changes from `single-change` to `off`. +- Snapshot proof succeeds for one file and fails for another file in the same task. + +Expected result for every failure injection: original metadata-only safety is +preserved, no duplicate review rows, diagnostics explain the skip/failure. + +### Serialization Tests + +Add tests around the task-change event materialization boundary: + +- `beforeContent: ''` remains empty string. +- `afterContent: ''` remains empty string. +- `beforeContent: null` remains unavailable/absent according to state. +- Unknown `evidenceProof` does not make a file rejectable. +- Snapshot fields survive import/export if present. +- Snapshot fields may be absent without renderer crashes. + +### Manual QA Runbook + +Manual QA is not a substitute for tests, but it helps catch integration mistakes. + +Prepare: + +1. Pick one OpenCode team with snapshot evidence. +2. Pick one Codex-only or Anthropic-only team. +3. Record before counts for: + - OpenCode metadata-only files. + - OpenCode full-text files. + - non-OpenCode full-text files. + - task-boundary warnings. + - snapshot proof diagnostics. + +Run with `off`: + +```bash +OPENCODE_SNAPSHOT_PROOF_UPGRADE=off pnpm test --run test/main/services/team/TaskChangeComputer.test.ts +``` + +Expected: + +- No new upgraded OpenCode snapshot events. +- Existing exact toolpart-chain behavior unchanged. + +Run with `shadow`: + +```bash +OPENCODE_SNAPSHOT_PROOF_UPGRADE=shadow pnpm test --run test/main/services/team/TaskChangeComputer.test.ts +``` + +Expected: + +- Snapshot proof stats are emitted. +- Would-upgrade counts are visible. +- Imported/reviewed changes are identical to `off`. +- Any difference from `off` outside diagnostics is a blocker. + +Run with `single-change`: + +```bash +OPENCODE_SNAPSHOT_PROOF_UPGRADE=single-change pnpm test --run test/main/services/team/TaskChangeComputer.test.ts +``` + +Expected: + +- OpenCode full-text count may increase. +- OpenCode metadata-only count may decrease or stay equal. +- non-OpenCode counts are unchanged. +- Multi-change groups are skipped with diagnostics. + +Run with `full` only after tests pass: + +```bash +OPENCODE_SNAPSHOT_PROOF_UPGRADE=full pnpm test --run test/main/services/team/TaskChangeComputer.test.ts +``` + +Expected: + +- Same guarantees as `single-change`. +- Multi-change upgrades appear only when diagnostics can explain the full chain. + +UI spot check: + +- Open a mixed task with one upgraded file and one metadata-only file. +- Verify the upgraded file shows a diff. +- Verify the metadata-only file still shows a warning. +- Verify `Reject All` skips metadata-only files. +- Verify current disk preview is not treated as baseline. +- Verify task boundary warnings remain if present. + +Any mismatch is a blocker. + +## Acceptance Criteria + +The implementation is acceptable only if all are true: + +- OpenCode-only behavior changed. +- Strict delivery remains required for snapshot full-text upgrades. +- Exact existing `toolpart-chain` behavior is unchanged. +- Metadata-only fallback still works. +- No current disk content is used as historical proof. +- No broad OpenCode session scan is introduced. +- Snapshot read limits are unchanged or narrower. +- Ambiguous chains keep warnings. +- Large and binary files keep warnings. +- Tests cover same-path multi-change chains. +- Real-data smoke shows no cross-task leakage. +- Feature flag can disable the upgrade. +- Repeated backfill does not duplicate review files. +- Warning removal is limited to known resolved warning predicates. +- Performance budgets pass. +- The implementation has an explicit fallback for unsupported OpenCode snapshot shapes. +- The implementation includes Phase 0 contract audit notes in the PR/commit + description or test output. +- No warning is removed unless a unit test names that exact warning or predicate. +- No current-disk preview path is involved in a proof decision. +- No behavior change occurs when `OPENCODE_SNAPSHOT_PROOF_UPGRADE=off`. +- Smoke output includes attempted/upgraded/skipped counts. +- Full mode is not enabled while any known unknown remains unresolved. +- Existing metadata-only events are not rewritten unless source-key supersede is + proven by tests. +- Cache behavior is documented in the Phase 0 audit. +- Composite proof identity is enforced before snapshot text is trusted. +- Toolpart ordering is explicitly verified before multi-change upgrades. +- `single-change` and `full` have separate definitions of done. +- Serialization preserves empty string versus absent file. +- `shadow` mode proves expected upgrades without changing imported review events. +- Exhaustive handling covers every proof decision and feature flag mode. +- Capability gates are checked per session, not inferred from config alone. +- Missing/pruned snapshot store objects preserve metadata-only fallback. +- Deterministic fingerprints prove non-OpenCode behavior is unchanged. +- Apply/reject execution safety still checks current disk state after review + proof succeeds. +- Storage and memory budgets are enforced without duplicate blob storage. +- Malformed/truncated OpenCode rows cannot produce upgraded proof. +- First apply rollout stays within the minimum safe scope. +- Negative controls prove close-but-invalid cases remain metadata-only. +- Runtime postcondition failures preserve original changes. + +## Verification Command Matrix + +Use the narrowest useful commands first, then broader smoke. + +| Layer | Command or check | Required result | +| --- | --- | --- | +| Typecheck | `pnpm typecheck` | passes | +| Enricher unit | targeted `OpenCodeChangeEvidenceEnricher` tests | passes | +| Snapshot provider | targeted `OpenCodeSnapshotEvidenceProvider` tests | passes | +| Ledger bridge | targeted `OpenCodeLedgerBridgeService` tests | passes | +| Desktop review safety | targeted review/rejectability tests | passes | +| Off mode | task-change tests with `OPENCODE_SNAPSHOT_PROOF_UPGRADE=off` | old behavior | +| Shadow mode | task-change tests with `OPENCODE_SNAPSHOT_PROOF_UPGRADE=shadow` | stats only | +| Single-change mode | task-change tests with `OPENCODE_SNAPSHOT_PROOF_UPGRADE=single-change` | only one-change upgrades | +| Full mode | task-change tests with `OPENCODE_SNAPSHOT_PROOF_UPGRADE=full` | only after chain tests | +| Real data | existing task-change smoke on OpenCode and non-OpenCode teams | no leakage | + +Do not use `full` smoke as a substitute for single-change smoke. They prove +different safety boundaries. + +## Code Review Checklist + +Use this checklist before merging the implementation: + +- Every upgraded change has a non-`metadata-only-fallback` proof. +- Every upgraded modify has both `beforeContent` and `afterContent`. +- Every upgraded create has `beforeState.exists === false` and `afterContent`. +- Every upgraded delete has `beforeContent` and `afterState.exists === false`. +- State hashes match emitted content. +- No branch reads current disk as proof. +- No branch catches an error and upgrades anyway. +- Every skipped branch preserves the original change. +- Warning stripping uses a central predicate. +- Multi-change mode can be disabled independently. +- Snapshot reader limits are unchanged or narrower. +- Tests include at least one negative case for every positive upgrade case. +- Real-data smoke includes at least one OpenCode team and one non-OpenCode team. +- No new IPC or filesystem read path bypasses existing workspace trust checks. +- No content appears in diagnostics, metrics, or thrown error messages. +- `off` mode is covered by a test and is easy to use during rollback. +- The proof logic is structured so forbidden state transitions are not possible + without an obvious code review smell. +- `shadow` mode has been run on real data before any apply mode is enabled. +- Any new union member requires an exhaustive switch update, not a permissive + default branch. +- Review-safe and execution-safe are checked separately. +- Large-file and total-byte budget tests prove metadata-only fallback. +- Minimum safe scope is visible in code structure, not only in tests. +- Negative controls exist for task, member, window, baseline, and operation + mismatch. + +## Implementation Anti-Patterns + +Do not implement the feature using these patterns: + +- A broad `try/catch` that returns an upgraded change on partial data. +- Mutating the original change object in place before proof has succeeded. +- Removing warnings before the final proof decision. +- Reading current disk to fill `beforeContent`. +- Comparing normalized line endings for proof. +- Treating a matching hash as content. +- Creating OpenCode-specific rejectability logic in the renderer. +- Appending upgraded duplicate events and expecting UI sorting to hide stale ones. +- Increasing snapshot limits to make a test pass. +- Falling back from strict delivery to compatible attribution for safety. +- Adding `full` mode as the default in the same change that introduces it. +- Treating empty string as missing content. +- Treating missing content as empty string. +- Sorting same-path chains by only one field. +- Retrying snapshot reads in a loop without a budget. +- Treating review-safe as automatically execution-safe. +- Adding a second blob store or cache for before/after content. +- Dereferencing Git LFS or submodule content outside the existing snapshot reader. +- Using malformed partial OpenCode JSON rows as proof context. +- Expanding the first apply mode to unsupported operations because the snapshot + text happens to be available. +- Hiding uncertainty by changing user-facing wording from warning to success. + +## Rollout Strategy + +1. Land diagnostics and helper functions with no behavior change if practical. +2. Add the feature flag with default `off` in tests where needed. +3. Add snapshot-first upgrade for single-change same-path cases. +4. Run targeted tests and real-data smoke. +5. Enable `single-change` mode for local smoke. +6. Add multi-change chain upgrade only after tests are solid. +7. Move to `full` mode only if multi-change smoke is clean. +8. Inspect warnings before and after for OpenCode tasks. + +If multi-change support looks risky during implementation, stop after +single-change mode. Single-change upgrade is already useful and lower risk. + +Recommended shipping sequence: + +```text +PR 1: diagnostics + eligibility helpers + no behavior change +PR 2: single-change snapshot proof upgrade behind flag +PR 3: enable single-change by default for OpenCode strict delivery +PR 4: multi-change chain upgrade behind flag +PR 5: enable full mode only after real-data smoke +``` + +If this stays as one PR, keep the same commit structure locally and verify each +step before moving to the next one. + +## Abort Conditions + +Do not continue implementation if any of these happens: + +- Snapshot windows cannot be reliably matched to toolparts. +- Existing OpenCode snapshot shape differs from tests in real data. +- Real-data smoke shows any new cross-task file leakage. +- Performance smoke shows repeated timeouts. +- A change would require using current disk as proof. +- A change would require broad compatible attribution scanning. +- Warning stripping needs broad substring matching to pass tests. +- Multi-change support requires accepting ambiguous edit/apply-patch replacements. +- Source import key dedupe behavior is unclear. +- The only available validation is manual UI inspection. +- A test has to assert against current wall-clock timing without a stable budget. +- Formal proof predicates require exceptions to support the first implementation. +- Postconditions fail for any positive fixture. +- `single-change` mode needs multi-change assumptions to pass. +- `full` mode needs renderer-specific special cases to appear safe. +- Rollback with `OPENCODE_SNAPSHOT_PROOF_UPGRADE=off` does not restore old + behavior for new backfills. +- Empty content and missing content cannot be distinguished at serialization. + +## Open Questions Template For Implementation PR + +Every implementation PR should answer these in its description: + +```text +OpenCode snapshot proof PR checklist: +- Mode implemented: diagnostics | single-change | full +- Default mode: +- Phase 0 contract audit completed: yes | no +- Source import key duplicate policy: +- Review bundle dedupe key: +- Rejectability helper: +- Existing event policy: new-imports-only | supersede-by-source-key +- Snapshot shape fingerprint observed: +- Real-data teams tested: +- Non-OpenCode teams unchanged: yes | no +- Snapshot proof stats: +- Rollback tested with OPENCODE_SNAPSHOT_PROOF_UPGRADE=off: yes | no +- Known unknowns remaining: +``` + +If the PR cannot answer one of these, it should not enable new behavior by +default. + +## Example Final Change Shape + +Before upgrade, metadata-only edit: + +```json +{ + "sourceTool": "edit", + "before": { + "exists": true, + "unavailableReason": "opencode-edit-baseline-not-captured" + }, + "after": { + "exists": true, + "unavailableReason": "opencode-edit-final-content-unavailable" + }, + "evidenceProof": "metadata-only-fallback", + "warnings": [ + "OpenCode edit was captured without a proven full-text baseline; apply/reject is manual-only." + ] +} +``` + +After verified snapshot upgrade: + +```json +{ + "sourceTool": "edit", + "before": { + "exists": true, + "sha256": "before-hash", + "sizeBytes": 128 + }, + "after": { + "exists": true, + "sha256": "after-hash", + "sizeBytes": 128 + }, + "evidenceProof": "opencode-snapshot", + "snapshotSource": "opencode", + "warnings": [] +} +``` + +If any proof check fails, the event must stay in the first shape. + +## Notes for Future Maintainers + +The important invariant is not "fewer warnings". The invariant is "warnings are +removed only when the system has stronger evidence than before". + +Warnings are correct when historical full text is not proven. A warning is a +better outcome than an unsafe reject button.