agent-ecosystem/docs/team-management/opencode-snapshot-first-proof-upgrade-plan.md

# OpenCode Snapshot-First Proof Upgrade Plan

## Goal

Reduce false or avoidable OpenCode review warnings for new tasks by upgrading
metadata-only OpenCode `edit`, `write`, and `apply_patch` changes to verified
full-text before/after changes when, and only when, existing OpenCode snapshot
evidence proves the exact file state transition.

The implementation must be fail-closed:

- If proof is complete, store full before/after content and remove manual-only warnings.
- If proof is incomplete, ambiguous, too large, binary, outside scope, or unavailable, keep the current warning.
- Never use current disk content as proof for historical before/after.
- Never broaden attribution outside strict delivery context.

## Non-goals

- Do not change Codex or Anthropic task extraction.
- Do not change generic review UI semantics.
- Do not infer diffs from current disk.
- Do not scan unrelated OpenCode sessions.
- Do not increase OpenCode snapshot file size limits as part of this work.
- Do not retroactively "fix" old tasks unless existing ledger backfill has strict delivery and snapshot evidence.

## Current System Facts

Desktop repo:

- `ChangeExtractorService` requests OpenCode ledger backfill only when delivery context is available.
- Backfill goes through `OpenCodeReadinessBridge.backfillOpenCodeTaskLedger`.
- Imported events already support full before/after content and metadata-only fallbacks.

Orchestrator repo:

- `OpenCodeProfileManager.buildManagedConfig()` already sets `snapshot: true`.
- `OpenCodeLedgerBridgeService.backfill()` reconstructs toolpart changes, then calls `OpenCodeChangeEvidenceEnricher.enrich()`.
- `OpenCodeOfflineSessionReader` reads OpenCode SQLite history in read-only mode and extracts snapshot windows.
- `OpenCodeSnapshotEvidenceProviderService` reads before/after snapshot file contents with strict limits.
- `OpenCodeToolpartChangeReconstructor` already creates exact `toolpart-chain` changes when it has a known baseline.
- `OpenCodeChangeEvidenceEnricher` already upgrades some metadata-only changes through snapshot or inverse chain proof.

This plan should strengthen the existing evidence path rather than introduce a
new capture subsystem.

## Risk Estimate

Recommended implementation:

- Functional bug risk: 3/10.
- Performance regression risk: 2/10.
- Data safety risk: 2/10.
- Complexity: 7/10.
- Approximate runtime change size: 220-450 LOC.
- Approximate total change size with tests and diagnostics: 450-900 LOC.

The low data safety risk depends on preserving fail-closed behavior. If any
step starts accepting guesses as proof, data safety risk becomes 7/10 or worse.

## Hard Safety Invariants

These invariants are more important than reducing warnings.

1. A full-text upgrade must be tied to one task, one member, one OpenCode
   session, one delivery record, one assistant message, one toolpart, and one
   snapshot window.
2. The upgrade must be local to OpenCode. Codex, Anthropic, generic task-log
   parsing, and non-OpenCode review flows must not change.
3. `strict-delivery` is required for every snapshot-based full-text upgrade.
   Compatible attribution may still import metadata-only events, but it must not
   produce auto-safe before/after content.
4. Current disk content is never historical evidence. It can be displayed as
   read-only context by the desktop, but it cannot remove a warning or enable
   safe reject.
5. Hash-only evidence is not full-text evidence. A hash can verify text that was
   already read from a trusted snapshot, but a hash alone is not enough.
6. Empty string is valid full text. `null` and `undefined` mean unavailable.
7. Large, binary, truncated, path-unsafe, or schema-unsupported content stays
   metadata-only.
8. A failed upgrade must preserve the original change event shape as much as
   possible. It may add diagnostics, but it must not remove warnings or mutate
   operation/confidence.
9. Imported event idempotency must remain based on existing source import keys.
   The upgrade must not create duplicate events for the same toolpart/path.
10. Any multi-change path chain must be all-or-nothing for unresolved changes in
    that path/window. Partial upgrades are allowed only for changes that were
    already independently exact before the chain attempt.
11. The implementation must never make a previously non-rejectable change
    rejectable unless both `beforeContent` and the target after/absence state
    are proven from the same trusted historical evidence path.
12. Diagnostics are allowed to become more detailed. They are not allowed to be
    used as a substitute for proof.

## Things That Are Explicitly Not Proof

These signals can be useful diagnostics, but they must not remove warnings or
enable safe reject by themselves:

- Current disk content matching an expected hash.
- Current disk content matching `newString`.
- A file path appearing in OpenCode tool metadata.
- A file path appearing in a snapshot diff without readable before/after text.
- A before/after hash without the corresponding text blob.
- An OpenCode tool status of `completed`.
- The absence of an error in the toolpart.
- The task title mentioning the same directory.
- A member name matching the expected teammate.
- A session id matching but no strict delivery record.
- A snapshot window in the same session but a different assistant message.
- A snapshot window that overlaps several toolparts ambiguously.
- A successful manual UI render of current disk preview.

If implementation pressure makes any of these tempting, stop and keep the
warning.

## Threat Model

The feature is not security-sensitive in the network sense, but it is
data-safety-sensitive. The main threat is a false full-text proof that enables
safe reject/apply for the wrong historical state.

Bug classes to defend against:

1. Cross-task contamination:
   - A file change from task A appears in task B review.
   - Main defense: strict delivery, canonical task id, source message/window
     matching, real-data smoke.
2. Cross-member contamination:
   - A teammate using the same OpenCode profile is attributed to another member.
   - Main defense: delivery record member/lane/session matching.
3. Cross-window contamination:
   - A toolpart is matched to the wrong snapshot window in the same message.
   - Main defense: exactly-one window matching and order tests.
4. False baseline:
   - Current disk or hash-only evidence is treated as historical before text.
   - Main defense: "not proof" list and code review checklist.
5. Unsafe warning removal:
   - UI stops warning about a file that is still manual-only.
   - Main defense: central warning predicate and negative warning tests.
6. Duplicate imported events:
   - The same source toolpart appears twice after re-backfill.
   - Main defense: source-key idempotency audit and repeated-backfill tests.
7. Silent performance regression:
   - Snapshot proof reads too many blobs or times out often.
   - Main defense: proof-needed filtering, existing limits, timing counters.
8. Unsupported upstream shape:
   - OpenCode changes SQLite/snapshot schema and our parser guesses.
   - Main defense: shape fingerprint, unsupported fallback, abort condition.

For every bug class above, the implementation needs at least one negative test
or real-data smoke assertion.

## Pre-Implementation Audit Checklist

Before writing runtime code, answer these questions from the current codebase:

1. Does the task-change ledger importer update, replace, supersede, or append
   events with the same `sourceImportKey`?
2. Does the desktop review bundle dedupe by source key, file path, event id, or
   a computed change id?
3. Which exact helper decides whether a file is rejectable?
4. Which warnings are currently surfaced in `TeamChangesSection` versus the
   full review dialog?
5. Does the OpenCode backfill cache hide an upgraded result for up to 60 seconds
   after a first metadata-only result?
6. Does `materializeMetadataOnlyChanges` preserve `evidenceProof`,
   `snapshotId`, `snapshotSource`, and warnings exactly?
7. Can the snapshot provider return `beforeState`/`afterState` with hashes but
   no text, and how is that serialized into task-change events?
8. Are OpenCode snapshot windows always message-local in current real data?
9. Are there real examples where a single toolpart touches more than one file?
10. Are there real examples where `apply_patch` contains rename or mode-only
    changes?
11. Does the snapshot provider ever return duplicate file entries for the same
    normalized path?
12. Does the task-change worker cache bundle results independently from the
    OpenCode backfill cache?
13. Are task-level warnings derived only from imported events, or can they come
    from boundary parsing separately?
14. Does a safe reject require `beforeContent`, or can `beforeState.exists === false`
    plus `afterContent` be enough for creates?
15. Is there any existing telemetry/log sink where structured counters can be
    emitted without leaking file contents?

If any answer is unknown, add a focused diagnostic or unit test before changing
behavior. Do not use implementation guesses for these contracts.

## Decision Gates

These gates must be passed in order. Do not skip gates to get fewer warnings
faster.

| Gate | Required evidence | If not met |
| --- | --- | --- |
| G0 contract audit | importer, bundle, rejectability, cache behavior known | no runtime change |
| G1 diagnostics-only | new diagnostics pass tests with no behavior change | fix diagnostics first |
| G2 shadow proof | proof computes stats but imports original changes | keep behavior disabled |
| G3 single-change proof | positive and negative single-change tests pass | keep apply disabled |
| G4 real-data single-change smoke | OpenCode improves or stays same, non-OpenCode unchanged | do not enable default |
| G5 multi-change proof | all chain tests pass, no ambiguous branch accepted | keep `full` unavailable |
| G6 real-data full smoke | no cross-task leakage, budgets pass | keep default `single-change` |
| G7 rollback check | `OPENCODE_SNAPSHOT_PROOF_UPGRADE=off` restores old behavior | do not ship |

The implementation should be easy to stop after G4. Single-change mode is a
valid ship point; `full` mode is optional.

## Known Unknowns That Block Full Mode

`full` mode must stay disabled if any of these are still unknown:

- Whether importer supersedes or appends duplicate `sourceImportKey` events.
- Whether real OpenCode data has nested or overlapping snapshot windows.
- Whether real OpenCode `apply_patch` parts include rename, chmod, or binary
  patch shapes.
- Whether multi-change same-path chains occur often enough to justify the risk.
- Whether review bundle dedupe can handle upgraded old events without duplicate
  rows.
- Whether snapshot proof stats can be collected without logging sensitive
  content.

Unknowns do not block diagnostics or single-change mode. They block `full` mode.

## Assumption Ledger

Keep an explicit ledger of assumptions. Each assumption needs a validation path
and a fallback. Do not leave assumptions implicit in implementation code.

| Assumption | Validation | Fallback if false |
| --- | --- | --- |
| OpenCode snapshot windows are message-local | unit fixture and real-data diagnostics | metadata-only fallback |
| Source import keys are stable across re-backfill | repeated-backfill test | new-imports-only, no old rewrite |
| Review bundle dedupes safely | Phase 0 audit and bridge test | do not upgrade old events |
| Empty string survives materialization | serialization test | do not upgrade empty files |
| Existing reject helper checks current disk | desktop contract test | fix shared helper before enabling |
| Snapshot store objects remain readable long enough | retention fixture and diagnostics | metadata-only fallback |
| Part ordering is stable enough for chains | ordering unit tests | disable `full` |
| Warning predicates are complete | unit tests naming every removed warning | preserve warning |
| Stats can be emitted without content | log review and tests | disable stats or redact harder |
| Non-OpenCode fingerprints stay identical | real-data mode comparison | keep apply modes disabled |

If an assumption has no validation path, it should be moved to "Known Unknowns"
and block `full` mode.

## Capability And Version Gates

Do not assume that `snapshot: true` in managed OpenCode config means snapshot
evidence is usable for every session. Treat snapshot proof as a runtime
capability that must be observed for the specific session being backfilled.

Required capability checks:

- OpenCode SQLite schema is supported.
- Session identity includes project id, directory, worktree, and git VCS.
- Session worktree matches the expected workspace root.
- Snapshot windows are present and paired.
- Snapshot git store reader reports the expected shape fingerprint.
- Snapshot file evidence can be read under the existing limits.
- The proof path sees the same normalized relative path in reconstruction and
  snapshot evidence.

Suggested result type:

```ts
type SnapshotProofCapability =
  | {
      supported: true
      shapeFingerprint: string
      sessionId: string
      projectId: string
    }
  | {
      supported: false
      code:
        | 'sqlite-schema-unsupported'
        | 'session-identity-missing'
        | 'workspace-mismatch'
        | 'snapshot-window-missing'
        | 'snapshot-store-unsupported'
        | 'snapshot-store-missing'
      diagnostics: string[]
    }
```

Rules:

- Unsupported capability returns metadata-only fallback.
- Unknown capability returns metadata-only fallback.
- Capability diagnostics may be emitted in `shadow`.
- Capability success alone is not proof. It only allows proof attempts.

## Mode Behavior Matrix

The mode must determine both proof computation and proof application.

| Mode | Compute proof? | Apply proof? | Import changed events? | Intended use |
| --- | --- | --- | --- | --- |
| `off` | no | no | no | rollback and baseline comparison |
| `shadow` | yes | no | no | validate proof quality and performance |
| `single-change` | yes | only one unresolved change per path/window | yes | safe first rollout |
| `full` | yes | one-change and verified chains | yes | optional later rollout |

If implementation makes `shadow` import different events from `off`, it is a
bug. If implementation makes `off` compute snapshot proof, it is a performance
bug.

## Minimum Safe Scope

The first behavior-changing implementation should intentionally support less
than the full theoretical feature.

Allowed in first `single-change` apply mode:

- OpenCode only.
- `strict-delivery` only.
- One unresolved change for one normalized path inside one snapshot window.
- Text files within existing size limits.
- `write` create when before absence and after text are proven.
- `write` modify when before text, after text, and toolpart after content agree.
- `edit` modify when `oldString` occurs exactly once and produces snapshot after.
- delete when before text and after absence are proven.

Explicitly excluded from first apply mode:

- Multi-change same-path chains.
- `apply_patch` without parsed hunks.
- rename, chmod, binary patch, submodule, and mode-only changes.
- Any case requiring current disk as evidence.
- Any case requiring line-ending normalization.
- Any case where snapshot evidence exists but operation semantics are unclear.
- Any old metadata-only event rewrite unless source-key supersede is proven.

This scope is deliberately conservative. The goal is to prove the pipeline, not
to maximize warning reduction in the first implementation.

## Lowest-Confidence Areas And Mitigations

The implementation should explicitly address the areas below because they are
where mistakes are most likely.

### Snapshot Window Matching

Risk: OpenCode history can contain several `step-start` and `step-finish`
records in the same assistant message. Incorrect ordering could attach a
toolpart to the wrong snapshot pair.

Mitigation:

- Keep the existing requirement that a toolpart must match exactly one window.
- Keep message-local matching. Do not match a toolpart to a window from another
  assistant message.
- Add tests where a toolpart is before the first window, after the last window,
  and inside two overlapping windows.
- If window order cannot be proven from `rawParts`, skip the upgrade.

### Multi-Change Chains

Risk: several edits to the same file can produce the same final content through
more than one path. This is the easiest place to create a convincing but wrong
diff.

Mitigation:

- Implement single-change upgrades first.
- Gate multi-change chain upgrades behind a narrow helper and dense tests.
- Do not allow a `write` in the middle of a reverse chain unless both sides of
  that write are independently proven.
- Abort the whole path/window chain on the first ambiguous step.
- Add a kill switch that can disable multi-change upgrades while leaving
  single-change upgrades enabled.

### Warning Removal

Risk: broad substring filtering can hide warnings that still matter, especially
task-boundary or attribution warnings.

Mitigation:

- Do not remove warnings by broad terms like `manual-only` alone.
- Centralize warning predicates and match only known OpenCode baseline/content
  warning messages.
- Preserve all warnings that mention attribution, delivery, boundary,
  confidence, path scope, binary, too-large, truncated, or unavailable snapshot
  content.
- Add tests where a warning contains `manual-only` but is unrelated to baseline
  proof.

### Snapshot Shape Stability

Risk: OpenCode can change SQLite or snapshot git-store shape. A shape change
could make old assumptions invalid.

Mitigation:

- Keep `snapshotShapeFingerprint` checks visible in diagnostics.
- Treat unknown or unsupported shapes as metadata-only fallback.
- Do not add compatibility shims that guess from partial rows.
- Add an abort condition for a real-data shape mismatch.

### Snapshot Store Retention

Risk: OpenCode SQLite can contain snapshot window hashes while the corresponding
git-store objects are missing, pruned, moved, or unreadable. The history then
looks promising but cannot prove full text.

Mitigation:

- Treat missing snapshot objects as metadata-only fallback.
- Keep a distinct diagnostic for missing store object versus unsupported shape.
- Do not retry by reading current disk.
- Do not reconstruct from only one side of the snapshot pair.
- Add a fixture where the window exists but object read fails.

### Performance

Risk: reading snapshot blobs for every task can become expensive on large
sessions.

Mitigation:

- Try snapshot proof only for unresolved OpenCode changes in strict delivery.
- Pass only unresolved touched paths to the snapshot reader unless a same-path
  chain requires exact already-proven neighbors.
- Keep the current snapshot read limits.
- Add timing diagnostics around snapshot proof attempts.
- Abort rollout if repeated snapshot timeouts appear in smoke data.

### Existing Ledger Events

Risk: a task that was previously imported as metadata-only may later be
backfilled with better evidence. If importer semantics are append-only, the UI
could show duplicates or stale warnings.

Mitigation:

- Audit importer behavior before enabling upgrades for old data.
- Prefer stable source-key replacement/superseding if already supported.
- If replacement is not supported, limit the behavior change to new backfill
  imports and leave old events untouched.
- Add repeated-backfill tests before real-data smoke.

### Cache Invalidation

Risk: desktop or worker cache may return an old metadata-only bundle after the
orchestrator has imported stronger evidence, making validation confusing or
causing stale warnings to persist.

Mitigation:

- Audit all cache layers in Phase 0.
- Include the OpenCode ledger fingerprint or imported event count in cache
  invalidation if an existing mechanism supports it.
- For tests, clear or bypass caches instead of waiting for TTLs.
- Do not add broad cache busting for all teams. Keep invalidation scoped to the
  requested team/task.

### Partial Success Semantics

Risk: one file in a task upgrades while another remains metadata-only. Bulk
review actions might accidentally assume the task is fully safe.

Mitigation:

- Keep rejectability file-level.
- Keep task-level warnings if any file remains manual-only or if boundaries are
  uncertain.
- Add a mixed-task desktop test.

## Feature Flag And Rollback

Add a runtime guard before changing behavior:

```ts
type SnapshotProofUpgradeMode = 'off' | 'shadow' | 'single-change' | 'full'

function getSnapshotProofUpgradeMode(env: NodeJS.ProcessEnv): SnapshotProofUpgradeMode {
  const raw = env.OPENCODE_SNAPSHOT_PROOF_UPGRADE
  if (raw === '0' || raw === 'off') {
    return 'off'
  }
  if (raw === 'shadow') {
    return 'shadow'
  }
  if (raw === 'full') {
    return 'full'
  }
  if (raw === 'single-change') {
    return 'single-change'
  }
  return 'shadow'
}
```

Recommended rollout:

- Default to `shadow` during development and first smoke validation.
- Move to `single-change` only after shadow stats show expected upgrades with
  no behavior changes.
- Move to `full` only after multi-change chain tests and real-data smoke pass.
- Keep `off` available as an emergency rollback path.
- If `full` later becomes the default, that should be a separate rollout change
  after the implementation has passed real-data smoke in explicit `full` mode.

If the project already has a central feature-flag/env helper for OpenCode
runtime behavior, use that instead of adding a new ad-hoc parser.

`shadow` mode is intentionally different from `off`:

- `off` does not attempt proof.
- `shadow` attempts proof and records stats/diagnostics, but returns the
  original changes to the importer.
- `single-change` applies only one-change path/window upgrades.
- `full` applies single-change and multi-change chain upgrades.

This gives a low-risk way to validate proof quality and performance on real
data before changing review safety.

## Architecture

Use this pipeline:

```text
OpenCode SQLite history
  -> toolpart reconstruction
  -> strict delivery attribution
  -> snapshot window grouping
  -> snapshot file read with limits
  -> proof upgrade per path
  -> validate candidate batch
  -> import task-change events
```

The upgrade belongs in the orchestrator evidence layer, primarily around:

- `OpenCodeChangeEvidenceEnricher.ts`
- `OpenCodeSnapshotEvidenceProvider.ts`
- `OpenCodeToolpartChangeReconstructor.ts` only if a small helper or extra metadata is needed
- tests near existing OpenCode evidence and ledger bridge tests

Avoid touching desktop review UI for the proof itself. The desktop should only
benefit from better imported event content.

## Composite Identity Contract

Every full-text proof must be anchored to a composite identity. Do not rely on
any single field alone.

Required identity dimensions:

```ts
type SnapshotProofIdentity = {
  teamName: string
  taskId: string
  memberName: string
  laneId?: string
  sessionId: string
  parentUserMessageId?: string
  assistantMessageId: string
  sourceMessageId: string
  sourcePartId: string
  toolUseId: string
  relativePath: string
  snapshotWindowId: string
  fromSnapshot: string
  toSnapshot: string
}
```

Rules:

- `taskId` must be canonical, not display-only.
- `memberName`, `laneId`, and `sessionId` must come from strict delivery or
  already trusted session records.
- `sourceMessageId` must match the snapshot window message id.
- `sourcePartId` must be inside the matched window according to the same
  message's part order.
- `relativePath` must be normalized through the existing OpenCode path helpers.
- `fromSnapshot` and `toSnapshot` must be the exact pair used to read file
  evidence.

If any identity dimension is missing, the default is `metadata-only-fallback`.

## Ordering Contract

Same-path chains are safe only if toolpart order is stable and proven. Use the
existing ordering data from OpenCode SQLite. Do not introduce a new sort.

Preferred order keys, in priority order:

1. `messageTimeCreated`
2. `messageIdSort`
3. `messagePartOrder`
4. `partId`

Rules:

- Do not sort only by `partId`.
- Do not sort only by timestamp.
- Do not merge parts from different `sourceMessageId` values into one chain.
- If two parts have indistinguishable order, do not upgrade the chain.
- If raw part order is unavailable, single-change upgrade may still work, but
  multi-change mode must skip.

Example guard:

```ts
function hasStablePartOrder(parts: SourcePartSortKey[]): boolean {
  const seen = new Set<string>()
  for (const part of parts) {
    const key = [
      part.messageTimeCreated,
      part.messageIdSort,
      part.messagePartOrder,
      part.partId,
    ].join('\0')
    if (seen.has(key)) {
      return false
    }
    seen.add(key)
  }
  return true
}
```

If the real symbol names differ, keep the same invariant.

## Cross-Repo Contract Boundaries

This feature crosses the desktop repo and the orchestrator repo. Keep the
contract explicit.

Desktop responsibilities:

- Request OpenCode backfill only when delivery context exists.
- Keep cache/in-flight dedupe behavior.
- Render full-text events as diffs.
- Render metadata-only events as manual-only warnings.
- Use existing rejectability checks. Do not special-case OpenCode snapshot
  events in the UI unless a rendering bug is found.

Orchestrator responsibilities:

- Read OpenCode history and snapshot evidence.
- Decide whether proof is strong enough to materialize before/after text.
- Preserve strict delivery attribution.
- Preserve source import keys.
- Emit diagnostics explaining why upgrades were skipped.

Shared contract:

```ts
type ReviewSafetyContract = {
  sourceImportKey: string
  evidenceProof: OpenCodeEvidenceProof
  beforeContent: string | null
  afterContent: string | null
  beforeState?: { exists?: boolean; sha256?: string; sizeBytes?: number; unavailableReason?: string }
  afterState?: { exists?: boolean; sha256?: string; sizeBytes?: number; unavailableReason?: string }
  warnings?: string[]
}
```

Safe reject requires a proven historical baseline:

```ts
function hasSafeHistoricalBaseline(change: ReviewSafetyContract): boolean {
  if (change.beforeContent !== null) {
    return true
  }
  return change.beforeState?.exists === false && change.afterContent !== null
}
```

The exact desktop helper may have a different name. The invariant should match
this contract.

## Apply/Reject Execution Safety Contract

Snapshot proof can make a review event eligible for normal diff rendering and
safe reject consideration. It must not bypass current worktree conflict checks.

Review safety and execution safety are different:

- Review safety answers: "Do we know the historical before/after for this
  change?"
- Execution safety answers: "Can we apply or reject this change against the
  user's current disk state right now?"

This feature only upgrades review safety. It must not weaken execution safety.

Required rules:

- Rejecting a modify still requires the current file to match the expected after
  state, or whatever stricter existing conflict check is already used.
- Rejecting a create still requires the current file to match the created after
  state before deletion.
- Rejecting a delete still requires the current absence/after state to match the
  expected deleted state before restoring before content.
- Accepting an OpenCode change must not overwrite unrelated current disk edits.
- Bulk `Reject All` must keep per-file conflict checks and skip unsafe files.
- Current disk mismatch should produce a conflict/manual warning, not a proof
  downgrade.

Suggested predicate split:

```ts
function isReviewSafe(change: ReviewSafetyContract): boolean {
  return isSnapshotReviewSafe(change)
}

function canExecuteReject(input: {
  change: ReviewSafetyContract
  currentDiskState: { exists: boolean; sha256?: string }
}): boolean {
  if (!isReviewSafe(input.change)) {
    return false
  }
  // Use the existing project helper here. This sketch only documents that
  // execution safety is a separate check from proof safety.
  return currentDiskMatchesExpectedAfterState(input.change, input.currentDiskState)
}
```

Do not implement `currentDiskMatchesExpectedAfterState` ad hoc if the project
already has a conflict/rejectability helper. This plan requires preserving that
existing behavior.

## Data Model Contract

Do not introduce a new task-change event shape unless absolutely necessary.
Prefer filling existing fields:

```ts
type UpgradedOpenCodeChangeContract = {
  sourceTool: 'write' | 'edit' | 'apply_patch' | 'snapshot_patch'
  sourceImportKey: string
  evidenceProof: 'opencode-snapshot' | 'inverse-edit-chain' | 'inverse-apply-patch-chain' | 'toolpart-chain'
  confidence: 'high' | 'exact'
  beforeContent: string | null
  afterContent: string | null
  beforeState: {
    exists?: boolean
    sha256?: string
    sizeBytes?: number
    unavailableReason?: never
  }
  afterState: {
    exists?: boolean
    sha256?: string
    sizeBytes?: number
    unavailableReason?: never
  }
  snapshotId?: string
  snapshotSource?: 'opencode'
  warnings: string[]
}
```

Important:

- Upgraded full-text events should not carry `unavailableReason` for the
  before/after side they claim to prove.
- Metadata-only events may carry `unavailableReason`, but then they must remain
  non-rejectable.
- `confidence: 'high'` is acceptable for snapshot proof. Use `exact` only for
  truly exact toolpart chains that already have local full text.
- `snapshotId` is useful provenance, but it is not required for safety if the
  proof was otherwise validated. Missing `snapshotId` should be diagnostic.

## Storage And Memory Contract

The feature must not create a second blob storage path or keep large full-text
content in memory longer than the existing ledger import requires.

Rules:

- Reuse the existing task-change ledger content storage.
- Do not duplicate before/after text in diagnostics, stats, or cache keys.
- Do not add per-team global caches of snapshot file content.
- Do not store both snapshot raw blobs and task-change blobs unless the existing
  snapshot reader already does that internally.
- Apply the per-file and total byte limits before materializing upgraded events.
- If a file exceeds the limit, store metadata-only state with a reason.
- If many small files exceed the total byte budget, skip the excess files as
  metadata-only instead of raising the limit.
- Stats should count bytes read and files skipped, but never include content.

Suggested stats additions:

```ts
type SnapshotProofStorageStats = {
  bytesRead: number
  bytesMaterialized: number
  skippedByByteLimit: number
  skippedByTotalBudget: number
}
```

Memory pressure is a reason to keep metadata-only fallback. It is not a reason
to increase limits or stream partial text into a diff.

## Mutation Rules

When an upgrade is skipped, only diagnostics may change. The returned
`ReconstructedOpenCodeToolChange` must preserve:

- `beforeContent`
- `afterContent`
- `beforeState`
- `afterState`
- `operation`
- `confidence`
- `warnings`
- `evidenceProof`
- `sourceImportKey`

When an upgrade succeeds, only these fields may change:

- `beforeContent`
- `afterContent`
- `beforeState`
- `afterState`
- `operation`, only when the tool semantics and snapshot operation both prove it
- `confidence`
- `warnings`, only through the central resolved-warning predicate
- `evidenceProof`
- `evidenceDiagnostics`
- `snapshotId`
- `snapshotSource`

No other fields should be rewritten by the proof upgrade. This reduces
accidental attribution changes.

## Proof Levels

Use explicit proof labels and keep their meaning strict.

```ts
type OpenCodeEvidenceProof =
  | 'toolpart-chain'
  | 'opencode-snapshot'
  | 'inverse-edit-chain'
  | 'inverse-apply-patch-chain'
  | 'metadata-only-fallback'
```

Accepted for auto review:

- `toolpart-chain`
- `opencode-snapshot`
- `inverse-edit-chain`
- `inverse-apply-patch-chain`

Not accepted for safe reject/apply:

- `metadata-only-fallback`
- current disk only
- file path metadata only
- hash without text
- text without matching task/window/path proof

## Proof Decision Tables

### Operation State Table

| Tool | Snapshot before | Snapshot after | Tool fields | Upgrade? | Reason |
| --- | --- | --- | --- | --- | --- |
| `write` create | absent | text | content absent or same as after | yes | create is fully proven |
| `write` modify | text | text | content absent or same as after | yes | before and after are fully proven |
| `write` modify | unavailable | text | any | no | overwrite baseline is unknown |
| `write` modify | text | text | content differs from after | no | toolpart and snapshot disagree |
| `edit` modify | text | text | old/new apply exactly once | yes | edit transition is proven |
| `edit` modify | text | text | old/new ambiguous | no | multiple valid transitions are possible |
| `apply_patch` modify | text | text | hunks verify exactly | yes | patch transition is proven |
| `apply_patch` modify | text | text | hunks missing | maybe | only if the snapshot window has exact single-file proof and no competing changes |
| delete | text | absent | delete operation | yes | delete is fully proven |
| any | binary/large/unavailable | any | any | no | full text is not available |

### Confidence Table

| Evidence | Confidence | Safe reject/apply? |
| --- | --- | --- |
| toolpart chain with known previous text | `exact` | yes |
| snapshot before/after with verified transition | `high` | yes |
| inverse edit/apply-patch chain with exact single replacements | `high` | yes |
| snapshot path anchor without verified transition | `medium` | no |
| metadata-only toolpart | `medium` | no |

Do not upgrade confidence from `medium` to `high` unless safe reject/apply would
also be valid.

## Proof State Machine

Implement the upgrade as a state machine, not as scattered conditionals.

```text
original change
  -> not eligible
  -> candidate
  -> snapshot evidence requested
  -> snapshot evidence matched
  -> transition verified
  -> upgraded change
  -> validated import candidate
```

Failure from any state returns to the original metadata-only change plus
diagnostics.

Allowed transitions:

| From | To | Required condition |
| --- | --- | --- |
| original | not eligible | not OpenCode, exact already, flag off, non-strict delivery |
| original | candidate | OpenCode unresolved change, strict delivery, flag permits |
| candidate | snapshot requested | non-empty touched path set within limits |
| snapshot requested | snapshot matched | exactly one window and one file anchor |
| snapshot matched | transition verified | operation-specific before/after proof succeeds |
| transition verified | upgraded | state hashes match content and warnings stripped safely |
| upgraded | validated import candidate | existing candidate validation accepts it |

Forbidden transitions:

- original -> upgraded
- candidate -> upgraded
- snapshot requested -> upgraded
- snapshot matched -> upgraded without operation-specific verification
- skipped -> upgraded

Suggested type:

```ts
type ProofState =
  | { state: 'not-eligible'; reason: SnapshotUpgradeDiagnosticCode }
  | { state: 'candidate'; change: ReconstructedOpenCodeToolChange }
  | { state: 'snapshot-matched'; change: ReconstructedOpenCodeToolChange; anchor: SnapshotFileAnchor }
  | { state: 'transition-verified'; change: ReconstructedOpenCodeToolChange; before: string | null; after: string | null }
  | { state: 'upgraded'; change: ReconstructedOpenCodeToolChange }
  | { state: 'skipped'; reason: SnapshotUpgradeDiagnosticCode; original: ReconstructedOpenCodeToolChange }
```

The concrete implementation does not have to use this exact union, but it
should preserve the same transitions.

## Exhaustiveness And Type Safety

Use exhaustive switches for proof decisions, operation handling, and feature
flag modes. Do not add a permissive `default` branch that silently preserves or
upgrades without naming the case.

```ts
function assertNever(value: never, context: string): never {
  throw new Error(`Unexpected ${context}: ${String(value)}`)
}

function applyProofDecision(
  decision: SnapshotProofDecision,
  original: ReconstructedOpenCodeToolChange,
): ReconstructedOpenCodeToolChange {
  switch (decision.type) {
    case 'upgraded':
      return decision.change
    case 'skipped':
      return original
    default:
      return assertNever(decision, 'snapshot proof decision')
  }
}
```

If TypeScript cannot prove exhaustiveness, keep the code more explicit rather
than using casts. A cast in proof code should be treated as a review smell.

## Default Answers To Uncertainty

Use these defaults when implementation hits an unclear case:

| Question | Default |
| --- | --- |
| Is attribution strict enough? | no upgrade |
| Is toolpart order stable? | no multi-change upgrade |
| Does snapshot text prove the operation? | no upgrade |
| Does warning removal feel broad? | preserve warning |
| Is content text or binary? | treat as unavailable |
| Does old event replacement behavior seem unclear? | new imports only |
| Is cache invalidation unclear? | do not rely on cache for proof |
| Does UI need an OpenCode-specific branch? | fix shared helper or stop |
| Is performance impact unclear? | keep flag off or single-change only |

These defaults are part of the safety design, not temporary indecision.

## Formal Proof Predicates

Implement proof decisions through small predicates that can be unit tested
directly. Avoid spreading equivalent checks across several branches.

```ts
function isReadableFullText(value: string | null | undefined): value is string {
  return typeof value === 'string'
}

function isKnownAbsent(state: { exists?: boolean } | undefined): boolean {
  return state?.exists === false
}

function hasUnavailableReason(state: { unavailableReason?: string } | undefined): boolean {
  return typeof state?.unavailableReason === 'string' && state.unavailableReason.length > 0
}

function isProvenCreate(change: ReviewSafetyContract): boolean {
  return (
    isKnownAbsent(change.beforeState) &&
    isReadableFullText(change.afterContent) &&
    !hasUnavailableReason(change.afterState)
  )
}

function isProvenModify(change: ReviewSafetyContract): boolean {
  return (
    isReadableFullText(change.beforeContent) &&
    isReadableFullText(change.afterContent) &&
    !hasUnavailableReason(change.beforeState) &&
    !hasUnavailableReason(change.afterState)
  )
}

function isProvenDelete(change: ReviewSafetyContract): boolean {
  return (
    isReadableFullText(change.beforeContent) &&
    change.afterState?.exists === false &&
    !hasUnavailableReason(change.beforeState)
  )
}

function isSnapshotReviewSafe(change: ReviewSafetyContract): boolean {
  return (
    change.evidenceProof === 'opencode-snapshot' ||
    change.evidenceProof === 'inverse-edit-chain' ||
    change.evidenceProof === 'inverse-apply-patch-chain' ||
    change.evidenceProof === 'toolpart-chain'
  ) && (
    isProvenCreate(change) ||
    isProvenModify(change) ||
    isProvenDelete(change)
  )
}
```

The real implementation can use existing helper names, but tests should cover
the predicates above as behavior. In particular, `unavailableReason` on a side
that claims full proof should make the change unsafe.

## Atomicity And Failure Semantics

Snapshot proof should behave atomically at three levels.

Per change:

- Success returns one upgraded change.
- Failure returns the original change unchanged plus diagnostics.
- No intermediate state should be visible to importer validation.

Per same-path chain:

- Success upgrades every unresolved change in the chain.
- Failure upgrades none of the unresolved changes in the chain.
- Already exact changes may remain exact, but they must not be rewritten by the
  failed chain attempt.

Per import batch:

- Candidate validation runs after proof upgrade.
- If import fails, review safety must not observe a partially imported safe
  state.
- Retry uses stable source import keys.

Implementation pattern:

```ts
const original = change
const decision = tryUpgradeChange(change)
if (decision.type === 'skipped') {
  diagnostics.push(decision.reason)
  return original
}
const upgraded = decision.change
if (!isSnapshotReviewSafe(upgraded)) {
  diagnostics.push('snapshot-upgrade-skipped/postcondition-failed')
  return original
}
return upgraded
```

Do not mutate `change` in place before postconditions pass.

## Postconditions

Every successful upgrade must satisfy these postconditions:

```ts
function assertUpgradePostconditions(input: {
  original: ReconstructedOpenCodeToolChange
  upgraded: ReconstructedOpenCodeToolChange
}): boolean {
  const { original, upgraded } = input
  return (
    original.sourceImportKey === upgraded.sourceImportKey &&
    original.taskId === upgraded.taskId &&
    original.teamName === upgraded.teamName &&
    original.memberName === upgraded.memberName &&
    original.sessionId === upgraded.sessionId &&
    original.sourcePartId === upgraded.sourcePartId &&
    original.sourceMessageId === upgraded.sourceMessageId &&
    original.relativePath === upgraded.relativePath &&
    upgraded.evidenceProof !== 'metadata-only-fallback' &&
    isSnapshotReviewSafe(upgraded)
  )
}
```

If a postcondition fails, keep the original change and emit a diagnostic. A
postcondition failure is a bug in proof logic, not a reason to relax safety.

## Runtime Assertion Policy

Assertions should catch programmer errors without making production data unsafe.

Rules:

- In tests, postcondition failures should fail loudly.
- In production backfill, postcondition failures should skip the upgrade,
  preserve the original metadata-only change, and emit a diagnostic.
- Assertions must never catch an error and continue with upgraded content.
- Assertions must not include file content in thrown messages.
- Assertions should include stable identifiers such as task id, source part id,
  source import key, and normalized relative path.

Suggested pattern:

```ts
function enforceUpgradePostconditions(input: {
  original: ReconstructedOpenCodeToolChange
  upgraded: ReconstructedOpenCodeToolChange
  diagnostics: string[]
}): ReconstructedOpenCodeToolChange {
  if (assertUpgradePostconditions(input)) {
    return input.upgraded
  }
  input.diagnostics.push(
    `snapshot-upgrade-skipped/postcondition-failed:${input.original.sourceImportKey}`,
  )
  return input.original
}
```

Do not use runtime assertions to justify looser proof predicates. Assertions are
a last guard, not the proof itself.

## Implementation Phases

### Phase 0 - Audit Contracts Before Behavior Changes

This phase should be completed before any runtime behavior change.

Audit:

- Ledger import behavior for duplicate `sourceImportKey`.
- Review bundle dedupe behavior.
- Existing rejectability helper.
- Existing OpenCode backfill cache and in-flight behavior.
- `materializeMetadataOnlyChanges` serialization of proof fields.
- Current real-data snapshot diagnostics for a few OpenCode teams.

Deliverable:

```text
Contract audit:
- sourceImportKey duplicate policy: replace | supersede | append | unknown
- review bundle dedupe key: ...
- rejectability helper: ...
- metadata materialization preserves proof fields: yes | no
- observed snapshot shape fingerprint: ...
- can proceed to Phase 1: yes | no
```

If any field is `unknown`, do not proceed to behavior changes.

### Phase 1 - Add Targeted Diagnostics

Add diagnostics that explain why a metadata-only change was not upgraded.
This makes real-data validation much easier.

Examples:

- `snapshot-upgrade-skipped/no-window`
- `snapshot-upgrade-skipped/ambiguous-window`
- `snapshot-upgrade-skipped/no-file-anchor`
- `snapshot-upgrade-skipped/binary`
- `snapshot-upgrade-skipped/too-large`
- `snapshot-upgrade-skipped/path-chain-ambiguous`
- `snapshot-upgrade-skipped/toolpart-after-mismatch`
- `snapshot-upgrade-skipped/current-disk-not-proof`
- `snapshot-upgrade-skipped/strict-delivery-required`
- `snapshot-upgrade-skipped/unsupported-snapshot-shape`
- `snapshot-upgrade-skipped/warning-preserved`
- `snapshot-upgrade-skipped/feature-flag-off`

These diagnostics should not be user-noisy by default, but they should be
available in backfill result diagnostics and tests.

Diagnostics should be structured internally even if the public result remains a
string array:

```ts
type SnapshotUpgradeDiagnosticCode =
  | 'snapshot-upgrade-skipped/no-window'
  | 'snapshot-upgrade-skipped/ambiguous-window'
  | 'snapshot-upgrade-skipped/no-file-anchor'
  | 'snapshot-upgrade-skipped/operation-mismatch'
  | 'snapshot-upgrade-skipped/toolpart-after-mismatch'
  | 'snapshot-upgrade-skipped/path-chain-ambiguous'
  | 'snapshot-upgrade-skipped/strict-delivery-required'
  | 'snapshot-upgrade-skipped/unsupported-snapshot-shape'

function pushSnapshotDiagnostic(
  diagnostics: string[],
  code: SnapshotUpgradeDiagnosticCode,
  detail: string,
): void {
  diagnostics.push(`${code}: ${detail}`)
}
```

Using a small typed union makes it harder to accidentally invent inconsistent
diagnostics throughout the proof code.

### Phase 2 - Make Upgrade Eligibility Explicit

Add a helper that decides whether a change needs snapshot proof. It should skip
already exact full-text changes.

```ts
function needsSnapshotProof(change: ReconstructedOpenCodeToolChange): boolean {
  if (change.evidenceProof === 'toolpart-chain') {
    return false
  }
  if (change.beforeContent !== null && change.afterContent !== null) {
    return false
  }
  if (change.beforeState?.exists === false && change.afterContent !== null) {
    return false
  }
  if (change.beforeContent !== null && change.afterState?.exists === false) {
    return false
  }
  return (
    change.sourceTool === 'write' ||
    change.sourceTool === 'edit' ||
    change.sourceTool === 'apply_patch' ||
    change.sourceTool === 'snapshot_patch'
  )
}
```

Use this helper before expensive snapshot work where possible:

```ts
const changesNeedingProof = params.changes.filter(needsSnapshotProof)
if (changesNeedingProof.length === 0) {
  return result
}
```

Important: this helper should reduce work, not reduce safety. If in doubt,
include a change in snapshot proof attempt and let the proof logic reject it.

Add a second helper for safety eligibility. It must be stricter than
`needsSnapshotProof`.

```ts
function mayUseSnapshotProof(input: {
  attributionMode: OpenCodeLedgerAttributionMode
  change: ReconstructedOpenCodeToolChange
  mode: SnapshotProofUpgradeMode
}): boolean {
  if (input.mode === 'off') {
    return false
  }
  if (input.attributionMode !== 'strict-delivery') {
    return false
  }
  if (!needsSnapshotProof(input.change)) {
    return false
  }
  return input.change.attributionMethod === 'delivery-ledger-taskrefs'
}
```

This keeps "should we spend time trying?" separate from "is this proof allowed
to affect review safety?".

Add a third helper for apply eligibility. `shadow` may compute proof, but must
not apply it.

```ts
function mayApplySnapshotProof(input: {
  mode: SnapshotProofUpgradeMode
  changeCountForPathWindow: number
}): boolean {
  if (input.mode === 'off' || input.mode === 'shadow') {
    return false
  }
  if (input.mode === 'single-change') {
    return input.changeCountForPathWindow === 1
  }
  return input.mode === 'full'
}
```

The call site should look structurally like this:

```ts
const decision = tryComputeSnapshotProof(change)
stats.record(decision)
if (!mayApplySnapshotProof({ mode, changeCountForPathWindow })) {
  return originalChange
}
return decision.type === 'upgraded' ? decision.change : originalChange
```

This prevents diagnostics-only validation from accidentally changing imported
review events.

Also make the final proof decision return a typed result instead of a nullable
change. Nullable returns tend to hide why an upgrade failed.

```ts
type SnapshotProofDecision =
  | {
      type: 'upgraded'
      change: ReconstructedOpenCodeToolChange
      proof: Exclude<OpenCodeEvidenceProof, 'metadata-only-fallback'>
    }
  | {
      type: 'skipped'
      reason: SnapshotUpgradeDiagnosticCode
      preserveOriginal: true
    }

function preserveOriginal(
  reason: SnapshotUpgradeDiagnosticCode,
): SnapshotProofDecision {
  return { type: 'skipped', reason, preserveOriginal: true }
}
```

Callers should be forced to handle both branches. A skipped decision must return
the original change unchanged except for diagnostics collected outside the
change object.

### Phase 3 - Strengthen Snapshot Anchor Matching

Snapshot anchors should be accepted only when all these conditions hold:

1. The change belongs to a strict delivery session.
2. The source toolpart belongs to exactly one snapshot window.
3. The snapshot window belongs to the same OpenCode message.
4. The normalized touched path is inside the session worktree.
5. The snapshot reader returns an anchor for the exact relative path.
6. Text content is full text, not binary, and within existing limits.
7. The file operation is compatible with the tool operation.

Add a small validation helper:

```ts
type SnapshotAnchorValidation =
  | { ok: true }
  | { ok: false; reason: string }

function validateSnapshotAnchorForChange(input: {
  change: ReconstructedOpenCodeToolChange
  anchor: SnapshotFileAnchor | undefined
}): SnapshotAnchorValidation {
  const { change, anchor } = input
  if (!anchor) {
    return { ok: false, reason: 'snapshot-upgrade-skipped/no-file-anchor' }
  }

  if (change.operation === 'create' && anchor.operation !== 'create') {
    return { ok: false, reason: 'snapshot-upgrade-skipped/operation-mismatch' }
  }

  if (change.operation === 'delete' && anchor.operation !== 'delete') {
    return { ok: false, reason: 'snapshot-upgrade-skipped/operation-mismatch' }
  }

  if (change.operation === 'modify' && anchor.operation === 'create') {
    return { ok: false, reason: 'snapshot-upgrade-skipped/operation-mismatch' }
  }

  return { ok: true }
}
```

Do not rely only on operation matching. It is a gate, not proof.

The validation helper should also distinguish these concepts:

- `anchor operation`: what the snapshot diff says happened to the file.
- `tool operation`: what the reconstructed toolpart thinks happened.
- `review operation`: what the imported task-change event will expose.

If these disagree, do not silently rewrite the operation unless the snapshot
transition and tool semantics both prove the new value. For example, a `write`
with no previous baseline may be reconstructed as `modify`; if snapshot says
`create` and before is absent, it may be upgraded to `create`. A reconstructed
`edit` must not become `create` or `delete`.

Add source identity checks before path checks:

```ts
function isSameSourceWindow(input: {
  change: ReconstructedOpenCodeToolChange
  windowMessageId: string
  windowId: string
  matchedWindowIds: string[]
}): boolean {
  return (
    input.change.sourceMessageId === input.windowMessageId &&
    input.matchedWindowIds.length === 1 &&
    input.matchedWindowIds[0] === input.windowId
  )
}
```

The exact data shape can differ, but the check must prove message-local and
single-window identity before using the snapshot file anchor.

### Phase 4 - Upgrade Single-Change Snapshot Proof

For one change touching a file within a snapshot window, upgrade directly if the
snapshot anchor proves the full transition.

Rules:

- `write` create:
  - accept when `beforeState.exists === false` and `afterContent` is full text.
  - if toolpart content exists, require it to equal snapshot after content.
- `write` modify:
  - accept when both snapshot before and after are full text.
  - if toolpart content exists, require it to equal snapshot after content.
- `edit` modify:
  - accept when both snapshot before and after are full text.
  - require applying `oldString -> newString` to before to equal after, unless the edit came from a verified snapshot patch.
- `apply_patch`:
  - accept when snapshot before and after are full text.
  - if parsed hunks exist, verify before-to-after application or inverse chain.
- delete:
  - accept when snapshot before is full text and snapshot after is absent.

For phase 4, do not support "maybe" `apply_patch` upgrades without parsed hunks.
Keep those for phase 5 or leave them metadata-only. This reduces the first
behavior change to the most provable cases.

Example helper:

```ts
function applyEditExactlyOnce(input: {
  before: string
  oldString: string | undefined
  newString: string | undefined
}): string | null {
  if (
    typeof input.oldString !== 'string' ||
    typeof input.newString !== 'string' ||
    input.oldString === input.newString
  ) {
    return null
  }
  if (countOccurrences(input.before, input.oldString) !== 1) {
    return null
  }
  return input.before.replace(input.oldString, input.newString)
}
```

Example upgrade:

```ts
function upgradeEditFromSnapshot(input: {
  change: ReconstructedOpenCodeToolChange
  anchor: SnapshotFileAnchor
}): ReconstructedOpenCodeToolChange | null {
  const before = input.anchor.beforeContent
  const after = input.anchor.afterContent
  if (typeof before !== 'string' || typeof after !== 'string') {
    return null
  }

  const applied = applyEditExactlyOnce({
    before,
    oldString: input.change.oldString,
    newString: input.change.newString,
  })
  if (applied !== after) {
    return null
  }

  return {
    ...input.change,
    beforeContent: before,
    afterContent: after,
    beforeState: contentStateForText(before),
    afterState: contentStateForText(after),
    confidence: 'high',
    evidenceProof: 'opencode-snapshot',
    snapshotId: input.anchor.snapshotId,
    snapshotSource: input.anchor.snapshotId ? 'opencode' : undefined,
    warnings: stripManualOnlyWarnings(input.change.warnings, input.anchor.warnings),
  }
}
```

Add a generic transition verifier so write/edit/apply_patch decisions share the
same state checks:

```ts
type VerifiedTransition =
  | { ok: true; beforeContent: string | null; afterContent: string | null; operation: 'create' | 'modify' | 'delete' }
  | { ok: false; reason: SnapshotUpgradeDiagnosticCode }

function verifySnapshotTransition(input: {
  change: ReconstructedOpenCodeToolChange
  anchor: SnapshotFileAnchor
}): VerifiedTransition {
  const before = input.anchor.beforeContent
  const after = input.anchor.afterContent

  if (input.anchor.operation === 'create') {
    return typeof after === 'string'
      ? { ok: true, beforeContent: null, afterContent: after, operation: 'create' }
      : { ok: false, reason: 'snapshot-upgrade-skipped/no-file-anchor' }
  }

  if (input.anchor.operation === 'delete') {
    return typeof before === 'string'
      ? { ok: true, beforeContent: before, afterContent: null, operation: 'delete' }
      : { ok: false, reason: 'snapshot-upgrade-skipped/no-file-anchor' }
  }

  if (typeof before !== 'string' || typeof after !== 'string') {
    return { ok: false, reason: 'snapshot-upgrade-skipped/no-file-anchor' }
  }

  return { ok: true, beforeContent: before, afterContent: after, operation: 'modify' }
}
```

This function should not be the final proof for `edit` or `apply_patch`. It only
proves that snapshot text exists for the operation state.

Before returning an upgraded change, verify the emitted states match the emitted
content:

```ts
function assertStateMatchesContent(input: {
  beforeContent: string | null
  afterContent: string | null
  beforeState: ReconstructedOpenCodeToolChange['beforeState']
  afterState: ReconstructedOpenCodeToolChange['afterState']
}): boolean {
  if (input.beforeContent !== null) {
    const expected = contentStateForText(input.beforeContent)
    if (input.beforeState?.sha256 !== expected.sha256) {
      return false
    }
  }
  if (input.afterContent !== null) {
    const expected = contentStateForText(input.afterContent)
    if (input.afterState?.sha256 !== expected.sha256) {
      return false
    }
  }
  return true
}
```

If this assertion fails, keep the original metadata-only change and emit a
diagnostic. Do not import inconsistent state/content.

### Phase 5 - Upgrade Multi-Change Same-Path Chains

When several changes touch the same file inside one snapshot window, only
upgrade if the whole chain verifies.

Algorithm:

1. Start from snapshot `afterContent`.
2. Walk changes for that path in reverse source order.
3. For each change:
   - if it already has full before/after, require its after to equal the cursor.
   - for `edit`, reverse `newString -> oldString` exactly once.
   - for `apply_patch`, reverse parsed hunks exactly once.
   - for `write`, only allow it as the first/oldest operation if snapshot before
     matches the previous state or known absent state.
4. If any step is ambiguous, stop and keep all unresolved warnings.
5. If the reverse chain reaches snapshot `beforeContent`, materialize
   replacements for every unresolved change in the chain.

Pseudo-code:

```ts
function upgradeSamePathChain(input: {
  changes: ReconstructedOpenCodeToolChange[]
  anchor: SnapshotFileAnchor
  diagnostics: string[]
}): Map<string, ReconstructedOpenCodeToolChange> {
  const replacements = new Map<string, ReconstructedOpenCodeToolChange>()
  let cursor = input.anchor.afterContent

  if (typeof cursor !== 'string') {
    input.diagnostics.push('snapshot-upgrade-skipped/no-after-anchor')
    return replacements
  }

  for (let index = input.changes.length - 1; index >= 0; index -= 1) {
    const change = input.changes[index]
    if (!change) {
      continue
    }

    const upgraded = reverseOneChangeFromAfter({ change, after: cursor, anchor: input.anchor })
    if (!upgraded) {
      input.diagnostics.push(`snapshot-upgrade-skipped/path-chain-ambiguous:${change.relativePath}`)
      return new Map()
    }

    replacements.set(change.sourceImportKey, upgraded.change)
    cursor = upgraded.beforeContent
  }

  if (typeof input.anchor.beforeContent === 'string' && cursor !== input.anchor.beforeContent) {
    input.diagnostics.push('snapshot-upgrade-skipped/path-chain-boundary-mismatch')
    return new Map()
  }

  return replacements
}
```

This is the highest-risk section. Keep tests dense here.

If there is any schedule pressure, defer this whole phase. Single-change
upgrades are enough to reduce many warnings and are much less risky.

Additional multi-change restrictions:

- Do not cross snapshot-window boundaries.
- Do not cross assistant-message boundaries.
- Do not cross task delivery boundaries.
- Do not mix changes with different `sourceMessageId`.
- Do not mix changes with different normalized `relativePath`.
- Do not include changes whose source import key is missing or duplicated.
- Do not upgrade a chain if any change in the path has an operation that cannot
  be reversed from the current cursor.
- Do not upgrade if the final reverse cursor does not exactly equal snapshot
  `beforeContent` for modify/delete, or known absence for create.

Add this explicit guard:

```ts
function assertSinglePathWindowChain(input: {
  changes: ReconstructedOpenCodeToolChange[]
}): boolean {
  const relativePaths = new Set(input.changes.map(change => change.relativePath))
  const messageIds = new Set(input.changes.map(change => change.sourceMessageId))
  const importKeys = new Set(input.changes.map(change => change.sourceImportKey))
  return (
    relativePaths.size === 1 &&
    messageIds.size === 1 &&
    importKeys.size === input.changes.length
  )
}
```

### Phase 6 - Warning Stripping Must Be Conservative

Only remove warnings that are made false by the new proof.

Safe to remove after verified before/after:

- `OpenCode edit was captured without a proven full-text baseline; apply/reject is manual-only.`
- `OpenCode write overwrote an existing file before the bridge had a known baseline; reject is manual-only.`
- `OpenCode apply_patch was captured without full before/after text; review is manual-only.`
- `OpenCode toolpart content was unavailable or too large; review is manual-only.`
- `full review depends on snapshot evidence`

Do not remove:

- attribution warnings
- low confidence task boundary warnings
- delivery context warnings
- path outside session directory warnings
- large/binary warnings for other files
- warnings attached to unrelated changes in the same task
- snapshot unavailable warnings attached to the same file
- any warning whose text is not in the known resolved warning predicate

Example:

```ts
function isResolvedByFullTextProof(warning: string): boolean {
  return (
    warning === 'OpenCode edit was captured without a proven full-text baseline; apply/reject is manual-only.' ||
    warning === 'OpenCode write overwrote an existing file before the bridge had a known baseline; reject is manual-only.' ||
    warning === 'OpenCode apply_patch was captured without full before/after text; review is manual-only.' ||
    warning === 'OpenCode toolpart content was unavailable or too large; review is manual-only.' ||
    warning.includes('full review depends on snapshot evidence')
  )
}

function stripManualOnlyWarnings(
  existing: string[] | undefined,
  snapshotWarnings: string[] | undefined,
): string[] {
  return [
    ...(existing ?? []).filter(warning => !isResolvedByFullTextProof(warning)),
    ...(snapshotWarnings ?? []),
  ].filter(Boolean)
}
```

If snapshot warnings contain unavailable content for this exact file, the change
should probably not have been upgraded. Add a test for that.

### Phase 7 - Preserve Performance Limits And Add Budgets

Do not increase these limits by default:

- `maxFiles: 100`
- `maxBytesPerTextFile: 1024 * 1024`
- `maxTotalBytes: 4 * 1024 * 1024`
- `timeoutMs: 3000`

Additional guard:

```ts
const unresolved = params.changes.filter(needsSnapshotProof)
if (unresolved.length === 0) {
  return result
}

const touchedRelativePaths = [...new Set(unresolved.map(change => change.relativePath))]
```

Do not pass already exact changes into `touchedRelativePaths` unless needed for
chain verification. This keeps snapshot reads narrow.

Add explicit performance budgets:

- A no-op backfill with no unresolved OpenCode changes should not invoke the
  snapshot reader.
- A strict-delivery task with one unresolved file should read one touched path.
- Snapshot proof attempt should record elapsed time in diagnostics when it
  exceeds 500 ms.
- More than two snapshot timeouts in a real-data smoke run blocks rollout.
- The broad real-data smoke should not increase total runtime by more than 10%
  compared with the baseline measured before the change.

Implementation sketch:

```ts
const startedAt = performance.now()
const snapshotResult = await readSnapshotEvidence()
const elapsedMs = performance.now() - startedAt
if (elapsedMs > 500) {
  diagnostics.push(`snapshot-upgrade-slow: ${Math.round(elapsedMs)}ms`)
}
```

Use the local runtime timing primitive already used in the orchestrator if
`performance.now()` is not available in that module.

Add a resource envelope for one backfill call:

```ts
type SnapshotProofResourceEnvelope = {
  maxSnapshotReadsPerBackfill: 10
  maxTouchedPathsPerRead: 100
  maxBytesPerTextFile: 1024 * 1024
  maxTotalBytesPerRead: 4 * 1024 * 1024
  maxElapsedMsPerRead: 3000
}
```

Do not add hidden retries that can multiply these limits. One failed or timed
out snapshot read should produce diagnostics and preserve metadata-only changes.

### Phase 8 - Idempotency And Existing Ledger Events

The upgrade may change the materialized content for a source event that was
previously imported as metadata-only. That needs a clear policy.

Preferred policy:

1. Keep `sourceImportKey` stable.
2. Let the existing ledger importer treat the upgraded event as the same source
   event, not a new file change.
3. If the importer is append-only and cannot update a previous event safely,
   do not attempt to rewrite old ledger data in this feature.
4. For new tasks, the upgraded evidence should be imported on the first backfill.
5. For old tasks, a re-backfill can show better evidence only if the existing
   ledger/import layer already supports replacing or superseding by source key.

Add a test for repeated backfill. It should not duplicate files in the review
bundle.

### Phase 9 - Desktop Contract Validation

This phase should not add new UI behavior unless tests expose a bug. It validates
that the upgraded events are already consumed safely.

Checklist:

- Full-text upgraded OpenCode event renders through the same path as Codex full-text diffs.
- Metadata-only OpenCode event still renders the warning banner.
- Mixed full-text and metadata-only task keeps per-file rejectability.
- `Reject All` skips metadata-only files.
- Current disk preview remains read-only context.
- Task summary warnings remain visible if attribution or boundary warnings remain.

If any item fails, fix the shared review safety helper rather than adding a
separate OpenCode-specific branch in the UI.

## Observability And Metrics

Add counters to diagnostics or existing debug output. They should be cheap and
safe to expose in test logs.

Suggested counters:

```ts
type SnapshotProofStats = {
  attemptedChanges: number
  upgradedChanges: number
  skippedChanges: number
  skippedByReason: Record<string, number>
  snapshotReadCount: number
  snapshotReadTimeouts: number
  snapshotReadElapsedMs: number
  touchedPathCount: number
  exactToolpartChainCount: number
  metadataOnlyFallbackCount: number
}
```

Use these stats in smoke output:

```text
OpenCode snapshot proof:
- attempted: 12
- upgraded: 7
- skipped: 5
- skipped/no-window: 2
- skipped/path-chain-ambiguous: 1
- skipped/too-large: 2
- snapshot reads: 3
- snapshot read time: 184ms
```

Metrics must not include file content or secrets. Paths are acceptable only if
the existing diagnostics already expose paths in the same context.

## Deterministic Output Comparison

Use deterministic fingerprints to compare `off`, `shadow`, and apply modes.
This catches accidental behavior changes that are hard to see in UI screenshots.

Suggested fingerprint input:

```ts
type ReviewBundleFingerprintInput = Array<{
  taskId: string
  relativePath: string
  sourceImportKey: string
  evidenceProof: string | undefined
  operation: string
  beforeSha256?: string
  afterSha256?: string
  warningCount: number
  rejectable: boolean
}>
```

Rules:

- `off` and `shadow` fingerprints must match except for diagnostics/stats.
- `single-change` may change OpenCode entries only.
- `full` may change OpenCode entries only.
- Non-OpenCode entries must have identical fingerprints in every mode.
- Fingerprints must not include raw file content.

If a mode comparison fails, inspect the structured diff before looking at UI.

## Cache And Re-Backfill Policy

The safest initial policy is:

- New backfills may import upgraded proof.
- Existing metadata-only events should not be rewritten unless the current
  importer already has a proven source-key replacement/supersede path.
- The desktop cache should not be globally invalidated.
- A task-specific refresh may re-read after successful OpenCode import.
- If cache behavior is unclear, tests should bypass cache and the rollout should
  leave old events unchanged.

Pseudo-policy:

```ts
type ExistingEventPolicy = 'new-imports-only' | 'supersede-by-source-key'

function chooseExistingEventPolicy(audit: {
  importerSupersedesBySourceKey: boolean
  reviewBundleDedupesBySourceKey: boolean
}): ExistingEventPolicy {
  return audit.importerSupersedesBySourceKey && audit.reviewBundleDedupesBySourceKey
    ? 'supersede-by-source-key'
    : 'new-imports-only'
}
```

Do not create a third policy that appends upgraded duplicates and relies on UI
filtering to hide the old event.

## Rollback Runbook

Rollback must be possible without data repair.

Immediate rollback:

```bash
OPENCODE_SNAPSHOT_PROOF_UPGRADE=off
```

Expected behavior after rollback:

- New OpenCode backfills return to previous metadata-only/manual-only behavior
  for cases without exact toolpart chains.
- Existing already-imported upgraded events remain valid historical full-text
  events. Do not delete them as part of rollback.
- No new upgraded events should be imported while the flag is off.
- Desktop review should continue to render previously imported full-text events.

If rollback is needed because upgraded duplicates were imported:

1. Do not add renderer-side filtering as a permanent fix.
2. Identify whether duplicates share `sourceImportKey`.
3. Fix importer/source-key dedupe.
4. Add a regression test with the duplicated event fixture.
5. Only then consider a one-off ledger cleanup, and only with explicit user
   approval.

If rollback is needed because of performance:

1. Keep diagnostics.
2. Disable proof upgrade.
3. Preserve exact `toolpart-chain` behavior.
4. Inspect snapshot read counters and touched path counts.
5. Re-enable only after reducing reads, not after raising limits.

## Implementation Slices

Prefer these slices even if the work lands in one PR. Each slice should compile
and have focused tests before the next slice starts.

1. Diagnostics only:
   - Add typed diagnostic codes.
   - Add stats object.
   - No behavior change.
2. Eligibility only:
   - Add feature flag parser.
   - Add `needsSnapshotProof` and `mayUseSnapshotProof`.
   - Prove `off` mode has no behavior change.
3. Shadow proof:
   - Compute proof decisions and stats.
   - Return original changes to importer.
   - Compare `shadow` and `off` outputs.
4. Single-change proof:
   - Implement formal predicates.
   - Implement create/modify/delete proof for one path/window/change.
   - Keep multi-change groups skipped.
5. Import/idempotency validation:
   - Verify source-key dedupe or choose `new-imports-only`.
   - Add repeated-backfill tests.
6. Desktop validation:
   - Verify shared rejectability consumes upgraded events safely.
   - No OpenCode-specific renderer branch unless a shared helper bug is found.
7. Multi-change proof:
   - Implement only after ordering contract tests pass.
   - Keep behind `full`.
8. Default enablement:
   - Enable `single-change` only after real-data smoke.
   - Enable `full` only in a separate rollout decision.

Stop points:

- It is acceptable to stop after slice 3 and ship only `shadow`.
- It is acceptable to stop after slice 4 and ship only `single-change`.
- It is acceptable to stop after diagnostics if real data shows unsupported
  snapshot shape.
- It is not acceptable to ship multi-change proof without real or synthetic
  chain coverage.

## Definition Of Done By Mode

### `off`

- No behavior change from current metadata-only/full-text decisions.
- Diagnostics may mention that the feature is disabled.
- Tests prove no upgraded event appears in this mode.

### `shadow`

Required before any apply mode can be default:

- Proof attempts run for eligible OpenCode changes.
- Importer receives the original change list.
- Stats include would-upgrade and skipped counts.
- No review diff, rejectability, warning, or file count changes.
- Real-data smoke shows non-OpenCode teams unchanged.
- Performance budget passes while proof is computed but not applied.

### `single-change`

Required before this mode can be default:

- Only one unresolved change for a path/window can upgrade.
- Multi-change path/window groups are skipped with diagnostics.
- `write` create/modify, `edit` modify, and delete cases have positive and
  negative tests.
- Non-OpenCode teams are unchanged in real-data smoke.
- Metadata-only count for OpenCode tasks decreases or stays equal.
- No duplicate review rows after repeated backfill.

### `full`

Required before this mode can be default:

- Every known unknown that blocks full mode is resolved.
- Same-path multi-change order is proven by tests.
- Chain upgrades are all-or-nothing for unresolved changes.
- Real-data smoke includes at least one actual multi-change chain or a synthetic
  fixture with equivalent shape.
- `full` mode can be disabled without changing code.
- A separate rollout decision enables `full`; it must not become default as a
  side effect of implementing single-change mode.

## Edge Case Matrix

### Attribution and Task Boundaries

- No delivery context:
  - Do not run strict snapshot upgrade.
  - Keep existing backfill skipped behavior.
- Delivery context exists but does not include the requested task:
  - Keep `no-attribution` behavior. Do not use compatible fallback for safe full-text.
- Compatible attribution mode:
  - Do not upgrade to auto-safe full text.
  - Reason: task ownership is not strict enough.
- Missing task start boundary:
  - Snapshot proof may prove file content, but task boundary warning remains.
- Estimated end boundary:
  - Snapshot proof may prove file content, but boundary warning remains.
- Same OpenCode session contains several tasks:
  - Only strict delivery records for the requested task are eligible.
- Same member touches same file for two tasks:
  - Do not merge changes across delivery windows.
- Multiple members share an OpenCode profile:
  - Require the delivery record member/lane/session match. Do not trust profile alone.
- Runtime delivery ledger was reset after launch:
  - No strict delivery context means no safe upgrade. Keep warnings.
- Delivery record has task refs but missing observed assistant message:
  - Do not use message-local snapshot proof unless the toolpart can still be
    tied to the delivered prompt through existing strict delivery matching.
- Delivery record has a pre-prompt cursor but no post-prompt cursor:
  - Keep strict-delivery matching conservative. Do not widen to the whole session.
- Task display id matches but canonical task id differs:
  - Use canonical task id for safe upgrades.

### Snapshot Windows

- No snapshot windows:
  - Keep metadata-only warning.
- Toolpart outside window:
  - Keep metadata-only warning.
- Toolpart matches multiple windows:
  - Keep metadata-only warning.
- Window has before hash but no after hash:
  - Keep metadata-only warning.
- Window has after hash but no before hash:
  - Allow create only if file absence is explicitly proven. Otherwise keep warning.
- Snapshot diff contains the path but operation is unknown:
  - Keep metadata-only warning.
- Snapshot diff includes more changed files than reconstructed toolparts:
  - Upgrade only exact reconstructed paths. Add diagnostic for extra snapshot paths.
- Snapshot diff misses a reconstructed path:
  - Keep that path metadata-only.
- OpenCode SQLite changed during read:
  - Existing transaction snapshot is okay, but add diagnostic.
- OpenCode schema changed:
  - Treat as unsupported history shape, no upgrade.
- Snapshot git store object is missing:
  - Keep metadata-only warning and include the store diagnostic.
- Snapshot git store read times out:
  - Keep metadata-only warning and include timeout diagnostic.
- Snapshot window hashes exist but git-store object is pruned:
  - Keep metadata-only warning and include retention diagnostic.
- Snapshot window hashes exist but point to an object from a different project:
  - Treat as workspace mismatch and skip.
- Snapshot window contains no reconstructed changes after path filtering:
  - Do not read files for that window.
- Snapshot reader returns duplicate entries for one relative path:
  - Treat as ambiguous and skip that path.
- Snapshot reader returns content for a path with different casing:
  - Use existing normalized comparison key. If identity is ambiguous, skip.
- Snapshot window is valid but OpenCode part JSON was truncated by our reader
  cap:
  - Treat affected toolparts as metadata-only. Do not combine partial part data
    with snapshot proof.
- Snapshot window contains changes from a tool type not modeled by this plan:
  - Keep those changes metadata-only until the tool type has explicit tests.

### File Content

- Text over size limit:
  - Keep warning with `too-large`.
- Binary or null-byte:
  - Keep warning with `binary`.
- Empty file:
  - Valid text content. Do not confuse empty string with unavailable.
- Missing file after delete:
  - Valid delete if before content is known.
- Missing file before create:
  - Valid create if after content is known.
- File exists before create operation:
  - Operation mismatch, no upgrade.
- File absent after modify operation:
  - Operation mismatch, no upgrade.
- Content normalizes differently by line endings:
  - Do not normalize for proof. Exact byte-equivalent UTF-8 text comparison is required.
- Content has invalid UTF-8:
  - Treat as binary/unavailable.
- Generated/minified text below limit:
  - It can be upgraded if full text is available, but review UI may still choose to collapse display.
- File mode-only changes:
  - Do not create a text diff upgrade unless text before/after also changed or mode changes are explicitly modeled.
- Very small binary file:
  - Size does not make it text. Binary detection still wins.
- UTF-16 or other non-UTF-8 text:
  - Treat as unavailable unless the existing snapshot reader explicitly decodes
    and hashes the exact same text representation used by review events.
- Secrets in file content:
  - Do not log content in diagnostics. Existing ledger storage rules apply to
    before/after blobs.
- Git LFS pointer file:
  - Treat the pointer text as the file content if that is what the snapshot
    contains. Do not dereference LFS objects.
- Sparse checkout missing working-tree file:
  - Irrelevant for proof. Snapshot evidence may still be valid, but execution
    safety must handle current disk conflict separately.
- Submodule path:
  - Do not read inside submodule git data unless existing snapshot reader
    explicitly models submodules. Treat as metadata-only otherwise.
- Permission denied reading snapshot object:
  - Keep metadata-only warning and include a permission diagnostic.

### Edit Semantics

- `oldString === newString`:
  - Skip as no-op as current code does.
- `oldString` missing:
  - Cannot prove edit from toolpart alone.
- `newString` missing:
  - Cannot prove edit from toolpart alone.
- `oldString` appears twice:
  - No upgrade unless snapshot chain proves exact transition through another trusted source.
- `newString` appears twice when reversing:
  - No inverse upgrade.
- Replacement creates same final content through multiple possible paths:
  - No upgrade.
- `replaceAll` or multi-replacement edit shape appears:
  - Do not upgrade until that tool shape is explicitly parsed and tested.
- Edit tool reports success but snapshot before does not contain `oldString`:
  - No upgrade.
- Edit tool reports success but snapshot after does not contain `newString`:
  - No upgrade unless the replacement legitimately deletes the string and the exact transition verifies.
- Empty `oldString`:
  - Do not upgrade. Empty search strings are ambiguous.
- Empty `newString`:
  - Valid deletion only when `oldString` occurs exactly once and snapshot after
    equals the deletion result.
- Overlapping replacements:
  - Do not upgrade unless exact before-to-after application has one valid path.

### Write Semantics

- Write creates new file:
  - Upgrade only if before absent and after content known.
- Write overwrites existing file:
  - Upgrade only if snapshot before and snapshot after are known.
- Toolpart content differs from snapshot after:
  - No upgrade.
- Toolpart content is truncated:
  - Snapshot can still prove after only if snapshot after is available and operation is fully verified.
  - Keep a diagnostic that toolpart content was truncated but snapshot proof was used.
- Existing file baseline unavailable:
  - Keep current warning.
- Write after earlier edit in the same path/window:
  - Treat as chain case. Do not single-change upgrade.
- Write followed by edit in the same path/window:
  - Treat as chain case. Single-change upgrade is not enough.
- Write content is available but snapshot after is unavailable:
  - Do not use toolpart content alone for existing-file overwrite baseline.
- Write content equals previous content:
  - It may be a no-op. Do not create a misleading modify diff unless snapshot
    shows a real text state transition or the review event model supports no-op
    changes explicitly.
- Write creates parent directories:
  - Only the file text is in scope. Directory creation is not a text proof.

### Apply Patch Semantics

- Patch text unavailable:
  - Snapshot can prove final file-level before/after only if window has exact path anchor.
- Parsed update hunks apply exactly once:
  - Allow inverse chain proof.
- Parsed hunks apply multiple places:
  - No upgrade.
- Patch creates/deletes file:
  - Verify operation with snapshot before/after states.
- Patch touches files not in metadata:
  - Add diagnostic and do not upgrade missing paths unless snapshot path proof is exact.
- Patch contains rename:
  - Do not upgrade as text modify unless rename support is explicitly modeled.
- Patch changes file mode only:
  - Keep metadata-only unless mode changes are supported by the review event schema.
- Patch contains CRLF-sensitive context:
  - Exact text verification is required. Do not line-ending-normalize.
- Patch partially applies in reverse:
  - No upgrade. All hunks must verify.
- Patch has context-only hunks:
  - Do not treat context as a change without before/after text proof.
- Patch deletes and recreates the same file in one patch:
  - Treat as ambiguous unless the parser explicitly models it and tests cover it.

### Paths and Workspaces

- Absolute paths outside workspace:
  - Reject upgrade.
- `..` path traversal:
  - Reject upgrade.
- Windows path separators:
  - Normalize, then validate.
- Symlink points outside workspace:
  - Do not read current disk. Snapshot git store path normalization should be trusted only for repository paths.
- Session directory is subdirectory:
  - Touched paths outside session directory get diagnostic. Do not let this alone prove or disprove content.
- Case-insensitive filesystem:
  - Use existing OpenCode path comparison helpers.
- Unicode normalization differences in file names:
  - Use existing normalized path keys. Do not add a second normalization scheme in this feature.
- Nested git repository inside workspace:
  - Verify snapshot identity against the OpenCode project worktree, not just the process cwd.
- Worktree moved after task:
  - Use recorded project identity. If workspace identity cannot be trusted, no upgrade.
- Workspace root is a symlink:
  - Use existing workspace comparison helpers. Do not add ad-hoc `realpath`
    behavior unless tests cover both symlinked and non-symlinked roots.
- File path contains newline or control characters:
  - Do not include raw path in diagnostics without escaping. Upgrade only if
    existing path normalization accepts it safely.
- Case-only rename:
  - Treat as rename/path operation, not a text modify, unless the review event
    schema explicitly models it.
- Path appears both as file and directory across before/after:
  - Keep metadata-only unless snapshot reader explicitly models the transition.

### Concurrency and Later Changes

- Current disk changed after task:
  - Irrelevant. Do not use current disk for proof.
- Another member changed same file after OpenCode task:
  - Snapshot proof remains historical. Review conflict detection must happen elsewhere.
- Backfill runs twice:
  - Source import keys must dedupe.
- Backfill interrupted:
  - Existing ledger import must remain idempotent.
- OpenCode host is still writing SQLite:
  - Rely on read-only transaction and existing fingerprint diagnostics. Do not retry aggressively.
- Two backfills run concurrently:
  - Existing in-flight dedupe should prevent duplicate desktop calls. The importer must still dedupe by source key.
- User manually edits a file while review is open:
  - Snapshot proof remains historical. Apply/reject conflict handling is outside this feature.
- Team is relaunched while backfill is running:
  - Use run/session identity from the delivery context. Do not merge new runtime
    sessions into the old task proof.
- Snapshot proof succeeds but ledger import fails:
  - Retry should be idempotent by source key. Diagnostics should not mark the
    task as safely upgraded until import succeeds.
- Snapshot store is pruned between capability check and file read:
  - Treat the read failure as metadata-only fallback. Do not retry from current disk.
- OpenCode writes a new assistant message while backfill reads SQLite:
  - Use the read-only transaction snapshot and existing fingerprint diagnostics.
    Do not merge later rows into the current proof attempt.
- SQLite WAL is corrupt or cannot be read:
  - Treat session history as unavailable/unsupported. Do not use partial rows for
    safe proof.
- OpenCode JSON row is malformed:
  - Skip that row and keep affected changes metadata-only. Do not infer from
    surrounding rows.

### UI And Review Semantics

- Full-text upgrade enables normal diff rendering only if the imported event has
  both safe baseline and safe target state.
- Metadata-only warnings should remain visible and should not be hidden by task
  summary aggregation.
- `Reject All` must still skip non-rejectable files.
- A task-level warning may remain even when all file diffs are full-text.
- A file-level warning may remain even when another file in the same task is upgraded.
- Do not change viewed-count behavior in this feature.
- Do not hide task cards solely because all OpenCode warnings were resolved.
- Do not change accept/reject button labels or statuses in this feature.
- Do not mark a file viewed just because snapshot proof succeeded.
- A file becoming review-safe does not mean reject execution must succeed if the
  user changed the worktree after the task.
- Conflict messaging for apply/reject should remain the existing shared
  behavior, not a new OpenCode-only message path.
- If a task card warning disappears because all file-level OpenCode baseline
  warnings were resolved, task-boundary and attribution warnings must still
  remain visible.
- The UI should not describe a snapshot-upgraded file as "guaranteed safe".
  It is "review-safe" or "full-text verified"; execution can still conflict.
- Do not add success toasts or celebratory messaging for proof upgrades. This is
  infrastructure, not a user-facing achievement.

### Security And Privacy

- Do not log before/after content.
- Do not include long snippets in diagnostics.
- Do not include raw paths with control characters in diagnostics.
- Do not include delivery payload text in proof stats.
- Do not expand file size limits for convenience.
- Do not add a new IPC path that exposes arbitrary snapshot reads.
- If a file is upgraded, it is stored through the existing task-change ledger
  content path. Do not add a second storage location.

### Serialization And Backward Compatibility

- Older desktop builds may see new diagnostics but should not require a new
  event schema to render metadata-only fallback.
- Missing `snapshotSource` should not crash review rendering.
- Missing `snapshotId` should not crash review rendering.
- Unknown `evidenceProof` values should be treated as unsafe by review safety
  helpers.
- JSON serialization must preserve empty string content.
- JSON serialization must distinguish absent file from empty file.
- Large content omitted by limits must serialize as unavailable state, not empty
  content.

## Test Plan

### Risk To Test Traceability

| Risk | Required test/smoke |
| --- | --- |
| Cross-task contamination | real-data smoke with at least two tasks in one OpenCode session |
| Cross-member contamination | fixture with shared profile but different member/lane |
| Wrong snapshot window | unit test with overlapping windows and outside-window toolpart |
| False baseline from current disk | unit test proving current disk is never consulted |
| Unsafe warning removal | unit test with unrelated `manual-only` warning preserved |
| Duplicate imported events | repeated-backfill bridge test |
| Performance regression | smoke budget with snapshot read counters |
| Unsupported OpenCode shape | snapshot provider unsupported-shape test |
| Mixed safe/unsafe task | desktop integration test for `Reject All` skipping metadata-only |
| Cache stale result | bridge or desktop worker test bypassing/invalidating cache deliberately |
| Capability false positive | fixture with snapshot enabled but missing store object |
| Shadow mode mutation | fingerprint comparison between `off` and `shadow` |
| Snapshot retention loss | fixture where window exists but object read fails |
| Execution conflict bypass | desktop/review test where current disk differs from expected after |
| Memory/storage blowup | fixture with many small files exceeding total byte budget |
| Malformed OpenCode rows | offline reader/reconstructor fixture with malformed part JSON |

### Negative Control Fixtures

Negative controls are cases that look close to valid proof but must not upgrade.

Required negative controls:

- Same file path, same member, but different task id.
- Same task id, same file path, but different member/lane.
- Same session and file path, but toolpart outside the snapshot window.
- Snapshot before/after text exists, but `oldString` occurs twice.
- Snapshot after equals toolpart content, but before is unavailable.
- Snapshot path matches, but operation is rename or mode-only.
- Current disk matches expected after, but snapshot before is missing.
- `shadow` computes an upgrade decision, but imported fingerprint matches `off`.
- Existing metadata-only event appears before upgraded event with same source key.
- Unknown `evidenceProof` appears in imported data.

Each negative control should assert both behavior and diagnostic reason. A
negative control without a reason is hard to debug and easy to regress.

### Golden Fixture Coverage Matrix

Maintain a small set of golden fixtures that cover the supported state space.

| Fixture | Mode | Expected |
| --- | --- | --- |
| write-create-text | `single-change` | upgraded create |
| write-modify-text | `single-change` | upgraded modify |
| edit-modify-once | `single-change` | upgraded modify |
| delete-text | `single-change` | upgraded delete |
| duplicate-old-string | `single-change` | metadata-only |
| missing-before | `single-change` | metadata-only |
| toolpart-outside-window | `single-change` | metadata-only |
| shadow-valid-edit | `shadow` | would-upgrade stats, original import |
| non-opencode-task | all modes | unchanged fingerprint |
| missing-snapshot-object | all apply modes | metadata-only |
| multi-change-chain | `single-change` | skipped |
| multi-change-chain | `full` | upgraded only if chain verifies |

Golden fixtures should be tiny and deterministic. They should not depend on
wall-clock time, filesystem case behavior, or the user's current worktree.

### Unit Tests

Add or extend `OpenCodeChangeEvidenceEnricher.test.ts`.

Tests:

1. Upgrades metadata-only edit from exact snapshot before/after.
2. Does not upgrade edit when `oldString` appears twice.
3. Does not upgrade edit when snapshot after does not equal applied result.
4. Upgrades write create with before absent and after text.
5. Upgrades write modify when toolpart content equals snapshot after.
6. Does not upgrade write modify when toolpart content differs from snapshot after.
7. Upgrades delete with before text and after absent.
8. Does not remove attribution warnings after content proof.
9. Keeps manual-only warning when anchor has unavailable before/after content.
10. Multi-edit same-path chain upgrades only when the whole chain verifies.
11. Multi-edit same-path chain keeps all metadata-only fallbacks when one step is ambiguous.
12. Snapshot provider unavailable keeps current behavior.
13. Does not remove unrelated `manual-only` warning text.
14. Keeps task boundary warnings after successful content proof.
15. Does not upgrade compatible attribution mode.
16. Does not upgrade when feature flag is `off`.
17. Single-change mode skips multi-change chain upgrade.
18. Empty file create and empty file modify are handled as valid text.
19. CRLF/LF mismatch fails proof instead of normalizing.
20. Duplicate source import keys block chain upgrade.
21. Empty `newString` deletion upgrades only with exact single occurrence.
22. Empty `oldString` never upgrades.
23. State hashes must match emitted full text.
24. Snapshot anchor duplicate path entry skips upgrade.
25. `write` no-op does not create a misleading diff.
26. Skipped proof preserves the original change object fields.
27. Successful proof mutates only allowed fields.
28. Snapshot proof decision returns typed skipped reason, not `null`.
29. Unsupported snapshot shape never upgrades.
30. Existing-event policy defaults to `new-imports-only` when dedupe is unknown.
31. State machine cannot jump from candidate to upgraded without transition verification.
32. Unstable part order blocks multi-change upgrade.
33. Unknown `evidenceProof` is unsafe in review safety helper.
34. Empty string survives materialization and serialization.
35. Absent file is not serialized as empty file.
36. `shadow` mode computes proof stats but returns original changes.
37. `mayApplySnapshotProof` blocks multi-change groups in `single-change`.
38. Exhaustive switches fail compilation when a new mode/proof decision is not handled.
39. Capability success is required before snapshot proof attempt.
40. Missing snapshot git-store object keeps metadata-only fallback.
41. `off` and `shadow` review bundle fingerprints match.
42. Non-OpenCode fingerprints are identical across all modes.
43. Review-safe upgraded change still fails reject execution when current disk mismatches expected after.
44. Total byte budget skips excess files as metadata-only.
45. Malformed OpenCode part JSON cannot produce upgraded proof.
46. LFS pointer text is not dereferenced.
47. Submodule paths stay metadata-only unless explicitly modeled.
48. Runtime postcondition failure preserves original metadata-only change.
49. Every golden fixture has a paired negative control.
50. Minimum safe scope excludes unsupported operation shapes.

Example fixture shape:

```ts
const change: ReconstructedOpenCodeToolChange = {
  taskId: 'task-1',
  taskRef: 'task-1',
  taskRefKind: 'canonical',
  teamName: 'team',
  memberName: 'alice',
  sessionId: 'session',
  assistantMessageId: 'message-1',
  toolUseId: 'tool-1',
  sourcePartId: 'part-1',
  sourceMessageId: 'message-1',
  sourceTool: 'edit',
  sourceImportKey: 'session:part-1:src/app.ts',
  filePath: '/workspace/src/app.ts',
  relativePath: 'src/app.ts',
  beforeContent: null,
  afterContent: null,
  operation: 'modify',
  confidence: 'medium',
  attributionMethod: 'delivery-ledger-taskrefs',
  oldString: 'const value = 1',
  newString: 'const value = 2',
  beforeState: { exists: true, unavailableReason: 'opencode-edit-baseline-not-captured' },
  afterState: { exists: true, unavailableReason: 'opencode-edit-final-content-unavailable' },
  evidenceProof: 'metadata-only-fallback',
  warnings: ['OpenCode edit was captured without a proven full-text baseline; apply/reject is manual-only.'],
  timestamp: new Date(0).toISOString(),
}
```

### Synthetic Fixture Schema

Use a compact fixture builder so edge cases do not depend entirely on live
OpenCode data.

```ts
type SnapshotProofFixture = {
  name: string
  mode: SnapshotProofUpgradeMode
  attributionMode: OpenCodeLedgerAttributionMode
  delivery: {
    teamName: string
    taskId: string
    memberName: string
    laneId?: string
    sessionId: string
    assistantMessageId: string
  }
  windows: Array<{
    messageId: string
    windowId: string
    fromSnapshot: string
    toSnapshot: string
    startPartOrder: number
    finishPartOrder: number
  }>
  parts: Array<{
    partId: string
    messageId: string
    order: number
    tool: 'write' | 'edit' | 'apply_patch'
    filePath: string
    oldString?: string
    newString?: string
    content?: string
  }>
  snapshotFiles: Array<{
    relativePath: string
    beforeContent?: string
    afterContent?: string
    beforeExists: boolean
    afterExists: boolean
  }>
  expected: {
    upgraded: number
    metadataOnly: number
    diagnostics: string[]
  }
}
```

Fixture rules:

- Every positive fixture needs a paired negative fixture that differs by one
  proof condition.
- Fixtures should prefer tiny strings so failures are easy to inspect.
- Fixtures must include at least one empty string case and one absent-file case.
- Fixtures must include one path with unsafe characters for diagnostics escaping.
- Fixtures must not include secrets or large blobs.

### Snapshot Provider Tests

Extend `OpenCodeSnapshotEvidenceProvider.test.ts`.

Tests:

1. Groups only unresolved proof-needed changes into touched paths.
2. Emits diagnostic for missing window.
3. Emits diagnostic for ambiguous window.
4. Preserves existing limits.
5. Does not read snapshot for unrelated exact changes.
6. Does not match windows across assistant messages.
7. Emits diagnostic for extra snapshot paths not in reconstructed toolparts.
8. Emits timeout diagnostic while preserving metadata-only fallback.
9. Does not read snapshot windows with no unresolved touched paths.
10. Escapes unsafe path text in diagnostics.

### Ledger Bridge Tests

Extend `OpenCodeLedgerBridgeService` tests or add a focused fixture test.

Tests:

1. Backfill imports upgraded full-text event for strict delivery OpenCode edit.
2. Backfill keeps metadata-only event for compatible attribution.
3. Backfill keeps metadata-only event with no delivery context.
4. Imported event has stable source import key and dedupes on rerun.
5. `snapshotShapeFingerprint` is present when snapshot proof was used.
6. Repeated backfill does not duplicate file entries.
7. Old metadata-only imported event is not rewritten unless importer already supports superseding by source key.
8. Snapshot proof is not attempted for Codex or Anthropic members.
9. Snapshot proof is not attempted for OpenCode exact `toolpart-chain` changes.
10. Backfill cache does not return stale metadata-only data after an upgraded import in the same test.
11. Import failure leaves no partial safe-review state.

### Desktop Integration Tests

Only if needed. The desktop review UI already handles full text and metadata-only.

Smoke check:

1. Full-text OpenCode upgraded event renders a real diff.
2. Metadata-only event still renders manual-only warning.
3. Reject is enabled only for full-text safe baseline.
4. Warnings remain visible for task boundary uncertainty.
5. `Reject All` skips a mixed task where one OpenCode file upgraded and another stayed metadata-only.
6. Current disk preview remains read-only and does not become a reject baseline.
7. Viewed count is unchanged by proof upgrade.
8. Task-level boundary warning remains visible after all file diffs upgrade.
9. Reject execution still blocks when current disk no longer matches the expected
   after state.
10. Bulk `Reject All` rejects only files that pass both review-safety and
    execution-safety checks.

### Property-Like Tests

Add small table-driven tests for transition verification:

```ts
const editCases = [
  { name: 'single replacement', before: 'a = 1', oldString: '1', newString: '2', after: 'a = 2', ok: true },
  { name: 'duplicate old', before: 'a 1 b 1', oldString: '1', newString: '2', after: 'a 2 b 1', ok: false },
  { name: 'empty old', before: 'abc', oldString: '', newString: 'x', after: 'xabc', ok: false },
  { name: 'delete exactly once', before: 'abc', oldString: 'b', newString: '', after: 'ac', ok: true },
]
```

The point is not random fuzzing. The point is to make ambiguous replacement
rules explicit and hard to regress.

### Real Data Smoke

Before implementation, capture a baseline:

```bash
time pnpm test --run test/main/services/team/TaskChangeComputer.test.ts
time pnpm test --run test/main/services/team/ChangeExtractorService.test.ts
```

After implementation, run the same commands:

```bash
pnpm test --run test/main/services/team/TaskChangeComputer.test.ts
pnpm test --run test/main/services/team/ChangeExtractorService.test.ts
```

Then run the existing real-data smoke scripts used for task changes. Required
checks:

- `errors: 0`
- no increase in item errors
- no cross-task file leakage
- no increase in metadata-only count for OpenCode tasks
- no change for Codex-only teams
- no change for Anthropic-only teams
- broad smoke runtime increase <= 10%
- snapshot timeout count <= 2
- upgraded OpenCode full-text count is explainable by diagnostics
- no decrease in task-boundary warnings unless task-boundary code changed separately
- `off` and `shadow` fingerprints match except diagnostics/stats
- non-OpenCode fingerprints match in all modes

Manual target cases:

- `relay-works-3/#1f735bea`
- `relay-works-3/#bf01e5c3`
- `relay-works-3/#43e6b9b0` should remain Codex-related, not OpenCode-upgraded
- `signal-ops-22` should remain unaffected because it has no OpenCode members
- any OpenCode team with real `snapshotShapeFingerprint` present in diagnostics
- one team with missing/reset delivery ledger, if available

Add at least one synthetic OpenCode snapshot fixture if real data lacks a clean
single-change full-text snapshot case. Real data validates integration, but a
synthetic fixture is better for precise edge cases.

Real-data smoke should compare before/after summaries:

```text
Before:
- OpenCode metadata-only file changes: N
- OpenCode full-text file changes: M
- non-OpenCode full-text file changes: X
- task-boundary warnings: B

After:
- OpenCode metadata-only file changes: <= N
- OpenCode full-text file changes: >= M
- non-OpenCode full-text file changes: X
- task-boundary warnings: B
```

Any non-OpenCode count change is a blocker.

### Failure Injection Tests

Add targeted failure injection where practical:

- Snapshot provider throws.
- Snapshot provider times out.
- Snapshot provider returns duplicate path entries.
- Ledger importer rejects the batch.
- Backfill runs twice with the same source import key.
- Feature flag changes from `single-change` to `off`.
- Snapshot proof succeeds for one file and fails for another file in the same task.

Expected result for every failure injection: original metadata-only safety is
preserved, no duplicate review rows, diagnostics explain the skip/failure.

### Serialization Tests

Add tests around the task-change event materialization boundary:

- `beforeContent: ''` remains empty string.
- `afterContent: ''` remains empty string.
- `beforeContent: null` remains unavailable/absent according to state.
- Unknown `evidenceProof` does not make a file rejectable.
- Snapshot fields survive import/export if present.
- Snapshot fields may be absent without renderer crashes.

### Manual QA Runbook

Manual QA is not a substitute for tests, but it helps catch integration mistakes.

Prepare:

1. Pick one OpenCode team with snapshot evidence.
2. Pick one Codex-only or Anthropic-only team.
3. Record before counts for:
   - OpenCode metadata-only files.
   - OpenCode full-text files.
   - non-OpenCode full-text files.
   - task-boundary warnings.
   - snapshot proof diagnostics.

Run with `off`:

```bash
OPENCODE_SNAPSHOT_PROOF_UPGRADE=off pnpm test --run test/main/services/team/TaskChangeComputer.test.ts
```

Expected:

- No new upgraded OpenCode snapshot events.
- Existing exact toolpart-chain behavior unchanged.

Run with `shadow`:

```bash
OPENCODE_SNAPSHOT_PROOF_UPGRADE=shadow pnpm test --run test/main/services/team/TaskChangeComputer.test.ts
```

Expected:

- Snapshot proof stats are emitted.
- Would-upgrade counts are visible.
- Imported/reviewed changes are identical to `off`.
- Any difference from `off` outside diagnostics is a blocker.

Run with `single-change`:

```bash
OPENCODE_SNAPSHOT_PROOF_UPGRADE=single-change pnpm test --run test/main/services/team/TaskChangeComputer.test.ts
```

Expected:

- OpenCode full-text count may increase.
- OpenCode metadata-only count may decrease or stay equal.
- non-OpenCode counts are unchanged.
- Multi-change groups are skipped with diagnostics.

Run with `full` only after tests pass:

```bash
OPENCODE_SNAPSHOT_PROOF_UPGRADE=full pnpm test --run test/main/services/team/TaskChangeComputer.test.ts
```

Expected:

- Same guarantees as `single-change`.
- Multi-change upgrades appear only when diagnostics can explain the full chain.

UI spot check:

- Open a mixed task with one upgraded file and one metadata-only file.
- Verify the upgraded file shows a diff.
- Verify the metadata-only file still shows a warning.
- Verify `Reject All` skips metadata-only files.
- Verify current disk preview is not treated as baseline.
- Verify task boundary warnings remain if present.

Any mismatch is a blocker.

## Acceptance Criteria

The implementation is acceptable only if all are true:

- OpenCode-only behavior changed.
- Strict delivery remains required for snapshot full-text upgrades.
- Exact existing `toolpart-chain` behavior is unchanged.
- Metadata-only fallback still works.
- No current disk content is used as historical proof.
- No broad OpenCode session scan is introduced.
- Snapshot read limits are unchanged or narrower.
- Ambiguous chains keep warnings.
- Large and binary files keep warnings.
- Tests cover same-path multi-change chains.
- Real-data smoke shows no cross-task leakage.
- Feature flag can disable the upgrade.
- Repeated backfill does not duplicate review files.
- Warning removal is limited to known resolved warning predicates.
- Performance budgets pass.
- The implementation has an explicit fallback for unsupported OpenCode snapshot shapes.
- The implementation includes Phase 0 contract audit notes in the PR/commit
  description or test output.
- No warning is removed unless a unit test names that exact warning or predicate.
- No current-disk preview path is involved in a proof decision.
- No behavior change occurs when `OPENCODE_SNAPSHOT_PROOF_UPGRADE=off`.
- Smoke output includes attempted/upgraded/skipped counts.
- Full mode is not enabled while any known unknown remains unresolved.
- Existing metadata-only events are not rewritten unless source-key supersede is
  proven by tests.
- Cache behavior is documented in the Phase 0 audit.
- Composite proof identity is enforced before snapshot text is trusted.
- Toolpart ordering is explicitly verified before multi-change upgrades.
- `single-change` and `full` have separate definitions of done.
- Serialization preserves empty string versus absent file.
- `shadow` mode proves expected upgrades without changing imported review events.
- Exhaustive handling covers every proof decision and feature flag mode.
- Capability gates are checked per session, not inferred from config alone.
- Missing/pruned snapshot store objects preserve metadata-only fallback.
- Deterministic fingerprints prove non-OpenCode behavior is unchanged.
- Apply/reject execution safety still checks current disk state after review
  proof succeeds.
- Storage and memory budgets are enforced without duplicate blob storage.
- Malformed/truncated OpenCode rows cannot produce upgraded proof.
- First apply rollout stays within the minimum safe scope.
- Negative controls prove close-but-invalid cases remain metadata-only.
- Runtime postcondition failures preserve original changes.

## Verification Command Matrix

Use the narrowest useful commands first, then broader smoke.

| Layer | Command or check | Required result |
| --- | --- | --- |
| Typecheck | `pnpm typecheck` | passes |
| Enricher unit | targeted `OpenCodeChangeEvidenceEnricher` tests | passes |
| Snapshot provider | targeted `OpenCodeSnapshotEvidenceProvider` tests | passes |
| Ledger bridge | targeted `OpenCodeLedgerBridgeService` tests | passes |
| Desktop review safety | targeted review/rejectability tests | passes |
| Off mode | task-change tests with `OPENCODE_SNAPSHOT_PROOF_UPGRADE=off` | old behavior |
| Shadow mode | task-change tests with `OPENCODE_SNAPSHOT_PROOF_UPGRADE=shadow` | stats only |
| Single-change mode | task-change tests with `OPENCODE_SNAPSHOT_PROOF_UPGRADE=single-change` | only one-change upgrades |
| Full mode | task-change tests with `OPENCODE_SNAPSHOT_PROOF_UPGRADE=full` | only after chain tests |
| Real data | existing task-change smoke on OpenCode and non-OpenCode teams | no leakage |

Do not use `full` smoke as a substitute for single-change smoke. They prove
different safety boundaries.

## Code Review Checklist

Use this checklist before merging the implementation:

- Every upgraded change has a non-`metadata-only-fallback` proof.
- Every upgraded modify has both `beforeContent` and `afterContent`.
- Every upgraded create has `beforeState.exists === false` and `afterContent`.
- Every upgraded delete has `beforeContent` and `afterState.exists === false`.
- State hashes match emitted content.
- No branch reads current disk as proof.
- No branch catches an error and upgrades anyway.
- Every skipped branch preserves the original change.
- Warning stripping uses a central predicate.
- Multi-change mode can be disabled independently.
- Snapshot reader limits are unchanged or narrower.
- Tests include at least one negative case for every positive upgrade case.
- Real-data smoke includes at least one OpenCode team and one non-OpenCode team.
- No new IPC or filesystem read path bypasses existing workspace trust checks.
- No content appears in diagnostics, metrics, or thrown error messages.
- `off` mode is covered by a test and is easy to use during rollback.
- The proof logic is structured so forbidden state transitions are not possible
  without an obvious code review smell.
- `shadow` mode has been run on real data before any apply mode is enabled.
- Any new union member requires an exhaustive switch update, not a permissive
  default branch.
- Review-safe and execution-safe are checked separately.
- Large-file and total-byte budget tests prove metadata-only fallback.
- Minimum safe scope is visible in code structure, not only in tests.
- Negative controls exist for task, member, window, baseline, and operation
  mismatch.

## Implementation Anti-Patterns

Do not implement the feature using these patterns:

- A broad `try/catch` that returns an upgraded change on partial data.
- Mutating the original change object in place before proof has succeeded.
- Removing warnings before the final proof decision.
- Reading current disk to fill `beforeContent`.
- Comparing normalized line endings for proof.
- Treating a matching hash as content.
- Creating OpenCode-specific rejectability logic in the renderer.
- Appending upgraded duplicate events and expecting UI sorting to hide stale ones.
- Increasing snapshot limits to make a test pass.
- Falling back from strict delivery to compatible attribution for safety.
- Adding `full` mode as the default in the same change that introduces it.
- Treating empty string as missing content.
- Treating missing content as empty string.
- Sorting same-path chains by only one field.
- Retrying snapshot reads in a loop without a budget.
- Treating review-safe as automatically execution-safe.
- Adding a second blob store or cache for before/after content.
- Dereferencing Git LFS or submodule content outside the existing snapshot reader.
- Using malformed partial OpenCode JSON rows as proof context.
- Expanding the first apply mode to unsupported operations because the snapshot
  text happens to be available.
- Hiding uncertainty by changing user-facing wording from warning to success.

## Rollout Strategy

1. Land diagnostics and helper functions with no behavior change if practical.
2. Add the feature flag with default `off` in tests where needed.
3. Add snapshot-first upgrade for single-change same-path cases.
4. Run targeted tests and real-data smoke.
5. Enable `single-change` mode for local smoke.
6. Add multi-change chain upgrade only after tests are solid.
7. Move to `full` mode only if multi-change smoke is clean.
8. Inspect warnings before and after for OpenCode tasks.

If multi-change support looks risky during implementation, stop after
single-change mode. Single-change upgrade is already useful and lower risk.

Recommended shipping sequence:

```text
PR 1: diagnostics + eligibility helpers + no behavior change
PR 2: single-change snapshot proof upgrade behind flag
PR 3: enable single-change by default for OpenCode strict delivery
PR 4: multi-change chain upgrade behind flag
PR 5: enable full mode only after real-data smoke
```

If this stays as one PR, keep the same commit structure locally and verify each
step before moving to the next one.

## Abort Conditions

Do not continue implementation if any of these happens:

- Snapshot windows cannot be reliably matched to toolparts.
- Existing OpenCode snapshot shape differs from tests in real data.
- Real-data smoke shows any new cross-task file leakage.
- Performance smoke shows repeated timeouts.
- A change would require using current disk as proof.
- A change would require broad compatible attribution scanning.
- Warning stripping needs broad substring matching to pass tests.
- Multi-change support requires accepting ambiguous edit/apply-patch replacements.
- Source import key dedupe behavior is unclear.
- The only available validation is manual UI inspection.
- A test has to assert against current wall-clock timing without a stable budget.
- Formal proof predicates require exceptions to support the first implementation.
- Postconditions fail for any positive fixture.
- `single-change` mode needs multi-change assumptions to pass.
- `full` mode needs renderer-specific special cases to appear safe.
- Rollback with `OPENCODE_SNAPSHOT_PROOF_UPGRADE=off` does not restore old
  behavior for new backfills.
- Empty content and missing content cannot be distinguished at serialization.

## Open Questions Template For Implementation PR

Every implementation PR should answer these in its description:

```text
OpenCode snapshot proof PR checklist:
- Mode implemented: diagnostics | single-change | full
- Default mode:
- Phase 0 contract audit completed: yes | no
- Source import key duplicate policy:
- Review bundle dedupe key:
- Rejectability helper:
- Existing event policy: new-imports-only | supersede-by-source-key
- Snapshot shape fingerprint observed:
- Real-data teams tested:
- Non-OpenCode teams unchanged: yes | no
- Snapshot proof stats:
- Rollback tested with OPENCODE_SNAPSHOT_PROOF_UPGRADE=off: yes | no
- Known unknowns remaining:
```

If the PR cannot answer one of these, it should not enable new behavior by
default.

## Example Final Change Shape

Before upgrade, metadata-only edit:

```json
{
  "sourceTool": "edit",
  "before": {
    "exists": true,
    "unavailableReason": "opencode-edit-baseline-not-captured"
  },
  "after": {
    "exists": true,
    "unavailableReason": "opencode-edit-final-content-unavailable"
  },
  "evidenceProof": "metadata-only-fallback",
  "warnings": [
    "OpenCode edit was captured without a proven full-text baseline; apply/reject is manual-only."
  ]
}
```

After verified snapshot upgrade:

```json
{
  "sourceTool": "edit",
  "before": {
    "exists": true,
    "sha256": "before-hash",
    "sizeBytes": 128
  },
  "after": {
    "exists": true,
    "sha256": "after-hash",
    "sizeBytes": 128
  },
  "evidenceProof": "opencode-snapshot",
  "snapshotSource": "opencode",
  "warnings": []
}
```

If any proof check fails, the event must stay in the first shape.

## Notes for Future Maintainers

The important invariant is not "fewer warnings". The invariant is "warnings are
removed only when the system has stronger evidence than before".

Warnings are correct when historical full text is not proven. A warning is a
better outcome than an unsafe reject button.