about things
at main 147 lines 5.3 kB view raw view rendered
1# the inversion algorithm 2 3how `verifyCommitDiff` proves a commit is a valid state transition. 4 5## the claim being verified 6 7given: 8- a signed commit with MST root `data_cid` (the "after" state) 9- a CAR containing the commit block + changed MST nodes + new records 10- a list of operations (create/update/delete with paths and CIDs) 11- a previous MST root `prev_data` (the "before" state, from the preceding commit) 12 13prove: the operations, applied to `prev_data`, produce `data_cid`. 14 15the trick: instead of *applying* ops forward (which would require the full previous tree), *invert* them on the new tree and check if you get back to the old root. 16 17## why inversion instead of forward application 18 19forward application would require: 201. the full previous MST (potentially millions of entries) 212. applying each op to produce the new tree 223. comparing the new root against `data_cid` 23 24inversion requires: 251. only the partial MST from the CAR (changed nodes only) 262. reversing each op on the new tree 273. comparing the inverted root against `prev_data` 28 29the CAR already contains the new partial tree. unchanged subtrees become stubs — their CIDs are known but their contents aren't loaded. this is the key insight: you only need blocks for the parts that changed. 30 31## step by step 32 33source: `repo_verifier.zig:387-469` 34 35### 1. parse CAR, extract commit 36 37``` 38loadCommitFromCAR(blocks) → { commit, unsigned_bytes, repo_car } 39``` 40 41the commit is CBOR with fields: `did`, `version`, `data` (MST root CID), `rev`, `prev`, `sig`. the unsigned bytes are re-encoded without `sig` for signature verification. 42 43### 2. verify signature 44 45``` 46verify(unsigned_bytes, commit.sig, public_key) → ok | error 47``` 48 49proves the commit came from the account's signing key. this is independent of MST — it proves authorship. 50 51### 3. load partial MST from CAR 52 53``` 54Mst.loadFromBlocks(repo_car, commit.data_cid) → partial tree 55``` 56 57source: `mst.zig:459-480` 58 59walks from the root block, loading nodes whose blocks exist in the CAR. blocks not found become `ChildRef.stub` — a CID placeholder. the result is a tree where: 60- changed paths have fully loaded nodes 61- unchanged subtrees are stubs (just their hash) 62 63### 4. copy tree for inversion 64 65``` 66tree.copy() → inverted 67``` 68 69deep clone. stubs stay as stubs. the original tree is preserved. 70 71### 5. normalize operations 72 73``` 74normalizeOps(msg_ops) → sorted_ops 75``` 76 77source: `mst.zig:773-800` 78 79- sort: **deletions first**, then by path lexicographically 80- reject duplicates (same path twice → `DuplicatePath`) 81 82deletions first matters: if you invert a create before a delete on related paths, you might touch a node that should already be gone. ordering ensures deterministic application. 83 84### 6. invert each operation 85 86source: `mst.zig:804-820` 87 88for each op in sorted order: 89 90| forward op | inverse action | verification | 91|---|---|---| 92| **create**(path, cid) | `deleteReturn(path)` | removed value must equal `cid` | 93| **update**(path, new_cid, prev_cid) | `putReturn(path, prev_cid)` | displaced value must equal `new_cid` | 94| **delete**(path, prev_cid) | `putReturn(path, prev_cid)` | path must not already exist (displaced == null) | 95 96each step is self-checking. if any assertion fails → `InversionMismatch`. 97 98### 7. compute inverted root 99 100``` 101inverted.rootCid() → cid 102``` 103 104traverses the (now-inverted) tree bottom-up. loaded nodes are serialized as DAG-CBOR and SHA-256 hashed. stubs contribute their known CIDs directly — this is safe because we trust the previous state. 105 106if inversion needs to descend into a stub (operation touches an unchanged subtree), it fails with `PartialTree` — the CAR didn't include enough context. 107 108### 8. compare 109 110``` 111inverted_root == prev_data → valid transition 112``` 113 114if they match, the operations *fully explain* the state change from the old root to the new root. no hidden mutations, no missing ops, no reordering. 115 116## what stubs mean for the proof 117 118stubs are the mechanism that makes partial verification possible: 119 120``` 121ChildRef = union(enum) { 122 none, — no child (leaf boundary) 123 node: *Node, — loaded from CAR, fully traversable 124 stub: cbor.Cid, — CID known, content trusted 125} 126``` 127 128when computing `rootCid()`: 129- `node` → serialize, hash, produce CID 130- `stub` → use the CID as-is (it's trusted from previous verification) 131- `none` → encoded as null in CBOR 132 133the proof chain holds because each stub was once a verified node in a previous commit. by induction, every stub's CID is correct. 134 135## error taxonomy 136 137| error | meaning | response | 138|---|---|---| 139| `SignatureVerificationFailed` | wrong key or tampered commit | reject, re-resolve DID | 140| `PrevDataMismatch` | ops don't explain the transition | chain broken, re-sync | 141| `InversionMismatch` | individual op inconsistent with tree state | reject commit | 142| `PartialTree` | CAR missing blocks needed for inversion | incomplete data | 143| `DuplicatePath` | same path appears in multiple ops | malformed commit | 144| `InvalidMstNode` | CBOR structure of MST node is wrong | corrupted data | 145| `InvalidCommit` | commit object malformed or wrong version | corrupted data | 146 147the first three are the interesting ones — they distinguish between identity fraud (signature), history fraud (prevData), and operation fraud (inversion).