about things
1# the inversion algorithm
2
3how `verifyCommitDiff` proves a commit is a valid state transition.
4
5## the claim being verified
6
7given:
8- a signed commit with MST root `data_cid` (the "after" state)
9- a CAR containing the commit block + changed MST nodes + new records
10- a list of operations (create/update/delete with paths and CIDs)
11- a previous MST root `prev_data` (the "before" state, from the preceding commit)
12
13prove: the operations, applied to `prev_data`, produce `data_cid`.
14
15the trick: instead of *applying* ops forward (which would require the full previous tree), *invert* them on the new tree and check if you get back to the old root.
16
17## why inversion instead of forward application
18
19forward application would require:
201. the full previous MST (potentially millions of entries)
212. applying each op to produce the new tree
223. comparing the new root against `data_cid`
23
24inversion requires:
251. only the partial MST from the CAR (changed nodes only)
262. reversing each op on the new tree
273. comparing the inverted root against `prev_data`
28
29the CAR already contains the new partial tree. unchanged subtrees become stubs — their CIDs are known but their contents aren't loaded. this is the key insight: you only need blocks for the parts that changed.
30
31## step by step
32
33source: `repo_verifier.zig:387-469`
34
35### 1. parse CAR, extract commit
36
37```
38loadCommitFromCAR(blocks) → { commit, unsigned_bytes, repo_car }
39```
40
41the commit is CBOR with fields: `did`, `version`, `data` (MST root CID), `rev`, `prev`, `sig`. the unsigned bytes are re-encoded without `sig` for signature verification.
42
43### 2. verify signature
44
45```
46verify(unsigned_bytes, commit.sig, public_key) → ok | error
47```
48
49proves the commit came from the account's signing key. this is independent of MST — it proves authorship.
50
51### 3. load partial MST from CAR
52
53```
54Mst.loadFromBlocks(repo_car, commit.data_cid) → partial tree
55```
56
57source: `mst.zig:459-480`
58
59walks from the root block, loading nodes whose blocks exist in the CAR. blocks not found become `ChildRef.stub` — a CID placeholder. the result is a tree where:
60- changed paths have fully loaded nodes
61- unchanged subtrees are stubs (just their hash)
62
63### 4. copy tree for inversion
64
65```
66tree.copy() → inverted
67```
68
69deep clone. stubs stay as stubs. the original tree is preserved.
70
71### 5. normalize operations
72
73```
74normalizeOps(msg_ops) → sorted_ops
75```
76
77source: `mst.zig:773-800`
78
79- sort: **deletions first**, then by path lexicographically
80- reject duplicates (same path twice → `DuplicatePath`)
81
82deletions first matters: if you invert a create before a delete on related paths, you might touch a node that should already be gone. ordering ensures deterministic application.
83
84### 6. invert each operation
85
86source: `mst.zig:804-820`
87
88for each op in sorted order:
89
90| forward op | inverse action | verification |
91|---|---|---|
92| **create**(path, cid) | `deleteReturn(path)` | removed value must equal `cid` |
93| **update**(path, new_cid, prev_cid) | `putReturn(path, prev_cid)` | displaced value must equal `new_cid` |
94| **delete**(path, prev_cid) | `putReturn(path, prev_cid)` | path must not already exist (displaced == null) |
95
96each step is self-checking. if any assertion fails → `InversionMismatch`.
97
98### 7. compute inverted root
99
100```
101inverted.rootCid() → cid
102```
103
104traverses the (now-inverted) tree bottom-up. loaded nodes are serialized as DAG-CBOR and SHA-256 hashed. stubs contribute their known CIDs directly — this is safe because we trust the previous state.
105
106if inversion needs to descend into a stub (operation touches an unchanged subtree), it fails with `PartialTree` — the CAR didn't include enough context.
107
108### 8. compare
109
110```
111inverted_root == prev_data → valid transition
112```
113
114if they match, the operations *fully explain* the state change from the old root to the new root. no hidden mutations, no missing ops, no reordering.
115
116## what stubs mean for the proof
117
118stubs are the mechanism that makes partial verification possible:
119
120```
121ChildRef = union(enum) {
122 none, — no child (leaf boundary)
123 node: *Node, — loaded from CAR, fully traversable
124 stub: cbor.Cid, — CID known, content trusted
125}
126```
127
128when computing `rootCid()`:
129- `node` → serialize, hash, produce CID
130- `stub` → use the CID as-is (it's trusted from previous verification)
131- `none` → encoded as null in CBOR
132
133the proof chain holds because each stub was once a verified node in a previous commit. by induction, every stub's CID is correct.
134
135## error taxonomy
136
137| error | meaning | response |
138|---|---|---|
139| `SignatureVerificationFailed` | wrong key or tampered commit | reject, re-resolve DID |
140| `PrevDataMismatch` | ops don't explain the transition | chain broken, re-sync |
141| `InversionMismatch` | individual op inconsistent with tree state | reject commit |
142| `PartialTree` | CAR missing blocks needed for inversion | incomplete data |
143| `DuplicatePath` | same path appears in multiple ops | malformed commit |
144| `InvalidMstNode` | CBOR structure of MST node is wrong | corrupted data |
145| `InvalidCommit` | commit object malformed or wrong version | corrupted data |
146
147the first three are the interesting ones — they distinguish between identity fraud (signature), history fraud (prevData), and operation fraud (inversion).