the inversion algorithm#
how verifyCommitDiff proves a commit is a valid state transition.
the claim being verified#
given:
- a signed commit with MST root
data_cid(the "after" state) - a CAR containing the commit block + changed MST nodes + new records
- a list of operations (create/update/delete with paths and CIDs)
- a previous MST root
prev_data(the "before" state, from the preceding commit)
prove: the operations, applied to prev_data, produce data_cid.
the trick: instead of applying ops forward (which would require the full previous tree), invert them on the new tree and check if you get back to the old root.
why inversion instead of forward application#
forward application would require:
- the full previous MST (potentially millions of entries)
- applying each op to produce the new tree
- comparing the new root against
data_cid
inversion requires:
- only the partial MST from the CAR (changed nodes only)
- reversing each op on the new tree
- comparing the inverted root against
prev_data
the CAR already contains the new partial tree. unchanged subtrees become stubs — their CIDs are known but their contents aren't loaded. this is the key insight: you only need blocks for the parts that changed.
step by step#
source: repo_verifier.zig:387-469
1. parse CAR, extract commit#
loadCommitFromCAR(blocks) → { commit, unsigned_bytes, repo_car }
the commit is CBOR with fields: did, version, data (MST root CID), rev, prev, sig. the unsigned bytes are re-encoded without sig for signature verification.
2. verify signature#
verify(unsigned_bytes, commit.sig, public_key) → ok | error
proves the commit came from the account's signing key. this is independent of MST — it proves authorship.
3. load partial MST from CAR#
Mst.loadFromBlocks(repo_car, commit.data_cid) → partial tree
source: mst.zig:459-480
walks from the root block, loading nodes whose blocks exist in the CAR. blocks not found become ChildRef.stub — a CID placeholder. the result is a tree where:
- changed paths have fully loaded nodes
- unchanged subtrees are stubs (just their hash)
4. copy tree for inversion#
tree.copy() → inverted
deep clone. stubs stay as stubs. the original tree is preserved.
5. normalize operations#
normalizeOps(msg_ops) → sorted_ops
source: mst.zig:773-800
- sort: deletions first, then by path lexicographically
- reject duplicates (same path twice →
DuplicatePath)
deletions first matters: if you invert a create before a delete on related paths, you might touch a node that should already be gone. ordering ensures deterministic application.
6. invert each operation#
source: mst.zig:804-820
for each op in sorted order:
| forward op | inverse action | verification |
|---|---|---|
| create(path, cid) | deleteReturn(path) |
removed value must equal cid |
| update(path, new_cid, prev_cid) | putReturn(path, prev_cid) |
displaced value must equal new_cid |
| delete(path, prev_cid) | putReturn(path, prev_cid) |
path must not already exist (displaced == null) |
each step is self-checking. if any assertion fails → InversionMismatch.
7. compute inverted root#
inverted.rootCid() → cid
traverses the (now-inverted) tree bottom-up. loaded nodes are serialized as DAG-CBOR and SHA-256 hashed. stubs contribute their known CIDs directly — this is safe because we trust the previous state.
if inversion needs to descend into a stub (operation touches an unchanged subtree), it fails with PartialTree — the CAR didn't include enough context.
8. compare#
inverted_root == prev_data → valid transition
if they match, the operations fully explain the state change from the old root to the new root. no hidden mutations, no missing ops, no reordering.
what stubs mean for the proof#
stubs are the mechanism that makes partial verification possible:
ChildRef = union(enum) {
none, — no child (leaf boundary)
node: *Node, — loaded from CAR, fully traversable
stub: cbor.Cid, — CID known, content trusted
}
when computing rootCid():
node→ serialize, hash, produce CIDstub→ use the CID as-is (it's trusted from previous verification)none→ encoded as null in CBOR
the proof chain holds because each stub was once a verified node in a previous commit. by induction, every stub's CID is correct.
error taxonomy#
| error | meaning | response |
|---|---|---|
SignatureVerificationFailed |
wrong key or tampered commit | reject, re-resolve DID |
PrevDataMismatch |
ops don't explain the transition | chain broken, re-sync |
InversionMismatch |
individual op inconsistent with tree state | reject commit |
PartialTree |
CAR missing blocks needed for inversion | incomplete data |
DuplicatePath |
same path appears in multiple ops | malformed commit |
InvalidMstNode |
CBOR structure of MST node is wrong | corrupted data |
InvalidCommit |
commit object malformed or wrong version | corrupted data |
the first three are the interesting ones — they distinguish between identity fraud (signature), history fraud (prevData), and operation fraud (inversion).