sync verification (sync 1.1)#
how relays and downstream services verify firehose commits without trusting the source.
source: atproto sync spec, repository spec, event stream spec, cryptography spec
the inductive proof chain#
the core idea: instead of re-fetching the full repo to verify each change, you verify that each commit is a valid transition from the previous state. this only requires tracking two values per DID:
rev— the TID of the last verified commitdata— the MST root CID of the last verified commit
base case#
establish ground truth by fetching the full repo (via getRepo CAR export), verifying the MST structure, and verifying the commit signature. now you know the repo state is correct at this point.
inductive step#
for each subsequent #commit from the firehose:
- check chain continuity — the event's
sincemust match your storedrev, andprevDatamust match your storeddata - verify signature — re-encode the commit without
sig, SHA-256 hash, verify ECDSA - MST inversion — apply the
opsin reverse against the partial MST from the CAR blocks. if the ops are complete, the resulting root CID must equalprevData - update state — store the new
revanddata
if step 1 or 3 fails, the chain is broken → mark the repo as desynchronized and re-fetch.
commit event fields that matter#
the signed commit object (in the CAR) has:
| field | notes |
|---|---|
did |
account DID |
version |
always 3 (v1 dead, v2 legacy-compatible) |
data |
CID of MST root — this is what the proof chain tracks |
rev |
TID, must increase monotonically |
prev |
virtually always null in v3 — vestigial from v2 |
sig |
ECDSA signature over all other fields encoded as DAG-CBOR |
the firehose #commit event adds:
| field | notes |
|---|---|
since |
rev of the preceding commit (chain link) |
prevData |
MST root CID of the preceding commit (chain link) |
blocks |
CAR slice — only changed blocks, max 2 MB |
ops |
up to 200 record operations |
each repoOp has:
| field | notes |
|---|---|
action |
create, update, or delete |
path |
<collection>/<rkey> |
cid |
new record CID (null for deletes) |
prev |
previous record CID (for updates/deletes — required for inversion) |
important: since/prevData are unsigned — they're not in the signed commit object. but they're verifiable via MST inversion. the relay proves they're correct by showing the math works out.
MST inversion (the "trick")#
from the spec:
the trick to this process is record operation inversion.
#commitmessages contain both a repo diff (CAR slice), and an array of record operations. the operations can be applied in reverse against a copy of the partial repo tree contained in the diff blocks. if the list of operations is complete, the root of the tree should be exactly that of the previous commit object of the repository.
concretely, for each op applied in reverse:
- create → delete that key from the new tree, verify removed CID matches
op.cid - update → put
op.prevback, verify displaced CID matchesop.cid - delete → re-insert
op.prev, verify key didn't already exist
the CAR contains a partial MST — only the nodes that changed. unchanged subtrees become "stubs" (just their CID). after inversion, the root is computed bottom-up: stubs contribute their known CIDs, loaded nodes are serialized and hashed. if it matches prevData, the transition is proven valid.
what fails if data is tampered with#
| tampering | detection |
|---|---|
| modified record content | CAR block hash ≠ CID |
| forged commit (wrong DID, rev, data) | signature verification fails |
| wrong MST structure | inverted root ≠ prevData |
| extra/missing operations | inverted root ≠ prevData, or inversion mismatch |
| op claims to create X but X isn't in tree | deleteReturn returns null |
| op touches unchanged subtree not in CAR | stub error (partial tree) |
| high-S signature malleability | low-S check rejects it |
chain break → resync#
when the chain breaks (mismatched since/prevData, or a #sync event):
- mark the repo as
desynchronized - queue incoming events for this DID (don't drop them)
- fetch the full CAR — from the upstream relay first (not the PDS) to avoid thundering herd
- verify and reconcile state
- replay queued events
from the spec:
if many services attempt to re-synchronize a repository at the same time, the upstream PDS host may be overwhelmed with a 'thundering herd' of requests. to mitigate this, receiving services should first attempt to fetch the repo CAR file from their direct upstream (often a relay instance).
#sync events#
sent when repo state has been reset or is ambiguous (e.g. account reactivation after data corruption). contains only the commit block, not the full repo.
note that the repository contents are not included in the sync event: the
blocksfield only contains the repo commit object. downstream services would need to fetch the full repo CAR file to re-synchronize.
#account events#
| field | notes |
|---|---|
active |
whether the repo can be redistributed |
status |
takendown, suspended, deleted, deactivated, desynchronized, throttled |
the spec is clear: non-active accounts' content should not be redistributed. this means listReposByCollection should filter by active status.
when an account status is non-
active, the content that hosts should not redistributed includes: repository exports (CAR files), repo records, transformed records ('views', embeds, etc), blobs, transformed blobs (thumbnails, etc)
account events are hop-by-hop — they describe status at the emitting service, not globally.
validation checklist (from spec)#
what a relay should do for each #commit:
- verify commit signature (refresh identity on initial failure)
- verify event fields match the signed commit in
blocks - verify
blocksagainstopsandprevDatavia MST inversion - check
sinceagainst storedrev— mismatch → out-of-sync - check
prevDataagainst storeddata— mismatch → out-of-sync - ignore events with
rev ≤ stored_rev - reject events with future
rev(beyond clock drift window) - ignore events for non-active accounts
- do NOT validate records against lexicons (relay-specific)
cryptographic details#
- two curves: P-256 (
secp256r1) and secp256k1 — implementations must support both - low-S normalization required for both curves
- signing: DAG-CBOR encode unsigned commit → SHA-256 hash (binary) → ECDSA sign
- the CID of a signed commit uses the signed DAG-CBOR encoding (codec
0x71) - public keys: compressed 33-byte points, multicodec-prefixed (
0x80 0x24for P-256,0xe7 0x01for secp256k1), then multibase-encoded (z+ base58btc)
cursor semantics#
sequence numbers are per-service, per-endpoint. reconnection rules:
- no cursor → start from current position
- cursor in rollback window → replay from that point
- cursor too old → info message, then replay entire rollback window
- cursor in the future →
FutureCursorerror, close connection
a relay should track both "last received seq" (for reconnection) and "high water mark" (for persistence after processing completes).
implementations#
| zat/zlay | lightrail (fig) | collectiondir (indigo) | |
|---|---|---|---|
| signature verification | yes (P-256 + secp256k1) | not yet (resolver ready) | no |
| MST inversion | yes (verifyCommitDiff) |
not yet (MST parsing exists) | no |
| per-DID ordering | caller responsibility | CommitDispatcher enforces |
n/a |
| prev chain tracking | yes (postgres, CAS upsert) | RepoPrev storage built |
no |
| chain continuity checks | yes (log-only, metrics) | not yet | no |
| account status | dual status (local + upstream) | tracked, not filtered at query | append-only (no removal) |
| resync on discontinuity | not yet | architecture ready | n/a |
zat has the cryptographic and structural verification. zlay (march 2026) runs chain continuity detection in observation mode — logging breaks and counting them via prometheus, not yet enforcing. verifyCommitDiff is wired but behind a config flag; production uses verifyCommitCar (signature-only). see inductive-proof/relay-integration.md for details.
lightrail has the operational scheduling and recovery. collectiondir trusts the upstream relay entirely.
see also#
- inductive-proof/ — deep dive: algorithm, relay integration, SDK affordances
- firehose — event stream basics, consuming events
- data — repos, records, collections
- identity — DIDs, handles, key resolution