relay integration#
how zlay uses the sync 1.1 APIs from zat, as of march 2026. zlay is ~4 days old.
current state#
zlay is on zat v0.2.10. the sync 1.1 verification is wired but deployed in observation mode — chain breaks are logged and counted, not enforced.
the pipeline#
subscriber (reader thread)
→ header decode, cursor tracking
→ submit raw frame to thread pool
frame_worker (pool worker)
→ CBOR decode payload
→ rev clock check (reject future timestamps beyond 5min skew)
→ chain continuity check (log-only):
since vs stored rev
prevData vs stored data_cid
→ dispatch to validator
validator
→ DID cache lookup (miss → queue background resolve, skip frame)
→ verifyCommitCar(blocks, public_key, {verify_mst: false})
OR verifyCommitDiff(blocks, ops, prev_data, public_key) [behind config flag]
→ return (data_cid, commit_rev)
event_log
→ persist frame to disk
→ conditional upsert: UPDATE ... WHERE rev < new_rev
→ broadcast to consumers
what's working#
chain continuity detection (frame_worker.zig, subscriber.zig):
- compares incoming
sinceagainst storedrev - compares incoming
prevDataCID against storeddata_cid - increments
relay_chain_breaks_totalprometheus counter - log-only — commits still flow through
conditional state upsert (event_log.zig):
INSERT INTO account_repo (uid, rev, commit_data_cid)
VALUES ($1, $2, $3)
ON CONFLICT (uid) DO UPDATE
SET rev = EXCLUDED.rev, commit_data_cid = EXCLUDED.commit_data_cid
WHERE account_repo.rev < EXCLUDED.rev
prevents concurrent workers from rolling back state. returns whether the update actually happened.
extractOps fix (validator.zig):
- previously looked for separate
collection/rkeyfields (wrong) - now reads
pathfield and splits on/(matches firehose wire format) - validates both halves: NSID for collection, rkey for record key
future rev rejection (frame_worker.zig):
- parses incoming rev as TID, extracts microsecond timestamp
- compares against wall clock + configurable skew (default 5 minutes)
- rejects commits claiming to be from the future
what's not yet enabled#
full diff verification (verifyCommitDiff):
- wired in validator.zig but behind
config.verify_commit_diffflag - disabled in production — currently all commits go through
verifyCommitCar(signature-only, MST verification disabled) - the observation mode lets operators measure chain break rates before strict enforcement
resync on chain break:
- breaks are detected and logged but no recovery action is taken
- the spec says: mark desynchronized, queue events, fetch full CAR, reconcile, replay
- this is a significant operational feature (thundering herd concerns, etc.)
the optimistic validation pattern#
zlay's approach to DID resolution creates a trust window:
- first commit from a DID → cache miss → broadcast immediately, resolve key in background
- subsequent commits → cache hit → verify signature
- verification failure → evict cache, re-resolve, skip this frame
this is a deliberate trade-off: brief trust window for throughput. bounded by resolver thread count and resolution latency (~200ms per DID).
state requirements#
per-DID state for chain verification is minimal:
| field | type | purpose |
|---|---|---|
uid |
u64 | internal ID (from DID mapping cache) |
rev |
text | last verified commit TID |
commit_data_cid |
text | last verified MST root, multibase-encoded |
this maps directly to what the spec requires: track rev and data per repo.
questions for SDK design#
observations from watching zlay integrate:
-
extractOps was wrong for months — the SDK provides
MstOperationbut the firehose wire format uses a different field layout (pathvscollection+rkey,cidvsvalue). should the SDK provide a firehose-aware operation parser? -
chain continuity is caller responsibility — every consumer needs to track (rev, data_cid) and compare against incoming (since, prevData). this is boilerplate with subtle ordering requirements. could the SDK help?
-
the observation-then-enforcement pattern — zlay chose log-only first. this is sensible for any consumer. does the SDK's error-based API support this well, or does it force binary accept/reject?
-
multibase encoding of CIDs — zlay encodes CIDs as multibase base32lower for storage/comparison. this is a common need. the SDK has
multibase.encodebut the pattern of "extract CID from verify result, encode for storage" is repeated.