Streaming Tree ARchive format
4 2 0

Clone this repository

https://tangled.org/microcosm.blue/star https://tangled.org/did:plc:lulmyldiq4sb2ikags5sfb25/star
git@tangled.org:microcosm.blue/star git@tangled.org:did:plc:lulmyldiq4sb2ikags5sfb25/star

For self-hosted knots, clone URLs may differ based on your setup.

Download tar.gz
readme.md

STAR: Streaming Tree ARchive format#

status: just thinking about it

STAR is an archival format for Merkle Search Trees (MSTs) as implemented in atproto. It offers efficient key-ordered streaming of repository contents and reduced archive size compared to CARs.

  • convertible to/from CAR (lossless except any out-of-tree blocks from a CAR)
  • extra garbage strictly not allowed (unlike CAR)
  • canonical (unlike CAR)
  • strict depth-first (MST key-ordered) node and record ordering -> efficient reading (unlike CAR)
  • the header simply is the commit, followed by the serialized tree.
  • CID link are implicit for blocks included in the STAR: linked blocks follow in a deterministic order and recompute CID from on their contents.
  • fewer edge cases: empty MST nodes strictly disallowed: commit omits data for an empty tree

the two primary motivations are

  1. bounded-resource streaming readers.

    atproto MSTs in CARs have to buffer and retain all record blocks, and typically buffer most MST node blocks, just to traverse the tree. even if a CAR appears to be in stream-friendly block ordering, you can only safely discard record blocks if you know for sure it's actually stream-friendly.

    you also cannot reliably identify MST node blocks and record blocks in an atproto CAR without walkign the tree, so you cannot discard any potentially garbage blocks from the buffered data before walking. A malicious PDS can serve a cheap-to-generate endless CAR stream of garbage blocks, and you just have to keep buffering them.

    since STAR is strictly stream-ordered, there is no node/block ambiguity, and extra garbage is not allowed. CIDs commit the contents of subtrees and records, and since reading is the same as walking the tree, it might be possible to reject some kinds of malicious block-generation attacks early. (haven't thought this through)

  2. reduced archive size.

    CIDs are large, compression-unfriendly, and redundant if you are including the CID's actual content.

    for example, my atproto repo is around 5.0MB and contains 14,673 blocks with a CID prefix plus 14,675 CID links in its MST. Each CID is 32 bytes, so (14,673 + 14,675) * 32 = 0.9MB just for the CIDS, almost 20%.

    from a few more samples of various sizes from real atproto repos:

      CIDs     CAR   potential savings
    0.53KB / 3.4KB = 16%
    23.2KB / 279KB = 8%
     0.9MB / 5.0MB = 18%
    25.9MB / 128MB = 20%
    94.8MB / 449MB = 21%
    

    These calculations don't include the 4-bytes-per-CID prefix size, since that overhead will already typically be eliminated by compression.

    STARs retain the raw CBOR serialization of records, but may use a new MST node serialization that further reduces this overhead.

    Since omitting CIDs by making them implicit removes uncompressible content from CARs, I'm optimistic that real savings for compressed STARs vs CARs will be higher.

    Note that all repository content in a STAR is still cryptographically bound to the signed root commit's CID: it's just a little more work to prove it.

scope#

STAR is specialized for atproto MST storage, and best-suited for serializing complete trees.

  • It should work for "CAR slices" -- contiguous narrowed partial trees that may omit content before and/or after a specific key. (CIDS referencing missing nodes at the boundaries cannot be eliminated)

    • It's desireable to be able to archive complete subtrees, so enforcing a well-formed atproto commit as the header might not be sufficient on its own. (subtrees could be stored as CAR slices so this may be unnecessary)
  • It might be interesting to allow arbitrary sparse trees. Not sure yet.

  • It's not suitable for firehose commit CARs, which need to include blocks that aren't in a strict single MST.

STAR format does not aim to provide efficient access to random nodes or through other tree iteration patterns. Almost any kind of inspection requires a linear scan through the archive (especially if global key compression happens).

problems#

It might be difficult to convert a STAR to stream-friendly (preorder traversal) CAR format, since the CID of each MST node block can only be computed after visiting all of its children.

STAR (could)[https://bsky.app/profile/bad-example.com/post/3mcv4zxwtgs2w] require that MST nodes above a certain depth store their own CIDs which would be sad but pragmatic.

format#

|--------- header ---------| |---------------- optional tree ----------------|
[ '*' | ver | len | commit ] [ node ] [ node OR record ] [ node OR record ] …
  • * (fixed u8): The first byte of the header is always * (hex 0x2A).

  • ver (varint): Next is an LEB128 varint specifying the version, with a fixed value of 1 for the current format.

  • len (varint): The length of the proceeding atproto commit object in bytes.

  • commit: An atproto commit object in DAG-CBOR derived from the repo spec:

    • did (string, nullable): same as repo spec
    • version (integer, required): corresponding CAR repo format version, currently fixed value of 3
    • data (hash link, nullable): CID of the first (root) node in the MST. an empty tree is represented by the presence of a null here
    • rev (string, required): same as repo spec
    • prev (hash link, nullable): same as repo spec
    • sig (byte array, nullable): to enable archiving stable sub-trees which might be later stitched into full signed MSTs, the sig property is allowed to be null.

verifying a commit#

The commit object can be converted to a repo-spec compliant commit:

  • if data is null, replace it with the CID of an empty repo-spec style MST (bafyreihmh6lpqcmyus4kt4rsypvxgvnvzkmj4aqczyewol5rsf7pdzzta4)
  • follow steps from the repo spec to resolve the identity and verify the signature.

When sig is null (typically for archived MST sub-trees), this STAR cannot be converted to a repo-spec compliant commit.

optional tree#

  • node: TODO: we need a new node format. It must be convertible back to a repo-spec style node.

  • record: The atproto record. Its CID can be computed over the bytes of its block (see below).

record#

|--- record --|
[ len | block ]
  • len (varint): the length of the proceeding binary record block in bytes.

  • block (bytes): the raw bytes of the (DAG-CBOR) record

order of nodes and records#

The MST must be stored in key order, which for an MST is a depth-first walk across the tree.

For each included child of a node (indicated by ?? in its entries. null for cid?), todo blah blah

excluded children (indicated by a CID link being present in entries) are not included in the series of nodes and records.

key compression#

TODO

  • (but basically do what the repo-spec does but apply it across the whole stream)
  • (but also actually run some tests and measure how much this decreases file sizes post-normal-file-compression)