···11# repo-stream
2233-a futures atproto record stream from CAR file
33+Fast and (aspirationally) robust atproto CAR file processing in rust
44+55+66+todo
77+88+- [ ] car file test fixtures & validation tests
99+- [ ] make sure we can get the did and signature out for verification
1010+- [ ] spec compliance todos
1111+ - [ ] assert that keys are ordered and fail if not
1212+ - [ ] verify node mst depth from key (possibly pending [interop test fixes](https://github.com/bluesky-social/atproto-interop-tests/issues/5))
1313+- [ ] performance todos
1414+ - [ ] consume the serialized nodes into a mutable efficient format
1515+ - [ ] maybe customize the deserialize impl to do that directly?
1616+ - [ ] benchmark and profile
1717+- [ ] robustness todos
1818+ - [ ] swap the blocks hashmap for a BlockStore trait that can be dumped to redb
1919+ - [ ] maybe keep the redb function behind a feature flag?
2020+ - [ ] can we assert a max size for node blocks?
2121+ - [ ] figure out why asserting the upper nibble of the fourth byte of a node fails fingerprinting
2222+ - [ ] max mst depth (there is actually a hard limit but a malicious repo could do anything)
2323+ - [ ] i don't think we need a max recursion depth for processing cbor contents since we leave records to the user to decode
42455-current notes
2525+newer ideas
2626+2727+- fixing the interleaved mst walk/ block load actually does perform ok: just need the walker to tell the block loader which block we actually need next, so that the block loader can go ahead and load all blocks until that one without checking back with the walker. so i think we're streaming-order ready!
2828+2929+3030+later ideas
631732- just buffering all the blocks is 2.5x faster than interleaving optimistic walking
833 - at least, this is true on huge CARs with the current (stream-unfriendly) pds export behaviour