commits
- Fix major data corruption by removing Next skip from iterator adapter (a hack that was added for leveldb)
- Serialize and restore DID bloom filter
- Move stuff around, clean up test helpers
And all it took was disabling prefetching on the iterator (the bloom filter then applied an additional 2x speedup)
The index sadly can't be rebuilt from just the tree nodes, mainly because of the validFrom-validTo logic. If we ever confirm that historical queries (at a specific tree height) are not worth supporting, we can revisit this.
leveldb causes too much write amplification on SSDs to the point of being ridiculous, especially when paired with iavl's merkle tree updates.
badger uses _much_ more memory but if we separate the operation values (not the tree inner nodes) into its value log, it seems to write significantly fewer GBs to the disk and compactions are either less frequent or much less noticeable (i.e. they don't stall the normal operation of the application as much).
(writing to the iavl tree still causes many more GBs to be written to disk than the data that's actually being inserted into the tree, but such appears to be the cost of using merkle trees)
- Badger turned out to stall too much on compactions and ended up seemingly having a disk space disadvantage over goleveldb, even when both were set to snappy compression (which goleveldb uses by default)
- Keeping the DID->operations pointers in the IAVL tree led to writes of unordered keys, which caused constant rebalancing and increased-rehashing within the merkleized tree, leading to writes becoming impossibly slow as the tree grew and ruining the performance of the underlying KV store, no matter its implementation, as more reads and writes were needed to reach a certain outcome (While everything the IAVL tree was doing was still certainly O(log n), you definitely don't want to hit the worst cases all the time)
- Begin cleaning up TreeVersions/Adapters/Heights mess through the introduction of unified transaction interfaces that manage the commits/rollbacks within tree and the index KV store in tandem
This should allow us to once again keep the entire tree history (the diffs between tree versions should be quite small, especially now that almost no keys are ever rewritten)
- Avoid fetching next seq from tree on every op store
- Misc cleanups
- Eager async fetch of entries to import
- Drop old tree history (improves situation but it's not a permanent solution, more investigation required) with async iavl tree pruning
- Skip executing import on proposal preparation
- Effectively enforce timeouts through proper context propagation and checking
- Iterate mutable tree with correct iterator when obtaining latest sequence value
- Avoid repeated lock acquisition when importing multiple operations
leveldb causes too much write amplification on SSDs to the point of being ridiculous, especially when paired with iavl's merkle tree updates.
badger uses _much_ more memory but if we separate the operation values (not the tree inner nodes) into its value log, it seems to write significantly fewer GBs to the disk and compactions are either less frequent or much less noticeable (i.e. they don't stall the normal operation of the application as much).
(writing to the iavl tree still causes many more GBs to be written to disk than the data that's actually being inserted into the tree, but such appears to be the cost of using merkle trees)
- Badger turned out to stall too much on compactions and ended up seemingly having a disk space disadvantage over goleveldb, even when both were set to snappy compression (which goleveldb uses by default)
- Keeping the DID->operations pointers in the IAVL tree led to writes of unordered keys, which caused constant rebalancing and increased-rehashing within the merkleized tree, leading to writes becoming impossibly slow as the tree grew and ruining the performance of the underlying KV store, no matter its implementation, as more reads and writes were needed to reach a certain outcome (While everything the IAVL tree was doing was still certainly O(log n), you definitely don't want to hit the worst cases all the time)
- Begin cleaning up TreeVersions/Adapters/Heights mess through the introduction of unified transaction interfaces that manage the commits/rollbacks within tree and the index KV store in tandem
This should allow us to once again keep the entire tree history (the diffs between tree versions should be quite small, especially now that almost no keys are ever rewritten)
- Eager async fetch of entries to import
- Drop old tree history (improves situation but it's not a permanent solution, more investigation required) with async iavl tree pruning
- Skip executing import on proposal preparation
- Effectively enforce timeouts through proper context propagation and checking
- Iterate mutable tree with correct iterator when obtaining latest sequence value
- Avoid repeated lock acquisition when importing multiple operations