Git fork

Documentation: describe incremental MIDX bitmaps

Prepare to implement support for reachability bitmaps for the new
incremental multi-pack index (MIDX) feature over the following commits.

This commit begins by first describing the relevant format and usage
details for incremental MIDX bitmaps.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

authored by

Taylor Blau and committed by
Junio C Hamano
4887bdd4 4a9179d1

+72
+72
Documentation/technical/multi-pack-index.adoc
··· 164 164 (in the C implementation, this is often computed as `i + 165 165 m->num_objects_in_base`). 166 166 167 + === Pseudo-pack order for incremental MIDXs 168 + 169 + The original implementation of multi-pack reachability bitmaps defined 170 + the pseudo-pack order in linkgit:gitformat-pack[5] (see the section 171 + titled "multi-pack-index reverse indexes") roughly as follows: 172 + 173 + ____ 174 + In short, a MIDX's pseudo-pack is the de-duplicated concatenation of 175 + objects in packs stored by the MIDX, laid out in pack order, and the 176 + packs arranged in MIDX order (with the preferred pack coming first). 177 + ____ 178 + 179 + In the incremental MIDX design, we extend this definition to include 180 + objects from multiple layers of the MIDX chain. The pseudo-pack order 181 + for incremental MIDXs is determined by concatenating the pseudo-pack 182 + ordering for each layer of the MIDX chain in order. Formally two objects 183 + `o1` and `o2` are compared as follows: 184 + 185 + 1. If `o1` appears in an earlier layer of the MIDX chain than `o2`, then 186 + `o1` sorts ahead of `o2`. 187 + 188 + 2. Otherwise, if `o1` and `o2` appear in the same MIDX layer, and that 189 + MIDX layer has no base, then if one of `pack(o1)` and `pack(o2)` is 190 + preferred and the other is not, then the preferred one sorts ahead of 191 + the non-preferred one. If there is a base layer (i.e. the MIDX layer 192 + is not the first layer in the chain), then if `pack(o1)` appears 193 + earlier in that MIDX layer's pack order, then `o1` sorts ahead of 194 + `o2`. Likewise if `pack(o2)` appears earlier, then the opposite is 195 + true. 196 + 197 + 3. Otherwise, `o1` and `o2` appear in the same pack, and thus in the 198 + same MIDX layer. Sort `o1` and `o2` by their offset within their 199 + containing packfile. 200 + 201 + Note that the preferred pack is a property of the MIDX chain, not the 202 + individual layers themselves. Fundamentally we could introduce a 203 + per-layer preferred pack, but this is less relevant now that we can 204 + perform multi-pack reuse across the set of packs in a MIDX. 205 + 206 + === Reachability bitmaps and incremental MIDXs 207 + 208 + Each layer of an incremental MIDX chain may have its objects (and the 209 + objects from any previous layer in the same MIDX chain) represented in 210 + its own `*.bitmap` file. 211 + 212 + The structure of a `*.bitmap` file belonging to an incremental MIDX 213 + chain is identical to that of a non-incremental MIDX bitmap, or a 214 + classic single-pack bitmap. Since objects are added to the end of the 215 + incremental MIDX's pseudo-pack order (see above), it is possible to 216 + extend a bitmap when appending to the end of a MIDX chain. 217 + 218 + (Note: it is possible likewise to compress a contiguous sequence of MIDX 219 + incremental layers, and their `*.bitmap` files into a single layer and 220 + `*.bitmap`, but this is not yet implemented.) 221 + 222 + The object positions used are global within the pseudo-pack order, so 223 + subsequent layers will have, for example, `m->num_objects_in_base` 224 + number of `0` bits in each of their four type bitmaps. This follows from 225 + the fact that we only write type bitmap entries for objects present in 226 + the layer immediately corresponding to the bitmap). 227 + 228 + Note also that only the bitmap pertaining to the most recent layer in an 229 + incremental MIDX chain is used to store reachability information about 230 + the interesting and uninteresting objects in a reachability query. 231 + Earlier bitmap layers are only used to look up commit and pseudo-merge 232 + bitmaps from that layer, as well as the type-level bitmaps for objects 233 + in that layer. 234 + 235 + To simplify the implementation, type-level bitmaps are iterated 236 + simultaneously, and their results are OR'd together to avoid recursively 237 + calling internal bitmap functions. 238 + 167 239 Future Work 168 240 ----------- 169 241