Git fork

midx: implement support for writing incremental MIDX chains

Now that the rest of the MIDX subsystem and relevant callers have been
updated to learn about how to read and process incremental MIDX chains,
let's finally update the implementation in `write_midx_internal()` to be
able to write incremental MIDX chains.

This new feature is available behind the `--incremental` option for the
`multi-pack-index` builtin, like so:

$ git multi-pack-index write --incremental

The implementation for doing so is relatively straightforward, and boils
down to a handful of different kinds of changes implemented in this
patch:

- The `compute_sorted_entries()` function is taught to reject objects
which appear in any existing MIDX layer.

- Functions like `write_midx_revindex()` are adjusted to write
pack_order values which are offset by the number of objects in the
base MIDX layer.

- The end of `write_midx_internal()` is adjusted to move
non-incremental MIDX files when necessary (i.e. when creating an
incremental chain with an existing non-incremental MIDX in the
repository).

There are a handful of other changes that are introduced, like new
functions to clear incremental MIDX files that are unrelated to the
current chain (using the same "keep_hash" mechanism as in the
non-incremental case).

The tests explicitly exercising the new incremental MIDX feature are
relatively limited for two reasons:

1. Most of the "interesting" behavior is already thoroughly covered in
t5319-multi-pack-index.sh, which handles the core logic of reading
objects through a MIDX.

The new tests in t5334-incremental-multi-pack-index.sh are mostly
focused on creating and destroying incremental MIDXs, as well as
stitching their results together across layers.

2. A new GIT_TEST environment variable is added called
"GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL", which modifies the
entire test suite to write incremental MIDXs after repacking when
combined with the "GIT_TEST_MULTI_PACK_INDEX" variable.

This exercises the long tail of other interesting behavior that is
defined implicitly throughout the rest of the CI suite. It is
likewise added to the linux-TEST-vars job.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

authored by

Taylor Blau and committed by
Junio C Hamano
fcb2205b 147c3f67

+459 -103
+10 -1
Documentation/git-multi-pack-index.txt
··· 64 64 duplicates. (If a given OID is given more than once, it is marked as 65 65 preferred if at least one instance of it begins with the special `+` 66 66 marker). 67 + 68 + --incremental:: 69 + Write an incremental MIDX file containing only objects 70 + and packs not present in an existing MIDX layer. 71 + Migrates non-incremental MIDXs to incremental ones when 72 + necessary. Incompatible with `--bitmap`. 67 73 -- 68 74 69 75 verify:: ··· 74 80 have no objects referenced by the MIDX (with the exception of 75 81 `.keep` packs and cruft packs). Rewrite the MIDX file afterward 76 82 to remove all references to these pack-files. 83 + + 84 + NOTE: this mode is incompatible with incremental MIDX files. 77 85 78 86 repack:: 79 87 Create a new pack-file containing objects in small pack-files ··· 95 103 + 96 104 If `repack.packKeptObjects` is `false`, then any pack-files with an 97 105 associated `.keep` file will not be selected for the batch to repack. 98 - 106 + + 107 + NOTE: this mode is incompatible with incremental MIDX files. 99 108 100 109 EXAMPLES 101 110 --------
+2
builtin/multi-pack-index.c
··· 129 129 MIDX_WRITE_BITMAP | MIDX_WRITE_REV_INDEX), 130 130 OPT_BIT(0, "progress", &opts.flags, 131 131 N_("force progress reporting"), MIDX_PROGRESS), 132 + OPT_BIT(0, "incremental", &opts.flags, 133 + N_("write a new incremental MIDX"), MIDX_WRITE_INCREMENTAL), 132 134 OPT_BOOL(0, "stdin-packs", &opts.stdin_packs, 133 135 N_("write multi-pack index containing only given indexes")), 134 136 OPT_FILENAME(0, "refs-snapshot", &opts.refs_snapshot,
+6 -2
builtin/repack.c
··· 1514 1514 if (run_update_server_info) 1515 1515 update_server_info(0); 1516 1516 1517 - if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0)) 1518 - write_midx_file(get_object_directory(), NULL, NULL, 0); 1517 + if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0)) { 1518 + unsigned flags = 0; 1519 + if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL, 0)) 1520 + flags |= MIDX_WRITE_INCREMENTAL; 1521 + write_midx_file(get_object_directory(), NULL, NULL, flags); 1522 + } 1519 1523 1520 1524 cleanup: 1521 1525 string_list_clear(&names, 1);
+1
ci/run-build-and-tests.sh
··· 25 25 export GIT_TEST_COMMIT_GRAPH=1 26 26 export GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=1 27 27 export GIT_TEST_MULTI_PACK_INDEX=1 28 + export GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL=1 28 29 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master 29 30 export GIT_TEST_NO_WRITE_REV_INDEX=1 30 31 export GIT_TEST_CHECKOUT_WORKERS=2
+267 -45
midx-write.c
··· 17 17 #include "refs.h" 18 18 #include "revision.h" 19 19 #include "list-objects.h" 20 + #include "path.h" 21 + #include "pack-revindex.h" 20 22 21 23 #define PACK_EXPIRED UINT_MAX 22 24 #define BITMAP_POS_UNKNOWN (~((uint32_t)0)) ··· 25 27 26 28 extern int midx_checksum_valid(struct multi_pack_index *m); 27 29 extern void clear_midx_files_ext(const char *object_dir, const char *ext, 28 - unsigned char *keep_hash); 30 + const char *keep_hash); 31 + extern void clear_incremental_midx_files_ext(const char *object_dir, 32 + const char *ext, 33 + const char **keep_hashes, 34 + uint32_t hashes_nr); 29 35 extern int cmp_idx_or_pack_name(const char *idx_or_pack_name, 30 36 const char *idx_name); 31 37 ··· 86 92 size_t nr; 87 93 size_t alloc; 88 94 struct multi_pack_index *m; 95 + struct multi_pack_index *base_midx; 89 96 struct progress *progress; 90 97 unsigned pack_paths_checked; 91 98 ··· 98 105 uint32_t num_large_offsets; 99 106 100 107 int preferred_pack_idx; 108 + 109 + int incremental; 110 + uint32_t num_multi_pack_indexes_before; 101 111 102 112 struct string_list *to_include; 103 113 }; ··· 121 131 * to_include before the existing MIDX). 122 132 */ 123 133 if (ctx->m && midx_contains_pack(ctx->m, file_name)) 134 + return 0; 135 + else if (ctx->base_midx && midx_contains_pack(ctx->base_midx, 136 + file_name)) 124 137 return 0; 125 138 else if (ctx->to_include && 126 139 !string_list_has_string(ctx->to_include, file_name)) ··· 338 351 for (cur_fanout = 0; cur_fanout < 256; cur_fanout++) { 339 352 fanout.nr = 0; 340 353 341 - if (ctx->m) 354 + if (ctx->m && !ctx->incremental) 342 355 midx_fanout_add_midx_fanout(&fanout, ctx->m, cur_fanout, 343 356 ctx->preferred_pack_idx); 344 357 ··· 364 377 if (cur_object && oideq(&fanout.entries[cur_object - 1].oid, 365 378 &fanout.entries[cur_object].oid)) 366 379 continue; 380 + if (ctx->incremental && ctx->base_midx && 381 + midx_has_oid(ctx->base_midx, 382 + &fanout.entries[cur_object].oid)) 383 + continue; 367 384 368 385 ALLOC_GROW(ctx->entries, st_add(ctx->entries_nr, 1), 369 386 alloc_objects); ··· 547 564 void *data) 548 565 { 549 566 struct write_midx_context *ctx = data; 550 - uint32_t i; 567 + uint32_t i, nr_base; 568 + 569 + if (ctx->incremental && ctx->base_midx) 570 + nr_base = ctx->base_midx->num_objects + 571 + ctx->base_midx->num_objects_in_base; 572 + else 573 + nr_base = 0; 551 574 552 575 for (i = 0; i < ctx->entries_nr; i++) 553 - hashwrite_be32(f, ctx->pack_order[i]); 576 + hashwrite_be32(f, ctx->pack_order[i] + nr_base); 554 577 555 578 return 0; 556 579 } ··· 579 602 static uint32_t *midx_pack_order(struct write_midx_context *ctx) 580 603 { 581 604 struct midx_pack_order_data *data; 582 - uint32_t *pack_order; 605 + uint32_t *pack_order, base_objects = 0; 583 606 uint32_t i; 584 607 585 608 trace2_region_enter("midx", "midx_pack_order", the_repository); 586 609 610 + if (ctx->incremental && ctx->base_midx) 611 + base_objects = ctx->base_midx->num_objects + 612 + ctx->base_midx->num_objects_in_base; 613 + 614 + ALLOC_ARRAY(pack_order, ctx->entries_nr); 587 615 ALLOC_ARRAY(data, ctx->entries_nr); 616 + 588 617 for (i = 0; i < ctx->entries_nr; i++) { 589 618 struct pack_midx_entry *e = &ctx->entries[i]; 590 619 data[i].nr = i; ··· 596 625 597 626 QSORT(data, ctx->entries_nr, midx_pack_order_cmp); 598 627 599 - ALLOC_ARRAY(pack_order, ctx->entries_nr); 600 628 for (i = 0; i < ctx->entries_nr; i++) { 601 629 struct pack_midx_entry *e = &ctx->entries[data[i].nr]; 602 630 struct pack_info *pack = &ctx->info[ctx->pack_perm[e->pack_int_id]]; 603 631 if (pack->bitmap_pos == BITMAP_POS_UNKNOWN) 604 - pack->bitmap_pos = i; 632 + pack->bitmap_pos = i + base_objects; 605 633 pack->bitmap_nr++; 606 634 pack_order[i] = data[i].nr; 607 635 } ··· 649 677 prepare_packing_data(the_repository, pdata); 650 678 651 679 for (i = 0; i < ctx->entries_nr; i++) { 652 - struct pack_midx_entry *from = &ctx->entries[ctx->pack_order[i]]; 680 + uint32_t pos = ctx->pack_order[i]; 681 + struct pack_midx_entry *from = &ctx->entries[pos]; 653 682 struct object_entry *to = packlist_alloc(pdata, &from->oid); 654 683 655 684 oe_set_in_pack(pdata, to, ··· 897 926 static int fill_packs_from_midx(struct write_midx_context *ctx, 898 927 const char *preferred_pack_name, uint32_t flags) 899 928 { 900 - uint32_t i; 929 + struct multi_pack_index *m; 930 + 931 + for (m = ctx->m; m; m = m->base_midx) { 932 + uint32_t i; 901 933 902 - for (i = 0; i < ctx->m->num_packs; i++) { 903 - ALLOC_GROW(ctx->info, ctx->nr + 1, ctx->alloc); 934 + for (i = 0; i < m->num_packs; i++) { 935 + ALLOC_GROW(ctx->info, ctx->nr + 1, ctx->alloc); 904 936 905 - if (flags & MIDX_WRITE_REV_INDEX || preferred_pack_name) { 906 937 /* 907 938 * If generating a reverse index, need to have 908 939 * packed_git's loaded to compare their 909 940 * mtimes and object count. 910 - * 911 941 * 912 942 * If a preferred pack is specified, need to 913 943 * have packed_git's loaded to ensure the chosen 914 944 * preferred pack has a non-zero object count. 915 945 */ 916 - if (prepare_midx_pack(the_repository, ctx->m, i)) 917 - return error(_("could not load pack")); 946 + if (flags & MIDX_WRITE_REV_INDEX || 947 + preferred_pack_name) { 948 + if (prepare_midx_pack(the_repository, m, 949 + m->num_packs_in_base + i)) { 950 + error(_("could not load pack")); 951 + return 1; 952 + } 918 953 919 - if (open_pack_index(ctx->m->packs[i])) 920 - die(_("could not open index for %s"), 921 - ctx->m->packs[i]->pack_name); 954 + if (open_pack_index(m->packs[i])) 955 + die(_("could not open index for %s"), 956 + m->packs[i]->pack_name); 957 + } 958 + 959 + fill_pack_info(&ctx->info[ctx->nr++], m->packs[i], 960 + m->pack_names[i], 961 + m->num_packs_in_base + i); 922 962 } 963 + } 964 + return 0; 965 + } 923 966 924 - fill_pack_info(&ctx->info[ctx->nr++], ctx->m->packs[i], 925 - ctx->m->pack_names[i], i); 967 + static struct { 968 + const char *non_split; 969 + const char *split; 970 + } midx_exts[] = { 971 + {NULL, MIDX_EXT_MIDX}, 972 + {MIDX_EXT_BITMAP, MIDX_EXT_BITMAP}, 973 + {MIDX_EXT_REV, MIDX_EXT_REV}, 974 + }; 975 + 976 + static int link_midx_to_chain(struct multi_pack_index *m) 977 + { 978 + struct strbuf from = STRBUF_INIT; 979 + struct strbuf to = STRBUF_INIT; 980 + int ret = 0; 981 + size_t i; 982 + 983 + if (!m || m->has_chain) { 984 + /* 985 + * Either no MIDX previously existed, or it was already 986 + * part of a MIDX chain. In both cases, we have nothing 987 + * to link, so return early. 988 + */ 989 + goto done; 926 990 } 927 991 928 - return 0; 992 + for (i = 0; i < ARRAY_SIZE(midx_exts); i++) { 993 + const unsigned char *hash = get_midx_checksum(m); 994 + 995 + get_midx_filename_ext(&from, m->object_dir, hash, 996 + midx_exts[i].non_split); 997 + get_split_midx_filename_ext(&to, m->object_dir, hash, 998 + midx_exts[i].split); 999 + 1000 + if (link(from.buf, to.buf) < 0 && errno != ENOENT) { 1001 + ret = error_errno(_("unable to link '%s' to '%s'"), 1002 + from.buf, to.buf); 1003 + goto done; 1004 + } 1005 + 1006 + strbuf_reset(&from); 1007 + strbuf_reset(&to); 1008 + } 1009 + 1010 + done: 1011 + strbuf_release(&from); 1012 + strbuf_release(&to); 1013 + return ret; 1014 + } 1015 + 1016 + static void clear_midx_files(const char *object_dir, 1017 + const char **hashes, 1018 + uint32_t hashes_nr, 1019 + unsigned incremental) 1020 + { 1021 + /* 1022 + * if incremental: 1023 + * - remove all non-incremental MIDX files 1024 + * - remove any incremental MIDX files not in the current one 1025 + * 1026 + * if non-incremental: 1027 + * - remove all incremental MIDX files 1028 + * - remove any non-incremental MIDX files not matching the current 1029 + * hash 1030 + */ 1031 + struct strbuf buf = STRBUF_INIT; 1032 + const char *exts[] = { MIDX_EXT_BITMAP, MIDX_EXT_REV, MIDX_EXT_MIDX }; 1033 + uint32_t i, j; 1034 + 1035 + for (i = 0; i < ARRAY_SIZE(exts); i++) { 1036 + clear_incremental_midx_files_ext(object_dir, exts[i], 1037 + hashes, hashes_nr); 1038 + for (j = 0; j < hashes_nr; j++) 1039 + clear_midx_files_ext(object_dir, exts[i], hashes[j]); 1040 + } 1041 + 1042 + if (incremental) 1043 + get_midx_filename(&buf, object_dir); 1044 + else 1045 + get_midx_chain_filename(&buf, object_dir); 1046 + 1047 + if (unlink(buf.buf) && errno != ENOENT) 1048 + die_errno(_("failed to clear multi-pack-index at %s"), buf.buf); 1049 + 1050 + strbuf_release(&buf); 929 1051 } 930 1052 931 1053 static int write_midx_internal(const char *object_dir, ··· 940 1062 uint32_t i, start_pack; 941 1063 struct hashfile *f = NULL; 942 1064 struct lock_file lk; 1065 + struct tempfile *incr; 943 1066 struct write_midx_context ctx = { 0 }; 944 1067 int bitmapped_packs_concat_len = 0; 945 1068 int pack_name_concat_len = 0; 946 1069 int dropped_packs = 0; 947 1070 int result = 0; 1071 + const char **keep_hashes = NULL; 948 1072 struct chunkfile *cf; 949 1073 950 1074 trace2_region_enter("midx", "write_midx_internal", the_repository); 951 1075 952 - get_midx_filename(&midx_name, object_dir); 1076 + ctx.incremental = !!(flags & MIDX_WRITE_INCREMENTAL); 1077 + if (ctx.incremental && (flags & MIDX_WRITE_BITMAP)) 1078 + die(_("cannot write incremental MIDX with bitmap")); 1079 + 1080 + if (ctx.incremental) 1081 + strbuf_addf(&midx_name, 1082 + "%s/pack/multi-pack-index.d/tmp_midx_XXXXXX", 1083 + object_dir); 1084 + else 1085 + get_midx_filename(&midx_name, object_dir); 953 1086 if (safe_create_leading_directories(midx_name.buf)) 954 1087 die_errno(_("unable to create leading directories of %s"), 955 1088 midx_name.buf); 956 1089 957 - if (!packs_to_include) { 958 - /* 959 - * Only reference an existing MIDX when not filtering which 960 - * packs to include, since all packs and objects are copied 961 - * blindly from an existing MIDX if one is present. 962 - */ 963 - ctx.m = lookup_multi_pack_index(the_repository, object_dir); 964 - } 1090 + if (!packs_to_include || ctx.incremental) { 1091 + struct multi_pack_index *m = lookup_multi_pack_index(the_repository, 1092 + object_dir); 1093 + if (m && !midx_checksum_valid(m)) { 1094 + warning(_("ignoring existing multi-pack-index; checksum mismatch")); 1095 + m = NULL; 1096 + } 965 1097 966 - if (ctx.m && !midx_checksum_valid(ctx.m)) { 967 - warning(_("ignoring existing multi-pack-index; checksum mismatch")); 968 - ctx.m = NULL; 1098 + if (m) { 1099 + /* 1100 + * Only reference an existing MIDX when not filtering 1101 + * which packs to include, since all packs and objects 1102 + * are copied blindly from an existing MIDX if one is 1103 + * present. 1104 + */ 1105 + if (ctx.incremental) 1106 + ctx.base_midx = m; 1107 + else if (!packs_to_include) 1108 + ctx.m = m; 1109 + } 969 1110 } 970 1111 971 1112 ctx.nr = 0; 972 - ctx.alloc = ctx.m ? ctx.m->num_packs : 16; 1113 + ctx.alloc = ctx.m ? ctx.m->num_packs + ctx.m->num_packs_in_base : 16; 973 1114 ctx.info = NULL; 974 1115 ALLOC_ARRAY(ctx.info, ctx.alloc); 975 1116 976 - if (ctx.m && fill_packs_from_midx(&ctx, preferred_pack_name, 977 - flags) < 0) { 978 - result = 1; 1117 + if (ctx.incremental) { 1118 + struct multi_pack_index *m = ctx.base_midx; 1119 + while (m) { 1120 + ctx.num_multi_pack_indexes_before++; 1121 + m = m->base_midx; 1122 + } 1123 + } else if (ctx.m && fill_packs_from_midx(&ctx, preferred_pack_name, 1124 + flags) < 0) { 979 1125 goto cleanup; 980 1126 } 981 1127 ··· 992 1138 for_each_file_in_pack_dir(object_dir, add_pack_to_midx, &ctx); 993 1139 stop_progress(&ctx.progress); 994 1140 995 - if ((ctx.m && ctx.nr == ctx.m->num_packs) && 1141 + if ((ctx.m && ctx.nr == ctx.m->num_packs + ctx.m->num_packs_in_base) && 1142 + !ctx.incremental && 996 1143 !(packs_to_include || packs_to_drop)) { 997 1144 struct bitmap_index *bitmap_git; 998 1145 int bitmap_exists; ··· 1008 1155 * corresponding bitmap (or one wasn't requested). 1009 1156 */ 1010 1157 if (!want_bitmap) 1011 - clear_midx_files_ext(object_dir, ".bitmap", 1012 - NULL); 1158 + clear_midx_files_ext(object_dir, "bitmap", NULL); 1013 1159 goto cleanup; 1014 1160 } 1015 1161 } 1162 + 1163 + if (ctx.incremental && !ctx.nr) 1164 + goto cleanup; /* nothing to do */ 1016 1165 1017 1166 if (preferred_pack_name) { 1018 1167 ctx.preferred_pack_idx = -1; ··· 1159 1308 pack_name_concat_len += MIDX_CHUNK_ALIGNMENT - 1160 1309 (pack_name_concat_len % MIDX_CHUNK_ALIGNMENT); 1161 1310 1162 - hold_lock_file_for_update(&lk, midx_name.buf, LOCK_DIE_ON_ERROR); 1163 - f = hashfd(get_lock_file_fd(&lk), get_lock_file_path(&lk)); 1311 + if (ctx.incremental) { 1312 + struct strbuf lock_name = STRBUF_INIT; 1313 + 1314 + get_midx_chain_filename(&lock_name, object_dir); 1315 + hold_lock_file_for_update(&lk, lock_name.buf, LOCK_DIE_ON_ERROR); 1316 + strbuf_release(&lock_name); 1317 + 1318 + incr = mks_tempfile_m(midx_name.buf, 0444); 1319 + if (!incr) { 1320 + error(_("unable to create temporary MIDX layer")); 1321 + return -1; 1322 + } 1323 + 1324 + if (adjust_shared_perm(get_tempfile_path(incr))) { 1325 + error(_("unable to adjust shared permissions for '%s'"), 1326 + get_tempfile_path(incr)); 1327 + return -1; 1328 + } 1329 + 1330 + f = hashfd(get_tempfile_fd(incr), get_tempfile_path(incr)); 1331 + } else { 1332 + hold_lock_file_for_update(&lk, midx_name.buf, LOCK_DIE_ON_ERROR); 1333 + f = hashfd(get_lock_file_fd(&lk), get_lock_file_path(&lk)); 1334 + } 1164 1335 1165 1336 if (ctx.nr - dropped_packs == 0) { 1166 1337 error(_("no pack files to index.")); ··· 1253 1424 * have been freed in the previous if block. 1254 1425 */ 1255 1426 1256 - if (ctx.m) 1427 + CALLOC_ARRAY(keep_hashes, ctx.num_multi_pack_indexes_before + 1); 1428 + 1429 + if (ctx.incremental) { 1430 + FILE *chainf = fdopen_lock_file(&lk, "w"); 1431 + struct strbuf final_midx_name = STRBUF_INIT; 1432 + struct multi_pack_index *m = ctx.base_midx; 1433 + 1434 + if (!chainf) { 1435 + error_errno(_("unable to open multi-pack-index chain file")); 1436 + return -1; 1437 + } 1438 + 1439 + if (link_midx_to_chain(ctx.base_midx) < 0) 1440 + return -1; 1441 + 1442 + get_split_midx_filename_ext(&final_midx_name, object_dir, 1443 + midx_hash, MIDX_EXT_MIDX); 1444 + 1445 + if (rename_tempfile(&incr, final_midx_name.buf) < 0) { 1446 + error_errno(_("unable to rename new multi-pack-index layer")); 1447 + return -1; 1448 + } 1449 + 1450 + keep_hashes[ctx.num_multi_pack_indexes_before] = 1451 + xstrdup(hash_to_hex(midx_hash)); 1452 + 1453 + for (i = 0; i < ctx.num_multi_pack_indexes_before; i++) { 1454 + uint32_t j = ctx.num_multi_pack_indexes_before - i - 1; 1455 + 1456 + keep_hashes[j] = xstrdup(hash_to_hex(get_midx_checksum(m))); 1457 + m = m->base_midx; 1458 + } 1459 + 1460 + for (i = 0; i < ctx.num_multi_pack_indexes_before + 1; i++) 1461 + fprintf(get_lock_file_fp(&lk), "%s\n", keep_hashes[i]); 1462 + } else { 1463 + keep_hashes[ctx.num_multi_pack_indexes_before] = 1464 + xstrdup(hash_to_hex(midx_hash)); 1465 + } 1466 + 1467 + if (ctx.m || ctx.base_midx) 1257 1468 close_object_store(the_repository->objects); 1258 1469 1259 1470 if (commit_lock_file(&lk) < 0) 1260 1471 die_errno(_("could not write multi-pack-index")); 1261 1472 1262 - clear_midx_files_ext(object_dir, ".bitmap", midx_hash); 1263 - clear_midx_files_ext(object_dir, ".rev", midx_hash); 1473 + clear_midx_files(object_dir, keep_hashes, 1474 + ctx.num_multi_pack_indexes_before + 1, 1475 + ctx.incremental); 1264 1476 1265 1477 cleanup: 1266 1478 for (i = 0; i < ctx.nr; i++) { ··· 1275 1487 free(ctx.entries); 1276 1488 free(ctx.pack_perm); 1277 1489 free(ctx.pack_order); 1490 + if (keep_hashes) { 1491 + for (i = 0; i < ctx.num_multi_pack_indexes_before + 1; i++) 1492 + free((char *)keep_hashes[i]); 1493 + free(keep_hashes); 1494 + } 1278 1495 strbuf_release(&midx_name); 1279 1496 1280 1497 trace2_region_leave("midx", "write_midx_internal", the_repository); ··· 1310 1527 1311 1528 if (!m) 1312 1529 return 0; 1530 + 1531 + if (m->base_midx) 1532 + die(_("cannot expire packs from an incremental multi-pack-index")); 1313 1533 1314 1534 CALLOC_ARRAY(count, m->num_packs); 1315 1535 ··· 1485 1705 1486 1706 if (!m) 1487 1707 return 0; 1708 + if (m->base_midx) 1709 + die(_("cannot repack an incremental multi-pack-index")); 1488 1710 1489 1711 CALLOC_ARRAY(include_pack, m->num_packs); 1490 1712
+51 -11
midx.c
··· 16 16 17 17 int midx_checksum_valid(struct multi_pack_index *m); 18 18 void clear_midx_files_ext(const char *object_dir, const char *ext, 19 - unsigned char *keep_hash); 19 + const char *keep_hash); 20 + void clear_incremental_midx_files_ext(const char *object_dir, const char *ext, 21 + char **keep_hashes, 22 + uint32_t hashes_nr); 20 23 int cmp_idx_or_pack_name(const char *idx_or_pack_name, 21 24 const char *idx_name); 22 25 ··· 521 524 return 0; 522 525 } 523 526 527 + int midx_has_oid(struct multi_pack_index *m, const struct object_id *oid) 528 + { 529 + return bsearch_midx(oid, m, NULL); 530 + } 531 + 524 532 struct object_id *nth_midxed_object_oid(struct object_id *oid, 525 533 struct multi_pack_index *m, 526 534 uint32_t n) ··· 723 731 } 724 732 725 733 struct clear_midx_data { 726 - char *keep; 734 + char **keep; 735 + uint32_t keep_nr; 727 736 const char *ext; 728 737 }; 729 738 ··· 731 740 const char *file_name, void *_data) 732 741 { 733 742 struct clear_midx_data *data = _data; 743 + uint32_t i; 734 744 735 745 if (!(starts_with(file_name, "multi-pack-index-") && 736 746 ends_with(file_name, data->ext))) 737 747 return; 738 - if (data->keep && !strcmp(data->keep, file_name)) 739 - return; 740 - 748 + for (i = 0; i < data->keep_nr; i++) { 749 + if (!strcmp(data->keep[i], file_name)) 750 + return; 751 + } 741 752 if (unlink(full_path)) 742 753 die_errno(_("failed to remove %s"), full_path); 743 754 } 744 755 745 756 void clear_midx_files_ext(const char *object_dir, const char *ext, 746 - unsigned char *keep_hash) 757 + const char *keep_hash) 747 758 { 748 759 struct clear_midx_data data; 749 760 memset(&data, 0, sizeof(struct clear_midx_data)); 750 761 751 - if (keep_hash) 752 - data.keep = xstrfmt("multi-pack-index-%s%s", 753 - hash_to_hex(keep_hash), ext); 762 + if (keep_hash) { 763 + ALLOC_ARRAY(data.keep, 1); 764 + 765 + data.keep[0] = xstrfmt("multi-pack-index-%s.%s", keep_hash, ext); 766 + data.keep_nr = 1; 767 + } 754 768 data.ext = ext; 755 769 756 770 for_each_file_in_pack_dir(object_dir, 757 771 clear_midx_file_ext, 758 772 &data); 759 773 774 + if (keep_hash) 775 + free(data.keep[0]); 776 + free(data.keep); 777 + } 778 + 779 + void clear_incremental_midx_files_ext(const char *object_dir, const char *ext, 780 + char **keep_hashes, 781 + uint32_t hashes_nr) 782 + { 783 + struct clear_midx_data data; 784 + uint32_t i; 785 + 786 + memset(&data, 0, sizeof(struct clear_midx_data)); 787 + 788 + ALLOC_ARRAY(data.keep, hashes_nr); 789 + for (i = 0; i < hashes_nr; i++) 790 + data.keep[i] = xstrfmt("multi-pack-index-%s.%s", keep_hashes[i], 791 + ext); 792 + data.keep_nr = hashes_nr; 793 + data.ext = ext; 794 + 795 + for_each_file_in_pack_subdir(object_dir, "multi-pack-index.d", 796 + clear_midx_file_ext, &data); 797 + 798 + for (i = 0; i < hashes_nr; i++) 799 + free(data.keep[i]); 760 800 free(data.keep); 761 801 } 762 802 ··· 774 814 if (remove_path(midx.buf)) 775 815 die(_("failed to clear multi-pack-index at %s"), midx.buf); 776 816 777 - clear_midx_files_ext(r->objects->odb->path, ".bitmap", NULL); 778 - clear_midx_files_ext(r->objects->odb->path, ".rev", NULL); 817 + clear_midx_files_ext(r->objects->odb->path, MIDX_EXT_BITMAP, NULL); 818 + clear_midx_files_ext(r->objects->odb->path, MIDX_EXT_REV, NULL); 779 819 780 820 strbuf_release(&midx); 781 821 }
+4
midx.h
··· 29 29 #define MIDX_LARGE_OFFSET_NEEDED 0x80000000 30 30 31 31 #define GIT_TEST_MULTI_PACK_INDEX "GIT_TEST_MULTI_PACK_INDEX" 32 + #define GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL \ 33 + "GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL" 32 34 33 35 struct multi_pack_index { 34 36 struct multi_pack_index *next; ··· 77 79 #define MIDX_WRITE_BITMAP (1 << 2) 78 80 #define MIDX_WRITE_BITMAP_HASH_CACHE (1 << 3) 79 81 #define MIDX_WRITE_BITMAP_LOOKUP_TABLE (1 << 4) 82 + #define MIDX_WRITE_INCREMENTAL (1 << 5) 80 83 81 84 #define MIDX_EXT_REV "rev" 82 85 #define MIDX_EXT_BITMAP "bitmap" ··· 101 104 uint32_t *result); 102 105 int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, 103 106 uint32_t *result); 107 + int midx_has_oid(struct multi_pack_index *m, const struct object_id *oid); 104 108 off_t nth_midxed_offset(struct multi_pack_index *m, uint32_t pos); 105 109 uint32_t nth_midxed_pack_int_id(struct multi_pack_index *m, uint32_t pos); 106 110 struct object_id *nth_midxed_object_oid(struct object_id *oid,
+13 -3
packfile.c
··· 815 815 report_helper(list, seen_bits, first, list->nr); 816 816 } 817 817 818 - void for_each_file_in_pack_dir(const char *objdir, 819 - each_file_in_pack_dir_fn fn, 820 - void *data) 818 + void for_each_file_in_pack_subdir(const char *objdir, 819 + const char *subdir, 820 + each_file_in_pack_dir_fn fn, 821 + void *data) 821 822 { 822 823 struct strbuf path = STRBUF_INIT; 823 824 size_t dirnamelen; ··· 826 827 827 828 strbuf_addstr(&path, objdir); 828 829 strbuf_addstr(&path, "/pack"); 830 + if (subdir) 831 + strbuf_addf(&path, "/%s", subdir); 829 832 dir = opendir(path.buf); 830 833 if (!dir) { 831 834 if (errno != ENOENT) ··· 845 848 846 849 closedir(dir); 847 850 strbuf_release(&path); 851 + } 852 + 853 + void for_each_file_in_pack_dir(const char *objdir, 854 + each_file_in_pack_dir_fn fn, 855 + void *data) 856 + { 857 + for_each_file_in_pack_subdir(objdir, NULL, fn, data); 848 858 } 849 859 850 860 struct prepare_pack_data {
+4
packfile.h
··· 55 55 56 56 typedef void each_file_in_pack_dir_fn(const char *full_path, size_t full_path_len, 57 57 const char *file_name, void *data); 58 + void for_each_file_in_pack_subdir(const char *objdir, 59 + const char *subdir, 60 + each_file_in_pack_dir_fn fn, 61 + void *data); 58 62 void for_each_file_in_pack_dir(const char *objdir, 59 63 each_file_in_pack_dir_fn fn, 60 64 void *data);
+4
t/README
··· 469 469 index to be written after every 'git repack' command, and overrides the 470 470 'core.multiPackIndex' setting to true. 471 471 472 + GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL=<boolean>, when true, sets 473 + the '--incremental' option on all invocations of 'git multi-pack-index 474 + write'. 475 + 472 476 GIT_TEST_SIDEBAND_ALL=<boolean>, when true, overrides the 473 477 'uploadpack.allowSidebandAll' setting to true, and when false, forces 474 478 fetch-pack to not request sideband-all (even if the server advertises
+2 -4
t/lib-bitmap.sh
··· 1 1 # Helpers for scripts testing bitmap functionality; see t5310 for 2 2 # example usage. 3 3 4 + . "$TEST_DIRECTORY"/lib-midx.sh 5 + 4 6 objdir=.git/objects 5 7 midx=$objdir/pack/multi-pack-index 6 8 ··· 262 264 echo $2 >expect && 263 265 echo $1 | git cat-file --batch-check="%(deltabase)" >actual && 264 266 test_cmp expect actual 265 - } 266 - 267 - midx_checksum () { 268 - test-tool read-midx --checksum "$1" 269 267 } 270 268 271 269 # midx_pack_source <obj>
+28
t/lib-midx.sh
··· 6 6 test_cmp expect actual && 7 7 git multi-pack-index --object-dir=$1 verify 8 8 } 9 + 10 + midx_checksum () { 11 + test-tool read-midx --checksum "$1" 12 + } 13 + 14 + midx_git_two_modes () { 15 + git -c core.multiPackIndex=false $1 >expect && 16 + git -c core.multiPackIndex=true $1 >actual && 17 + if [ "$2" = "sorted" ] 18 + then 19 + sort <expect >expect.sorted && 20 + mv expect.sorted expect && 21 + sort <actual >actual.sorted && 22 + mv actual.sorted actual 23 + fi && 24 + test_cmp expect actual 25 + } 26 + 27 + compare_results_with_midx () { 28 + MSG=$1 29 + test_expect_success "check normal git operations: $MSG" ' 30 + midx_git_two_modes "rev-list --objects --all" && 31 + midx_git_two_modes "log --raw" && 32 + midx_git_two_modes "count-objects --verbose" && 33 + midx_git_two_modes "cat-file --batch-all-objects --batch-check" && 34 + midx_git_two_modes "cat-file --batch-all-objects --batch-check --unordered" sorted 35 + ' 36 + }
+3 -24
t/t5319-multi-pack-index.sh
··· 3 3 test_description='multi-pack-indexes' 4 4 . ./test-lib.sh 5 5 . "$TEST_DIRECTORY"/lib-chunk.sh 6 + . "$TEST_DIRECTORY"/lib-midx.sh 6 7 7 8 GIT_TEST_MULTI_PACK_INDEX=0 9 + GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 10 + GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL=0 8 11 objdir=.git/objects 9 12 10 13 HASH_LEN=$(test_oid rawsz) ··· 106 109 git multi-pack-index --object-dir=$objdir write && 107 110 midx_read_expect 1 18 4 $objdir 108 111 ' 109 - 110 - midx_git_two_modes () { 111 - git -c core.multiPackIndex=false $1 >expect && 112 - git -c core.multiPackIndex=true $1 >actual && 113 - if [ "$2" = "sorted" ] 114 - then 115 - sort <expect >expect.sorted && 116 - mv expect.sorted expect && 117 - sort <actual >actual.sorted && 118 - mv actual.sorted actual 119 - fi && 120 - test_cmp expect actual 121 - } 122 - 123 - compare_results_with_midx () { 124 - MSG=$1 125 - test_expect_success "check normal git operations: $MSG" ' 126 - midx_git_two_modes "rev-list --objects --all" && 127 - midx_git_two_modes "log --raw" && 128 - midx_git_two_modes "count-objects --verbose" && 129 - midx_git_two_modes "cat-file --batch-all-objects --batch-check" && 130 - midx_git_two_modes "cat-file --batch-all-objects --batch-check --unordered" sorted 131 - ' 132 - } 133 112 134 113 test_expect_success 'write midx with one v2 pack' ' 135 114 git pack-objects --index-version=2,0x40 $objdir/pack/test <obj-list &&
+1
t/t5326-multi-pack-bitmaps.sh
··· 7 7 # We'll be writing our own MIDX, so avoid getting confused by the 8 8 # automatic ones. 9 9 GIT_TEST_MULTI_PACK_INDEX=0 10 + GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL=0 10 11 11 12 # This test exercise multi-pack bitmap functionality where the object order is 12 13 # stored and read from a special chunk within the MIDX, so use the default
+1
t/t5327-multi-pack-bitmaps-rev.sh
··· 8 8 # We'll be writing our own MIDX, so avoid getting confused by the automatic 9 9 # ones. 10 10 GIT_TEST_MULTI_PACK_INDEX=0 11 + GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL=0 11 12 12 13 # Unlike t5326, this test exercise multi-pack bitmap functionality where the 13 14 # object order is stored in a separate .rev file.
+2
t/t5332-multi-pack-reuse.sh
··· 6 6 . ./test-lib.sh 7 7 . "$TEST_DIRECTORY"/lib-bitmap.sh 8 8 9 + GIT_TEST_MULTI_PACK_INDEX=0 10 + GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL=0 9 11 objdir=.git/objects 10 12 packdir=$objdir/pack 11 13
+46
t/t5334-incremental-multi-pack-index.sh
··· 1 + #!/bin/sh 2 + 3 + test_description='incremental multi-pack-index' 4 + . ./test-lib.sh 5 + . "$TEST_DIRECTORY"/lib-midx.sh 6 + 7 + GIT_TEST_MULTI_PACK_INDEX=0 8 + export GIT_TEST_MULTI_PACK_INDEX 9 + 10 + objdir=.git/objects 11 + packdir=$objdir/pack 12 + midxdir=$packdir/multi-pack-index.d 13 + midx_chain=$midxdir/multi-pack-index-chain 14 + 15 + test_expect_success 'convert non-incremental MIDX to incremental' ' 16 + test_commit base && 17 + git repack -ad && 18 + git multi-pack-index write && 19 + 20 + test_path_is_file $packdir/multi-pack-index && 21 + old_hash="$(midx_checksum $objdir)" && 22 + 23 + test_commit other && 24 + git repack -d && 25 + git multi-pack-index write --incremental && 26 + 27 + test_path_is_missing $packdir/multi-pack-index && 28 + test_path_is_file $midx_chain && 29 + test_line_count = 2 $midx_chain && 30 + grep $old_hash $midx_chain 31 + ' 32 + 33 + compare_results_with_midx 'incremental MIDX' 34 + 35 + test_expect_success 'convert incremental to non-incremental' ' 36 + test_commit squash && 37 + git repack -d && 38 + git multi-pack-index write && 39 + 40 + test_path_is_file $packdir/multi-pack-index && 41 + test_dir_is_empty $midxdir 42 + ' 43 + 44 + compare_results_with_midx 'non-incremental MIDX conversion' 45 + 46 + test_done
+14 -13
t/t7700-repack.sh
··· 7 7 . "${TEST_DIRECTORY}/lib-midx.sh" 8 8 . "${TEST_DIRECTORY}/lib-terminal.sh" 9 9 10 + GIT_TEST_MULTI_PACK_INDEX=0 11 + GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL=0 12 + 10 13 commit_and_pack () { 11 14 test_commit "$@" 1>&2 && 12 15 incrpackid=$(git pack-objects --all --unpacked --incremental .git/objects/pack/pack </dev/null) && ··· 117 120 ( 118 121 cd member && 119 122 test_commit "object" && 120 - GIT_TEST_MULTI_PACK_INDEX=0 git repack -Adl --write-bitmap-index 2>err && 123 + git repack -Adl --write-bitmap-index 2>err && 121 124 cat >expect <<-EOF && 122 125 warning: disabling bitmap writing, as some objects are not being packed 123 126 EOF ··· 533 536 test_expect_success '--write-midx unchanged' ' 534 537 ( 535 538 cd midx && 536 - GIT_TEST_MULTI_PACK_INDEX=0 git repack && 539 + git repack && 537 540 test_path_is_missing $midx && 538 541 test_path_is_missing $midx-*.bitmap && 539 542 540 - GIT_TEST_MULTI_PACK_INDEX=0 git repack --write-midx && 543 + git repack --write-midx && 541 544 542 545 test_path_is_file $midx && 543 546 test_path_is_missing $midx-*.bitmap && ··· 550 553 cd midx && 551 554 test_commit loose && 552 555 553 - GIT_TEST_MULTI_PACK_INDEX=0 git repack --write-midx && 556 + git repack --write-midx && 554 557 555 558 test_path_is_file $midx && 556 559 test_path_is_missing $midx-*.bitmap && ··· 561 564 test_expect_success '--write-midx with -b' ' 562 565 ( 563 566 cd midx && 564 - GIT_TEST_MULTI_PACK_INDEX=0 git repack -mb && 567 + git repack -mb && 565 568 566 569 test_path_is_file $midx && 567 570 test_path_is_file $midx-*.bitmap && ··· 574 577 cd midx && 575 578 test_commit repack && 576 579 577 - GIT_TEST_MULTI_PACK_INDEX=0 git repack -Ad --write-midx && 580 + git repack -Ad --write-midx && 578 581 579 582 test_path_is_file $midx && 580 583 test_path_is_missing $midx-*.bitmap && ··· 587 590 cd midx && 588 591 589 592 test_commit repack-2 && 590 - GIT_TEST_MULTI_PACK_INDEX=0 git repack -Adb --write-midx && 593 + git repack -Adb --write-midx && 591 594 592 595 checksum=$(midx_checksum $objdir) && 593 596 test_path_is_file $midx && 594 597 test_path_is_file $midx-$checksum.bitmap && 595 598 596 599 test_commit repack-3 && 597 - GIT_TEST_MULTI_PACK_INDEX=0 git repack -Adb --write-midx && 600 + git repack -Adb --write-midx && 598 601 599 602 test_path_is_file $midx && 600 603 test_path_is_missing $midx-$checksum.bitmap && 601 604 test_path_is_file $midx-$(midx_checksum $objdir).bitmap && 602 605 603 606 test_commit repack-4 && 604 - GIT_TEST_MULTI_PACK_INDEX=0 git repack -Adb && 607 + git repack -Adb && 605 608 606 609 find $objdir/pack -type f -name "multi-pack-index*" >files && 607 610 test_must_be_empty files ··· 622 625 git log --format="create refs/tags/%s/%s %H" HEAD >refs && 623 626 git update-ref --stdin <refs && 624 627 625 - GIT_TEST_MULTI_PACK_INDEX=0 \ 626 628 git repack --write-midx --write-bitmap-index && 627 629 test_path_is_file $midx && 628 630 test_path_is_file $midx-$(midx_checksum $objdir).bitmap && ··· 714 716 ( 715 717 cd repo && 716 718 test_commit base && 717 - GIT_TEST_MULTI_PACK_INDEX=0 git repack -Ab && 719 + git repack -Ab && 718 720 719 721 pack_bitmap=$(ls $objdir/pack/pack-*.bitmap) && 720 722 test_path_is_file "$pack_bitmap" && 721 723 722 724 test_commit tip && 723 - GIT_TEST_MULTI_PACK_INDEX=0 git repack -bm && 725 + git repack -bm && 724 726 725 727 test_path_is_file $midx && 726 728 test_path_is_file $midx-$(midx_checksum $objdir).bitmap && ··· 743 745 keep="$objdir/pack/pack-$one.keep" && 744 746 touch "$keep" && 745 747 746 - GIT_TEST_MULTI_PACK_INDEX=0 \ 747 748 git repack --write-midx --write-bitmap-index --geometric=2 -d \ 748 749 --pack-kept-objects && 749 750