Git fork

fsck: document and test sorted skipList input

Ever since the skipList support was first added in cd94c6f91 ("fsck:
git receive-pack: support excluding objects from fsck'ing",
2015-06-22) the documentation for the format has that the file is a
sorted list of object names.

Thus, anyone using the feature would have thought the list needed to
be sorted. E.g. I recently in conjunction with my fetch.fsck.*
implementation in 1362df0d41 ("fetch: implement fetch.fsck.*",
2018-07-27) wrote some code to ship a skipList, and went out of my way
to sort it.

Doing so seems intuitive, since it contains fixed-width records, and
has no support for comments, so one might expect it to be binary
searched in-place on-disk.

However, as documented here this was never a requirement, so let's
change the documentation. Since this is a file format change let's
also document what was said about this in the past, so e.g. someone
like myself reading the new docs can see this never needed to be
sorted ("why do I have all this code to sort this thing...").

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

authored by

Ævar Arnfjörð Bjarmason and committed by
Junio C Hamano
58dc440b 536a9ce8

+28 -1
+9 -1
Documentation/config.txt
··· 1710 1710 will only cause git to warn. 1711 1711 1712 1712 fsck.skipList:: 1713 - The path to a sorted list of object names (i.e. one SHA-1 per 1713 + The path to a list of object names (i.e. one SHA-1 per 1714 1714 line) that are known to be broken in a non-fatal way and should 1715 1715 be ignored. This feature is useful when an established project 1716 1716 should be accepted despite early commits containing errors that ··· 1725 1725 fall back on the `fsck.skipList` configuration if they aren't set. To 1726 1726 uniformly configure the same fsck settings in different circumstances 1727 1727 all three of them they must all set to the same values. 1728 + + 1729 + Older versions of Git (before 2.20) documented that the object names 1730 + list should be sorted. This was never a requirement, the object names 1731 + can appear in any order, but when reading the list we track whether 1732 + the list is sorted for the purposes of an internal binary search 1733 + implementation, which can save itself some work with an already sorted 1734 + list. Unless you have a humongous list there's no reason to go out of 1735 + your way to pre-sort the list. 1728 1736 1729 1737 gc.aggressiveDepth:: 1730 1738 The depth parameter used in the delta compression
+19
t/t5504-fetch-receive-strict.sh
··· 142 142 test_i18ngrep "missingEmail" err 143 143 ' 144 144 145 + test_expect_success 'setup sorted and unsorted skipLists' ' 146 + cat >SKIP.unsorted <<-EOF && 147 + 0000000000000000000000000000000000000004 148 + 0000000000000000000000000000000000000002 149 + $commit 150 + 0000000000000000000000000000000000000001 151 + 0000000000000000000000000000000000000003 152 + EOF 153 + sort SKIP.unsorted >SKIP.sorted 154 + ' 155 + 156 + test_expect_success 'fsck with sorted skipList' ' 157 + git -c fsck.skipList=SKIP.sorted fsck 158 + ' 159 + 160 + test_expect_success 'fsck with unsorted skipList' ' 161 + git -c fsck.skipList=SKIP.unsorted fsck 162 + ' 163 + 145 164 test_expect_success 'fsck with invalid or bogus skipList input' ' 146 165 git -c fsck.skipList=/dev/null -c fsck.missingEmail=ignore fsck && 147 166 test_must_fail git -c fsck.skipList=does-not-exist -c fsck.missingEmail=ignore fsck 2>err &&