Git fork

docs: add documentation for loose objects

We currently have no documentation for how loose objects are stored.
Let's add some here so it's easy for people to understand how they
work.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

authored by

brian m. carlson and committed by
Junio C Hamano
d4f43954 24d46f86

+55
+1
Documentation/Makefile
··· 34 34 MAN5_TXT += gitformat-chunk.adoc 35 35 MAN5_TXT += gitformat-commit-graph.adoc 36 36 MAN5_TXT += gitformat-index.adoc 37 + MAN5_TXT += gitformat-loose.adoc 37 38 MAN5_TXT += gitformat-pack.adoc 38 39 MAN5_TXT += gitformat-signature.adoc 39 40 MAN5_TXT += githooks.adoc
+53
Documentation/gitformat-loose.adoc
··· 1 + gitformat-loose(5) 2 + ================== 3 + 4 + NAME 5 + ---- 6 + gitformat-loose - Git loose object format 7 + 8 + 9 + SYNOPSIS 10 + -------- 11 + [verse] 12 + $GIT_DIR/objects/[0-9a-f][0-9a-f]/* 13 + 14 + DESCRIPTION 15 + ----------- 16 + 17 + Loose objects are how Git stores individual objects, where every object is 18 + written as a separate file. 19 + 20 + Over the lifetime of a repository, objects are usually written as loose objects 21 + initially. Eventually, these loose objects will be compacted into packfiles 22 + via repository maintenance to improve disk space usage and speed up the lookup 23 + of these objects. 24 + 25 + == Loose objects 26 + 27 + Each loose object contains a prefix, followed immediately by the data of the 28 + object. The prefix contains `<type> <size>\0`. `<type>` is one of `blob`, 29 + `tree`, `commit`, or `tag` and `size` is the size of the data (without the 30 + prefix) as a decimal integer expressed in ASCII. 31 + 32 + The entire contents, prefix and data concatenated, is then compressed with zlib 33 + and the compressed data is stored in the file. The object ID of the object is 34 + the SHA-1 or SHA-256 (as appropriate) hash of the uncompressed data. 35 + 36 + The file for the loose object is stored under the `objects` directory, with the 37 + first two hex characters of the object ID being the directory and the remaining 38 + characters being the file name. This is done to shard the data and avoid too 39 + many files being in one directory, since some file systems perform poorly with 40 + many items in a directory. 41 + 42 + As an example, the empty tree contains the data (when uncompressed) `tree 0\0` 43 + and, in a SHA-256 repository, would have the object ID 44 + `6ef19b41225c5369f1c104d45d8d85efa9b057b53b14b4b9b939dd74decc5321` and would be 45 + stored under 46 + `$GIT_DIR/objects/6e/f19b41225c5369f1c104d45d8d85efa9b057b53b14b4b9b939dd74decc5321`. 47 + 48 + Similarly, a blob containing the contents `abc` would have the uncompressed 49 + data of `blob 3\0abc`. 50 + 51 + GIT 52 + --- 53 + Part of the linkgit:git[1] suite
+1
Documentation/meson.build
··· 171 171 'gitformat-chunk.adoc' : 5, 172 172 'gitformat-commit-graph.adoc' : 5, 173 173 'gitformat-index.adoc' : 5, 174 + 'gitformat-loose.adoc' : 5, 174 175 'gitformat-pack.adoc' : 5, 175 176 'gitformat-signature.adoc' : 5, 176 177 'githooks.adoc' : 5,