odb: teach read_blob_entry to use size_t · freshlybakedca.ke/git@e9aa762

Git fork

odb: teach read_blob_entry to use size_t

There is mixed use of size_t and unsigned long to deal with sizes in the
codebase. Recall that Windows defines unsigned long as 32 bits even on
64-bit platforms, meaning that converting size_t to unsigned long narrows
the range. This mostly doesn't cause a problem since Git rarely deals
with files larger than 2^32 bytes.

But adjunct systems such as Git LFS, which use smudge/clean filters to
keep huge files out of the repository, may have huge file contents passed
through some of the functions in entry.c and convert.c. On Windows, this
results in a truncated file being written to the workdir. I traced this to
one specific use of unsigned long in write_entry (and a similar instance
in write_pc_item_to_fd for parallel checkout). That appeared to be for
the call to read_blob_entry, which expects a pointer to unsigned long.

By altering the signature of read_blob_entry to expect a size_t,
write_entry can be switched to use size_t internally (which all of its
callers and most of its callees already used). To avoid touching dozens of
additional files, read_blob_entry uses a local unsigned long to call a
chain of functions which aren't prepared to accept size_t.

Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Matt Cooper <vtbassmatt@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

authored by vtbassmatt.tngl.sh and committed by

Junio C Hamano 4 years ago e9aa762c b79541af

+8 -6

4 changed files

expand all

unified split

entry.c

entry.h

parallel-checkout.c

t1051-large-conversion.sh

+5 -3

entry.c

··· 82 82 return open(path, O_WRONLY | O_CREAT | O_EXCL, mode); 83 83 } 84 84 85 - void *read_blob_entry(const struct cache_entry *ce, unsigned long *size) 85 + void *read_blob_entry(const struct cache_entry *ce, size_t *size) 86 86 { 87 87 enum object_type type; 88 - void *blob_data = read_object_file(&ce->oid, &type, size); 88 + unsigned long ul; 89 + void *blob_data = read_object_file(&ce->oid, &type, &ul); 89 90 91 + *size = ul; 90 92 if (blob_data) { 91 93 if (type == OBJ_BLOB) 92 94 return blob_data; ··· 270 272 int fd, ret, fstat_done = 0; 271 273 char *new_blob; 272 274 struct strbuf buf = STRBUF_INIT; 273 - unsigned long size; 275 + size_t size; 274 276 ssize_t wrote; 275 277 size_t newsize = 0; 276 278 struct stat st;

+1 -1

entry.h

··· 51 51 */ 52 52 void unlink_entry(const struct cache_entry *ce); 53 53 54 - void *read_blob_entry(const struct cache_entry *ce, unsigned long *size); 54 + void *read_blob_entry(const struct cache_entry *ce, size_t *size); 55 55 int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st); 56 56 void update_ce_after_write(const struct checkout *state, struct cache_entry *ce, 57 57 struct stat *st);

+1 -1

parallel-checkout.c

··· 261 261 struct stream_filter *filter; 262 262 struct strbuf buf = STRBUF_INIT; 263 263 char *blob; 264 - unsigned long size; 264 + size_t size; 265 265 ssize_t wrote; 266 266 267 267 /* Sanity check */

+1 -1

t/t1051-large-conversion.sh

··· 85 85 86 86 # This smudge filter prepends 5GB of zeros to the file it checks out. This 87 87 # ensures that smudging doesn't mangle large files on 64-bit Windows. 88 - test_expect_failure EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ 88 + test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ 89 89 'files over 4GB convert on output' ' 90 90 test_commit test small "a small file" && 91 91 small_size=$(test_file_size small) &&