Git fork

rev-list: support delimiting objects with NUL bytes

When walking objects, git-rev-list(1) prints each object entry on a
separate line. Some options, such as `--objects`, may print additional
information about tree and blob object on the same line in the form:

$ git rev-list --objects <rev>
<tree/blob oid> SP [<path>] LF

Note that in this form the SP is appended regardless of whether the tree
or blob object has path information available. Paths containing a
newline are also truncated at the newline.

Introduce the `-z` option for git-rev-list(1) which reformats the output
to use NUL-delimiters between objects and associated info in the
following form:

$ git rev-list -z --objects <rev>
<oid> NUL [path=<path> NUL]

In this form, the start of each record is signaled by an OID entry that
is all hexidecimal and does not contain any '='. Additional path info
from `--objects` is appended to the record as a token/value pair
`path=<path>` as-is without any truncation.

For now, the `--objects` flag is the only options that can be used in
combination with `-z`. In a subsequent commit, NUL-delimited support for
other options is added. Other options that do not make sense when used
in combination with `-z` are rejected.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

authored by

Justin Tobler and committed by
Junio C Hamano
c3d59c2e c9907a19

+86 -5
+21
Documentation/rev-list-options.adoc
··· 361 361 --progress=<header>:: 362 362 Show progress reports on stderr as objects are considered. The 363 363 `<header>` text will be printed with each progress update. 364 + 365 + -z:: 366 + Instead of being newline-delimited, each outputted object and its 367 + accompanying metadata is delimited using NUL bytes. Output is printed 368 + in the following form: 369 + + 370 + ----------------------------------------------------------------------- 371 + <OID> NUL [<token>=<value> NUL]... 372 + ----------------------------------------------------------------------- 373 + + 374 + Additional object metadata, such as object paths, is printed using the 375 + `<token>=<value>` form. Token values are printed as-is without any 376 + encoding/truncation. An OID entry never contains a '=' character and thus 377 + is used to signal the start of a new object record. Examples: 378 + + 379 + ----------------------------------------------------------------------- 380 + <OID> NUL 381 + <OID> NUL path=<path> NUL 382 + ----------------------------------------------------------------------- 383 + + 384 + This mode is only compatible with the `--objects` output option. 364 385 endif::git-rev-list[] 365 386 366 387 History Simplification
+30 -5
builtin/rev-list.c
··· 65 65 " --abbrev-commit\n" 66 66 " --left-right\n" 67 67 " --count\n" 68 + " -z\n" 68 69 " special purpose:\n" 69 70 " --bisect\n" 70 71 " --bisect-vars\n" ··· 96 97 static int arg_show_object_names = 1; 97 98 98 99 #define DEFAULT_OIDSET_SIZE (16*1024) 100 + 101 + static char line_term = '\n'; 102 + static char info_term = ' '; 99 103 100 104 static int show_disk_usage; 101 105 static off_t total_disk_usage; ··· 264 268 if (revs->commit_format == CMIT_FMT_ONELINE) 265 269 putchar(' '); 266 270 else if (revs->include_header) 267 - putchar('\n'); 271 + putchar(line_term); 268 272 269 273 if (revs->verbose_header) { 270 274 struct strbuf buf = STRBUF_INIT; ··· 361 365 printf("%s", oid_to_hex(&obj->oid)); 362 366 363 367 if (arg_show_object_names) { 364 - putchar(' '); 365 - for (const char *p = name; *p && *p != '\n'; p++) 366 - putchar(*p); 368 + if (line_term) { 369 + putchar(info_term); 370 + for (const char *p = name; *p && *p != '\n'; p++) 371 + putchar(*p); 372 + } else if (*name) { 373 + printf("%cpath=%s", info_term, name); 374 + } 367 375 } 368 376 369 - putchar('\n'); 377 + putchar(line_term); 370 378 } 371 379 372 380 static void show_edge(struct commit *commit) ··· 642 650 revs.exclude_promisor_objects = 1; 643 651 } else if (skip_prefix(arg, "--missing=", &arg)) { 644 652 parse_missing_action_value(arg); 653 + } else if (!strcmp(arg, "-z")) { 654 + line_term = '\0'; 655 + info_term = '\0'; 645 656 } 646 657 } 647 658 ··· 757 768 usage(rev_list_usage); 758 769 759 770 } 771 + 772 + /* 773 + * Reject options currently incompatible with -z. For some options, this 774 + * is not an inherent limitation and support may be implemented in the 775 + * future. 776 + */ 777 + if (!line_term) { 778 + if (revs.graph || revs.verbose_header || show_disk_usage || 779 + info.show_timestamp || info.header_prefix || bisect_list || 780 + use_bitmap_index || revs.edge_hint || revs.left_right || 781 + revs.cherry_mark || arg_missing_action || revs.boundary) 782 + die(_("-z option used with unsupported option")); 783 + } 784 + 760 785 if (revs.commit_format != CMIT_FMT_USERFORMAT) 761 786 revs.include_header = 1; 762 787 if (revs.commit_format != CMIT_FMT_UNSPECIFIED) {
+35
t/t6000-rev-list-misc.sh
··· 182 182 test_cmp expect actual 183 183 ' 184 184 185 + test_expect_success 'rev-list -z' ' 186 + test_when_finished rm -rf repo && 187 + 188 + git init repo && 189 + test_commit -C repo 1 && 190 + test_commit -C repo 2 && 191 + 192 + oid1=$(git -C repo rev-parse HEAD~) && 193 + oid2=$(git -C repo rev-parse HEAD) && 194 + 195 + printf "%s\0%s\0" "$oid2" "$oid1" >expect && 196 + git -C repo rev-list -z HEAD >actual && 197 + 198 + test_cmp expect actual 199 + ' 200 + 201 + test_expect_success 'rev-list -z --objects' ' 202 + test_when_finished rm -rf repo && 203 + 204 + git init repo && 205 + test_commit -C repo 1 && 206 + test_commit -C repo 2 && 207 + 208 + oid1=$(git -C repo rev-parse HEAD:1.t) && 209 + oid2=$(git -C repo rev-parse HEAD:2.t) && 210 + path1=1.t && 211 + path2=2.t && 212 + 213 + printf "%s\0path=%s\0%s\0path=%s\0" "$oid1" "$path1" "$oid2" "$path2" \ 214 + >expect && 215 + git -C repo rev-list -z --objects HEAD:1.t HEAD:2.t >actual && 216 + 217 + test_cmp expect actual 218 + ' 219 + 185 220 test_done