Git fork

doc: remembering-renames.adoc: fix asciidoc warnings

Both asciidoc and ascidoctor issue warnings about 'list item index:
expected n got n-1' for n=1->9 on lines 13, 15, 17, 20, 23, 25, 29,
31 and 33. In asciidoc, numbered lists must start at one, whereas this
file has a list starting at zero. Also, asciidoc and asciidoctor warn
about 'section title out of sequence: expected level 1, got level 2'
on line 38. (asciidoc only complains about the first instance of this,
while asciidoctor complains about them all, on lines 94, 141, 142,
184, 185, 257, 288, 289, 290, 397, 424, 485, 486 and 487). These
warnings stem from the section titles not being correctly nested within
a document/chapter title.

In order to address the first set of warnings, simply renumber the list
from one to nine, rather than zero to eight. This also requires altering
the text which refers to the section numbers, including other section
titles.

In order to address the second set of warnings, change the section title
syntax from '=== title ===' to '== title ==', effectively reducing the
nesting level of the title by one. Also, some of the titles are given
over multiple lines (they are very long), with an title '===' prefix
on each line. This leads to them being treated as separate sections
with no body text (as you can see from the line numbers given for the
asciidoctor warnings, above). So, for these titles, turn them into a
single (long) line of text.

In addition to the warnings, address some other formatting issues:

- the ascii branch diagrams didn't format correctly on asciidoctor
so include them in a literal block.
- several blocks of text were intended to be formatted 'as is' but
were not included in a literal block.
- in section 8, format the (A)->(D) in the text description as a
literal with `` marks, since (C) is rendered as a copyright
symbol in html otherwise.
- in section 9, a sub-list of two items is not formatted as such.
change the '*' introducer to '**' to correct the sub-list format.

Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

authored by

Ramsay Jones and committed by
Junio C Hamano
4fa0e4d0 52751000

+78 -42
+78 -42
Documentation/technical/remembering-renames.adoc
··· 10 10 11 11 Outline: 12 12 13 - 0. Assumptions 13 + 1. Assumptions 14 14 15 - 1. How rebasing and cherry-picking work 15 + 2. How rebasing and cherry-picking work 16 16 17 - 2. Why the renames on MERGE_SIDE1 in any given pick are *always* a 17 + 3. Why the renames on MERGE_SIDE1 in any given pick are *always* a 18 18 superset of the renames on MERGE_SIDE1 for the next pick. 19 19 20 - 3. Why any rename on MERGE_SIDE1 in any given pick is _almost_ always also 20 + 4. Why any rename on MERGE_SIDE1 in any given pick is _almost_ always also 21 21 a rename on MERGE_SIDE1 for the next pick 22 22 23 - 4. A detailed description of the counter-examples to #3. 23 + 5. A detailed description of the counter-examples to #4. 24 24 25 - 5. Why the special cases in #4 are still fully reasonable to use to pair 25 + 6. Why the special cases in #5 are still fully reasonable to use to pair 26 26 up files for three-way content merging in the merge machinery, and why 27 27 they do not affect the correctness of the merge. 28 28 29 - 6. Interaction with skipping of "irrelevant" renames 29 + 7. Interaction with skipping of "irrelevant" renames 30 30 31 - 7. Additional items that need to be cached 31 + 8. Additional items that need to be cached 32 32 33 - 8. How directory rename detection interacts with the above and why this 33 + 9. How directory rename detection interacts with the above and why this 34 34 optimization is still safe even if merge.directoryRenames is set to 35 35 "true". 36 36 37 37 38 - === 0. Assumptions === 38 + == 1. Assumptions == 39 39 40 40 There are two assumptions that will hold throughout this document: 41 41 ··· 44 44 45 45 * All merges are fully automatic 46 46 47 - and a third that will hold in sections 2-5 for simplicity, that I'll later 48 - address in section 8: 47 + and a third that will hold in sections 3-6 for simplicity, that I'll later 48 + address in section 9: 49 49 50 50 * No directory renames occur 51 51 ··· 77 77 stored on disk, and thus is thrown away as soon as the rebase or cherry 78 78 pick stops for the user to resolve the operation. 79 79 80 - The third assumption makes sections 2-5 simpler, and allows people to 80 + The third assumption makes sections 3-6 simpler, and allows people to 81 81 understand the basics of why this optimization is safe and effective, and 82 - then I can go back and address the specifics in section 8. It is probably 82 + then I can go back and address the specifics in section 9. It is probably 83 83 also worth noting that if directory renames do occur, then the default of 84 84 merge.directoryRenames being set to "conflict" means that the operation 85 85 will stop for users to resolve the conflicts and the cache will be thrown ··· 88 88 users will have set merge.directoryRenames to "true" to allow the merges to 89 89 continue to proceed automatically. The optimization is still safe with 90 90 this config setting, but we have to discuss a few more cases to show why; 91 - this discussion is deferred until section 8. 91 + this discussion is deferred until section 9. 92 92 93 93 94 - === 1. How rebasing and cherry-picking work === 94 + == 2. How rebasing and cherry-picking work == 95 95 96 96 Consider the following setup (from the git-rebase manpage): 97 97 98 + ------------ 98 99 A---B---C topic 99 100 / 100 101 D---E---F---G main 102 + ------------ 101 103 102 104 After rebasing or cherry-picking topic onto main, this will appear as: 103 105 106 + ------------ 104 107 A'--B'--C' topic 105 108 / 106 109 D---E---F---G main 110 + ------------ 107 111 108 112 The way the commits A', B', and C' are created is through a series of 109 113 merges, where rebase or cherry-pick sequentially uses each of the three ··· 111 115 in the merge operation as MERGE_BASE, MERGE_SIDE1, and MERGE_SIDE2. For 112 116 this picture, the three commits for each of the three merges would be: 113 117 118 + .... 114 119 To create A': 115 120 MERGE_BASE: E 116 121 MERGE_SIDE1: G ··· 125 130 MERGE_BASE: B 126 131 MERGE_SIDE1: B' 127 132 MERGE_SIDE2: C 133 + .... 128 134 129 135 Sometimes, folks are surprised that these three-way merges are done. It 130 136 can be useful in understanding these three-way merges to view them in a ··· 138 144 B, B', and C, at least the parts before you decide to record a commit. 139 145 140 146 141 - === 2. Why the renames on MERGE_SIDE1 in any given pick are always a === 142 - === superset of the renames on MERGE_SIDE1 for the next pick. === 147 + == 3. Why the renames on MERGE_SIDE1 in any given pick are always a superset of the renames on MERGE_SIDE1 for the next pick. == 143 148 144 149 The merge machinery uses the filenames it is fed from MERGE_BASE, 145 150 MERGE_SIDE1, and MERGE_SIDE2. It will only move content to a different ··· 156 161 First, let's remember what commits are involved in the first and second 157 162 picks of the cherry-pick or rebase sequence: 158 163 164 + .... 159 165 To create A': 160 166 MERGE_BASE: E 161 167 MERGE_SIDE1: G ··· 165 171 MERGE_BASE: A 166 172 MERGE_SIDE1: A' 167 173 MERGE_SIDE2: B 174 + .... 168 175 169 176 So, in particular, we need to show that the renames between E and G are a 170 177 superset of those between A and A'. ··· 181 188 and G are a superset of those between A and A'. 182 189 183 190 184 - === 3. Why any rename on MERGE_SIDE1 in any given pick is _almost_ === 185 - === always also a rename on MERGE_SIDE1 for the next pick. === 191 + == 4. Why any rename on MERGE_SIDE1 in any given pick is _almost_ always also a rename on MERGE_SIDE1 for the next pick. == 186 192 187 193 Let's again look at the first two picks: 188 194 195 + .... 189 196 To create A': 190 197 MERGE_BASE: E 191 198 MERGE_SIDE1: G ··· 195 202 MERGE_BASE: A 196 203 MERGE_SIDE1: A' 197 204 MERGE_SIDE2: B 205 + .... 198 206 199 207 Now let's look at any given rename from MERGE_SIDE1 of the first pick, i.e. 200 208 any given rename from E to G. Let's use the filenames 'oldfile' and 201 209 'newfile' for demonstration purposes. That first pick will function as 202 210 follows; when the rename is detected, the merge machinery will do a 203 211 three-way content merge of the following: 212 + 213 + .... 204 214 E:oldfile 205 215 G:newfile 206 216 A:oldfile 217 + .... 218 + 207 219 and produce a new result: 220 + 221 + .... 208 222 A':newfile 223 + .... 209 224 210 225 Note above that I've assumed that E->A did not rename oldfile. If that 211 226 side did rename, then we most likely have a rename/rename(1to2) conflict ··· 254 269 detectable as renames almost always. 255 270 256 271 257 - === 4. A detailed description of the counter-examples to #3. === 272 + == 5. A detailed description of the counter-examples to #4. == 258 273 259 - We already noted in section 3 that rename/rename(1to1) (i.e. both sides 274 + We already noted in section 4 that rename/rename(1to1) (i.e. both sides 260 275 renaming a file the same way) was one counter-example. The more 261 276 interesting bit, though, is why did we need to use the "almost" qualifier 262 277 when stating that A:oldfile and A':newfile are "almost" always detectable 263 278 as renames? 264 279 265 - Let's repeat an earlier point that section 3 made: 280 + Let's repeat an earlier point that section 4 made: 266 281 282 + .... 267 283 A':newfile was created by applying the changes between E:oldfile and 268 284 G:newfile to A:oldfile. The changes between E:oldfile and G:newfile were 269 285 <50% of the size of E:oldfile. 286 + .... 270 287 271 288 If those changes that were <50% of the size of E:oldfile are also <50% of 272 289 the size of A:oldfile, then A:oldfile and A':newfile will be detectable as ··· 276 293 detect A:oldfile and A':newfile as renames. 277 294 278 295 Here's an example where that can happen: 296 + 279 297 * E:oldfile had 20 lines 280 298 * G:newfile added 10 new lines at the beginning of the file 281 299 * A:oldfile kept the first 3 lines of the file, and deleted all the rest 300 + 282 301 then 302 + 303 + .... 283 304 => A':newfile would have 13 lines, 3 of which matches those in A:oldfile. 284 - E:oldfile -> G:newfile would be detected as a rename, but A:oldfile and 285 - A':newfile would not be. 305 + E:oldfile -> G:newfile would be detected as a rename, but A:oldfile and 306 + A':newfile would not be. 307 + .... 286 308 287 309 288 - === 5. Why the special cases in #4 are still fully reasonable to use to === 289 - === pair up files for three-way content merging in the merge machinery, === 290 - === and why they do not affect the correctness of the merge. === 310 + == 6. Why the special cases in #5 are still fully reasonable to use to pair up files for three-way content merging in the merge machinery, and why they do not affect the correctness of the merge. == 291 311 292 312 In the rename/rename(1to1) case, A:newfile and A':newfile are not renames 293 313 since they use the *same* filename. However, files with the same filename ··· 295 315 machinery has never employed break detection). The interesting 296 316 counter-example case is thus not the rename/rename(1to1) case, but the case 297 317 where A did not rename oldfile. That was the case that we spent most of 298 - the time discussing in sections 3 and 4. The remainder of this section 318 + the time discussing in sections 4 and 5. The remainder of this section 299 319 will be devoted to that case as well. 300 320 301 321 So, even if A:oldfile and A':newfile aren't detectable as renames, why is 302 322 it still reasonable to pair them up for three-way content merging in the 303 323 merge machinery? There are multiple reasons: 304 324 305 - * As noted in sections 3 and 4, the diff between A:oldfile and A':newfile 325 + * As noted in sections 4 and 5, the diff between A:oldfile and A':newfile 306 326 is *exactly* the same as the diff between E:oldfile and G:newfile. The 307 327 latter pair were detected as renames, so it seems unlikely to surprise 308 328 users for us to treat A:oldfile and A':newfile as renames. ··· 394 414 optimization than without. 395 415 396 416 397 - === 6. Interaction with skipping of "irrelevant" renames === 417 + == 7. Interaction with skipping of "irrelevant" renames == 398 418 399 419 Previous optimizations involved skipping rename detection for paths 400 420 considered to be "irrelevant". See for example the following commits: ··· 421 441 already detected renames. 422 442 423 443 424 - === 7. Additional items that need to be cached === 444 + == 8. Additional items that need to be cached == 425 445 426 446 It turns out we have to cache more than just renames; we also cache: 427 447 448 + .... 428 449 A) non-renames (i.e. unpaired deletes) 429 450 B) counts of renames within directories 430 451 C) sources that were marked as RELEVANT_LOCATION, but which were 431 452 downgraded to RELEVANT_NO_MORE 432 453 D) the toplevel trees involved in the merge 454 + .... 433 455 434 456 These are all stored in struct rename_info, and respectively appear in 457 + 435 458 * cached_pairs (along side actual renames, just with a value of NULL) 436 459 * dir_rename_counts 437 460 * cached_irrelevant 438 461 * merge_trees 439 462 440 - The reason for (A) comes from the irrelevant renames skipping 441 - optimization discussed in section 6. The fact that irrelevant renames 463 + The reason for `(A)` comes from the irrelevant renames skipping 464 + optimization discussed in section 7. The fact that irrelevant renames 442 465 are skipped means we only get a subset of the potential renames 443 466 detected and subsequent commits may need to run rename detection on 444 467 the upstream side on a subset of the remaining renames (to get the ··· 447 470 repeatedly check that those paths remain unpaired on the upstream side 448 471 with every commit we are transplanting. 449 472 450 - The reason for (B) is that diffcore_rename_extended() is what 473 + The reason for `(B)` is that diffcore_rename_extended() is what 451 474 generates the counts of renames by directory which is needed in 452 475 directory rename detection, and if we don't run 453 476 diffcore_rename_extended() again then we need to have the output from 454 477 it, including dir_rename_counts, from the previous run. 455 478 456 - The reason for (C) is that merge-ort's tree traversal will again think 479 + The reason for `(C)` is that merge-ort's tree traversal will again think 457 480 those paths are relevant (marking them as RELEVANT_LOCATION), but the 458 481 fact that they were downgraded to RELEVANT_NO_MORE means that 459 482 dir_rename_counts already has the information we need for directory 460 483 rename detection. (A path which becomes RELEVANT_CONTENT in a 461 484 subsequent commit will be removed from cached_irrelevant.) 462 485 463 - The reason for (D) is that is how we determine whether the remember 486 + The reason for `(D)` is that is how we determine whether the remember 464 487 renames optimization can be used. In particular, remembering that our 465 488 sequence of merges looks like: 466 489 490 + .... 467 491 Merge 1: 468 492 MERGE_BASE: E 469 493 MERGE_SIDE1: G ··· 475 499 MERGE_SIDE1: A' 476 500 MERGE_SIDE2: B 477 501 => Creates B' 502 + .... 478 503 479 504 It is the fact that the trees A and A' appear both in Merge 1 and in 480 505 Merge 2, with A as a parent of A' that allows this optimization. So ··· 482 507 time. 483 508 484 509 485 - === 8. How directory rename detection interacts with the above and === 486 - === why this optimization is still safe even if === 487 - === merge.directoryRenames is set to "true". === 510 + == 9. How directory rename detection interacts with the above and why this optimization is still safe even if merge.directoryRenames is set to "true". == 488 511 489 512 As noted in the assumptions section: 490 513 514 + .... 491 515 """ 492 516 ...if directory renames do occur, then the default of 493 517 merge.directoryRenames being set to "conflict" means that the operation ··· 497 521 is that some users will have set merge.directoryRenames to "true" to 498 522 allow the merges to continue to proceed automatically. 499 523 """ 524 + .... 500 525 501 526 Let's remember that we need to look at how any given pick affects the next 502 527 one. So let's again use the first two picks from the diagram in section 503 528 one: 504 529 530 + .... 505 531 First pick does this three-way merge: 506 532 MERGE_BASE: E 507 533 MERGE_SIDE1: G ··· 513 539 MERGE_SIDE1: A' 514 540 MERGE_SIDE2: B 515 541 => creates B' 542 + .... 516 543 517 544 Now, directory rename detection exists so that if one side of history 518 545 renames a directory, and the other side adds a new file to the old ··· 545 572 concerned; see the assumptions section). Two interesting sub-notes 546 573 about these counts: 547 574 548 - * If we need to perform rename-detection again on the given side (e.g. 575 + ** If we need to perform rename-detection again on the given side (e.g. 549 576 some paths are relevant for rename detection that weren't before), 550 577 then we clear dir_rename_counts and recompute it, making use of 551 578 cached_pairs. The reason it is important to do this is optimizations ··· 556 583 easiest way to "fix up" dir_rename_counts in such cases is to just 557 584 recompute it. 558 585 559 - * If we prune rename/rename(1to1) entries from the cache, then we also 586 + ** If we prune rename/rename(1to1) entries from the cache, then we also 560 587 need to update dir_rename_counts to decrement the counts for the 561 588 involved directory and any relevant parent directories (to undo what 562 589 update_dir_rename_counts() in diffcore-rename.c incremented when the ··· 578 605 579 606 Case 1: MERGE_SIDE1 renames old dir, MERGE_SIDE2 adds new file to old dir 580 607 608 + .... 581 609 This case looks like this: 582 610 583 611 MERGE_BASE: E, Has olddir/ ··· 595 623 * MERGE_SIDE1 has cached olddir/newfile -> newdir/newfile 596 624 Given the cached rename noted above, the second merge can proceed as 597 625 expected without needing to perform rename detection from A -> A'. 626 + .... 598 627 599 628 Case 2: MERGE_SIDE1 renames old dir, MERGE_SIDE2 renames file into old dir 600 629 630 + .... 601 631 This case looks like this: 632 + 602 633 MERGE_BASE: E oldfile, olddir/ 603 634 MERGE_SIDE1: G oldfile, olddir/ -> newdir/ 604 635 MERGE_SIDE2: A oldfile -> olddir/newfile ··· 617 648 618 649 Given the cached rename noted above, the second merge can proceed as 619 650 expected without needing to perform rename detection from A -> A'. 651 + .... 620 652 621 653 Case 3: MERGE_SIDE1 adds new file to old dir, MERGE_SIDE2 renames old dir 622 654 655 + .... 623 656 This case looks like this: 624 657 625 658 MERGE_BASE: E, Has olddir/ ··· 635 668 In this case, with the optimization, note that after the first commit there 636 669 were no renames on MERGE_SIDE1, and any renames on MERGE_SIDE2 are tossed. 637 670 But the second merge didn't need any renames so this is fine. 671 + .... 638 672 639 673 Case 4: MERGE_SIDE1 renames file into old dir, MERGE_SIDE2 renames old dir 640 674 675 + .... 641 676 This case looks like this: 642 677 643 678 MERGE_BASE: E, Has olddir/ ··· 658 693 659 694 Given the cached rename noted above, the second merge can proceed as 660 695 expected without needing to perform rename detection from A -> A'. 696 + .... 661 697 662 698 Finally, I'll just note here that interactions with the 663 699 skip-irrelevant-renames optimization means we sometimes don't detect