Git fork

merge-ort: fix merge.directoryRenames=false

There are two issues here.

First, when merge.directoryRenames is set to false, there are a few code
paths that should be turned off. I missed one; collect_renames() was
still doing some directory rename detection logic unconditionally. It
ended up not having much effect because
get_provisional_directory_renames() was skipped earlier and not setting
up renames->dir_renames, but the code should still be skipped.

Second, the larger issue is that sometimes we get a cached_pair rename
from a previous commit being replayed mapping A->B, but in a subsequent
commit but collect_merge_info() doesn't even recurse into the
directory containing B because there are no source pairings for that
rename that are relevant; we can merge that commit fine without knowing
the rename. But since the cached renames are added to the normal
renames, when we go to process it and find that B is not part of
opt->priv->paths, we hit the assertion error
process_renames: Assertion `newinfo && ~newinfo->merged.clean` failed.
I think we could fix this at the beginning of detect_regular_renames() by
pruning from cached_pairs any entry whose destination isn't in
opt->priv->paths, but it's suboptimal in that we'd kind of like the
cached_pair to be restored afterwards so that it can help the subsequent
commit, but more importantly since it sits at the intersection of
the caching renames optimization and the relevant renames optimization,
and the trivial directory resolution optimization, and I don't currently
have Documentation/technical/remembering-renames.txt fully paged in, I'm
not sure if that's a full solution or a bandaid for the current
testcase. However, since the remembering renames optimization was the
weakest of the set, and the optimization is far less important when
directory rename detection is off (as that implies far fewer potential
renames), let's just use a bigger hammer to ensure this special case is
fixed: turn off the rename caching. We do the same thing already when
we encounter rename/rename(1to1) cases (as per `git grep -3
disabling.the.optimization`, though it uses a slightly different
triggering mechanism since it's trying to affect the next time that
merge_check_renames_reusable() is called), and I think it makes sense
to do the same here.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

authored by

Elijah Newren and committed by
Junio C Hamano
a16e8efe a9185cc8

+30 -3
+29 -2
merge-ort.c
··· 3404 3404 pool_diff_free_filepair(&opt->priv->pool, p); 3405 3405 continue; 3406 3406 } 3407 + if (opt->detect_directory_renames == MERGE_DIRECTORY_RENAMES_NONE && 3408 + p->status == 'R' && 1) { 3409 + possibly_cache_new_pair(renames, p, side_index, NULL); 3410 + goto skip_directory_renames; 3411 + } 3407 3412 3408 3413 new_path = check_for_directory_rename(opt, p->two->path, 3409 3414 side_index, ··· 3421 3426 if (new_path) 3422 3427 apply_directory_rename_modifications(opt, p, new_path); 3423 3428 3429 + skip_directory_renames: 3424 3430 /* 3425 3431 * p->score comes back from diffcore_rename_extended() with 3426 3432 * the similarity of the renamed file. The similarity is ··· 5025 5031 trace2_region_leave("merge", "allocate/init", opt->repo); 5026 5032 } 5027 5033 5028 - static void merge_check_renames_reusable(struct merge_result *result, 5034 + static void merge_check_renames_reusable(struct merge_options *opt, 5035 + struct merge_result *result, 5029 5036 struct tree *merge_base, 5030 5037 struct tree *side1, 5031 5038 struct tree *side2) ··· 5046 5053 */ 5047 5054 if (!merge_trees[0]) { 5048 5055 assert(!merge_trees[0] && !merge_trees[1] && !merge_trees[2]); 5056 + renames->cached_pairs_valid_side = 0; /* neither side valid */ 5057 + return; 5058 + } 5059 + 5060 + /* 5061 + * Avoid using cached renames when directory rename detection is 5062 + * turned off. Cached renames are far less important in that case, 5063 + * and they lead to testcases with an interesting intersection of 5064 + * effects from relevant renames optimization, trivial directory 5065 + * resolution optimization, and cached renames all converging when 5066 + * the target of a cached rename is in a directory that 5067 + * collect_merge_info() does not recurse into. To avoid such 5068 + * problems, simply disable cached renames for this case (similar 5069 + * to the rename/rename(1to1) case; see the "disabling the 5070 + * optimization" comment near that case). 5071 + * 5072 + * This could be revisited in the future; see the commit message 5073 + * where this comment was added for some possible pointers. 5074 + */ 5075 + if (opt->detect_directory_renames == MERGE_DIRECTORY_RENAMES_NONE) { 5049 5076 renames->cached_pairs_valid_side = 0; /* neither side valid */ 5050 5077 return; 5051 5078 } ··· 5258 5285 5259 5286 trace2_region_enter("merge", "merge_start", opt->repo); 5260 5287 assert(opt->ancestor != NULL); 5261 - merge_check_renames_reusable(result, merge_base, side1, side2); 5288 + merge_check_renames_reusable(opt, result, merge_base, side1, side2); 5262 5289 merge_start(opt, result); 5263 5290 /* 5264 5291 * Record the trees used in this merge, so if there's a next merge in
+1 -1
t/t3650-replay-basics.sh
··· 195 195 done 196 196 ' 197 197 198 - test_expect_failure 'merge.directoryRenames=false' ' 198 + test_expect_success 'merge.directoryRenames=false' ' 199 199 # create a test case that stress-tests the rename caching 200 200 git switch -c rename-onto && 201 201