6

The man page git-rebase(1) says:

-m
--merge
Use merging strategies to rebase. [...]

But of course one can also run into "merge conflicts" without using the --merge option. So also in that case there must be any "merge strategy" to handle these conflicts.

What difference makes the --merge option to a rebase.

It seems to be something rather fundamental: For a rebase --merge, Git stores its working files in a folder named $GIT_DIR/rebase-merge (as it does for interactive rebases). If the --merge option is not used (and the rebase is non-interactive) that folder is named $GIT_DIR/rebase-apply.

Jürgen
  • 337
  • 1
  • 8
  • 1
    Interesting question. The manual suggests that you would need it when a file was renamed in the upstream, but I just tested it and a plain rebase dealt with that situation automatically and applied to commit to the renamed file. So I'm interested in the answer too, if someone knows. – joanis Apr 29 '19 at 14:59
  • 2
    With some more experimentation, my guess is now that it enables specifying the merge strategy (via `-s` and/or `-X`, which both imply `-m`). I say this because although `-m` technically changes the algorithm used, in the several cases I tested the final result was identical. More confusing is that I was able to create a scenario where `git merge upstream` and `git rebase upstream` gave different results, but in that case `git rebase -m upstream` gave the same results as `git rebase upstream`, although the log messages looked different along the way. – joanis Apr 29 '19 at 15:15
  • 1
    For the record, I've tested with a file being renamed in the upstream, and that did not confuse any rebase or merge. I've also tried having a change done in upstream, the same change done and undone in my branch, which is the case where merge differs from upstream, but the two rebases behaved identically: have the change undone in the final result, whereas merge keeps the change done in the final result. – joanis Apr 29 '19 at 15:20

1 Answers1

9

In one sentence, what -m or --merge does for git rebase is to make sure that rebase uses git cherry-pick internally.

The -m flag to force cherry-pick is often, but not always, redundant. In particular, any interactive rebase always uses cherry-pick anyway. As joanis noted in a comment, specifying any -s or -X options also force the use of cherry-pick. So does -k, as noted below.

Long (or at least longer)

Rebase has a long history in Git: the first rebase operations were done by formatting each commit-to-be-rebased into a patch, then applying the patch to some other commit. That is, originally, git rebase was mostly just:

branch=$(git symbolic-ref --short HEAD)
target=$(git rev-parse ${onto:-$upstream})
git format-patch $upstream..HEAD > $temp_file
git checkout $target
git am -3 $temp_file
git checkout -B $branch HEAD

(except for argument handling, all the error checking, and the fact that the git am can stop with an error, requiring hand-fixing and git rebase --continue; also, the above scripting is my reduced-for-readability version and probably does not resemble the original script much).

This kind of rebase handles most cases fairly well. The most common case that it doesn't handle well involves rebasing across some file renames. It also cannot copy an "empty" commit—one whose patch is empty, that is—as git format-patch is not allowed to omit the patch part.

These empty commits are normally omitted by git rebase even when using -m; you must add -k to preserve them. To preserve them, git rebase must switch to the cherry-pick variant, if it has not already done so.

To pass -s or -X arguments, rebase must invoke git cherry-pick rather than git am, so any of those flags also require the cherry-pick variant.

Using git format-patch never does any rename detection. Hence, if the stream of commits you're copying should all have rename detection applied with respect to HEAD, the -m flag is very important. For a concrete example, consider this series of commits:

          B--C--D   <-- topic
         /
...--o--A--E--F--G   <-- mainline

Suppose that the difference from A to B, B to C, and C to D is all handled within a file named lib-foo.ext. But in commit F, this file is renamed to be lib/foo.ext instead. A git format-patch of A..D will show changes to be made to file lib-foo.ext, none of which will apply correctly to commit G as there is no lib-foo.ext file. The rebase as a whole will fail.

A git cherry-pick of commit B when HEAD identifies commit G, however, will find the rename and apply the A-vs-B changes to the version of lib/foo.ext in commit G:

          B--C--D   <-- topic
         /
...--o--A--E--F--G   <-- mainline
                  \
                   B'   <-- HEAD [detached]

The next cherry-pick, of C while HEAD identifies B', will discover that the B-to-C change to libfoo.ext should be applied to the renamed lib/foo.ext, and the last cherry-pick of D will do the same, so that the rebase will succeed.

The rename detection code is slow, so a rebase that has no renames to do, and no "empty" commits to keep, can run much faster when run via the git format-patch | git am system. That's about the only way in which the original method is better than the cherry-pick variant: it's faster in constrained cases. (However, the speed improvement only occurs when there are lots of rename candidates, but either none of them are actual renames, or none of them matter.)

(Side note: the -3 argument, or --3way to use the longer spelling, tells git am to pass that flag on to each git apply, where the apply will attempt to do a three-way merge if needed, using the blob hashes in the index line in the diff. Under some conditions, it seems like this might suffice to handle renamed files—in particular if the blob hash exactly matches. The cherry-pick method does full rename detection, which handles inexact matches; -3 cannot do that. See also What is the difference between git cherry-pick and git format-patch | git am?, as Jürgen noted.)

torek
  • 389,216
  • 48
  • 524
  • 664
  • Thank you, torek, for this very enlightening answer. Up to now, I pictured also cherry-picks as appliances of previously formatted patches. Concerning that, I found the thorough answers to this, thus related question very helpful: ["What is the difference between git cherry-pick and git format-patch | git am?"](https://stackoverflow.com/q/52119937/11402257) – Jürgen Apr 29 '19 at 18:06
  • 1
    Thanks @torek for this detailed answer, this is very helpful! I still wonder, though: in my tests, I simulated a rename situation that should have failed in the `format-path | am` pipeline, yet succeeded anyway. Is there some heuristic in `git rebase` that sometimes switches to the cherry pick variant even for a non-interactive rebase with no switches? My test was with a `dev.upstream` branch having a rename commit, and the current branch having an edit on that file before the renaming, and `git rebase dev.upstream` worked as is, applying the change to the renamed file. – joanis Apr 29 '19 at 19:29
  • 2
    @joanis: no, there isn't (or wasn't the last time I looked, which might have been 2.15ish), and I would expect that to have failed in at least some cases. Note though that `git am` uses `git apply -3` to do three-way matching on blobs, so a pure rename (as opposed to a rename-with-mods) might (maybe) be found. I'd have to experiment a bit to check the all the nitty details here. – torek Apr 29 '19 at 20:21