如何制作 - 在合并时默认为什么?

jww*_*jww 3 git merge

我们使用单独的分支来处理非平凡的错误修复和功能.通过频繁执行,分支与主控保持同步git checkout <x>; git merge master.

我注意到在合并时,git会使用多条不相关的消息来污染日志文件.例如,git将添加所有提交消息,而不是单个"Merge <X> into Master"或"Merge Master into <X>".它与治理(砂处理过程)对主一个问题,因为开发过程中可能已经存在的一个分支的错误是和是不是在主分支永远存在.

更糟糕的是,分支和主人之间的行为是不同的.将master合并到分支时,会生成类似于"将Master合并到<X>"的日志条目.但是,将分支合并为Master时,没有"将<X>合并到Master中".根据日志,它好像开发分支从未存在过,合并从未发生过.

我知道我必须做一些特别的事情才能使git按照预期行事; 即如何使用git merge --squash?(它的经典git modus operandi:采取简单的方法并使其变得困难).

我的问题是,如何--squash在合并期间进行默认操作?

tor*_*rek 10

(注意:我是通过关注你最近的一个问题的链接来到这里的.我不确定你还关心这个,但不妨回答它.)

对于所有分支,你不能让Git默认为一个壁垒"合并" ,但是对于某些分支,你可以让它默认为"合并" .既然你特别想要让它发生,那可能就是你想要的.master

让我们做一个快速的1个什么检讨git merge确实以来,在平时的Git时尚,Git的复杂应有尽有.还有这个:

我们使用单独的分支来处理非平凡的错误修复和功能.通过频繁执行,分支与主人保持同步git checkout <x>; git merge master.

与许多人认为是Git中"正确"的工作流程相反.我有一些疑问,是否可以将任何Git工作流称为"正确":-),但有些比其他更成功,这绝对是其中一个更成功的反向.(我认为它可以很好地工作,如下面的扩展讨论中所述.)


1好吧,我试着保持简短.:-)随意撇去,虽然这里有一堆重要的材料.如果TL; DR,直接跳到最后.

提交图

如你所知,但其他人可能没有,在Git中,很多都是由提交图控制的.每1次提交都有一些父提交,或者在合并提交的情况下,有两个或更多父级.要进行新的提交,我们进入一些分支:

$ git checkout funkybranch
Run Code Online (Sandbox Code Playgroud)

and do some work in the work-tree, git add some files, and finally git commit the result to branch funkybranch:

... work work work ...
$ git commit -m 'do a thing'
Run Code Online (Sandbox Code Playgroud)

The current commit is the (one, single) commit to which the name funkybranch points. Git finds this by reading HEAD: HEAD normally contains the name of the branch, and the branch contains the raw SHA-1 hash ID of the commit.

To make the new commit, Git reads the ID of the current commit from the branch we're on, saves the index/staging-area into the repository,2 writes the new commit with the current commit's ID as the new commit's parent, and—last—writes the new commit's ID to the branch information file.

这就是分支增长的方式:从一次提交,我们创建一个新的,然后移动分支名称以指向新的提交.当我们将其作为线性链时,我们得到一个很好的线性历史:

... <- C <- D <- E   <-- funkybranch
Run Code Online (Sandbox Code Playgroud)

提交E(实际上可能是e35d9f...或者其他)是当前提交.它指回D因为D 当我们成为当前提交E; D回到C原因,因为C那时是当前的; 等等.

当我们创建新的分支时,例如,git checkout -b我们正在做的就是告诉Git创建一个新名称,指向一些现有的提交.通常这只是当前的提交.因此,如果我们开启funkybranchfunkybranch指向提交E并运行:

git checkout newbranch
Run Code Online (Sandbox Code Playgroud)

然后我们得到这个:

... <- C <- D <- E   <-- funkybranch, newbranch
Run Code Online (Sandbox Code Playgroud)

也就是说,两个名字都指向提交E.Git知道我们newbranch现在正在进行,因为HEAD他说newbranch.我也喜欢把它包含在这种绘图中:

... <- C <- D <- E   <-- funkybranch, HEAD -> newbranch
Run Code Online (Sandbox Code Playgroud)

我也想以更紧凑的方式绘制我的图形.我们知道提交总是向父母指出"倒退",因为E在我们提交之前不可能做出新的提交D.所以这些箭头总是指向左边,我们可以画一两个破折号:

...--C--D--E   <-- funkybranch, HEAD -> newbranch
Run Code Online (Sandbox Code Playgroud)

(然后如果我们不需要知道哪个提交是哪个,我们可以o为每个提交绘制一个圆形节点,但是现在我将坚持使用单个大写字母).

如果我们现在提交一个新的提交 - 提交F- 导致newbranch推进(因为,正如我们可以看到的那样HEAD,我们正在进行newbranch).所以让我们画出:

...--C--D--E      <-- funkybranch
            \
             F    <-- HEAD -> newbranch
Run Code Online (Sandbox Code Playgroud)

现在让我们git checkout funkybranch再次,并在那里做一些工作并提交它,进行新的提交G:

...--C--D--E--G   <-- HEAD -> funkybranch
            \
             F    <-- newbranch
Run Code Online (Sandbox Code Playgroud)

(HEAD现在指向funkybranch).现在我们可以合并一些东西了.


1Well, every commit except for root commits. In most Git repositories there is just one root commit, which is the very first commit. Obviously it cannot have a parent commit, since the parent of each new commit is whatever commit was current when we made the new commit. With no commits at all, there is no current commit yet when we make the first commit. So it becomes a root commit, and then all later commits are its children, grandchildren, great-grand-children, and so on.

2Most of the "save" work actually happens at each git add. The index/staging-area contains hash IDs, rather than actual file contents: the file contents were saved away when you ran git add. This is because Git's graph is not just of commit objects, but of every object in the repository. This is part of what makes Git so fast as compared to, e.g., Mercurial (which saves the files away at commit time rather than add time). Fortunately this, unlike the commit graph itself, is something users need not know or care about.

Git merge

As before, we have to be on some branch.1 We're on funkybranch, so we are all good to go:

$ git merge newbranch
Run Code Online (Sandbox Code Playgroud)

At this point, most people seem to think that Magic Happens. It's not magic at all though. Git now finds the merge base between our current commit and the one we named, and then runs two git diff commands.

The merge base is simply2 the first commit "in common" on the two branches—the first commit that is on both branches. We are on funkybranch, which points to G. We gave Git the branch name newbranch, which points to commit F. So we're merging commits G and F, and Git follows both of their parent pointers until it reaches a commit node that is on both branches. In this case, that's commit E: commit E is the merge base.

Now Git runs those two git diff commands. One compares the merge base against the current commit: git diff <id-of-E> <id-of-G>. The second diff compares the merge base against the other commit: git diff <id-of-E> <id-of-F>.

Finally, Git attempts to combine the two sets of changes, writing the result to our current work-tree. If the changes seem independent, Git takes both of them. If they seem to collide, Git stops with a "merge conflict" and makes us clean it up. If they seem to be the same changes, Git takes just one copy of the changes.

All of this "seems" stuff is done on a purely textual basis. Git has no understanding of code. It just sees things like "delete a line reading ++x;" and "add a line reading y *= 2;. Those look different, so as long as they seem to be in different areas, it does the one delete and the one add, to the files in the merge-base, putting the result in the work-tree.

Last, assuming all goes well and the merge does not stop with a conflict, Git makes a new commit. The new commit is a merge commit, which means it has two3 parents. The first parent—the order matters—is the current commit, just as with regular, non-merge commits. The second parent is the other commit. Once the commit is safely written to the repository, Git writes the new commit's ID into the branch name, as usual. So, assuming the merge works, we get this:

...--C--D--E--G--H  <-- HEAD -> funkybranch
            \   /
              F     <-- newbranch
Run Code Online (Sandbox Code Playgroud)

Note that newbranch has not moved: it still points to commit F. HEAD has not changed either: it still contains the name funkybranch. Only funkybranch has changed: it now points to the new merge commit H, and H points back to G, and also to F.


1Git is a bit schizoid about this. If we git checkout a raw SHA-1, or anything else that is not a branch name, it goes into a state it calls "detached HEAD". Internally, this works by shoving the SHA-1 hash directly into the HEAD file, so that HEAD gives the commit ID, rather than the name of the branch. But the way Git does everything else makes it work as though we're on a special branch whose name is just the empty string. It's the (single) anonymous branch—or, equivalently, it's the branch named HEAD. So in one sense, we're always on a branch: even if Git says that we're not on any branch, Git also says that we're on the special anonymous branch.

This causes a lot of confusion, and it might be more sensible if it weren't allowed, but Git uses it internally during git rebase, so it's actually pretty important. If the rebase goes wrong, this detail leaks out, and you wind up having to know what "detached HEAD" means, and is.

2I am deliberately ignoring a hard case here, which occurs when there are multiple possible merge base commits. Mercurial and Git use different solutions here: Mercurial picks one at (what seems to be) random, while Git gives you options. These cases are rare though, and ideally, even when they do occur, Mercurial's simpler method works anyway.

3Two or more, really: Git supports the concept of an octopus merge. But there's no need to go there. :-)

Merge changes the graph from a tree to a DAG

Merges—true merges: commits with two or more parents—have a bunch of important—critical, even—side effects. The main one is that the presence of a merge causes the commit graph data structure to change from a tree, where branches simply fork off and grow on their own, into a DAG: a Directed Acyclic Graph.

When Git walks the graph, as it does for so many operations, it usually follows all paths back. Since a merge has two parents, git log, which walks the graph, shows both parent commits. Hence this is considered a Feature:

For example, rather than a single "Merge into Master" or "Merge Master into ", git will add all the commit messages.

Git is following, and hence logging, both the original commit sequence—commits H, G, E, D, and so on—and the merged-in commit sequence F, E, D, and so on. Of course, it shows each commit only once; and by default, it sorts these commits by their date-stamps, intermingling the two branches if each one has many commits with dates that overlap.

If you don't want to see the commits that came in via the "other side" of a merge, Git has a way to do that: --first-parent tells every Git command that walks the graph1 to follow only the first parent of each merge. The other side is still there in the graph, and it still affects how Git computes things like the merge base, but git log --first-parent won't show it.


1This is quite a lot of Git commands. They use, or in the case of git log itself, are, variants of git rev-list, which is Git's general purpose graph-walk program. This code is central to push, fetch, bisect, log, blame, rebase, and numerous others. Its documentation has a dizzying array of options. The key ones to know as a casual user are --first-parent (just discussed here); --no-walk (suppresses graph walking entirely); --ancestry-path (simplifies history for source tree related work); --simplify-by-decoration (simplifies history for git log output); --branches, --remotes, and --tags (selects starting points for graph walking by branch, remote, or tag name); --merges and --no-merges (include or exclude merge commits); --since and --until (limit commits by date ranges); and the basic .. and ... (two and three dot) graph subsetting operations.

Benefits of merges

Having the merge in place means that development on a branch can continue on that branch, and a later git merge finds a newer—and hence less complicated—merge base. Consider this graph, where only a few commits have single-letter names:

  o--o--o--o--H--o--o--I        <-- feature2
 /             \        \
A--o--B---C-----D--E-----F--G   <-- master
 \       /        /        /
  o--o--J--o--o--K--o--o--L     <-- feature1
Run Code Online (Sandbox Code Playgroud)

Here, except for two early commits done on master after the root commit A, all development has taken place on side branches feature1 and feature2. Commits C, D, E, F, and G are all merges (in this case, strictly into master), bringing the feature-work into master when it was ready.

Note that when we made commit C on master, we did:

$ git checkout master; git merge feature1
Run Code Online (Sandbox Code Playgroud)

which found A as the merge base and B and J as the two tip commits to merge. When we made D:

$ git checkout master; git merge feature2
Run Code Online (Sandbox Code Playgroud)

we had A as the merge base and C and H as the two tip commits. So far, this is nothing special. But when we made E, we had this much so far (the final os, and even I, on feature2 may or may not have been in place—they have no effect):

  o--o--o--o--H--o--o           <-- feature2
 /             \
A--o--B---C-----D               <-- master
 \       /
  o--o--J--o--o--K              <-- feature1
Run Code Online (Sandbox Code Playgroud)

The merge base of master and feature1 is the first commit that is on both branches, which is commit J, which is the one we merged in to make C. So to do this merge, Git compares J vs D—the code we brought in from feature2—and J vs K: the new code (and only the new code) on feature1. If all goes well, or once we fix merge conflicts, this makes commit E and we now have:

  o--o--o--o--H--o--o--I        <-- feature2
 /             \
A--o--B---C-----D--E            <-- master
 \       /        /
  o--o--J--o--o--K--o--o        <-- feature1
Run Code Online (Sandbox Code Playgroud)

when we go to merge feature2 again. This time the merge base is commit H: moving straight back from feature2 soon hits H, and moving from E to D and then up to H from master also hits H. So now Git compares H vs E, which is what we brought in from feature1, and H vs I, which is the new stuff we added to feature2, and merges just those.

Drawbacks of merges

Trees have some very nice graph-theoretic properties, such as a guarantee of a single simple merge-base. Arbitrary DAGs may lose these properties. In particular, doing merges both ways—merging master into branch and merging branch into master—results in "criss cross merges" that can give you multiple merge bases.

Merges also make the graph (git log) very hard to follow. Using --first-parent or --simplify-by-decoration helps, especially if you practice good merging, but these graphs just naturally get messy.

Squash merges

Squash merges avoid the problems, but do so by paying a fairly heavy price: they are not merges at all. (Soon, we'll see how to deal with this.)

When you run git merge --squash, Git goes through the same motions as before in terms of finding a merge base, and making two diffs: merge-base vs current-commit, and merge-base vs other-commit. It then combines the changes in exactly the same way as for a regular commit. But then it makes an ordinary commit.1 The new commit has just a single parent, taken from the current branch.

Let's see that in action for the same sequence with feature1 and feature2:

  o--o--o                       <-- feature2
 /
A--o--B                         <-- master
 \ 
  o--o--J                       <-- feature1
Run Code Online (Sandbox Code Playgroud)

We do git checkout master; git merge --squash feature1 to make new commit C. Git compares A vs B to see what we did on master, and A vs J to see what they (we) did on feature1. Git combines those changes and we get commit C, but with only one parent:

  o--o--o                       <-- feature2
 /
A--o--B---C                     <-- master
 \
  o--o--J                       <-- feature1
Run Code Online (Sandbox Code Playgroud)

Now we'll make D as a squash from feature2:

  o--o--o--o--H                 <-- feature2
 /
A--o--B---C                     <-- master
 \
  o--o--J--o--o                 <-- feature1
Run Code Online (Sandbox Code Playgroud)

Git compares A vs C, and A vs H, same as last time. We now get D. So far it's much the same, except that there are no points where the branches rejoin. But now it is time to make E:

  o--o--o--o--H--o--o           <-- feature2
 /
A--o--B---C-----D               <-- master
 \
  o--o--J--o--o--K              <-- feature1
Run Code Online (Sandbox Code Playgroud)

We run git checkout master; git merge --squash feature1 as before.

Last time, Git compared J-vs-D and J-vs-K, as commit J was our merge base.

This time, commit A is (still) our merge base. Git compares A vs D, and A vs K. If there were conflicts we solved at C last time, we probably have to solve them again. This is bad—but we're not lost yet.


1Ordinary, as opposed to merge. As such, a squash merge is not a merge at all: it's a "get me the work done" commit, but it's not a merge commit. We need a real merge commit in addition; we will get to this in the next section.

Git actually stops here and forces you to run git commit to make the squash commit. Why? Who knows, it's Git. :-)

Squash merges can work

To solve the above, we just need to re-merge (with a non-squash "real merge") from master back to the feature branches. That is, instead of simply merging from whichever feature branch into master, and then continuing to work on the feature branch, we do this:

  o--o--o--o--H--*-o--o        <-- feature2
 /              /
A--o--B---C----D               <-- master
 \         \
  o--o--J---*--o--o--K         <-- feature1
Run Code Online (Sandbox Code Playgroud)

These new commits, marked *, are (non-squash) merges from master, into feature1 and feature2. We made squash merge C to pick up changes made from A to J. So we then make a real merge into feature1, preferably using the tree straight from master1 (which has whatever goodies were in o--B-- as well). (We also made the * on feature2, just as general preparation, after making D on master to bring in everything from A to H. Like the * on feature1 we probably just want the source tree straight from master.)

Now that we're ready to bring in more work from feature1, we can just do another (squash) merge. The merge-base of master and feature1 is commit C, and the two tips are D and K, which is just what we want. Git's merge code will come up with a reasonably close result; we fix up any conflicts, test, fix any breakage, and commit; and then we do another "prep work" merge from master back into feature1 as before.

This work-flow is a bit more complicated than the "merge into master" one, but should give good results.


1Git does not make this totally trivial: we want a merge with a -s theirs strategy, which Git simply doesn't have. There is an easy way to get the desired effect using "plumbing" commands, but I'll leave that out of this answer, which is already crazy-long.

So, if that all works, how about the mechanics?

Note that what we want is merge --squash when merging into master, but regular (non-squash) merge when merging from master. In other words:

$ git checkout master && git merge foo
Run Code Online (Sandbox Code Playgroud)

should use --squash, but:

$ git checkout foo && git merge master
Run Code Online (Sandbox Code Playgroud)

should not use --squash. (The tree copying from the footnote in the previous section might be nice, but should be unnecessary: the merge result should basically always be the tree straight out of __CO

  • td,dr:$ git config branch.master.mergeOptions“ --squash” (2认同)
  • @Joanvo:是的,我试图通过"跳到最后"位来清楚地说明这一点...... (2认同)