'git merge'如何在细节上工作?

aby*_*s.7 69 git merge conflict

我想知道'git merge'背后的确切算法(或接近那个).至少对这些子问题的答案将有所帮助:

  • git如何检测特定的非冲突变更的上下文?
  • git如何发现这些确切的行存在冲突?
  • git自动合并哪些东西?
  • 当没有合并分支的共同基础时,git如何执行?
  • 当合并分支有多个共同基础时,git如何执行?
  • 当我一次合并多个分支时会发生什么?
  • 合并策略有什么区别?

但是整个算法的描述会好得多.

twa*_*erg 51

您可能最好寻找3路合并算法的描述.高级描述将是这样的:

  1. 找到一个合适的合并库B- 一个版本的文件,它是两个新版本(XY)的祖先,通常是最近的这样的基础(尽管有些情况下它必须进一步返回,这是其中之一gits默认recursive合并的功能)
  2. 执行Xwith BYwith的差异B.
  3. 遍历两个差异中标识的更改块.如果双方在同一地点引入相同的变化,则接受其中任何一个; 如果一个人引入了一个变化而另一个人只留下了那个区域,那就引入最后的变化; 如果两者都引入了某个点的更改,但它们不匹配,则标记要手动解决的冲突.

这个完整的算法交易中有更详细的,甚至有一些文档(git help XXX为一体,与沿merge-base页,其中XXX是一个merge-file,merge,merge-one-file,B以及可能的其他一些).如果这还不够深入,总会有源代码......


Cir*_*四事件 9

当合并分支有多个共同基础时,git如何执行?

这篇文章非常有用:http://codicesoftware.blogspot.com/2011/09/merge-recursive-strategy.html(这里是第2部分).

递归递归地使用diff3来生成将用作祖先的虚拟分支.

例如:

(A)----(B)----(C)-----(F)
        |      |       |
        |      |   +---+
        |      |   |
        |      +-------+
        |          |   |
        |      +---+   |
        |      |       |
        +-----(D)-----(E)
Run Code Online (Sandbox Code Playgroud)

然后:

git checkout E
git merge F
Run Code Online (Sandbox Code Playgroud)

有两个最好的共同祖先(共同的祖先不是任何其他的祖先),CD.Git将它们合并到一个新的虚拟分支中V,然后V用作基础.

(A)----(B)----(C)--------(F)
        |      |          |
        |      |      +---+
        |      |      |
        |      +----------+
        |      |      |   |
        |      +--(V) |   |
        |          |  |   |
        |      +---+  |   |
        |      |      |   |
        |      +------+   |
        |      |          |
        +-----(D)--------(E)
Run Code Online (Sandbox Code Playgroud)

我想Git会继续讨论如果有更多最好的共同祖先,V与下一个祖先合并.

文章说,如果在生成虚拟分支时存在合并冲突,Git只会将冲突标记留在原来的位置并继续.

当我一次合并多个分支时会发生什么?

正如@Nevik Rehnel解释的那样,这取决于策略,它在man git-merge MERGE STRATEGIES章节中得到了很好的解释.

仅支持octopusours/ theirs支持一次合并多个分支,recursive例如不支持.

octopus如果存在冲突则拒绝合并,并且ours是一个微不足道的合并,因此不存在冲突.

那些生成新提交的命令将拥有2个以上的父项.

merge -X octopus在Git 1.8.5上做了一个没有冲突,看看它是怎么回事.

初始状态:

   +--B
   |
A--+--C
   |
   +--D
Run Code Online (Sandbox Code Playgroud)

行动:

git checkout B
git merge -Xoctopus C D
Run Code Online (Sandbox Code Playgroud)

新州:

   +--B--+
   |     |
A--+--C--+--E
   |     |
   +--D--+
Run Code Online (Sandbox Code Playgroud)

不出所料,E有3个父母.

TODO:章鱼如何在单个文件修改上运行.递归的二乘三向合并?

当没有合并分支的共同基础时,git如何执行?

@Torek提到自2.9以来,合并失败了--allow-unrelated-histories.

我在Git 1.8.5上凭经验尝试了它:

git init
printf 'a\nc\n' > a
git add .
git commit -m a

git checkout --orphan b
printf 'a\nb\nc\n' > a
git add .
git commit -m b
git merge master
Run Code Online (Sandbox Code Playgroud)

a 包含:

a
<<<<<<< ours
b
=======
>>>>>>> theirs
c
Run Code Online (Sandbox Code Playgroud)

然后:

git checkout --conflict=diff3 -- .
Run Code Online (Sandbox Code Playgroud)

a 包含:

<<<<<<< ours
a
b
c
||||||| base
=======
a
c
>>>>>>> theirs
Run Code Online (Sandbox Code Playgroud)

解释:

  • 基地是空的
  • 当基数为空时,无法解析单个文件的任何修改; 只能解决新文件添加等问题.上述冲突将通过与基础的3向合并a\nc\n作为单行添加来解决
  • 认为没有基本文件的3向合并称为双向合并,这只是一个差异


13r*_*ren 7

我也很感兴趣 我不知道答案,但是...

总是可以找到一个有效的复杂系统,它是从一个有效的简单系统演变而来的

我认为git的合并非常复杂,很难理解-但是,实现此合并的一种方法是从其前身开始,并专注于您关注的核心。也就是说,给定两个没有共同祖先的文件,git merge如何计算出如何合并它们,以及冲突在哪里?

让我们尝试找到一些先驱。来自git help merge-file

git merge-file is designed to be a minimal clone of RCS merge; that is,
       it implements all of RCS merge's functionality which is needed by
       git(1).
Run Code Online (Sandbox Code Playgroud)

维基百科:http://en.wikipedia.org/wiki/Git_%28software%29 - > http://en.wikipedia.org/wiki/Three-way_merge#Three-way_merge - > HTTP://en.wikipedia .org / wiki / Diff3- > http://www.cis.upenn.edu/~bcpierce/papers/diff3-short.pdf

最后一个链接是diff3详细描述该算法的论文的pdf 。这是google pdf查看器版本。它只有12页长,算法只有几页-而是完整的数学处理。这似乎有点过于正式,但是如果您想了解git的合并,则需要首先了解较简单的版本。我还没有检查过,但使用像这样的名称diff3,您可能还需要了解diff(使用最长的通用子序列算法)。但是,diff3如果您有Google ,可能会有更直观的解释。


现在,我只是做了一个比较diff3和的实验git merge-file。他们采取同样的三个输入文件VERSION1 oldversion版本2和标志冲突的方式一样,用<<<<<<< version1=======>>>>>>> version2diff3也有||||||| oldversion),展示他们的共同遗产。

我用了一个空文件oldversion,和近乎相同的文件VERSION1版本2与添加到刚刚一个额外的行版本2

结果:git merge-file将单个更改的行标识为冲突;但diff3将整个两个文件视为冲突。因此,与diff3一样,git的合并甚至在这种最简单的情况下也更加复杂。

这是实际结果(我使用@twalberg的答案作为文字)。请注意所需的选项(请参见相应的联机帮助页)。

$ git merge-file -p fun1.txt fun0.txt fun2.txt

You might be best off looking for a description of a 3-way merge algorithm. A
high-level description would go something like this:

    Find a suitable merge base B - a version of the file that is an ancestor of
both of the new versions (X and Y), and usually the most recent such base
(although there are cases where it will have to go back further, which is one
of the features of gits default recursive merge) Perform diffs of X with B and
Y with B.  Walk through the change blocks identified in the two diffs. If both
sides introduce the same change in the same spot, accept either one; if one
introduces a change and the other leaves that region alone, introduce the
change in the final; if both introduce changes in a spot, but they don't match,
mark a conflict to be resolved manually.
<<<<<<< fun1.txt
=======
THIS IS A BIT DIFFERENT
>>>>>>> fun2.txt

The full algorithm deals with this in a lot more detail, and even has some
documentation (/usr/share/doc/git-doc/technical/trivial-merge.txt for one,
along with the git help XXX pages, where XXX is one of merge-base, merge-file,
merge, merge-one-file and possibly a few others). If that's not deep enough,
there's always source code...
Run Code Online (Sandbox Code Playgroud)

$ diff3 -m fun1.txt fun0.txt fun2.txt

<<<<<<< fun1.txt
You might be best off looking for a description of a 3-way merge algorithm. A
high-level description would go something like this:

    Find a suitable merge base B - a version of the file that is an ancestor of
both of the new versions (X and Y), and usually the most recent such base
(although there are cases where it will have to go back further, which is one
of the features of gits default recursive merge) Perform diffs of X with B and
Y with B.  Walk through the change blocks identified in the two diffs. If both
sides introduce the same change in the same spot, accept either one; if one
introduces a change and the other leaves that region alone, introduce the
change in the final; if both introduce changes in a spot, but they don't match,
mark a conflict to be resolved manually.

The full algorithm deals with this in a lot more detail, and even has some
documentation (/usr/share/doc/git-doc/technical/trivial-merge.txt for one,
along with the git help XXX pages, where XXX is one of merge-base, merge-file,
merge, merge-one-file and possibly a few others). If that's not deep enough,
there's always source code...
||||||| fun0.txt
=======
You might be best off looking for a description of a 3-way merge algorithm. A
high-level description would go something like this:

    Find a suitable merge base B - a version of the file that is an ancestor of
both of the new versions (X and Y), and usually the most recent such base
(although there are cases where it will have to go back further, which is one
of the features of gits default recursive merge) Perform diffs of X with B and
Y with B.  Walk through the change blocks identified in the two diffs. If both
sides introduce the same change in the same spot, accept either one; if one
introduces a change and the other leaves that region alone, introduce the
change in the final; if both introduce changes in a spot, but they don't match,
mark a conflict to be resolved manually.
THIS IS A BIT DIFFERENT

The full algorithm deals with this in a lot more detail, and even has some
documentation (/usr/share/doc/git-doc/technical/trivial-merge.txt for one,
along with the git help XXX pages, where XXX is one of merge-base, merge-file,
merge, merge-one-file and possibly a few others). If that's not deep enough,
there's always source code...
>>>>>>> fun2.txt
Run Code Online (Sandbox Code Playgroud)

如果您对此真的感兴趣,那就有点麻烦了。在我看来,它像正则表达式一样深,是diff,上下文无关文法或关系代数中最长的常见子序列算法。如果您想深入浅出,我想可以,但是这需要一些坚定的研究。